It’s been four years now since plucky little Philae attempted to land on comet 67P/CG. It was a thrilling event, with an inexplicable failure that resulted in a slow motion crash landing and eventual loss of the lander.
This year engineer András Balázs reviews what happened to Philae. As he remarks dryly, “not everything went as planned.” ([1], p. 90)
I’m sure we are all shocked, shocked! to hear that software was involved….
Like all spacecraft, Philae was designed to be fault tolerant, at least for certain values of “fault”. The system included redundant processors and other equipment, software and communications protocols with error detection and recovery features, and fall back recovery processes. For instance, the system had “triple-redundant emergency telecommand decoders”—four duplicate message decoders to assure correct results even in the face of multiple problems.
Nevertheless, she failed.
Balázs six lessons learned are a litany of humbling epiphanies. From not being prepared for “unconceivable” problems to not being prepared for “conceivable” problems. “When Even Redundanccy is Useless”. And, of course, there were tradeoffs between science and safety, many missed opportunities and just plain poor decisions. Millions of kilometers from home, every human error is unfixable, and every lost opportunity is lost forever.
“The Philae mission was a jump into the unknown.” ([1], p. 93)
Not only was Philae a bold and risky leap into the unknown, and not only was it, like all space missions, beyond our ability to reach it for repairs if needed, it was the first and only mission. No engineer can hope that the first try will work—that’s called a “throw away”, intended only to teach us how to build a successful version.
But there was no option to do a throw away trial. There seldom is in space missions.
“The software community could benefit from more such evaluations of the problems that so frequently occur in projects. —Michiel van Genuchten and Les Hatton” ([1], p. 90)
The editors of IEEE Software praised Balázs’ frank assessment of the failures of the project. I’ll second the praise. It’s not pleasant to look back at one’s own mistakes and failures.
I have to say that I am disappointed that we still don’t know why the both of the redundant harpoons and the hold down thruster failed to anchor the lander. Suspicion falls on the software that was supposed to detect the impact and trigger the anchoring. (In my head, I’m seeing the software taking too long, the spacecraft bouncing and then shooting the harpoons into nothing….)
But we don’t really know. Sigh.
- András Balázs A Comet Revisited: Lessons Learned from Philaes Landing. IEEE Software, 35 (4):89-93, 2018.