Jump to page: 1 2
Thread overview
ESA's Schiaparelli Mars probe crashed because of integer overflow
Nov 24, 2016
qznc
Nov 24, 2016
Timon Gehr
Nov 25, 2016
Patrick Schluter
Nov 25, 2016
Alix Pexton
Nov 25, 2016
Patrick Schluter
Nov 28, 2016
Kagamin
Nov 25, 2016
Timon Gehr
Nov 25, 2016
Claude
Nov 26, 2016
Walter Bright
Nov 26, 2016
deadalnix
Nov 26, 2016
Walter Bright
Nov 27, 2016
Shachar Shemesh
Nov 27, 2016
deadalnix
Nov 27, 2016
lobo
Nov 27, 2016
Era Scarecrow
Nov 28, 2016
Walter Bright
November 24, 2016
Although, the article [0] does not say that literally, it sounds like an integer overflow:

> After trawling through mountains of data, the European Space Agency said Wednesday that while much of the mission went according to plan, a computer that measured the rotation of the lander hit a maximum reading, knocking other calculations off track.

> That led the navigation system to think the lander was much lower than it was, causing its parachute and braking thrusters to be deployed prematurely.

> "The erroneous information generated an estimated altitude that was negative—that is, below ground level," the ESA said in a statement.

That is why we need CheckedInt, folks. Reminder End. ;)


[0] http://phys.org/news/2016-11-glitch-blamed-european-mars-lander.html
November 24, 2016
On 24.11.2016 20:49, qznc wrote:
> Although, the article [0] does not say that literally, it sounds like an
> integer overflow:
>
>> After trawling through mountains of data, the European Space Agency
>> said Wednesday that while much of the mission went according to plan,
>> a computer that measured the rotation of the lander hit a maximum
>> reading, knocking other calculations off track.
>
>> That led the navigation system to think the lander was much lower than
>> it was, causing its parachute and braking thrusters to be deployed
>> prematurely.
>
>> "The erroneous information generated an estimated altitude that was
>> negative—that is, below ground level," the ESA said in a statement.
>
> That is why we need CheckedInt, folks. Reminder End. ;)
>
>
> [0] http://phys.org/news/2016-11-glitch-blamed-european-mars-lander.html

I don't think overflow is what happened. Rather, the statistical model they used to filter the sensor data didn't match reality. It put too much trust into a malfunctioning sensor -- I assume the sensor readings were extremely implausible.
November 25, 2016
On Thursday, 24 November 2016 at 20:22:00 UTC, Timon Gehr wrote:
> On 24.11.2016 20:49, qznc wrote:
>> Although, the article [0] does not say that literally, it sounds like an
>> integer overflow:
>>
>>> After trawling through mountains of data, the European Space Agency
>>> said Wednesday that while much of the mission went according to plan,
>>> a computer that measured the rotation of the lander hit a maximum
>>> reading, knocking other calculations off track.
>>
>>> That led the navigation system to think the lander was much lower than
>>> it was, causing its parachute and braking thrusters to be deployed
>>> prematurely.
>>
>>> "The erroneous information generated an estimated altitude that was
>>> negative—that is, below ground level," the ESA said in a statement.
>>
>> That is why we need CheckedInt, folks. Reminder End. ;)
>>
>>
>> [0] http://phys.org/news/2016-11-glitch-blamed-european-mars-lander.html
>
> I don't think overflow is what happened. Rather, the statistical model they used to filter the sensor data didn't match reality. It put too much trust into a malfunctioning sensor -- I assume the sensor readings were extremely implausible.

Hey, sounds suspicously similar to Ariane 5 explosion. Does ESA not learn from its errors or am I only reading too much in it (probably)?
November 25, 2016
On 25/11/2016 07:14, Patrick Schluter wrote:
> On Thursday, 24 November 2016 at 20:22:00 UTC, Timon Gehr wrote:
>> On 24.11.2016 20:49, qznc wrote:
>>> Although, the article [0] does not say that literally, it sounds like an
>>> integer overflow:
>>>
>>>> After trawling through mountains of data, the European Space Agency
>>>> said Wednesday that while much of the mission went according to plan,
>>>> a computer that measured the rotation of the lander hit a maximum
>>>> reading, knocking other calculations off track.
>>>
>>>> That led the navigation system to think the lander was much lower than
>>>> it was, causing its parachute and braking thrusters to be deployed
>>>> prematurely.
>>>
>>>> "The erroneous information generated an estimated altitude that was
>>>> negative—that is, below ground level," the ESA said in a statement.
>>>
>>> That is why we need CheckedInt, folks. Reminder End. ;)
>>>
>>>
>>> [0] http://phys.org/news/2016-11-glitch-blamed-european-mars-lander.html
>>
>> I don't think overflow is what happened. Rather, the statistical model
>> they used to filter the sensor data didn't match reality. It put too
>> much trust into a malfunctioning sensor -- I assume the sensor
>> readings were extremely implausible.
>
> Hey, sounds suspicously similar to Ariane 5 explosion. Does ESA not
> learn from its errors or am I only reading too much in it (probably)?

I thought Ariane was caused by errorcodes from one module being sent on the same bus as telemetry and interpreted as instructions by another module?

A...
November 25, 2016
On 25.11.2016 08:14, Patrick Schluter wrote:
> On Thursday, 24 November 2016 at 20:22:00 UTC, Timon Gehr wrote:
>> ...
>>
>> I don't think overflow is what happened. Rather, the statistical model
>> they used to filter the sensor data didn't match reality. It put too
>> much trust into a malfunctioning sensor -- I assume the sensor
>> readings were extremely implausible.
>
> Hey, sounds suspicously similar to Ariane 5 explosion. Does ESA not
> learn from its errors or am I only reading too much in it (probably)?

I don't think we have enough information to judge, but remember that writing correct software is hard. This is no less true if it should automatically land a spacecraft on the surface of Mars using real time data from possibly malfunctioning sensors. :)
November 25, 2016
On Friday, 25 November 2016 at 07:14:45 UTC, Patrick Schluter wrote:
> Hey, sounds suspicously similar to Ariane 5 explosion. Does ESA not learn from its errors or am I only reading too much in it (probably)?

Well, from the little information we have, I suppose we can only be reading too much in it.

So, I like too to think it's just due to an integer overflow. But not from a software engineer perspective, but more from a Marxist approach. One misses a simple test over an integer, and you make a rocket-ship worth billions of good money (that could be used in education, medical care or whatever) explode in tiny cold little pieces, 54 millions km from here.

What an ironic and subversive bug, the engineer who did that should be immensely proud of himself. :)
November 25, 2016
On Friday, 25 November 2016 at 09:19:26 UTC, Alix Pexton wrote:
> On 25/11/2016 07:14, Patrick Schluter wrote:
>> On Thursday, 24 November 2016 at 20:22:00 UTC, Timon Gehr wrote:
>>> On 24.11.2016 20:49, qznc wrote:
>>>> Although, the article [0] does not say that literally, it sounds like an
>>>> integer overflow:
>>>>
>>>>> After trawling through mountains of data, the European Space Agency
>>>>> said Wednesday that while much of the mission went according to plan,
>>>>> a computer that measured the rotation of the lander hit a maximum
>>>>> reading, knocking other calculations off track.
>>>>
>>>>> That led the navigation system to think the lander was much lower than
>>>>> it was, causing its parachute and braking thrusters to be deployed
>>>>> prematurely.
>>>>
>>>>> "The erroneous information generated an estimated altitude that was
>>>>> negative—that is, below ground level," the ESA said in a statement.
>>>>
>>>> That is why we need CheckedInt, folks. Reminder End. ;)
>>>>
>>>>
>>>> [0] http://phys.org/news/2016-11-glitch-blamed-european-mars-lander.html
>>>
>>> I don't think overflow is what happened. Rather, the statistical model
>>> they used to filter the sensor data didn't match reality. It put too
>>> much trust into a malfunctioning sensor -- I assume the sensor
>>> readings were extremely implausible.
>>
>> Hey, sounds suspicously similar to Ariane 5 explosion. Does ESA not
>> learn from its errors or am I only reading too much in it (probably)?
>
> I thought Ariane was caused by errorcodes from one module being sent on the same bus as telemetry and interpreted as instructions by another module?
>
> A...

Nope it was an oveflowing down cast
https://around.com/ariane.html
The irony was that the specific module that had made the wrong calculation was even formally proved to be correct.
This accident also gave Bertrand Meyer (Eiffel) a lot of wind for his sails about design by contract
https://archive.eiffel.com/doc/manuals/technology/contract/ariane/
in that context it might be even interesting for the D language, as it is one of the few languages that have (inbuilt) contracts.
November 25, 2016
On 11/25/2016 4:22 AM, Claude wrote:
> So, I like too to think it's just due to an integer overflow. But not from a
> software engineer perspective, but more from a Marxist approach. One misses a
> simple test over an integer, and you make a rocket-ship worth billions of good
> money (that could be used in education, medical care or whatever) explode in
> tiny cold little pieces, 54 millions km from here.
>
> What an ironic and subversive bug, the engineer who did that should be immensely
> proud of himself. :)

I'd like to know what really happened with the code.

But as someone who has worked on flight critical systems for airliners, the designs are required to account for any single failure of anything. That means all inputs must be validated for "reasonableness", and the same for outputs. If any of this is outside reasonable bounds, there must be failover to a backup method.

A negative altitude is not reasonable.

-----

It reminds me of college, where we were told that if we worked a problem and came up with unreasonable answers, such as negative energy, we were expected to note:

   "I know this answer is unreasonable, but I cannot find the mistake."

and the worst you'd get is a 0. Unreasonable answers, and no note, meant you'd get a negative score!
November 26, 2016
On Saturday, 26 November 2016 at 05:50:19 UTC, Walter Bright wrote:
> It reminds me of college, where we were told that if we worked a problem and came up with unreasonable answers, such as negative energy, we were expected to note:
>
>    "I know this answer is unreasonable, but I cannot find the mistake."
>
> and the worst you'd get is a 0. Unreasonable answers, and no note, meant you'd get a negative score!

You got a great teacher right there !

November 26, 2016
On 11/26/2016 3:16 AM, deadalnix wrote:
> On Saturday, 26 November 2016 at 05:50:19 UTC, Walter Bright wrote:
>> It reminds me of college, where we were told that if we worked a problem and
>> came up with unreasonable answers, such as negative energy, we were expected
>> to note:
>>
>>    "I know this answer is unreasonable, but I cannot find the mistake."
>>
>> and the worst you'd get is a 0. Unreasonable answers, and no note, meant you'd
>> get a negative score!
>
> You got a great teacher right there !
>

It was actually institute policy, not an individual teacher's. Another policy is no grades can be based on attendance (unless it was P.E.). A third is that if you can pass the finals, you can opt out of any class and yet receive full credit for it. A fourth was grades will not be on a curve - you either met the standard or you didn't.

There's more.

Oh, one more you'll recognize. You'd get a 0 on any computation where you prematurely rounded the results :-) The algebra had to be worked out to its final form before plugging in numbers. (Lots of times intermediate terms would algebraically cancel out, so calculating intermediate values would result in spurious rounding errors.)

I thought it was a fairly enlightened system of grading, quite a step up from what I was used to.


« First   ‹ Prev
1 2