737 Max

Status
Not open for further replies.
www.hifisonix.com
Joined 2003
Paid Member
I read through the Barr report.

Seems basic stuff like watchdogs were not implemented, excessively complex functions and up to 16 million potential failure modes if I understood it correctly totally out of control and untestable.

Of course I’ve never written embedded code that’s more than 1-2 thousand lines (C, C++) and keeping track of all that was difficult enough so I would imagine many thousands of lines of code and multiple coders must be a tough project to manage.

Nevertheless, some of the stuff Toyota did is unforgivable. I have WD’s on all my stuff since it’s real-time.

When the 737 report comes out, it will make interesting reading. I some how feel it will lean more to bad management decisions than plain bad coding practice. But, we won’t know until then.
 
I read through the Barr report.

Seems basic stuff like watchdogs were not implemented, excessively complex functions and up to 16 million potential failure modes if I understood it correctly totally out of control and untestable.

Of course I’ve never written embedded code that’s more than 1-2 thousand lines (C, C++) and keeping track of all that was difficult enough so I would imagine many thousands of lines of code and multiple coders must be a tough project to manage.

Nevertheless, some of the stuff Toyota did is unforgivable. I have WD’s on all my stuff since it’s real-time.

When the 737 report comes out, it will make interesting reading. I some how feel it will lean more to bad management decisions than plain bad coding practice. But, we won’t know until then.
There was the Therac-25 which was apparently a smaller project code-wise, but from what I read somewhere it was all written by one person, who didn't have even half decent programming practices. Then again, I learned 6502 assembly circa 1977, and C in 1987, and the only helpful programming paradigm I knew of for many years was structured programming. That's now so old hat and assumed to be part of programming that it's not even mentioned anymore.

I've been catching up in the last few years, there are many methods and "paradigms" to improve programming practices that have been developed in recent decades. Making the code clearer and easier for humans to read just happens to make bugs easier to see and fix, making the code more reliable.

It's arguable that bad management led to bad coding practices, which is inexcusable in companies the size of Boeing and Toyota. For the Therac-25, if it's in any way "excusable" that a smaller company had bad management practices that led to bad coding practices, then the company shouldn't have done that product.

I have this idea that totally goes against commercial "trade secrets" and such, but I wonder if safety-critical code should be made open source (or under an "open readable" license - you can study it and publicly critique it all you want, but need permission to use the code in your own product).
 
I have a long background in pacemaker design, where software errors are not tolerated. verification of software (I sometime spell it softwhere, softwear) take 5-6 times longer than hardware verification, that is why we are used to say: hardware you can change, software not.
Well the good part is that it is a rather tightly controlled temperature environment for the electronics.

When the temperature environment strays +/- 5 degrees, it doesn't matter if the pacemaker works..

jn
 
This is the document I recall:
https://nepp.nasa.gov/whisker/reference/tech_papers/2011-NASA-GSFC-whisker-failure-app-sensor.pdf



I recall the late 1990s, there was a lot in EE Times about the then-new RHOS standards and the need for lead-free solder. Even then there was concern about tin whiskers in lead-free solder and its use in satellites. There was talk of allowing leaded solder in safety critical areas, but it looks like not everyone got the message.
I remember that event sequence chart. What a mess.

They went lead free by edict at work when I took a week vacation...I really really took issue with that. It took a while, but I got them to reverse that.
The RHOS exceptions were the "out" I provided upper management.

Even now, many of the COTS units will have one or two components in them that grow whiskers. we have about 5 thousand units, so on occasion have to troubleshoot. So ridiculous to have to deal with incorrect global decisions.

jn
 
www.hifisonix.com
Joined 2003
Paid Member
Talking about steering wheels, when I was involved with auto ('96 - 2001) there were a few projects in progress at the big tier 1 suppliers looking into 'steer by wire'. The push to get autos weight down is huge and theengineers figured if they could throw out the traditional steering systems with their linkages and use small motors plus gear wheel at each front wheel, they'd save quite a bit of weight. I left to go to another business within the company in 2001, but AFAIK it was never taken up by the auto manufacturers.

Steering assist nowadays is almost always done with a small electric motor that helps you when you turn the wheel - gain, big weight/hazardous materials savings over traditional servo systems. My wife once had a Peugeot 101 - tiny car - without steering assist. She developed strong arms :D
 
www.hifisonix.com
Joined 2003
Paid Member
There was the Therac-25 which was apparently a smaller project code-wise, but from what I read somewhere it was all written by one person, who didn't have even half decent programming practices. Then again, I learned 6502 assembly circa 1977, and C in 1987, and the only helpful programming paradigm I knew of for many years was structured programming. That's now so old hat and assumed to be part of programming that it's not even mentioned anymore.

I've been catching up in the last few years, there are many methods and "paradigms" to improve programming practices that have been developed in recent decades. Making the code clearer and easier for humans to read just happens to make bugs easier to see and fix, making the code more reliable.

It's arguable that bad management led to bad coding practices, which is inexcusable in companies the size of Boeing and Toyota. For the Therac-25, if it's in any way "excusable" that a smaller company had bad management practices that led to bad coding practices, then the company shouldn't have done that product.

I have this idea that totally goes against commercial "trade secrets" and such, but I wonder if safety-critical code should be made open source (or under an "open readable" license - you can study it and publicly critique it all you want, but need permission to use the code in your own product).

Interesting. I did a bit of assembler coding on the Z80 and 8085 back in the day and then some on one of the ST controllers. I only got back into coding about 12 years ago and it was on the Philips 8080 derivatives and the Kiel 'C' compiler - but I quickly ditched all that and went down the ARM route and the mbed platform when Philips brought out the ARM emebedded processors (industry first - raised a lot of eyebrows at the time but ended up fundamentally changing the whole industry).

Once you've started using 32 bit emebedded processors, its very difficult to go back to 8 bit - well let me put it this way, you probably a masochist! I've not tried the Microchip 32 bit devices - but they'll also be pretty good.
 
AX tech editor
Joined 2002
Paid Member
I only will believe that a piece of software is error free if all three of the following conditions are simultaneously met:

1 - No more than 2 lines of code;

2 - It ran for 10 years with no problems;

3 - I wrote it. :cool:

If I ran a software company, I would never hire anyone who even believed that error-free software exists. Such persons are dangerous.

Jan
 
www.hifisonix.com
Joined 2003
Paid Member
I worked with a software engineer once (I did the hardware side) who proudly told me when we met for the first time 'I don't make mistakes'. The project took a year and it was his first big challenge.

He was quite humble by the end of it. If there's one thing I learned, it was to go back and check my work 2 or 3 times. I fell out to the habit in DIY and every now and then get 'humbled' by people pointing out my mistakes. Right first time methodology takes a lot of personal discipline and effort - but it can never totally eliminate what I would call 'honest' mistakes.
 
Reading this article about the latest report, it looks like Boeing will be required to do all the things it should have been doing all along:
The review team’s analysis identified 61 corrective and preventative actions to address the two software anomalies; those actions are organized into four categories to help manage and execute the scope of the work. Below are the four categories and examples of the resulting actions that Boeing has already begun working on:

Perform code modifications: Boeing will review and correct the coding for the mission elapsed timer and service module disposal burn.
Improve focused systems engineering: Boeing will strengthen its review process including better peer and control board reviews, and improve its software process training.
Improve software testing: Boeing will increase the fidelity in the testing of its software during all phases of flight. This includes improved end-to-end testing with the simulations, or emulators, similar enough to the actual flight system to adequately uncover issues.
Ensure product integrity: Boeing will check its software coding as hardware design changes are implemented into its system design.
NASA Update on Orbital Flight Test Independent Review Team – Commercial Crew Program
 
AX tech editor
Joined 2002
Paid Member
You can debate whether the listed omissions etc are really software errors. For me they are more design flaws.

For the MCAS to look only at one of the AOA sensors and getting lost when that part malfunctions is a design flaw, not a software error. The software worked as designed, they just had forgotten to 'tell' the software what to do when the AOA sensor would indicate impossible angles.

Jan
 
Status
Not open for further replies.