Developing software for safety in medical robotics
The use of robotics in medtech continues to grow. Whether it’s a cobot working alongside humans to automate manufacturing or a surgical robot in the OR, a single point of failure can cause serious harm. The incorporated software systems must take safety into account.
IEC 61508-3 offers several techniques for developing software for safety-related systems, which the medical device software development community can draw on when designing and implementing risk-control measures as required by ISO 14971.
Developing “safe” software begins with establishing a software coding standard. IEC 61508-3 promotes using well-known techniques, including:
- Using modular code.
- Using preferred design patterns.
- Avoiding reentrance and recursion.
- Avoiding dynamic memory allocations and global data objects.
- Minimizing the use of interrupt service routines and locking mechanisms.
- Avoiding dead wait loops.
- Using deterministic timing patterns.
Keep it simple
There are other suggestions under the “keep it simple” principle around limiting the use of pointers, unions and type casting, and not using automatic type conversions while encouraging the use of parentheses and brackets to clarify intended syntax.
A hazard analysis might identify that your code or data spaces can get corrupted. There are well-known risk-control measures around maintaining code and memory integrity which can be easily adopted. Running code from read-only memory, protected with a cyclic redundancy check (CRC-32) that can be checked at boot time and periodically during runtime, prevents errant changes to the code space and provides a mechanism to detect these failures.
Segregating data into different memory regions that can be protected through virtual memory space and using CRC-32 over blocks of memory regions or even adding a checksum to each item stored in memory allows these CRC/checksums to be checked periodically.
CRC/checksums can be verified on each read access to a stored item and updated atomically on every write access to these protected items. Building tests into the software is an important tool as well. It’s a good idea to perform a power-on self-test (POST) at power-up to make sure the hardware is working and to check that your code and data spaces are consistent and not corrupt.
What else can happen?
Another hazardous situation arises when controlling and monitoring are performed on the same processor or in the same process. What happens to your safety system if your process gets hung up in a loop? Techniques that separate the monitor from the controlling function introduce some complexity to the software system, but this complexity can be offset by ensuring the controlling function implements the minimum safety requirements while the monitor handles the fault and error recovery.
Fault detection systems and error recovery mechanisms are much easier to implement when designed from the start. Poorly designed software can experience unexpected, inconsistent timing, which results in unexpected failures. It’s possible to avoid these failures by controlling latency in the software. State machines, software watchdogs and timer-driven events are common design elements to control this.
Keep an eye on communications
Inter-device and inter-process communications are another area of concern for safety-related systems. The integrity of these communications must be monitored to ensure they are robust. Using CRC-32 on any protocol between two entities is recommended. Separate CRC-32 on the headers and the payload helps to detect corruption of these messages. Protocols should be written and designed with the idea that at any time, your system could reboot due to some fault. Thus, building in retry attempts and stateless protocols is recommended.
Safe operational software verifies the ranges of all inputs at the interface where it is encountered; checks internal variables for consistency; and defines default settings to help recover from an inconsistent setting or to support a factory reset. Software watchdog processes can be put in place to watch the watcher and ensure that processes are running as they should.
By taking these techniques into account, software developers working on medical robotic devices can better address the concerns of safety-related systems.