Description: https://images.manning.com/360/480/resize/book/4/9d28fdb-9d68-448e-97a9-efbed9461940/Bihlmaier-MEAP-HI.png

An excerpt from Robotics for Software Engineers by Andreas Bihlmaier

This article explores several specialized software fields that are important for robotics.

Read this if you are a software developer who is interested in learning more about robotics.


Take 25% off Robotics for Software Engineers by entering fccbihlmaier into the discount code box at checkout at manning.com.


Robotics is a highly interdisciplinary field. We need at least mechanics, electronics and software to create a robot. But even within the field of software, there are various specialized fields that are important for robot software. Let’s take a brief look at each of these fields and get to understand their relation to robotics.

Robots as Embedded Real-Time Systems

The first relevant area is software for embedded real-time systems. Embedded software is software that runs within devices that are not used as general-purpose computers. Frequently these devices have limited computational and memory resources. Another common characteristic of embedded software is that users do not directly interact with it. Embedded software is often referred to as firmware.

Real-time software is software under time constraints. For real-time software it is not sufficient to give the correct result (logical correctness), it must also give the result at the correct time (temporal correctness). Because it is such a widespread misconception that real-time means “fast”, let me reiterate this point: real-time software has to guarantee that computations produce a result at a specified point in time. This is the same as stating that the software must exhibit temporal determinism.

Combining both aspects, the field of embedded real-time software concerns software that controls devices under real-time constraints. Three examples of industries employing many embedded real-time software developers are aerospace, automotive, and communications. A modern car contains on the order of 100 electronic control units (ECUs). Each ECU has its own processing unit running embedded real-time software. They communicate with each other via a bus system and control everything in the car, from various engine parts to brakes to windshield wipers. The software is an integral part of the ECUs and the driver likely doesn’t even know about their existence, thus the embedded aspect is evident. The real-time aspect is also easy to figure out, considering not only that fuel needs to be injected into the engine’s cylinders at just the right time, but also that one really wants to have a guaranteed time between stepping on the brake pedal and the brake ECU actuating the brakes.

As you can see from the examples, embedded real-time software is a big and varied field. It has important intersections with robotics as well. Robot systems are commonly composed of devices running their own software. In many cases, this is embedded real-time software that takes care of (low-level) sensor data acquisition and actuator control. Not all of robotics, of course, is about close-to-the-metal embedded real-time software. If your main interest is in the higher layers of the robot software stack, you don’t have to become an expert in embedded real-time software. Nevertheless, everyone working in robotics, no matter on what layer, benefits from a thorough understanding the basics of embedded and real-time.

Python is a popular modern multi-paradigm language with a clean syntax and comprehensive standard library; it is often used for robotics and AI. Unfortunately, it is ill-suited for embedded real-time programming. The main drawbacks of Python for this use case are being an interpreted language and having a garbage collector.

In contrast to (pre-)compiled programming languages, interpreted languages (such as Python) are not translated to machine code before execution. Instead, the source code, or an intermediate representation of it, is directly read and executed by the language runtime, i.e. the interpreter. For a simple loop, such as:

 
 iterations = 100000000  # 10 million
 sum = 0
 for i in range(iterations):
   sum += i
 print(sum)
  

the difference in performance compared to the same loop written and compiled in C++ is on the order of 100x, i.e. Python runs a hundred times slower. Please note that this is an extreme example and there are workarounds in the Python ecosystem. The point here is that there is a significant performance penalty for interpreted languages, in particular on resource constrained systems. This is also true for memory usage and memory is often scarce in embedded systems. While memory is in the range of tens of gigabytes (GB) for PCs nowadays, it is still common for microcontrollers to have only a few hundred kilobytes (KB) or a few megabytes (MB) of memory.

Apart from resource utilization, which is mostly an embedded software issue, garbage collection (GC) poses a significant issue for real-time software. In programming languages without GC, memory must be managed explicitly. It is the software developer’s responsibility to manually allocate memory for use and to manually release it after. Having to perform manual memory management is not only a nuisance, but also a constant source of bugs, e.g. memory leaks. In languages with GC, memory management is automated. Memory is automatically allocated and released. The issue for real-time software is the question of when these automated memory management operations are performed, especially when the GC becomes active and releases memory. In non-real-time software we are only interested in average performance. It does not matter when the GC becomes active and delays program execution, as long as things are sufficiently fast on average. However, in real-time systems, we care very much about deterministic behavior of each and every program execution, not just the average case. The GC introduces random delays into program execution, thus rendering it difficult or impossible to provide real-time guarantees.

I will still use Python for the following examples of embedded real-time programming, because the concepts can be quite well expressed in Python. Just keep in mind that Python should not be your first choice when implementing these parts of the robot software stack.

When developing embedded software, we must pay special attention to resource utilization:

  • ROM (non-volatile memory)
  • RAM
  • CPU

Non-volatile memory (NVM) in embedded systems is not commonly provided by hard disks or solid state drives, rather by small amounts of EEPROM or flash memory. The NVM is often not even a separate component, but part of the microcontroller chip itself. For historical reasons, NVM is commonly referred to as read-only memory (ROM) in the context of embedded systems, although it is not strictly read-only anymore.

While most software developers are used to thinking about required storage space for program data, they usually do not pay much attention to the size of the program itself. And why would they? Even the code of large programs usually does not exceed a couple hundred megabytes (MB) with disk space measured in terabytes (TB). In embedded systems the “disk”, i.e. NVM / ROM, has only a couple hundred kilobytes (KB) or a few MB. Thus, program size matters and sometimes needs to be optimized for.

Many software developers also take memory (RAM) usage into account during development. For example, one should, generally not read files entirely into memory, rather one should process them in chunks to avoid running out of memory when processing large files. However, resources in embedded systems are much more constrained than in PCs and servers. Data structures that one would normally keep in main memory can easily exceed the available memory in an embedded system. Thus, memory usage must always be kept in mind and some familiar implementations might have to be adapted.

Finally, CPU performance of embedded systems is much lower than in PCs. To give you an idea, let’s have a closer look at floating point operations per second (FLOPS) as a performance metric: A (high-end) microcontroller has on the order of 0.1 GFLOPS, a (high-end) system on a chip (SoC) about 10 GFLOPS and a (high-end) desktop CPU about 100 GFLOPS. Put differently, there is a performance difference of 1000x between PCs and microcontrollers. Thus, keeping an eye on efficient usage of computational resources by selecting the right approach, algorithms, and data structures is essential in embedded systems.

Moving from embedded to real-time, there are a number of topics to pay attention to:

  • Worst-case execution time (WCET)
  • Stochastic hardware performance
  • Schedulers, task priorities, and preemption
  • Synchronization primitives and inter-process communication (IPC)
  • Interrupts and timers

If we want to guarantee that the results of computations are available on time, we need to start by knowing the duration of these computations. While it sounds simple enough to measure how long it takes to run a certain function, there are some pitfalls you should be aware of. Because we must guarantee that the deadlines are always met, we do not care about the average execution time, instead we are interested in the worst-case execution time (WCET). Functions that contain conditional statements or loops might have a different execution time depending on their input or the overall program state. We are interested in finding the input data and program state that lead to the worst, i.e. the longest, execution time. Once we have analyzed the program for the worst-case execution paths, it is still not sufficient to measure the execution time only once.

The reason is that most modern CPUs, including those in microcontrollers, have been optimized to provide high performance on average. We cannot go into the details of modern computer architecture here. In a nutshell, caches, pipelining, out-of-order execution, speculative execution and other optimizations greatly improve the average performance of CPUs, but they also lead to a huge difference–sometimes several orders of magnitude–between the fastest and the slowest execution of the same code. Unfortunately, it is practically impossible to analyze what execution times will occur from looking at the source code and the CPU datasheet. The practical solution to get the WCET of a function is to run the function many times while putting the overall system into different states and under different loads. The result is an execution time histogram like the one shown in figure 1.


Figure 1. Histogram of execution times for worst-case execution time (WCET) analysis.


The horizontal axis shows execution times and the vertical axis shows the number of times this execution time was measured. While more advanced WCET analysis techniques also analyze the distribution of execution times, we limit ourselves to looking at the highest measured execution time. Although we cannot prove that a worse execution time cannot occur, doing a sufficient number of measurements under sufficiently diverse system states is good enough in practice.

The next real-time consideration are schedulers, task priorities and preemption. Let’s look at an example system with two tasks, task A and task B, in which task A must be executed every 10 ms and task B every 100 ms. In other words, task A has a periodic deadline of 10 ms and task B of 100 ms. Task A has a WCET of 2 ms and task B of 50 ms. We cannot simply run the tasks sequentially, as we would miss the deadline of task A 5 times while task B is executing. Instead, we need to pause task B every 10 ms, run task A, then continue running task B until the next period of A. Pausing a task and switching to another task is also known as preemption. The software that decides what tasks to run at what time is the scheduler. If a task is preempted by another one, we speak of one task having a higher priority than the other one. In our case, we need to assign a higher priority to task A relative to task B. The resulting system behavior is illustrated in figure 2.


Figure 2. Example of preemptive scheduling for two tasks with fixed priorities.


Real-time operating systems (RTOSes) provide a suitable scheduler and corresponding task models for these real-time demands. Tasks are often not completely independent, but instead they have to exchange data with each other. Hence, RTOSes provide synchronization primitives, such as mutexes and semaphores, as well as inter-process communication (IPC) mechanisms, such as shared memory and message queues.

Finally, real-time systems require interrupts and timers. Interrupts are a hardware mechanism that interrupts the sequential execution of a program and execute a predefined function, the so called interrupt service routine (ISR). Interrupts can either originate from within the system or externally. In either case they signal that some kind of event has occurred. Hardware timers can be configured to cause an interrupt, either after a defined amount of time or periodically with a defined frequency. For example, the scheduler uses timers to regularly interrupt the running program and execute itself in the ISR to determine whether to continue running the current program or to run a different one. Another example for the use of interrupts is avoiding cyclic checking of inputs for changes, so called polling. Instead of occupying precious CPU resources with regularly polling an input, such as:

 
 while input.status() != True:  # wait for input signal to become true
   pass
 do_something()  # then run do_something()
  

we could–given hardware support–register an ISR for the input change interrupt:

 
 input.on_change(True, do_something)  # run do_something() when input changes to True
  

When an interrupt is triggered, the CPU and operating system have to switch from executing the current task to running the ISR. This process has a non-zero duration. The duration from the interrupt being triggered in hardware to the ISR being executed is known as interrupt latency. Given what we learned about the (non-)determinism of modern computer architectures above, the latency is not always exactly the same. This deviation from the average duration is called jitter. See figure 3 for an illustration.


Figure 3. Illustration of interrupts their latency and jitter.


This concludes the brief introduction to embedded real-time systems and their relation to robotics software.

If you want to learn more about the book, check it out on Manning’s liveBook platform here.