Do you know that programs do not necessarily know what they are doing? That’s because of the rare but existent cases of undefined behavior, which is the action of a program outside of official documentation of the programming language. Take a look at some cases that lead to undefined behavior, why this is a problem, and how we should avoid it in this article.
Why Is Undefined Behavior a Bug?
The first question you want to ask about this topic might be, “Why is undefined behavior a bug?” That’s because it’s not a good thing, sometimes even dangerous. Under the influence of a confused program, it could pretty much do anything as long as the compiler allows. At best, the compiler catches the problematic code and returns an error. At worst, it may introduce security vulnerabilities into the system or even cause data corruption by modifying variables in an unwanted way.
However, the presence of undefined behavior itself is not necessarily a bad thing. In fact, it’s a good thing if you want to talk about performance. Allowing programs to do whatever they want when they’re confused can help the compiler and the program avoid checks that may be computationally expensive, thus improving performance.
Examples of Undefined Behavior
At this point, you may want to ask for some examples of when a program would not know what to do. One example is accessing a memory location that isn’t there, like trying to fetch the 6th item of an array that only has 5 items. This looks like a stupid mistake, but it is easy to make if you forget that most programming languages are zero-indexed (see this page for more about off-by-one errors). In this case, the operating system safely contains this error by throwing a segmentation fault and terminating the program. However, the program doesn’t check for this type of error, so it counts as undefined behavior.
Another well-known example is division by zero. This is invalid, and a floating point exception terminates the program whenever it is being attempted. However, the program doesn’t know it’s performing an illegal operation, so it still counts in this category. However, since its implementation is straightforward (within a single line), compilers can often easily detect that and prevent errors from being thrown before the program is run.
These two implementations are covered by error-catching algorithms, but there are more examples of actions that may cause undefined behavior, and they aren’t necessarily caught by the compiler or the operating system. For example, suppose you assign a signed integer to the value 2^31-1 (2147483647), and then add its value by 1. In that case, its value becomes -2147483648, which causes the integer to wrap around. This is called an integer overflow, and it can cause undefined behavior. It wreaks havoc when you want to do arithmetic on those distorted numbers, causing incorrect results and errors propagating through the program’s execution.
How To Detect and Fix Undefined Behavior?
If you suspect that your program is behaving unusually due to a bug, adding print statements is the best way to verify the values of variables, which may have been changed outside of your original intentions. If you have pinpointed where this bug happens but is unable to fix it in standard ways, even after you are confident that what your code is doing is supposed to be correct, undefined behavior may have occurred.
Fixing those kinds of errors could be confusing, so it’s critical to learn about the types of undefined behavior in the language specifications to avoid and rectify these mistakes. Sometimes, it’s down to using the wrong data type for the operation, while in other cases, it may involve race conditions or the misuse of pointers. However, it all comes down to checking some of the most basic assumptions about the problematic piece of code, and if you see something that doesn’t feel right, that’s likely the source of the problem.
After that, make sure you work around the problem. You can assign a different, more suitable data type, for example, and you can also add some checks to make sure that there are no conflicts before you continue the execution of the program.
In this article, we explained what undefined behavior is in compiled languages, some of the numerous cases that can cause this phenomenon, and how we can detect, fix, and prevent programs from becoming confused. If you want to search for more information about related topics, please visit the webpages in the references below. Also, if you think that we missed some crucial points that we should have added, please leave them in the comments below to improve our article.
- (n.d.). Undefined Behavior in C and C++. Retrieved September 22, 2022, from https://www.geeksforgeeks.org/undefined-behavior-c-cpp/
- Raph Levien. (2018, August 17). With Undefined Behavior, Anything is Possible. Retrieved September 22, 2022, from https://raphlinus.github.io/programming/rust/2018/08/17/undefined-behavior.html
- (n.d.). Undefined Behavior != Unsafe Programming. Retrieved September 22, 2022, from https://blog.regehr.org/archives/1467
- (2022, June 3). Undefined behavior. Retrieved September 22, 2022, from https://en.cppreference.com/w/cpp/language/ub