Unreachable code

In computer programming, unreachable code is part of the source code of a program which can never be executed because there exists no control flow path to the code from the rest of the program.[1]

Unreachable code is sometimes also called dead code, although dead code may also refer to code that is executed but has no effect on the output of a program.

Unreachable code is generally considered undesirable for a number of reasons, including:

  • Occupies unnecessary memory
  • Causes unnecessary caching of instructions into the CPU instruction cache - which also decreases data locality.
  • From the perspective of program maintenance; time and effort may be spent maintaining and documenting a piece of code which is in fact unreachable, hence never executed.

However, unreachable code can have some legitimate uses, like providing a library of functions for calling or jumping to manually via the debugger while the program is halted after a breakpoint. This is particularly useful for examining and pretty-printing the internal state of the program. It may even be reasonable to have such code in the final version that is shipped to clients, if it may be necessary for a developer to attach a debugger to the client's running version in case a bug only occurs on their running production instance. This is different from debugging functions for which calls are inserted during development from time to time and were forgotten to be removed in the production version.

Causes

The existence of unreachable code can be due to various factors, such as:

  • programming errors in complex conditional branches;
  • a consequence of the internal transformations performed by an optimizing compiler;
  • incomplete testing of a new or modified program that failed to test the bypassed unreachable code;
  • obsolete code that a programmer forgot to delete;
  • unused code that a programmer decided not to delete because it was intermingled with functional code;
  • conditionally useful code that will never be reached, because current input data will never cause that code to be executed;
  • complex obsolete code that was intentionally retained but made unreachable so that it could be revived later if needed;
  • debugging constructs and vestigial development code which have yet to be removed from a program.

In the latter five cases, code which is currently unreachable is often there as part of a legacy, i.e. code that was once useful but is no longer used or required. However, the unreachable code may also be part of a complex component (library, module or routine), where the code continues to be useful in conjunction with different input data (never generated by the current application) or under conditions which are not met in the current runtime environment, thereby making the corresponding code unreachable, but which can occur in other runtime environments, for which the component has been developed.

An example of such a conditionally unreachable code may be the implementation of a printf() function in a compiler's runtime library, which contains complex code to process all possible string arguments, of which only a small subset is actually used in the program. Without recompiling the runtime library, compilers will typically not be able to remove the unused code sections at compile time.

Examples

Consider the following fragment of C code:

int foo (int X, int Y)
{
    return X + Y;
    int Z = X * Y;
}

The definition int Z = X * Y; is never reached as the function returns before the definition is reached. Therefore, the definition of Z can be discarded.

goto fail bug

An example of real life code that contained a major security flaw due to unreachable code is Apple's SSL/TLS bug formally known as CVE-2014-1266 and informally known as the "goto fail bug"[2] [3] from February 2014. The relevant code fragment [4] is listed below:

static OSStatus
SSLVerifySignedServerKeyExchange(SSLContext *ctx, bool isRsa, SSLBuffer signedParams,
                                 uint8_t *signature, UInt16 signatureLen)
{
    OSStatus        err;
    ...
 
    if ((err = SSLHashSHA1.update(&hashCtx, &serverRandom)) != 0)
        goto fail;
    if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
        goto fail;
        goto fail;
    if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0)
        goto fail;
    ...
 
fail:
    SSLFreeBuffer(&signedHashes);
    SSLFreeBuffer(&hashCtx);
    return err;
}

Here, there are two successive calls to goto fail. In the syntax of the C language, the second one is unconditional, and hence always skips the call to the final check. As a consequence, err will contain a successful value after the SHA1 update operation was successful, and the signature verification will never declare a failure, as the final check is omitted.[2]

Here, the unreachable code is the call to the final function, which should have been reached. There are several good coding practices that could have prevented this fault from occurring, such as code reviews, the proper use of indentation or curly braces, and test coverage analysis.[3] Applying the Clang compiler with the option -Weverything includes unreachable code analysis, which would trigger an alarm for this code.[3]

Analysis

Detecting unreachable code is a form of static analysis and involves performing control flow analysis to find any code that will never be executed regardless of the values of variables and other conditions at run time. In some languages (e.g. Java [5] ) some forms of unreachable code are explicitly disallowed. The optimization that removes unreachable code is known as dead code elimination.

Code may become unreachable as a consequence of the internal transformations performed by an optimizing compiler (e.g., common subexpression elimination).

In practice the sophistication of the analysis performed has a significant impact on the amount of unreachable code that is detected. For example, constant folding and simple flow analysis shows that the inside of the if-statement in the following code is unreachable:

int N = 2 + 1;

if (N == 4)
{
   /* unreachable */
}

However, a great deal more sophistication is needed to work out that the corresponding block is unreachable in the following code:

double X = sqrt(2);

if (X > 5)
{
    /* unreachable */
}

The unreachable code elimination technique is in the same class of optimizations as dead code elimination and redundant code elimination.

Unreachability vs. profiling

In some cases, a practical approach may be a combination of simple unreachability criteria and use of a profiler to handle the more complex cases. Profiling in general can not prove anything about the unreachability of a piece of code, but may be a good heuristic for finding potentially unreachable code. Once a suspect piece of code is found, other methods, such as a more powerful code analysis tool, or even analysis by hand, could be used to decide whether the code is truly unreachable.

See also

References

  1. Debray, Saumya K.; Evans, William; Muth, Robert; De Sutter, Bjorn (1 March 2000). "Compiler techniques for code compaction". ACM Transactions on Programming Languages and Systems. 22 (2): 378–415. doi:10.1145/349214.349233.
  2. 1 2 Adam Langley (2014). "Apple's SSL/TLS bug".
  3. 1 2 3 Arie van Deursen (2014). "Learning from Apple's #gotofail Security Bug".
  4. "sslKeyExchange.c - Source code for support for key exchange and server key exchange".
  5. "Java Language Specification".
  • Appel, A. W. 1998 Modern Compiler Implementation in Java. Cambridge University Press.
  • Muchnick S. S. 1997 Advanced Compiler Design and Implementation. Morgan Kaufmann.
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.