Peephole optimization

In compiler theory, peephole optimization is a kind of optimization performed over a very small set of instructions in a segment of generated code. The set is called a "peephole" or a "window". It works by recognizing sets of instructions that can be replaced by shorter or faster sets of instructions.

Replacement rules

Common techniques applied in peephole optimization:[1]

  • Null sequences – Delete useless operations.
  • Combine operations – Replace several operations with one equivalent.
  • Algebraic laws – Use algebraic laws to simplify or reorder instructions.
  • Special case instructions – Use instructions designed for special operand cases.
  • Address mode operations – Use address modes to simplify code.

There can be other types of peephole optimizations.

Examples

Replacing slow instructions with faster ones

The following Java bytecode

...
aload 1
aload 1
mul
...

can be replaced by

...
aload 1
dup
mul
...

This kind of optimization, like most peephole optimizations, makes certain assumptions about the efficiency of instructions. For instance, in this case, it is assumed that the dup operation (which duplicates and pushes the top of the stack) is more efficient than the aload X operation (which loads a local variable identified as X and pushes it on the stack).

Removing redundant code

Another example is to eliminate redundant load stores.

 a = b + c;
 d = a + e;

is straightforwardly implemented as

MOV b, R0  # Copy b to the register
ADD c, R0  # Add  c to the register, the register is now b+c
MOV R0, a  # Copy the register to a
MOV a, R0  # Copy a to the register
ADD e, R0  # Add  e to the register, the register is now a+e [(b+c)+e]
MOV R0, d  # Copy the register to d

but can be optimised to

MOV b, R0 # Copy b to the register
ADD c, R0 # Add c to the register, which is now b+c (a)
MOV R0, a # Copy the register to a
ADD e, R0 # Add e to the register, which is now b+c+e [(a)+e]
MOV R0, d # Copy the register to d

Removing redundant stack instructions

If the compiler saves registers on the stack before calling a subroutine and restores them when returning, consecutive calls to subroutines may have redundant stack instructions.

Suppose the compiler generates the following Z80 instructions for each procedure call:

 PUSH AF
 PUSH BC
 PUSH DE
 PUSH HL
 CALL _ADDR
 POP HL
 POP DE
 POP BC
 POP AF

If there were two consecutive subroutine calls, they would look like this:

 PUSH AF
 PUSH BC
 PUSH DE
 PUSH HL
 CALL _ADDR1
 POP HL
 POP DE
 POP BC
 POP AF
 PUSH AF
 PUSH BC
 PUSH DE
 PUSH HL
 CALL _ADDR2
 POP HL
 POP DE
 POP BC
 POP AF

The sequence POP regs followed by PUSH for the same registers is generally redundant. In cases where it is redundant, a peephole optimization would remove these instructions. In the example, this would cause another redundant POP/PUSH pair to appear in the peephole, and these would be removed in turn. Removing all of the redundant code in the example above would eventually leave the following code:

 PUSH AF
 PUSH BC
 PUSH DE
 PUSH HL
 CALL _ADDR1
 CALL _ADDR2
 POP HL
 POP DE
 POP BC
 POP AF

Implementation

Modern architectures typically allow for many hundreds of different kinds of peephole optimizations, and it is therefore often appropriate for compiler programmers to implement them using a pattern matching algorithm.[2]

See also

References

  1. Fischer, Charles N; Cytron, Ron K; LeBlanc, Richard J, Jr. (2010). Crafting a Compiler (PDF). Addison-Wesley. ISBN 978-0-13-606705-4. Retrieved 2 July 2018.
  2. Aho, Alfred V; Lam, Monica S; Sethi, Ravi; Ullman, Jeffrey D (2007). ""8.9.2 Code Generation by Tiling an Input Tree"". Compilers - Principles, Techniques, & Tools (second edition) (PDF). Pearson Education. p. 540. Archived (PDF) from the original on 10 June 2018. Retrieved 2 July 2018.

The dictionary definition of peephole optimization at Wiktionary

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.