< ROSE Compiler Framework

The ROSE Inliner inlines functions at function callsites.

Background

An inline function is one for which the compiler copies the code from the function definition directly into the code of the calling function rather than creating a separate set of instructions in memory. Instead of transferring control to and from the function code segment, a modified copy of the function body may be substituted directly for the function call. In this way, the performance overhead of a function call is avoided. The inline specifier is only a suggestion to the compiler that an inline expansion can be performed; the compiler is free to ignore the suggestion.

User Instructions

You must enable EDG 5.0 to inline C++11 code

  • --enable-edg_version=5.0


install the tool

configure and build ROSE as instructed at https://github.com/rose-compiler/rose/wiki


    inlineEverything -c [options] input.c

    The optional options include:

    • -skip-postprocessing: Skip postprocessing which cleanups code
    • -process-headers: Process calls within header files
    • -verbose: Printout debugging information

    Source code

    API and implementation:

    // main API
    bool doInline(SgFunctionCallExp*, bool)
    

    Algorithm

    216 // Main inliner code.  Accepts a function call as a parameter, and inlines
    217 // only that single function call.  Returns true if it succeeded, and false
    218 // otherwise.  The function call must be to a named function, static member
    219 // function, or non-virtual non-static member function, and the function
    220 // must be known (not through a function pointer or member function
    221 // pointer).  Also, the body of the function must already be visible.
    222 // Recursive procedures are handled properly (when allowRecursion is set), by
    223 // inlining one copy of the procedure into itself.  Any other restrictions on
    224 // what can be inlined are bugs in the inliner code.
    225 bool
    226 doInline(SgFunctionCallExp* funcall, bool allowRecursion)
    
    

    Major steps

    • eligibility check: skip things which cannot be inlined
    • if a function call is used as an expression operand: e.g. a = func1() + func2();
      • generate a temp variable to obtain the returned value: e.g. temp = func1();
      • replace the function call expression with the temp variable. e.g. a = temp + temp;
      • a slight optimization: if the function call is the only expression operand: e.g. a= func1(). No temp variable is needed (a can be directly used without another temp variable serving as intermediate variable.)
    • obtain the list of actual argument
    • make a copy of the body of the function to be inlined
    • rename labels in an inlined function definition. goto statements to them will be updated.
    • in the function body
      • create local variables , one per formal argument, initialize each with the actual argument
      • build a paramMap: mapping formal arguments (SgInitializedName) to new local variables (SgVariableSymbol)
      • this pointer is handled similarly: create a local variable , initialized with the caller's this pointer
      • replace variable references in the body with actual arguments // ReplaceParameterUseVisitor(paramMap).traverse(funbody_copy, postorder);
      • insert a label to indicate the end of the inlined function body // rose_inline_end__

    Limitations of this algorithm is not very clean:

    • It generates new local variables and a label.

    Eligibility check

    What can be inlined

    • a named function,
    • static member function, or
    • non-virtual non-static member function, // skip virtual functions
    • the function must be known (not through a function pointer or member function pointer). // empty function reference expression
    • the body of the function must already be visible in the current AST. // skip functions with empty body

    postprocessing

    inlineEverything.C has a clean up step to make the outlined code better

    • cleanupInlinedCode(sageProject);
      • remove unused labels: SageInterface::removeUnusedLabels(top);
      • remove jumps to next statement: SageInterface::removeJumpsToNextStatement(top);
      • simpleCopyAndConstantPropagation() // In code with declarations such as "int foo = bar", where foo and bar are not modified, replace "foo" with "bar" and remove the declaration
      • remove null statements: RemoveNullStatementsVisitor().traverse(top, postorder);
      • move declaration to first use: MoveDeclarationsToFirstUseVisitor().traverse(top, postorder); e.g. int x__2 =7; w= x_2 +3; ==> w=7+3;
      • doSubexpressionExpansionSmart () // Replaces all uses of a variable by its initialing expression. Requires that initname has an assign initializer Replaces all uses of initname in initname's scope by copy of its initializer expression. Then removes initname.
    • changeAllMembersToPublic(sageProject);

    This step does lots of things and can easily trigger some bugs.

    • Even wrose, the cleanup step operates on the entire AST, including C++ headers.
     // Post-inline AST normalizations
    
     // DQ (6/12/2015): These functions first renames all variable (a bit heavy handed for my tastes)
     // and then (second) removes the blocks that are otherwise added to support the inlining.  The removal
     // of the blocks is the motivation for renaming the variables, but the variable renaming is
     // done evarywhere instead of just where the functions are inlined.  I think the addition of
     // the blocks is a better solution than the overly agressive renaming of variables in the whole
     // program.  So the best solution is to comment out both of these functions.  All test codes
     // pass (including the token-based unparsing tests).
     // renameVariables(sageProject);
     // flattenBlocks(sageProject);
    
        cleanupInlinedCode(sageProject);
    
    // In code with declarations such as "int foo = bar", where foo and bar are
    // not modified, replace "foo" with "bar" and remove the declaration
    
    void simpleCopyAndConstantPropagation(SgNode* top) {
      FindReferenceVariablesVisitor().traverse(top, preorder);
      FindCopiesVisitor().traverse(top, preorder);
      FindUsedDeclarationsVisitor vis;
      vis.traverse(top, preorder);
      RemoveUnusedDeclarationsVisitor(vis.used_decls, set<SgFunctionDeclaration*>()).traverse(top, postorder);
    }
    

    Testing

    Test directory with an example translator and test input files

    By looking into Makefile.am, the example translator's source code will generate an executable named "inlineEverything" in your buildtree.

    translator: inlineEverything

    This is the tool you can try to inline your sample code.

    • inlineEverything

    The same Makefile.am's make check rules contain sample command lines to use the tool.

    To test one individual input file (such as template_functions.C), type

    • make inlineEverything_template_functions.C.passed // TODO: udpate to new ways to trigger the tests

    Command line options

    inlineEverything --help

    ---------------------Tool-Specific Help-----------------------------------
    This is a program transformation tool to inline function calls in your C/C++ or Fortran code.
    Usage: inlineEverything -c [options] input.c
    
    
    The optional options include: 
     -skip-postprocessing: Skip postprocessing which cleanups code
     -process-headers:     Process calls within header files
     -verbose:            Printout debugging information
    

    postprocessing

    --------------input----------------
    bash-4.2$ cat specimen25_1.C
     template<typename T>
     void swap(T& x, T& y)
     {
       T tmp = x;
       x = y;
       y = tmp;
     }
     
    int foo (int a, int b)
    {
       swap(a,b);
    }
     
    int main()
    {
    }
     
    ----- command line  -------------
    bash-4.2$ inlineEverything -c specimen26_1.C
     
    -----------output: with postprocessing (cleanup) --------------
    bash-4.2$ cat rose_specimen25_1.C 
    template < typename T >
     void swap ( T & x, T & y )
     {
       T tmp = x;
       x = y;
       y = tmp;
     }
     
    int foo(int a,int b)
    {
    {
        int tmp = a;
        a = b;
        b = tmp;
      }
    }
     
    int main()
    {
    }
    
    
    
    // output without postprocessing: cleanup
    template < typename T >
     void swap ( T & x, T & y )
     {
       T tmp = x;
       x = y;
       y = tmp;
     }
    
    int foo(int a,int b)
    {
    {
        int &x__2 = a;
        int &y__3 = b;
        int tmp = x__2;
        x__2 = y__3;
        y__3 = tmp;
        rose_inline_end__4:
        ;
      }
    }
    
    int main()
    {
    }
    
    
    

    Correctness checking

    make check only check if the number of inlined functions is the same as expected.

    # Note: must use the name convention of specimenXX_N.C , in which N is the number of function calls inlined.   
    # The specimens are named so that the number between the "_" and next "." is the number of function calls that
    # we expect this specimen to inline.
    inlineEverything_specimens =                    \
            specimen01_1.C   ..
    
    
    
    
    inlineEverything_test_targets = $(addprefix inlineEverything_, $(addsuffix .passed, $(inlineEverything_specimens)))
    TEST_TARGETS += $(inlineEverything_test_targets)
    $(inlineEverything_test_targets): inlineEverything_%.passed: % inlineEverything inlineEverything.conf
            @$(RTH_RUN)                                                                                             \
                    TITLE="inlineEverything $< [$@]"                                                                \
                    SPECIMEN="$(abspath $<)"                                                                        \
                    NINLINE="$$(echo $(notdir $<) |sed --regexp-extended 's/specimen[0-9]+_([0-9]+).*/\1/')"        \
                    TRANSLATOR="$$(pwd)/inlineEverything"                                                           \
                    $(srcdir)/inlineEverything.conf $@
    
    
     cat inlineEverything.conf 
    # Test configuration file (see "scripts/rth_run.pl --help" for details)
    # Tests the inliner
    
    # Run the tests in subdirectories for ease of cleanup.
    subdir = yes
    
    # Run the test and then make sure the output contains a certain string
    cmd = ${VALGRIND} ${TRANSLATOR} -rose:verbose 0 ${SPECIMEN} -o a.out |tee ${TEMP_FILE_0}
    cmd = grep "Test inlined ${NINLINE} function" ${TEMP_FILE_0}
    cmd = cat -n rose_*
    cmd = ./a.out
    
    # Extra stuff that might be useful to specify in the makefile
    title = ${TITLE}
    disabled = ${DISABLED}
    timeout = ${TIMEOUT}
    
    

    examples

    We use a set of increasingly more complex examples to explain the inlining algorithm used.

    More example input and output files are available at:

    naked call: no parameters in nor return output

    extern int x;
    
    void incrementX()
       {
          x++;
       }
    
    int main()
       {
         incrementX();
         return x;
       }
    
    
    //----------output, without postprocessing --------
    
    extern int x;
    
    void incrementX()
    {
      x++;
    }
    
    int main()
    {
       // the function body is copied here
      {              
        x++;
        rose_inline_end__2:   // a label for the end of a function is generated. 
        ;
      }
      
      return x;
    }
    
    //-----------output, with postprocessing for clean up
    // unused label and empty statement are removed
    
    extern int x;
    
    void incrementX()
    {
      x++;
    }
    
    int main()
    {
       // the function body is copied here
      {              
        x++;
      }
      
      return x;
    }
    
    
    

    function with a return

    // a function with a return
    extern int x;
    
    
    int incrementX()
    {
      x++;
      return x; 
    }
    
    int main()
    {
      incrementX();
      return x;
    }
    
    //---------- output without postprocessing
    
    // a function with a return
    extern int x;
    
    int incrementX()
    {
      x++;
      return x;
    }
    
    int main()
    {
      {
        x++;
        {                    // a return statement is translated into a block, go to the exit point in the end
          x;
          goto rose_inline_end__2;
        }
        rose_inline_end__2:  // a label for the end of the function: the exit point
        ;
      }
      return x;
    }
    
    //-------- with postprocessing, the code look the same 
    
    // a function with a return
    extern int x;
    
    int incrementX()
    {
      x++;
      return x;
    }
    
    int main()
    {
      {
        x++;
        {
          goto rose_inline_end__2;
        }
    rose_inline_end__2:
        ;
      }
      return x;
    }
    

    function calls as two expressions

    // input code -----------------------
    int foo(int i) {
      return 5+i;
    }
    
    int main(int, char**) {
      int w;
      w = foo(1)+ foo(2);
      return w;
    }
    
    //--------------after inlining-----------------
    // You can see that a temparory variable is used to capture the returned value of a function call.
    // Then the temp variable is used to replace the original function call expression
    int foo(int i)
    {
      return 5 + i;
    }
    
    int main(int ,char **)
    {
      int w;
      int rose_temp__4;
      {
        int i__2 = 1;
        {
          rose_temp__4 = 5 + i__2;
          goto rose_inline_end__3;
        }
        rose_inline_end__3:
        ;
      }
      int rose_temp__8;
      {
        int i__6 = 2;
        {
          rose_temp__8 = 5 + i__6;
          goto rose_inline_end__7;
        }
        rose_inline_end__7:
        ;
      }
      w = rose_temp__4 + rose_temp__8;
      return w;
    }
    
    //----- postprocessing does not simplify the code any further
    int foo(int i)
    {
      return 5 + i;
    }
    
    int main(int ,char **)
    {
      int rose_temp__4;
      {
        {
          rose_temp__4 = 5 + 1;
          goto rose_inline_end__3;
        }
        rose_inline_end__3:
          ;
      }
      int rose_temp__8;
      {
        {
          rose_temp__8 = 5 + 2;
          goto rose_inline_end__7;
        }
    rose_inline_end__7:
        ;
      }
      int w = rose_temp__4 + rose_temp__8;
      return w;
    }
    
    
    

    function call as a single expression

    An optimized transformation:

    • not blindly generate a temp variable to capture the value of a function call.
    • instead directly reuse the original declaration of the lhs variable inside the function body
    int foo(int i) {
      return 5+i;
    }
    
    int main(int, char**) {
      int w;
      w = foo(1);
      return w;
    }
    
    
    //-------------after inlining ----------
    
    int foo(int i)
    {
      return 5 + i;
    }
    
    int main(int ,char **)
    {
      int w;
      {
        int i__2 = 1;
        {
          w = 5 + i__2;
          goto rose_inline_end__3;
        }
    rose_inline_end__3:
        ;
      }
      return w;
    }
    
    
    //postprocessing does not simplify the code further.
    int foo(int i)
    {
      return 5 + i;
    }
    
    int main(int ,char **)
    {
      int w;
      {
        {
          w = 5 + 1;
          goto rose_inline_end__3;
        } 
    rose_inline_end__3:
        ;
      } 
      return w;
    } 
    
    
    

    3 operand operations

    the code is normalized.

    #include <stdlib.h>
    
    int foo() {
      exit (1);
      return 0;
    }
    
    int main(int, char**) {
      int w, x = 7;
      w = x == 8 ? foo() : 0;
      return w;
    }
    
    //----------- after inlining ---------------
    
    #include <stdlib.h>
    
    int foo()
    {
      exit(1);
      return 0;
    }
    
    int main(int ,char **)
    {
      int w;
      int x = 7;
      if (x == 8) {
        int rose_temp__4;
        {
          exit(1);
          {
            rose_temp__4 = 0;
            goto rose_inline_end__2;
          }
    rose_inline_end__2:
          ;
        }
        w = rose_temp__4;
      }
      else {
        w = 0;
      }
      return w;
    }
    
    

    data member access function

    #include <vector>
    typedef int    Index_t ;
    
    struct Domain
    {
      public:
        // non-reference type
        Index_t  numNode()            { return m_numNode ; }
    
        void AllocateNodeElemIndexes()
        {
          Index_t numNode = this->numNode() ;
        }
    
    #if 0  // the best inline result should look like the following
        void AllocateNodeElemIndexes_inlined()
        {
          Index_t numNode = m_numNode; // call site 1 inlined
        }
    #endif
    
      private:
        Index_t   m_numNode ;
    } domain;
    
    //----------------------------after inlining ----------------
    
    #include <vector>
    
    typedef int Index_t;
    
    struct Domain {
    
      // non-reference type
      inline Index_t numNode()
      {
        return (this) -> m_numNode;
      }
    
      inline void AllocateNodeElemIndexes()
      {
    
    //x. split declaration + initializer into two parts
    // a temporary variable to transfer value of initializer
    
        Index_t rose_temp__3;
    
    //x. a new code block to embed the function body
        {
          struct Domain *this__1 = this__1;
          {
            rose_temp__3 = this__1 -> m_numNode;
    
    //x. goto the label after function call
            goto rose_inline_end__2;
          }
    
    //x. label after the function call
    rose_inline_end__2:
          ;
        }
    
        Index_t numNode = rose_temp__3;
      }
    
      Index_t m_numNode;
    
    } domain;
    
    

    C++ template functions

    --------------input----------------
    bash-4.2$ cat specimen25_1.C
     template<typename T>
     void swap(T& x, T& y)
     {
       T tmp = x;
       x = y;
       y = tmp;
     }
     
    int foo (int a, int b)
    {
       swap(a,b);
    }
     
    int main()
    {
    }
     
    ----- command line  -------------
    bash-4.2$ inlineEverything -c -skip-postprocessing specimen26_1.C
     
    -----------output: with postprocessing (cleanup) --------------
    // output without postprocessing: cleanup
    template < typename T >
     void swap ( T & x, T & y )
     {
       T tmp = x;
       x = y;
       y = tmp;
     }
    
    int foo(int a,int b)
    {
    {
        int &x__2 = a;    // local variables for each formal arguments, initialized with actual arguments
        int &y__3 = b;
    
        int tmp = x__2;   // variable references are replace with the local variables
        x__2 = y__3;
        y__3 = tmp;
        rose_inline_end__4:   // a label to indicate the end of the outlined function body. 
        ;
      }
    }
    
    int main()
    {
    }
    
    

    multiple-level function calls

    //------------input 
    
    int foo(int x) {
      return x + 3;
    }
    
    int bar(int y) {
      return foo(y) + foo(2);
    }
    
    int main(int, char**) {
      int w;
      w = bar(1);
      return 0;
    }
    
    //--------------- output, no postprocessing ------------
    
    int
    foo (int x)
    {
      return x + 3;
    }
    
    int
    bar (int y)
    {
      int rose_temp__4;
      {
        int x__2 = y;
        {
          rose_temp__4 = x__2 + 3;
          goto rose_inline_end__3;
        }
      rose_inline_end__3:
        ;
      }
      int rose_temp__8;
      {
        int x__6 = 2;
        {
          rose_temp__8 = x__6 + 3;
          goto rose_inline_end__7;
        }
      rose_inline_end__7:
        ;
      }
      return rose_temp__4 + rose_temp__8;
    }
    
    int
    main (int, char **)
    {
      int w;
      {
        int y__10 = 1;
        int rose_temp__4;
        {
          int x__2 = y__10;
          {
    	rose_temp__4 = x__2 + 3;
    	goto rose_inline_end__3__1;
          }
        rose_inline_end__3__1:
          ;
        }
        int rose_temp__8;
        {
          int x__6 = 2;
          {
    	rose_temp__8 = x__6 + 3;
    	goto rose_inline_end__7__2;
          }
        rose_inline_end__7__2:
          ;
        }
        {
          w = rose_temp__4 + rose_temp__8;
          goto rose_inline_end__11;
        }
      rose_inline_end__11:
        ;
      }
      return 0;
    }
    

    Tutorial

    Official documentation about how to call the inlining API is:


    Troubleshooting


    TEST inlineEverything ../../../../../../sourcetree/tests/nonsmoke/functional/roseTests/astInliningTests/template_functions.C [inlineEverything_template_functions.C.passed]

    inlineEverything_template_functions.C [out]: Note: C++11 input files to ROSE are NOT supported using EDG 4.9 configuration with GNU compilers 4.9 and greater (configure ROSE using EDG 4.12)

    This article is issued from Wikibooks. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.