eddic 1.2.2 - Performances, improved optimizations and additions to standard library

These last weeks, I had more work than expected with my Master thesis so it took me longer to finish this new version of eddic. Moreover, I included more stuff than I though in this version. Anyway, I'm happy to announce the version 1.2.2 of eddic.

It is a minor version regarding the language itself. On the other, there are a lot of changes in the compiler itself.

For the language:

  1. Structures are now correclty copy constructed when passed by value
  2. When the same header is included several times accross the program, it is not parsed again
  3. The vector structure has now functions to insert and remove elements in arbitrary positions
  4. The functions to print bools, floats and integers are now written in EDDI directly. Only the functions to print chars and raw string are now written in assembly

I worked on improving the performances by improving the constant propagation pass that runs less times now and by tuning a bit the data-flow framework, avoiding virtual calls.

Another improvement is that all the mtac::Statement types have been merged in mtac::Quadruple, this removes one level of indirection and simplifies several passes. Moreover, there are now directly stored inside a vector and not allocated via shared_ptr. This removes another level of indirection.

Put together, these two optimizations improved the performances of the compiler by about 15%. On the other hand, now that printF and printI are written in EDDI, it takes much longer to compile. I will work on that for the next version too. One way to improve the performances will be to tune the ordering of passes and also to tune the passes themselves so that they do more work at once. I will also try to merge constant propagation and offset constant propagation together. They perform very similar work.

There are also several improvements in the optimization engine:

  • The loop analysis has been fixed to handle loops bigger than one basic block. There was a problem in my implement of Lengauer and Tarjan making that dominators were not computed.
  • The optimization engine now create a call graph of the program. This call graph is used to remove unused functions that are called but not reachable from the main function.
  • A new analysis pass has been added: pure_analysis. This pass test if a function is pure (no write to pointers or global variables) and thus avoid creating a basic block for it
  • The Loop Invariant Code Motion algorithm has been improved to handle more invariants
  • The Common Subexpression Elimination algorithm has been improved to handle more expression
  • The Induction Variables analysis has been reviewed and several bugs have been corrected. It is now a bit complicated.

A big bug has been fixed in the handling of the MEMSET LTAC instruction. This will be completely reviewed in the next version (See Future Work).

Some analysis starts to become quite complicated. I'm thinking of using SSA in MTAC in order to simplify some of the passes and to easily compute ud-chains. Another thing that I'm thinking is to add a powerful and complete alias analysis that would really improve the efficiency of some passes (offset constant propagation for instance) by making them less conservative.

I also have removed some memory leaks (will try to remove all of them in the next version). I added a new optimization level: O3. This level enables loop unrolling and complete loop peeling.

Future Work

The next version of the EDDI compiler (eddic) will be the version 1.2.3. The inliner will be improved to work directly in the call graph in postorder. That should produce better code. I will also try to improve the inlining heuristics. A first basic version of loop unswitching will be added as well. I will add a small local constant propagation pass for globals. I will also continue to work on the performances of the passes to avoid repeating them too much. MEMSET will be completely reviewed. That should produce smaller and faster code. Until now, the sizes of the types bool and chars were the same as int. They will be optimized to take only 1 byte.

I will also continue the improvements of the data structures by merging all ltac::Statement into ltac::Instruction and storing them directly.

And there will probably be some bug fixing as well.

Download

You can find the EDDI Compiler sources on the Github repository: https://github.com/wichtounet/eddic

The version is available in the v1.2.2 tag available in the GitHub or directly in the master branch.

Related articles

  • EDDI Compiler 1.1.4 – Graph Coloring Register Allocation
  • eddic 1.2.3 - Better data-flow analysis
  • eddic 1.2.1 - string, concatenation and vector
  • eddic 1.2.0 - Single inheritance, copy construction
  • EDDI Compiler 1.0.2 – Better pointer support and Dead-Code Elimination
  • EDDI Compiler 1.0 - Structures and Global Optimizations
  • Comments

    Comments powered by Disqus