- FTL Performance
- FTL Optimization Strategy
- LLVM Patch Points
- FTL-Style LLVM IR
- MCJIT and the LLVM C API
- Linking WebKit with LLVM
- FTL Efficiency
- Optimization Improvements
- Extending Patch Points
- LLVM IR has now adopted features for supporting speculative, profile-driven optimization and avoiding the performance penalty associated with abstractions when they cannot be removed.
- The dynamic link time initialization overhead of the static initializers that LLVM defines is unacceptable at program launch time - especially if only parts of the library or nothing at all are used.
- LLVM initializes global variables that require running exit-time destructors. This causes a multi-threaded parent application that attempts to exit normally to crash instead.
- As with static initializers, weak vtables introduce an unnecessary and unacceptable dynamic link time overhead.
- In general only a limited set of methods - the LLVM API - should be exported from the shared library.
- LLVM usurps process-level API calls like assert, raise, and abort.
- The resulting size of the LLVM shared library naively built from static libraries is larger than it needs to be. Build logic and conditional compilation should be added to ensure that only the passes and platform support required by the JIT client are ultimately linked into the shared library.
Note that the first patch point operand is an identifier that tells the runtime the program location of the intrinsic, allowing it find the correct stack map record for the program state at that location. After the above optimization, not only does LLVM avoid performing repeated checks within the loop, but it also avoids maintaining additional runtime state throughout the loop body. Generally, high level optimization requiring knowledge of language-specific semantics is best performed on a higher level IR. But in this case, extending LLVM with one aspect of high level semantics allows LLVM's loop and expression analysis to be directly leveraged and naturally extended into a new class of optimization.
%a = cmp <TrapConditionA> call @patchpoint(1, %a, <state-before-loop>) Loop: %b = cmp <TrapConditionB> @patchpoint(2, %b, <state-inside-loop>) <do something...> Could be safely optimized into: %c = cmp <TrapConditionC> // where C implies both A and B @patchpoint(1, %c, <state-before-loop>) Loop: do something...