LLVM Weekly - #124, May 16th 2016

Welcome to the one hundred and twenty-fourth issue of LLVM Weekly, a weekly newsletter (published every Monday) covering developments in LLVM, Clang, and related projects. LLVM Weekly is brought to you by Alex Bradbury. Subscribe to future issues at http://llvmweekly.org and pass it on to anyone else you think may be interested. Please send any tips or feedback to asb@asbradbury.org, or @llvmweekly or @asbradbury on Twitter.

The canonical home for this issue can be found here at llvmweekly.org.

News and articles from around the web

The main news this week is the announcement of Scala-native, an ahead-of-time compiler for Scala using LLVM. Jos Dirkens has written a getting started guide if you want to compile it and try it out. There's also more information in the slides from the announcement talk.

On the mailing lists

More of the students taking part in Google Summer of Code with LLVM-related projects have introduced themselves and their plans. Vivek Pandya will be working on interprocedural register allocation. Scott Egerton will be working on capture tracking improvements. Jie Chen will be working on better alias analysis, specifically improving cfl-aa. Matthias Reisinger will be working on enabling polyhedral optimisations in Julia, and Zhengyan Liu has plans for SAFECode memory hardening.
Renato Golin kicked off a discussion about whether LLVM's release process could be better aligned with downstream users. This thread covered a broad range of topics and triggered a lot of discussion, but luckily there's no need to summarise it as Renato has done the job for us.
Nicolai Hähnle notes that currently libLLVM.so contains about 1.7MB in its .data.rel.ro section, of which about 1.3MB comes from the MCInstrDesc tables created by tablegen representing a massive number of pointers to be relocated. He suggests reducing this by using offsets instead. Reducing the relocations will both reduce binary size and increase the portion of the binary that can be mapped as shared. So far, responses to the thread are supportive of the idea.
James Knight has written a detailed post on how it's not really possible to write an LL/SC loop guaranteed to make forward progress in LLVM IR right now. There are restrictions on what you can do between a load-linked and a store-conditional instruction that the code generator may not meet.
A public llvm-foundation mailing list has been announced, which to facilitate discussions related to the Foundation.
As well as the long, technically detailed and precise threads each week it's nice to highlight cases where a simple question has a simple answer. How do you register a pass as being opt-in based on a command-line flag? Answer: have it run every time, but return immediately if the desired command line flag isn't present.
Sanjoy Das has shared an RFC on adding a callee-saved register verifier. As is clarified later in the thread, the intention is to ensure that code not generated by LLVM (e.g. output from another JIT or hand-written assembly) properly adheres to the calling convention and doesn't clobber registers it shouldn't. The proposed pass would simply add code to check that the test values written to the callee-saved registers aren't modified.
In response to questions about pass ordering, Mehdi Amini has written a helpful description of what exactly happens when you do opt -mymodulepass0 -myfunctionpass -mymodulepass1.
Konstantin Vladimirov wonders if there's an option to force the register allocator to use as many architectural registers as possible to reduce dependencies. The short answer is there isn't currently, but it would be interesting to investigate.
Diana Picus has shared an RFC on modifying llc so it no longer exits after the first error. Generally people are in favour, and the patch should hopefully land soon (it had to be temporarily backed out after exposing some test cases failures in lldb).
Nico Weber has noted that now with AVX512, Clang's intrinsics headers are huge. This can cause compile time issues, for instance Nico reports building all of the v8 JS engine is 6% faster after removing the avx512 includes. The thread participants haven't yet decided on the best way forward to fix this, beyond the potential immediate step of adding include guards so AVX512 intrinsic headers aren't included when not compiling for AVX512 platforms.

LLVM commits

The outdated guide on cross-compiling LLVM has been brought up to date. r269054.
The WebAssembly backend gained preliminary fast instruction selection (fast-isel) support. r269083, r269203, r269273.
Loop unrolling (other than in the case of explicit pragmas) is now disabled at -Os in LLVM. You may recall last week it was enabled for -Os in Clang, but with different thresholds. r269124.
A new cost-tracking system has been implemented for the loop unroller. r269388.
LLVM's Sparc backend has seen the addition of more LEON-specific features, e.g. signed and unsigned multiply-accumulate. r268908.
llc's -run-pass option will now work with any pass known to the pass registry. Previously it would silently do nothing if you specify indirectly added analysis passes or passes not present in the optimisation pipeline. r269003.
WebAssembly register stackification and coloring are now run very late in the optimisation pipeline. The commit message suggests it's useful to think of these passes as domain-specific liveness-based compression rather than a conventional optimisation. r269012.
When declaring global in textual LLVM IR, you must now assign them with e.g. @0 = global i32 42. r269096.
The internal assembler is now enabled by default for 32-bit MIPS targets. r269560.

Clang commits

Clang now supports __float128. r268898.
Clang gained a new warning that triggers when casting away calling conventions from a function. r269116.
The recently developed include-fixer tools now has documentation. r269167.

Other project commits

compiler-rt's CMake build system can now build builtins without a full toolchain, allowing you to bootstrap a cross-compiler. r268977.
LLD will now sort relocations to optimise dynamic linker performance. r269066.