LLVM Project Blog

LLVM Project News and Details from the Trenches

Tuesday, September 19, 2017

Clang ♥ bash -- better auto completion is coming to bash



Compilers are complex pieces of software and have a multitude of command-line options to fine tune parameters. Clang is no exception: it has 447 command-line options. It’s nearly impossible to memorize all these options and their correct spellings, that's where shell completion can be very handy. When you type in the first few characters of a flag and hit tab, it will autocomplete the rest for you.

Background
However, such a autocompletion feature is not available yet, as there's no easy way to get a complete list of the options Clang supports. For example, bash doesn’t have any autocompletion support for Clang, and despite some shells like zsh having a script for command-line autocompletion, they use hard coded lists of command-line options, and are not automatically updated when a new option is added to Clang. These shells also can’t autocomplete arguments which some flags take (-std=[tab] for instance).


This is the problem we were working to solve during this year’s Google Summer of Code. We’re adding a feature to Clang so that we can implement a complete, exact command-line option completion which is highly portable for any shell. To start with, we'll provide a completion script for bash which uses this feature.

Implementation
Clang now has a new command line option called --autocomplete. This flag receives the incomplete user input from the shell and then queries the internal data structures of the current Clang binary, and returns a list of possible completions. With this API, we can always get an accurate list of options and values any time, on any newer versions of Clang.

We built an autocompletion using this in bash for the first implementation. You can find its source code here. Also, here is the sample for Qt text entry autocompletion to give an example how to use this API from an UI application as seen below:

final.gif

You can always complete one flag at a time. So if you want to use the API, you have to select the flag that the user is currently typing. Then just pass this flag to the --autocomplete flag in the selected clang binary. So in the case below all flags start with `-tr` are displayed with their descriptions behind them (separated from the flag with a tab character).
The API also supports completing the values of flags. If you have a flag for which value completion is supported, you can also provide an incomplete value behind the flag separated by a comma to get completion for this:
If you provide nothing after the comma, the list of the all possible values for this flag is displayed.

How to get it
This feature is available for use now with LLVM/clang 5.0 and we’ll also be adding this feature to the standard bash completion package. Make sure you have the latest clang version on your machine, and source this script. If want to make the change permanent, just source it from your .bashrc and enjoy typing your clang invocations!

Monday, September 11, 2017

2017 US LLVM Developers' Meeting Program

The LLVM Foundation is excited to announce the selected proposals for the 2017 US LLVM Developers' Meeting!

Keynotes:


Talks:


BoFs:


Tutorials:


Lightning Talks:


Student Research Competition:


Posters:

If you are interested in any of these talks, you should register to attend the 2017 US LLVM Developers' Meeting! Tickets are limited, so register now!

Friday, August 18, 2017

LLVM on Windows now supports PDB Debug Info

For several years, we’ve been hard at work on making clang a world class toolchain for developing software on Windows.  We’ve written about this several times in the past, and we’ve had full ABI compatibility (minus bugs) for some time. One area that been notoriously hard to achieve compatibility on has been debug information, but over the past 2 years we’ve made significant leaps.  If you just want the TL;DR, then here you go: If you’re using clang on Windows, you can now get PDB debug information!


Background: CodeView vs. PDB
CodeView is a debug information format invented by Microsoft in the mid 1980s. For various reasons, other debuggers developed an independent format called DWARF, which eventually became standardized and is now widely supported by many compilers and programming languages.  CodeView, like DWARF, defines a set of records that describe mappings between source lines and code addresses, as well as types and symbols that your program uses.  The debugger then uses this information to let you set breakpoints by function name, display the value of a variable, etc.  But CodeView is only somewhat documented, with the most recent official documentation being at least 20 years old.  While some records still have the format documented above, others have evolved, and entirely new records have been introduced that are not documented anywhere.


It’s important to understand though that CodeView is just a collection of records.  What happens when the user says “show me the value of Foo”?  The debugger has to find the record that describes Foo.  And now things start getting complicated.  What optimizations are enabled?  What version of the compiler was used?  (These could be important if there are certain ABI incompatibilities between different versions of the compiler, or as a hint when trying to reconstruct a backtrace in heavily optimized code, or if the stack has been smashed).  There are a billion other symbols in the program, how can we find the one named Foo without doing an exhaustive O(n) search?  How can we support incremental linking so that it doesn’t take a long time to re-generate debug info when only a small amount of code has actually changed?  How can we save space by de-duplicating strings that are used repeatedly?  Enter PDB.


PDB (Program Database) is, as you might have guessed from the name, a database.  It contains CodeView but it also contains many other things that allow indexing of the CodeView records in various ways.  This allows for fast lookups of types and symbols by name or address, the philosophical equivalent of “tables” for individual input files, and various other things that are mostly invisible to you as a user but largely responsible for making the debugging experience on Windows so great.  But there’s a problem: While CodeView is at least kind-of documented, PDB is completely undocumented.  And it’s highly non-trivial.


We’re Stuck (Or Are We?)
Several years ago, we decided that the path forward was to abandon any hope of emitting CodeView and PDB, and instead focus on two things:
  1. Make clang-cl emit DWARF debug information on Windows
  2. Port LLDB to Windows and teach it about the Windows ABI, which would be significantly easier than teaching Visual Studio and/or WinDbg to be able to interpret DWARF (assuming this is even possible at all, given that everything would have to be done strictly through the Visual Studio / WinDbg extensibility model)
In fact, I even wrote another blog post about this very topic a little over 2 years ago.  So I got it to work, and I eventually got parts of LLDB working on Windows for simple debugging scenarios.


Unfortunately, it was beginning to become clear that we really needed PDB.  Our goal has always been to create as little friction as possible for developers who are embedded in the Windows ecosystem.  Tools like Windows Performance Analyzer and vTune are very powerful and standard tools in engineers’ existing repertoires.  Organizations already have infrastructure in place to archive PDB files, and collect & analyze crash dumps.  Debugging with PDB is extremely responsive given that the debugger does not have to index symbols upon startup, since the indices are built into the file format.  And last but not least, tools such as WinDbg are already great for post-mortem debugging, and frankly many (perhaps even most) Windows developers will only give up the Visual Studio debugger when it is pried from their cold dead hands.


I got some odd stares (to put it lightly) when I suggested that we just ask Microsoft if they would help us out.  But ultimately we did, and… they agreed!  This came in the form of some code uploaded to the Microsoft Github repo which we were on our own to figure out.  Although they were only able to upload a subset of their PDB code (meaning we had to do a lot of guessing and exploration, and the code didn’t compile either since half of it was missing), it filled in enough blanks that we were able to do the rest.


After about a year and a half of studying this code, hacking away, studying the code some more, hacking away some more, etc, I’m proud to say that lld (the LLVM linker) can finally emit working PDBs.  All the basics like setting breakpoints by line, or by name, or viewing variables, or searching for symbols or types, everything works (minus bugs, of course).


For those of you who are interested in digging into the internals of a PDB, we also have been developing a tool for expressly this purpose.  It’s called llvm-pdbutil and is the spiritual counterpart to Microsoft’s own cvdump utility.  It can dump the internals of a PDB, convert a PDB to yaml and vice versa, find differences between two PDBs, and much more.  Brief documentation for llvm-pdbutil is here, and a detailed description of the PDB file format internals are here, consisting of everything we’ve learned over the past 2 years (still a work in progress, as I have to divide my time between writing the documentation and actually making PDBs work).


Bring on the Bugs!
So this is where you come in.  We’ve tested simple debugging scenarios with our PDBs, but we still consider this alpha in terms of debug info quality.  We’d love for you to try it out and report issues on our bug tracker.  To get you started, download the latest snapshot of clang for Windows.  Here are two simple ways to test out this new functionality:
  1. Have clang-cl invoke lld automatically
    1. clang-cl -fuse-ld=lld -Z7 -MTd hello.cpp
  2. Invoke clang-cl and lld separately.
    1. clang-cl -c -Z7 -MTd -o hello.obj hello.cpp
    2. lld-link -debug hello.obj
We look forward to the onslaught of bug reports!


We would like to extend a very sincere and deep thanks to Microsoft for their help in getting the code uploaded to the github repository, as we would never have gotten this far without it.


And to leave you with something to get you even more excited for the future, it's worth reiterating that all of this is done without a dependency on any windows specific api, dll, or library.  It's 100% portable.  Do I hear cross-compilation?

Zach Turner (on behalf of the the LLVM Windows Team)

Friday, March 10, 2017

Devirtualization in LLVM and Clang

This blog post is part of a series of blog posts from students who were funded by the LLVM Foundation to attend the 2016 LLVM Developers' Meeting in San Jose, CA. Please visit the LLVM Foundation's webpage for more information on our Travel Grants program. 

This post is from Piotr Padlewski on his work that he presented at the meeting:

This blogpost will show how C++ devirtualization is performed in current (4.0) clang and LLVM and also ongoing work on -fstrict-vtable-pointers features.

Devirtualization done by the frontend


In order to transform a virtual call into a direct call, the frontend must be sure that there are no overrides of vfunction in the program or know the dynamic type of object. Compilation proceeds one translation unit at a time, so, barring LTO, there are only a few cases when the compiler may conclude that there are no overrides:

  • either the class or virtual method is marked as final
  • the class is defined in an anonymous namespace and has no deriving classes in its translation unit

The latter is more tricky for clang, which translates the source code in chunks on the fly (see: ASTProducer and ASTConsumer), so is not able to determine if there are any deriving classes later in the source. This could be dealt with in a couple of ways:
  • give up immediate generation
  • run data flow analysis in LLVM to find all the dynamic types passed to function, which has static linkage
  • hope that every use of the virtual function, which is necessarily in the same translation unit, will be inlined by LLVM -- static linkage increases the chances of inlining

Store to load propagation in LLVM

In order to devirtualize a virtual call we need:
  • value of vptr - which virtual table is pointed by it
  • value of vtable slot - which exact virtual function it is

Because vtables are constant, the latter value is much easier to get when we have the value of vptr. The only thing we need is vtable definition, which can be achieved by using available_externally linkage.

In order to figure out the vptr value, we have to find the store to the same location that defines it. There are 2 analysis responsible for it:

  • MemDep (Memory Dependence Analysis) is a simple linear algorithm that for each quered instruction iterates through all instructions above and stops when first dependency is found. Because queries might be performed for each instruction we end up with a quadratic algorithm. Of course quadratic algorithms are not welcome in compilers, so MemDep can only check certain number of instructions.
  • Memory SSA on the other hand has constant complexity because of caching. To find out more, watch “Memory SSA in 5minutes” (https://www.youtube.com/watch?v=bdxWmryoHak). MemSSA is a pretty new analysis and it doesn’t have all the features MemDep has, therefore MemDep is still widely used.
The LLVM main pass that does store to load propagation is GVN - Global Value Numbering.



Finding vptr store

In order to figure out the vptr value, we need to see store from constructor. To not rely on constructor's availability or inlining, we decided to use the @llvm.assume intrinsic to indicate the value of vptr. Assume is akin to assert - optimizer seeing call to @llvm.assume(i1 %b) can assume that %b is true after it. We can indicate vptr value by comparing it with the vtable and then call the @llvm.assume with the result of this comparison.

call void @_ZN1AC1Ev(%struct.A* %a) ; call ctor
 %3 = load {...} %a                  ; Load vptr
 %4 = icmp eq %3, @_ZTV1A      ; compare vptr with vtable
 call void @llvm.assume(i1 %4)


Calling multiple virtual functions

A non-inlined virtual call will clobber the vptr. In other words, optimizer will have to assume that vfunction might change the vptr in passed object. This sounds like something that never happens because vptr is “const”. The truth is that it is actually weaker than C++ const member, because it changes multiple times during construction of an object (every base type constructor or destructor must set vptrs). But vptr can't be changed during a virtual call, right? Well, what about that?

void A::foo() { // virtual
static_assert(sizeof(A) == sizeof(Derived));
new(this) Derived;
}

This is call of placement new operator - it doesn’t allocate new memory, it just creates a new object in the provided location. So, by constructing a Derived object in the place where an object of type A was living, we change the vptr to point to Derived’s vtable. Is this code even legal? C++ Standard says yes.

However it turns out that if someone called foo 2 times (with the same object), the second call would be undefined behavior. Standard pretty much says that call or dereference of a pointer to an object whose lifetime has ended is UB, and because the standard agrees that nuking object from inside ends its lifetime, the second call is UB. Be aware that this is only because a zombie pointer is used for the second call. The pointer returned by placement new is considered alive, so performing calls on that pointer is valid. Note that we also silently used that fact with the use of assume.

(un)clobbering vptr

We need to somehow say that vptr is invariant during its lifetime. We decided to introduce a new metadata for that purpose - !invariant.group. The presence of the invariant.group metadata on the load/store tells the optimizer that every load and store to the same pointer operand within the same invariant group can be assumed to load or store the same value. With -fstrict-vtable-pointers Clang decorates vtable loads with invariant.group metadana coresponding to caller pointer type. 

We can enhance the load of virtual function (second load) by decorating it with !invariant.load, which is equivalent of saying “load from this location is always the same”, which is true because vtables never changes. This way we don’t rely on having the definition of vtable.

Call like:

void g(A *a) {
  a->foo();
  a->foo();
}

Will be translated to:

define void @function(%struct.A* %a) {
 %1 = load {...} %a, !invariant.group !0
 %2 = load {...} %1, !invariant.load !1
 call void %2(%struct.A* %a)

 %3 = load {...} %a, !invariant.group !0
 %4 = load {...} %4, !invariant.load !1
 call void %4(%struct.A* %a)
 ret void
}

!0 = !{!"_ZTS1A"} ; mangled type name of A
!1 = !{}

And now by magic of GVN and MemDep:

define void @function(%struct.A* %a) {
 %1 = load {...} %a, !invariant.group !0
 %2 = load {...} %1, !invariant.load !1
 call void %2(%struct.A* %a)
 call void %2(%struct.A* %a)
 ret void
}

With this, llvm-4.0 is be able to devirtualize function calls inside loops. 

Barriers

In order to prevent the middle-end from finding load/store with the same !invariant.group metadata, that would come from construction/destruction of dead dynamic object, @llvm.invariant.group.barrier was introduced. It returns another pointer that aliases its argument but is considered different for the purposes of load/store invariant.group metadata. Optimizer won’t be able to figure out that returned pointer is the same because intrinsics don’t have a definition. Barrier must be inserted in all the places where the dynamic object changes:
  • constructors
  • destructors
  • placement new of dynamic object

Dealing with barriers

Barriers hinder some other optimizations. Some ideas how it could be fixed:

  • stripping invariant.group metadata and barriers just after devirtualization. Currently it is done before codegen. The problem is that most of the devirtualization comes from GVN, which also does most of the optimizations we would miss with barriers. GVN is expensive therefore it is run only once. It also might make less sense if we are in LTO mode, because that would limit the devirtualization in the link phase. 
  • teaching important passes to look through the barrier. This might be very tricky to preserve the semantics of barrier, but e.g. looking for dependency of load without invariant.group by jumping through the barrier to find a store without invariant.group, is likely to do the trick.
  • removing invariant.barrier when its argument comes from alloca and is never used etc.
To find out more details about devirtualization check my talk (http://llvm.org/devmtg/2016-11/#talk6) from LLVM Dev Meeting 2016.

About author

Undergraduate student at University of Warsaw, currently working on C++ static analysis in IIIT.

Monday, March 6, 2017

Some news about apt.llvm.org

apt.llvm.org provides Debian and Ubuntu repositories for every maintained version of these distributions. LLVM, Clang, clang extra tools, compiler-rt, polly, LLDB and LLD packages are generated for the stable, stabilization and development branches.

As it seems that we have more and more users of these packages, I would like to share an update about various recent changes.

New features

LLD
First, the cool new stuff : lld is now proposed and built for i386/amd64 on all Debian and Ubuntu supported versions. The test suite is also executed and the coverage results are great.

4.0
Then, following the branching for the 4.0 release, I created new repositories to propose this release.
For example, for Debian stable, just add the following in /etc/apt/sources.list.d/llvm.list

deb http://apt.llvm.org/jessie/ llvm-toolchain-jessie-4.0 main
  deb-src http://apt.llvm.org/jessie/ llvm-toolchain-jessie main

llvm-defaults
Obviously, the trunk is now 5.0. If llvm-defaults is used, clang, lldb and other meta packages will be automatically updated to this version.
As a consequence and also because the branches are dead, 3.7 and 3.8 jobs have been disabled. Please note that both repositories are still available on apt.llvm.org and won't be removed.

Zesty: New Ubuntu
Packages for the next Ubuntu 17.04 (zesty) are also generated for 3.9, 4.0 and 5.0.

libfuzzer
It has been implemented a few months ago but not clearly communicated. libfuzzer has also its own packages: libfuzzer-X.Y-dev (example: libfuzzer-3.9-dev, libfuzzer-4.0-dev or libfuzzer-5.0-dev).


Changes in the infrastructure


In order to support the load, I started to use new blades that Google (thanks again to Nick Lewycky) sponsored for an initiative that I was running for Debian and IRILL. The 6 new blades removed all the wait time. With a new salt configuration, I automated the deployment of the slaves. In case the load increases again, we will have access to more blades.

I also took the time to fix some long ongoing issues:
  • all repositories are signed and verified that they are    
  • i386 and amd64 packages are now uploaded at once instead of being uploaded separately. This was causing checksum error when one of the two architectures built correctly and the second was failing (ex: test failing)
Last but not least, the code coverage results are produced in a more reliable manner.


More information about the implementation and services.

As what is shipped on apt.llvm.org is exactly the same as in Debian and Ubuntu, packaging files are stored on the Debian subversion server.

A Jenkins instance is in charge of the orchestration of the whole build infrastructure.

The trunk packages are built twice a day for every Debian and Ubuntu packages. Branches (3.9 and 4.0 currently) are rebuilt only when the - trigger job found a change.

In both case, the Jenkins source job will checkout the Debian SVN branches for their version, checkout/update LLVM/clang/etc repositories and repack everything to create the source tarballs and Debian files (dsc, etc).The completion of job will trigger the binaries job to start. These jobs, thanks to Debian Jenkins glue will create or update Debian/Ubuntu versions.

Then builds are done the usual way through pbuilder for both i386 and amd64. All the test suites are going to be executed. If any LLVM test is failing on i386 or amd64, the whole build will fail. If both builds and the LLVM testsuite are successful, the sync job will start and rsync packages to the LLVM server to be replicated on the CDN. If one or both builds fail, a notification is sent to the administrator.

Some Debian static analysis (lintian) are executed on the packages to prevent some packaging errors. From time to time, some interesting issues are found.

In parallel, some binary builds have some special hooks like Coverity, code coverage or installation of more recent versions of gcc for Ubuntu precise.

Report bugs

Bugs can be reported on the bugzilla of the LLVM project in the product "Packaging" and the component "deb packages".
  

Common issues

Because packaging quickly moving projects like LLVM or clang, in some cases, this can be challenging to follow the rhythm in particular with regard to tests. For Debian unstable or the latest version of Ubuntu, the matrix is complexified by new versions of the basic pieces of the operating system like gcc/g++ or libtstdc++.

This is also not uncommon that some tests are being ignored in the process.

How to help


Some new comers bugs are available. As an example:
Related to all this, a Google Summer of Code 2017 under the LLVM umbrella has been proposed: Integrate libc++ and OpenMP in apt.llvm.org

Help is also needed to keep track of the new test failures and get them fixed upstream. For example, a few tests have been marked as expected to fail to avoid crashes.

Tuesday, February 21, 2017

2016 LLVM Developers' Meeting - Experience from Johannes Doerfert, Travel Grant Recipient

This blog post is part of a series of blog posts from students who were funded by the LLVM Foundation to attend the 2016 LLVM Developers' Meeting in San Jose, CA. Please visit the LLVM Foundation's webpage for more information on our Travel Grants program.

This post is from Johannes Doerfert:
2016 was my third time attending the US LLVM developers meeting and for the third year in a row I was impressed by the quality of the talks, the organization and the diversity of attendees. The hands on experiences that are presented, combined with innovative ideas and cutting edge research makes it a perfect venue for me as a PhD student. The honest interest in the presented topics and the lively discussions that include students, professors and industry people are two of the many things that I experienced the strongest at these developer meetings.

For the last two years I was mainly attending as a Polly developer that talked about new features and possible applications of Polly. This year however my roles were different. First, I was attending as part of the organization team of the European LLVM developers meeting 2017 [0] together with my colleagues Tina Jung and Simon Moll. In this capacity I answered questions about the venue (Saarbruecken, Germany [1,2]) and the alterations in contrast to prior meetings. Though, more importantly, I advertised the meeting to core developers that usually do not attend the European version. Second on my agenda was the BoF on a parallel extension to the LLVM-IR which I organized with Simon Moll. In this BoF, but also during the preparation discussion on the mailing list [3], we tried to collect motivating examples, requirements as well as known challenges for a parallel extension to LLVM. These insights will be used to draft a proposal that can be discussed in the community.

Finally, I attended as a 4th year PhD student who is interested in contributing his work to the LLVM project (not only Polly). As my current research required a flexible polyhedral value (and iterationspace) analysis, I used the opportunity to implement one with aninterface similar to scalar evolution. The feedback I received on this topic was strictly positive. I will soon post a first version of this standalone analysis and start a public discussion. Since I hope to finish my studies at some (not too distant) point in time, I seized the opportunity to inquire about potential options for the time after my PhD.

As a final note I would like to thank the LLVM Foundation for their student travel grant that allowed me to attend the meeting in the first place.


[0] http://llvm.org/devmtg/2017-03/
[1] http://sic.saarland/
[2] https://en.wikipedia.org/wiki/Saarbr%C3%BCcken
[3] http://lists.llvm.org/pipermail/llvm-dev/2016-October/106051.html

Wednesday, December 14, 2016

LLVM's New Versioning Scheme

Historically, LLVM's major releases always added "0.1" to the version number, producing major versions like 3.8, 3.9, and 4.0 (expected by March 2017). With our next release though, we're changing this.  The LLVM version number will now increase by "1.0" with every major release, which means that the first major release after LLVM 4.0 will be LLVM 5.0 (expected September 2017).
We believe that this approach will provide a simpler and more standard approach to versioning.