LLVM Project News and Details from the Trenches

Sunday, May 29, 2011

LLVM @ "The Architecture of Open Source Applications"

LLVM is featured in a chapter of the new book The Architecture of Open Source Applications. This chapter talks about the high-level design of LLVM, and how it differs from other contemporary compilers and JITs out there, why you might want to use it (if you're looking for compiler libraries), a simple example of writing an optimization, how the code is structured, a 10,000 foot view of how the code generator works, and some of the interesting capabilities LLVM has due to its design. If you're curious what this whole LLVM thing is, then this is a great place to start.

This book is available inexpensively in dead tree or PDF form, and the author royalties are donated to charity. The content is also available for free under the creative commons license. Share and enjoy,

-Chris Lattner

Monday, May 23, 2011

C++ at Google: Here Be Dragons

Google has one of the largest monolithic C++ codebases in the world. We have thousands of engineers working on millions of lines of C++ code every day. To help keep the entire thing running and all these engineers fast and productive we have had to build some unique C++ tools, centering around the Clang C++ compiler. These help engineers understand their code and prevent bugs before they get to our production systems.

Saturday, May 21, 2011

What Every C Programmer Should Know About Undefined Behavior #3/3

In Part 1 of the series, we took a look at undefined behavior in C and showed some cases where it allows C to be more performant than "safe" languages. In Part 2, we looked at the surprising bugs this causes and some widely held misconceptions that many programmers have about C. In this article, we look at the challenges that compilers face in providing warnings about these gotchas, and talk about some of the features and tools that LLVM and Clang provide to help get the performance wins, while taking away some of the surprise.

Saturday, May 14, 2011

What Every C Programmer Should Know About Undefined Behavior #2/3

In Part 1 of our series, we discussed what undefined behavior is, and how it allows C and C++ compilers to produce higher performance applications than "safe" languages. This post talks about how "unsafe" C really is, explaining some of the highly surprising effects that undefined behavior can cause. In Part #3, we talk about what friendly compilers can do to mitigate some of the surprise, even if they aren't required to.

I like to call this "Why undefined behavior is often a scary and terrible thing for C programmers". :-)

Friday, May 13, 2011

What Every C Programmer Should Know About Undefined Behavior #1/3

People occasionally ask why LLVM-compiled code sometimes generates SIGTRAP signals when the optimizer is turned on. After digging in, they find that Clang generated a "ud2" instruction (assuming X86 code) - the same as is generated by __builtin_trap(). There are several issues at work here, all centering around undefined behavior in C code and how LLVM handles it.

This blog post (the first in a series of three) tries to explain some of these issues so that you can better understand the tradeoffs and complexities involved, and perhaps learn a few more of the dark sides of C. It turns out that C is not a "high level assembler" like many experienced C programmers (particularly folks with a low-level focus) like to think, and that C++ and Objective-C have directly inherited plenty of issues from it.