LLVM Project News and Details from the Trenches

Monday, December 19, 2011

NVIDIA CUDA 4.1 Compiler Now Built on LLVM

From the NVIDIA CUDA compiler team:

CUDA is a parallel programming model and platform created by NVIDIA for harnessing the power of hundreds of cores in modern graphics processing units (GPUs). NVIDIA provides free support for CUDA C and C++ in the CUDA toolkit. The CUDA programming environment consists of a compiler targeting NVIDIA GPUs and has been adopted by thousands of developers.

At NVIDIA we have switched over to using LLVM inside the CUDA C/C++ compiler for Fermi and future architectures. We use LLVM for optimizations and PTX code generation and for generating debug information for CUDA debugging. From a developer's perspective the new compiler is functionally on par with the previous compilers and produces better code with better compile times. We have extended the LLVM core compiler to understand data parallel programming model. It is now available, as part of CUDA 4.1 and you can learn more here.

Our experience with the use of LLVM has been very positive, starting with a modern compiler infrastructure and with high quality optimizations contributed by a large community of developers. The effort required to learn LLVM infrastructure is quite small and reasonable.

LLVM 3.1 vector changes

Intel uses the Low-Level Virtual Machine (LLVM) in a number of products, including the Intel® OpenCL SDK. The SDK's implicit vectorization module generates LLVM-IR (intermediate representation) which uses vector types.

LLVM-IR supports operations that use vector data types, and the LLVM code generator needs to do non-trivial work in order to efficiently compile vector operations into SIMD instructions. Recently, there were changes to the LLVM code generation that enabled better code generation for vector operations. In addition to many low level optimizations, this post talks about two major changes: the implementation of vector-select, and the support for vectors-of-pointers.

Monday, November 28, 2011

LLVM 3.0 Exception Handling Redesign

One of the biggest IR changes in the LLVM 3.0 release is a redesign and reimplementation of the LLVM IR exception handling model. The old model, while it worked for most cases, fell over in some key situations, leading to obscure miscompilations, missed optimizations, and poor compile time. This post talks about the changes in LLVM 3.0 and how to move an existing LLVM front-end to the new design. It assumes some familiarity with the Itanium C++ ABI for exception handling.

Saturday, November 26, 2011

LLVM 3.0 Type System Rewrite

One of the most pervasive IR (and thus compiler API) changes in LLVM 3.0 was a complete reimplementation of the LLVM IR type system. This change was long overdue (the original type system lasted from LLVM 1.0!) and made the compiler faster, greatly simplified a critical subsystem of VMCore, and eliminated some design points of IR that were frequently confusing and inconvenient. This post explains why the change was made along with how the new system works.

Sunday, September 18, 2011

Greedy Register Allocation in LLVM 3.0

LLVM has two new register allocators: Basic and Greedy. When LLVM 3.0 is released, the default optimizing register allocator will no longer be linear scan, but the new greedy register allocator.
With its global live range splitting, the greedy algorithm generates code that is 1-2% smaller, and up to 10% faster than code produced by linear scan.

Sunday, May 29, 2011

LLVM @ "The Architecture of Open Source Applications"

LLVM is featured in a chapter of the new book The Architecture of Open Source Applications. This chapter talks about the high-level design of LLVM, and how it differs from other contemporary compilers and JITs out there, why you might want to use it (if you're looking for compiler libraries), a simple example of writing an optimization, how the code is structured, a 10,000 foot view of how the code generator works, and some of the interesting capabilities LLVM has due to its design. If you're curious what this whole LLVM thing is, then this is a great place to start.

This book is available inexpensively in dead tree or PDF form, and the author royalties are donated to charity. The content is also available for free under the creative commons license. Share and enjoy,

-Chris Lattner

Monday, May 23, 2011

C++ at Google: Here Be Dragons

Google has one of the largest monolithic C++ codebases in the world. We have thousands of engineers working on millions of lines of C++ code every day. To help keep the entire thing running and all these engineers fast and productive we have had to build some unique C++ tools, centering around the Clang C++ compiler. These help engineers understand their code and prevent bugs before they get to our production systems.

Saturday, May 21, 2011

What Every C Programmer Should Know About Undefined Behavior #3/3

In Part 1 of the series, we took a look at undefined behavior in C and showed some cases where it allows C to be more performant than "safe" languages. In Part 2, we looked at the surprising bugs this causes and some widely held misconceptions that many programmers have about C. In this article, we look at the challenges that compilers face in providing warnings about these gotchas, and talk about some of the features and tools that LLVM and Clang provide to help get the performance wins, while taking away some of the surprise.

Saturday, May 14, 2011

What Every C Programmer Should Know About Undefined Behavior #2/3

In Part 1 of our series, we discussed what undefined behavior is, and how it allows C and C++ compilers to produce higher performance applications than "safe" languages. This post talks about how "unsafe" C really is, explaining some of the highly surprising effects that undefined behavior can cause. In Part #3, we talk about what friendly compilers can do to mitigate some of the surprise, even if they aren't required to.

I like to call this "Why undefined behavior is often a scary and terrible thing for C programmers". :-)

Friday, May 13, 2011

What Every C Programmer Should Know About Undefined Behavior #1/3

People occasionally ask why LLVM-compiled code sometimes generates SIGTRAP signals when the optimizer is turned on. After digging in, they find that Clang generated a "ud2" instruction (assuming X86 code) - the same as is generated by __builtin_trap(). There are several issues at work here, all centering around undefined behavior in C code and how LLVM handles it.

This blog post (the first in a series of three) tries to explain some of these issues so that you can better understand the tradeoffs and complexities involved, and perhaps learn a few more of the dark sides of C. It turns out that C is not a "high level assembler" like many experienced C programmers (particularly folks with a low-level focus) like to think, and that C++ and Objective-C have directly inherited plenty of issues from it.

Friday, April 22, 2011

Regular Expression Commands

Greetings LLDB users. I want to start writing regular blog posts for the new and cool features and things you can do in LLDB. Today I will start with one that was just added: regular expression commands.