Google Summer of Code: C++ Modernizer Improvements
Google Summer of Code (GSoC) offers students stipends to participate in open source projects during the summer. This year, I was accepted to work on the Clang C++ Modernizer, a project formerly known as the C++11 Migrator, driven by a team at Intel. The goals of the tool are to modernize C++ code by using the new features of new C++ standards in order to improve maintainability, readability and compile time and runtime performance. The project was featured in the April blog post “Status of the C++11 Migrator” and has been evolving since, both in terms of architecture and features.
This article presents the improvements made to the tool in the last few months, which include my work from this summer for GSoC. For a complete overview of the tool and how to install it, please visit the documentation: http://clang.llvm.org/extra/clang-modernize.html#getting-started. For a demonstration of the tool you can take a look at the Going Native 2013 talk given by Chandler Carruth: The Care and Feeding of C++'s Dragons. clang-modernize is featured starting at ~33min.
To avoid changing files that shouldn’t be changed, e.g. system headers or headers for third-party libraries, there are a few options to control which files should be transformed:
The command line switch to enable this functionality is -override-macros.
Example:
Example:
The transform is currently limited to constructor parameters that are copied into class fields.
Example:
std::move() is a library function declared in <utility>. If need be, this header is added by clang-modernize.
There is a lot of room for improvement in this transform. Other situations that are safe to transform likely exist. Contributions are most welcomed in this area!
Example:
void f(const std::vector<int> &my_container) {
for (std::vector<int>::const_iterator I = my_container.begin(),
E = my_container.end();
I != E; ++I) {
std::cout << *I << std::endl;
}
}
Without reformatting
void f(const std::vector<int> &my_container) {
for (auto I = my_container.begin(),
E = my_container.end();
I != E; ++I) {
std::cout << *I << std::endl;
}
}
With reformatting
void f(const std::vector<int> &my_container) {
for (auto I = my_container.begin(), E = my_container.end(); I != E; ++I) {
std::cout << *I << std::endl;
}
}
For more information about this option, take a look at the documentation: Formatting Command Line Options.
As an example, imagine that your project dropped a dependency to a “legacy” version of a compiler. You can automagically modernize your code to the new minimum versions of the compilers you want to support:
To support Clang >= 3.1, GCC >= 4.6 and MSVC 11:
For more information about this option and to see which transforms are available for each compilers, please read the documentation.
More transforms are coming down the pipe as well as improvements to existing transforms such as the pass-by-value transform.
We will continue fixing bugs and adding new features. Our backlog is publically available: https://cpp11-migrate.atlassian.net/secure/RapidBoard.jspa?rapidView=1&view=planning
The modernizer has its own bug and project tracker. If you want to file or fix a bug just go to: https://cpp11-migrate.atlassian.net
A few other addresses to keep in mind:
-- Guillaume Papin
Transform all Files That Make up a Translation Unit
A major improvement since the last version is the ability to transform every file that composes a translation unit not only the main source file. This means headers also get transformed if they need to be which makes the modernizer more useful.To avoid changing files that shouldn’t be changed, e.g. system headers or headers for third-party libraries, there are a few options to control which files should be transformed:
- -include Takes a comma-separated list of paths allowed to be transformed. All files within the entire directory tree rooted at each given path are marked as modifiable. For safety, the default behaviour is that no extra files will be transformed.
- -exclude Takes a comma-separated list of paths forbidden to be transformed. Can be used to prune out subtrees from included directory trees.
- -include-from and -exclude-from Respectively equivalent to -include and -exclude but takes a filename as argument instead of a comma-separated list of paths. The file should contain one path per line.
- src/foo.cpp
- include/foo.h
- lib/third-party.h
clang-modernize -include=include/ src/foo.cpp -- -std=c++11 -I include/ -I lib/
clang-modernize -include=. -exclude=lib/ src/foo.cpp -- -std=c++11 -I include/ -I lib/
The Transforms
Right now there is a total of 6 transforms, two of which are new:- Add-Override Transform
Adds the ‘override’ specifier to overriden member functions. - Loop Convert Transform
Makes use of for-ranged based loop. - Pass-By-Value Transform [new]
Replaces const-ref parameters that would benefit from using the pass-by-value idiom. - Replace Auto-Ptr Transform [new]
Replaces uses of the deprecated std::auto_ptr by std::unique_ptr. - Use-Auto Transform
Makes use of the auto type specifier in variable declarations. - Use-Nullptr Transform
Replaces null literals and macros by nullptr where applicable.
Improvement to Add-Override
Since the last article in April, the Add-Override Transform has been improved to handle user-defined macros. Some projects, like LLVM, use a macro that expands to the ‘override’ specifier for backward compatibility with non-C++11-compliant compilers. clang-modernize can detect those macros and use them instead of the ‘override’ identifier.The command line switch to enable this functionality is -override-macros.
Example:
clang-modernize -override-macros foo.cpp
Before | After |
#define LLVM_OVERRIDE override struct A { virtual void foo(); }; struct B : A { virtual void foo(); }; | #define LLVM_OVERRIDE override struct A { virtual void foo(); }; struct B : A { virtual void foo() LLVM_OVERRIDE; }; |
Improvement to Use-Nullptr
This transform has also been improved to handle user-defined macros that behave like NULL. The user specifies which macros can be replaced by nullptr by using the command line switch -user-null-macros=<string>.Example:
clang-modernize -user-null-macros=MY_NULL bar.cpp
Before | After |
#define MY_NULL 0 void bar() { int *p = MY_NULL; } | #define MY_NULL 0 void bar() { int *p = nullptr; } |
New Transform: Replace Auto-Ptr
This transform was a result of GSoC work. The transform replaces uses of std::auto_ptr by std::unique_ptr. It also inserts calls to std::move() when needed.Before | After |
#include <memory> void steal(std::auto_ptr<int> x); void foo(int i) { std::auto_ptr<int> p(new int(i)); steal(p); } | #include <memory> void steal(std::unique_ptr<int> x); void foo(int i) { std::unique_ptr<int> p(new int(i)); steal(std::move(p)); } |
New Transform: Pass-By-Value
Also a product of GSoC this transform makes use of move semantics added in C++11 to avoid a copy for functions that accept types that have move constructors by const reference. By changing to pass-by-value semantics, a copy can be avoided if an rvalue argument is provided. For lvalue arguments, the number of copies remains unchanged.The transform is currently limited to constructor parameters that are copied into class fields.
Example:
clang-modernize pass-by-value.cpp
Before | After |
#include <string> public: A(const std::string &Copied, const std::string &ReadOnly) : Copied(Copied), ReadOnly(ReadOnly) {} private: std::string Copied; const std::string &ReadOnly; }; | #include <string> #include <utility> class A { public: A(std::string Copied, const std::string &ReadOnly) : Copied(std::move(Copied)), ReadOnly(ReadOnly) {} private: std::string Copied; const std::string &ReadOnly; }; |
std::move() is a library function declared in <utility>. If need be, this header is added by clang-modernize.
There is a lot of room for improvement in this transform. Other situations that are safe to transform likely exist. Contributions are most welcomed in this area!
Usability Improvements
We also worked hard on improving the overall usability of the modernizer. Invoking the modernizer now requires fewer arguments since most of the time the arguments can be inferred.- If no compilation database or flags are provided, -std=c++11 is assumed.
- All transforms are enabled by default.
- Files don’t need to be explicitly listed if a compilation database is provided. The modernizer will get files from the compilation database. Use -include to choose which ones.
- Automatically reformat code affected by transforms using LibFormat.
- A new command line switch to choose transforms to apply based on compiler support.
Reformatting Transformed Code
LibFormat is the library used behind the scenes by clang-format, a tool to format C, C++ and Obj-C code. clang-modernize uses this library as well to reformat transformed code. When enabled with -format, the default style is LLVM. The -style option can control the style in a way identical to clang-format.Example:
format.cpp
#include <iostream>
#include <vector>
#include <vector>
void f(const std::vector<int> &my_container) {
for (std::vector<int>::const_iterator I = my_container.begin(),
E = my_container.end();
I != E; ++I) {
std::cout << *I << std::endl;
}
}
Without reformatting
$ clang-modernize -use-auto format.cpp
#include <iostream>
#include <vector>
#include <vector>
void f(const std::vector<int> &my_container) {
for (auto I = my_container.begin(),
E = my_container.end();
I != E; ++I) {
std::cout << *I << std::endl;
}
}
With reformatting
$ clang-modernize -format -style=LLVM -use-auto format.cpp
#include <iostream>
#include <vector>
#include <vector>
void f(const std::vector<int> &my_container) {
for (auto I = my_container.begin(), E = my_container.end(); I != E; ++I) {
std::cout << *I << std::endl;
}
}
For more information about this option, take a look at the documentation: Formatting Command Line Options.
Choosing Transforms based on Compiler Support
Another useful command-line switch is: -for-compilers. This option enables all transforms the given compilers support.As an example, imagine that your project dropped a dependency to a “legacy” version of a compiler. You can automagically modernize your code to the new minimum versions of the compilers you want to support:
To support Clang >= 3.1, GCC >= 4.6 and MSVC 11:
clang-modernize -format -for-compilers=clang-3.1,gcc-4.6,msvc-11 foo.cpp
For more information about this option and to see which transforms are available for each compilers, please read the documentation.
What’s next?
The ability to transform many translation units in parallel will arrive very soon. Think of clang-modernize -j as in make and ninja. Modernization of large code bases will become much faster as a result.More transforms are coming down the pipe as well as improvements to existing transforms such as the pass-by-value transform.
We will continue fixing bugs and adding new features. Our backlog is publically available: https://cpp11-migrate.atlassian.net/secure/RapidBoard.jspa?rapidView=1&view=planning
Get involved!
Interested by the tool? Found a bug? Have an idea of a transform that can be useful to others? The project is Open Source and contributions are most welcomed!The modernizer has its own bug and project tracker. If you want to file or fix a bug just go to: https://cpp11-migrate.atlassian.net
A few other addresses to keep in mind:
- Clang C++ Modernizer User’s Manual
- IRC channel: #llvm on irc.oftc.net
- Mailing lists:
- cfe-dev for questions and general discussions
- cfe-commits for patches
- Phabricator to submit patches
Final word
Finally I want to thank my mentor Edwin Vane and his team at Intel, Tareq Siraj and Ariel Bernal, for the great support they provided me. Also thanks to the LLVM community and Google Summer of Code team for giving me this opportunity to work on the C++ Modernizer this summer.-- Guillaume Papin