GSoC 2024: ABI Lowering in ClangIR

ClangIR is an ongoing effort to build a high-level intermediate representation (IR) for C/C++ within the LLVM ecosystem. Its key advantage lies in its ability to retain more source code information. While ClangIR is making progress, it still lacks certain features, notably ABI handling. Currently, ClangIR lowers most functions without accounting for ABI-specific calling convention details.

Goals

The “Build & Run SingleSource Benchmarks with ClangIR - Part 2” Google Summer of Code 2024 builds on my contributions from GSoC 2023 by addressing one of the main issues I encountered: target-specific lowering. It focuses on extending ClangIR’s code generation capabilities, particularly in ABI-lowering for X86-64. Several tests rely on operations and types (e.g., va_arg calls and complex data types) that require target-specific information to compile correctly.

The concrete steps to achieve this were:

Implement foundational infrastructure that can scale to multiple architectures while adhering to ClangIR design principles such as CodeGen parity, feature guarding, and AST backreferences.
Handle basic calling convention scenarios as a proof of concept to validate the foundational infrastructure.
Add lowering for a second architecture to further validate the infrastructure’s extensibility to multiple architectures.
Unify target-specific ClangIR lowering into the library, as there are a few isolated methods handling target-specific code lowering like cir.va_arg.
Integrate calling convention lowering into the main pipeline to ensure future contributions and continued development of this infrastructure.

Contributions

The list of contribution (PRs) can be found here.

Target Lowering Library

The most significant contribution of this project was the development of a modular TargetLowering library. This ensures that target-specific MLIR lowering passes can leverage this shared library for lowering logic. The library also follows ClangIR’s feature guarding principles, ensuring that any contributor can refer to the original CodeGen for contributions, and any unimplemented feature is asserted at specific code points, making it easy to track missing functionality.

Calling Convention Lowering Pass

As a proof of concept, the initial development of the TargetLowering library focused on implementing a calling convention lowering pass that targets multiple architectures. Currently, ClangIR ignores the target ABI during CodeGen to retain high-level information. For example, structs are not unraveled to improve argument-passing efficiency. ABI-specific LLVM attributes are also ignored. This pass addresses these issues by properly tagging LLVM attributes and rewriting function definitions and calls to handle unraveled structs. This was implemented for both X86-64 and AArch64, demonstrating the library’s multi-architecture support.

Shortcomings

Target-Specific Lowering Unification

While some target-specific lowering code was moved into the library, it was copied and pasted rather than properly integrated. This is not ideal for leveraging the library’s multi-architecture features.

Inclusion in the Main Pipeline

This is still a work in progress, as the library is not yet mature enough to handle most pre-existing ClangIR tests. There are also feature guards with unreachable statements for many unimplemented features.

Future Work

Now that there is a base infrastructure for handling target-agnostic to target-specific CIR code, there is a large amount of future work to be done, including:

Improving DataLayout-related queries using MLIR’s built-in tools.
Implementing calling convention lowering for additional types, such as pointers.
Extending the TargetLowering library to support more architectures.
Unifying remaining target-specific lowering code from other parts of ClangIR.

Acknowledgements

I would like to thank my Google Summer of Code mentors, Bruno Cardoso Lopes and Nathan Lanza, for another great GSoC experience. I also want to thank the LLVM community and Google for organizing the program.