Instruction Relationship Framework in LLVM
The article provides an overview of the new Relationship framework of TableGen. This TableGen feature is used to describe user defined relationships between instructions. It was added to LLVM in October 2012.Motivation:
The motivation for this feature stemmed from the Hexagon backend. Much like other processors, Hexagon provides multiple variations for many instructions. It is a common requirement in machine instruction passes to switch between various formats of the same instruction. For example, consider an Add instruction with predicated true (Add_pt) and predicated false (Add_pf) forms. Let's assume that a non-predicated Add instruction is selected during target lowering. However, during if-conversion, the optimization pass might decide to change the non-predicated Add into the predicated true Add_pt form. These transformations require a framework to relate non-predicated forms to the respective predicated forms. In the absence of such a framework, this transformation is typically achieved using large switch cases. There are many deficiencies in using a switch case based approach. The manual implementation of switch case clauses requires a very high maintenance cost. It also results in lost optimization opportunities due to incomplete implementation of switch cases. The lack of a relationship model resulted in around 15% of Hexagon backend code dedicated to switch cases with several of those functions growing to over thousands of lines of code.This problem inspired us to explore some alternatives. We started to look for a framework that was easy to maintain, flexible, scalable, and less error prone. After some initial discussions and brainstorming in the Hexagon group, we decided to modify TableGen to express instruction relations. The initial design was submitted to the LLVM-dev mailing list for review. Jakob Stoklund gave valuable suggestions and helped with the design of the relationship framework. The idea was to add a query language that can be used to define different kind of relationships between instructions. TableGen was extended to parse relationship models. It uses the information to construct tables which are queried to determine new opcodes corresponding to a relationship. The Hexagon backend relies heavily on the relationship framework and it has significantly improved the code quality of our target.
Before getting into the implementation details, let's consider an API that takes an instruction opcode as input and returns its predicated true/false form. We'll first look at the switch-case based solution and then compare it with the relationship based implementation.
short getPredicatedTrue(short opcode) {
switch (opcode) {
default:
return -1;
case Hexagon::Add:
return Hexagon::Add_pt;
case Hexagon::Sub:
return Hexagon::Sub_pt;
case Hexagon::And:
return Hexagon::And_pt;
case Hexagon::Or:
return Hexagon::Or_pt;
case ... :
return ...
}
short getPredicatedFalse(short opcode) {
switch (opcode) {
default:
return -1;
case Hexagon::Add:
return Hexagon::Add_pf;
case Hexagon::Sub:
return Hexagon::Sub_pf;
case Hexagon::And:
return Hexagon::And_pf;
case Hexagon::Or:
return Hexagon::Or_pf;
case ... :
return ...
}
short getPredicatedOpcode(short opcode, bool predSense) {
return predSense ? getPredicatedTrue(opcode)
: getPredicatedFlase(opcode);
}
The relationship framework offers a very systematic solution to this problem. It requires instructions to model their attributes and categorize their groups. For instance, a field called 'PredSense' can be used to record whether an instruction is predicated or not and its sense of predication. Each instruction in the group has to be unique such that no two instructions can share all the same attributes. There must be at least one field with different value. The instruction groups are modeled by assigning a common base name to all the instructions in the group. One of the biggest advantages of this approach is that, by modeling these attributes and groups once, we can define multiple relationships with very little effort.
With the relationship framework, the getPredicatedOpcode API can be implemented as below:
short getPredicatedOpcode(short opcode, bool predSense) {
return predSense ? getPredicated(opcode, PredSenseTrue)
: getPredicated(opcode, PredSenseFalse);
}
Architecture:
The entire framework is driven by a class called InstrMapping. The TableGen back-end has been extended to include a new parser for the relationship models implemented using InstrMapping class. Any relationship model must derive from this class and assign all the class members to the appropriate values. This is how the class looks like:
class InstrMapping {
// Used to reduce search space only to the instructions using this relation model
string FilterClass;
// List of fields/attributes that should be same for all the instructions in
// a row of the relation table. Think of this as a set of properties shared
// by all the instructions related by this relationship.
list RowFields = [];
// List of fields/attributes that are same for all the instructions
// in a column of the relation table.
list ColFields = [];
// Values for the fields/attributes listed in 'ColFields' corresponding to
// the key instruction. This is the instruction that will be transformed
// using this relation model.
list KeyCol = [];
// List of values for the fields/attributes listed in 'ColFields', one for
// each column in the relation table. These are the instructions a key
// instruction will be transformed into.
list > ValueCols = [];
}
def getPredicated : InstrMapping {
// Choose a FilterClass that is used as a base class for all the instructions modeling
// this relationship. This is done to reduce the search space only to these set of instructions.
let FilterClass = "PredRel";
// Instructions with same values for all the fields in RowFields form a row in the resulting
// relation table.
// For example, if we want to relate 'Add' (non-predicated) with 'Add_pt'
// (predicated true) and 'Add_pf' (predicated false), then all 3
// instructions need to have a common base name, i.e., same value for BaseOpcode here. It can be
// any unique value (Ex: XYZ) and should not be shared with any other instruction not related to 'Add'.
let RowFields = ["BaseOpcode"];
// List of attributes that can be used to define key and column instructions for a relation.
// Here, key instruction is passed as an argument to the function used for querying relation tables.
// Column instructions are the instructions they (key) can transform into.
//
// Here, we choose 'PredSense' as ColFields since this is the unique attribute of the key
// (non-predicated) and column (true/false) instructions involved in this relationship model.
let ColFields = ["PredSense"];
// The key column contains non-predicated instructions.
let KeyCol = ["none"];
// Two value columns - first column contains instructions with PredSense=true while the second
// column has instructions with PredSense=false.
let ValueCols = [["true"], ["false"]];
}
multiclass ALU32_Pred {
let PredSense = !if(PredNot, "false", "true"), isPredicated = 1 in
def NAME : ALU32_rr<(outs IntRegs:$dst),
(ins PredRegs:$src1, IntRegs:$src2, IntRegs: $src3),
!if(PredNot, "if (!$src1)", "if ($src1)")#
" $dst = "#mnemonic#"($src2, $src3)",
[]>;
}
multiclass ALU32_base {
let BaseOpcode = BaseOp in {
let isPredicable = 1 in
def NAME : ALU32_rr<(outs IntRegs:$dst),
(ins IntRegs:$src1, IntRegs:$src2),
"$dst = "#mnemonic#"($src1, $src2)",
[(set (i32 IntRegs:$dst), (OpNode (i32 IntRegs:$src1),
(i32 IntRegs:$src2)))]>;
defm pt : ALU32_Pred; // Predicate true
defm pf : ALU32_Pred; // Predicate false
}
}
let isCommutable = 1 in {
defm Add : ALU32_base<"add", "ADD", add>, PredRel;
defm And : ALU32_base<"and", "AND", and>, PredRel;
defm Xor: ALU32_base<"xor", "XOR", xor>, PredRel;
defm Or : ALU32_base<"or", "OR", or>, PredRel;
}
defm Sub : ALU32_base<"sub", "SUB", sub>, PredRel;
With the help of this extra information, TableGen is able to construct the following API. It offers the same functionality as switch-case based approach and significantly reduces the maintenance overhead:
int getPredicated(uint16_t Opcode, enum PredSense inPredSense) {
static const uint16_t getPredicatedTable[][3] = {
{ Hexagon::Add, Hexagon::Add_pt, Hexagon::Add_pf },
{ Hexagon::And, Hexagon::And_pt, Hexagon::And_pf },
{ Hexagon::Or, Hexagon::Or_pt, Hexagon::Or_pf },
{ Hexagon::Sub, Hexagon::Sub_pt, Hexagon::Sub_pf },
{ Hexagon::Xor, Hexagon::Xor_pt, Hexagon::Xor_pf },
}; // End of getPredicatedTable
unsigned mid;
unsigned start = 0;
unsigned end = 5;
while (start < end) {
mid = start + (end - start)/2;
if (Opcode == getPredicatedTable[mid][0]) {
break;
}
if (Opcode < getPredicatedTable[mid][0])
end = mid;
else
start = mid + 1;
}
if (start == end)
return -1; // Instruction doesn't exist in this table.
if (inPredSense == PredSense_true)
return getPredOpcodeTable[mid][1];
if (inPredSense == PredSense_false)
return getPredicatedTable[mid][2];
return -1;
}
//===------------------------------------------------------------------===//
// Generate mapping table to relate predicate-true instructions with their
// predicate-false forms
//
def getFalsePredOpcode : InstrMapping {
let FilterClass = "PredRel";
let RowFields = ["BaseOpcode"];
let ColFields = ["PredSense"];
let KeyCol = ["true"];
let ValueCols = [["false"]];
}