|
From Code like a Pro in C# by Jort Rodenburg This article gives an overview of exactly how C# code is compiled. |
Take 40% off Code like a Pro in C# by entering fccrodenburg into the discount code box at checkout at manning.com.
How CLI-compliant languages are compiled
In this article, you get an in-depth look at how C# (and other Common Language Infrastructure compliant languages) compiles. Knowing the entire compilation story prepares you to take advantage of all of C#’s features with understanding some of the pitfalls related to memory and execution. The C# compilation process has three states (C#, Intermediate Language, and native code) and two stages: going from C# to Common Intermediate Language and going from Intermediate Language to native code.
NOTE Native code is sometimes referred to as machine code.
By looking at what it takes to go from one step to the other and follow a method as the compiler and CLR compile the high-level C# code down to runnable native code, we gain an understanding of the complex machine that is C# and .NET 5. A good understanding of this process is often a resource gap found in beginner resources, but advanced resources require you to understand this.
Figure 1. The complete C# compilation process. It goes from C# code to Common Intermediate Language to native code. Understanding the compilation process gives us knowledge around some of the internal choices made around C# and .NET.
We use a combination of static compilation and JIT compilation to compile C# to native code:
- After a developer writes the C# code, he or she compiles their code. This results in Common Intermediate Language stored within Portable Executable (PE for 32-bit, PE+ for 64-bit) files such as “.exe” and “.dll” files for Windows. These files are distributed to users.
- When we launch a .NET program, the operating system invokes the Common Language Runtime. The Common Language Runtime JIT compiles the CIL to the native code appropriate for the platform it is running on. This allows CLI-compliant languages to run on a lot of platforms and compiler types. However, it would be amiss not to mention the major negative implication of using a virtual machine and JITter to run your code: performance.
A statically compiled program has a leg up at execution time since there is no waiting on a runtime to compile the code.
DEFINITION Static and Just-In-Time (JIT) compilation are two commonly used ways of compiling code. C# uses a combination of static compilation and JIT compilation. This entails that the code is compiled down to bytecode at the last possible moment. Static compilation compiles all the source code ahead of time.
Step 1: C# code (high-level)
The first time I encountered the Pythagorean Theorem was in 2008. I was in the Dutch version of high school and saw in the mathematics textbook that we would cover the Pythagorean Theorem that year. A couple of days later, late at night, I was in the car with my father. We had been driving for a while, so the conversation had reached a natural slow-point. In a completely out of character moment, I asked him: “What is the Pythagorean Theorem?” The question clearly took him aback since I had shown little academic interest, especially in mathematics, at that point in time. For the next ten minutes, he attempted to explain to me, somebody with the mathematical abilities of a grapefruit, what the Pythagorean Theorem was. I was surprised I actually understood what he was talking about, and now, years later, it has proven to be an excellent resource to show you the first step in the C# compilation process.
In this section, we look at the first step in the C# compilation process: compiling C# code. The program we follow through the compilation process is the Pythagorean Theorem. The reason for using a program representing the Pythagorean Theorem to teach you the C# compilation process is straightforward: we can condense the Pythagorean Theorem to a couple of lines of code that are understandable with a high school level of mathematical knowledge. This lets us focus on the compilation story instead of on implementation details.
NOTE If a quick refresher is in order, the Pythagorean Theorem states that in a triangle with one right angle, usually called a right triangle, the square of the length of the hypotenuse (c) is equal to the sum of the squares of the lengths of the other two sides (a and b), or a2 + b2 = c2.
Figure 2. The C# Compilation Process, Step 1: C# Code. This is the static compilation phase.
We start by writing a simple method that calculates the result of the Pythagorean Theorem when given two arguments.
Listing 1. Pythagorean Theorem (High Level)
public double Pythagoras(double sideLengthA, double sideLengthB) { #A double squaredLength = sideLengthA * sideLengthA + sideLengthB * sideLengthB; #B return squaredLength; }
#A We declare a method with a public access modifier, returning a floating-point number, called “Pythagoras”, that expects two floating-point (double) arguments: “sideLengthA” and “sideLengthB”.
#B We perform the Pythagorean Theorem and assign the result to a variable called “squaredLength”.
If we run this code and give it the arguments of [3, 8] we see that the result is 73, which is correct. Alternatively, since we are using 64-bit floating-point numbers (doubles), we can also test argument sets like [47.21, 99.04]. The result is 12037.7057.
C# Access Modifiers from open to restricted. Using the correct access modifier helps with encapsulating our data and protecting our classes.
Now we compile the code. Let us assume that the method in Listing 1 is part of a class called Pythagoras,
which is part of a project and solution called HelloPythagoras
. To compile a .NET 5 (or .NET Framework/.NET Core solution) to Intermediate Language stored in a PE/PE+ file, you can either use the build or compiler button in your IDE or run the following command in your command line:
dotnet build [solution file path]
A solution file ends with the file extension “.sln”. The command to create our solution is:
dotnet build HelloPythagoras.sln
After running the command, the compiler launches. First, the compiler restores all the required dependency packages through the NuGet Package Manager. Then the command-line tool compiles the project and stores the output in a new folder called “bin”. Within the “bin” folder, there are two potential options for further folders: “Debug” and “Release”. This depends on the mode we set the compiler to (you can define your own modes if want to). By default, the compiler compiles in “Debug” mode. “Debug” mode contains all the debug information (stored in .pdb files) that you need to step through an application with breakpoints.
To compile in “Release” mode through the command line, append the --Configuration release
flag to the command. Alternatively, in Visual Studio there is a drop-down to select “Debug” or “Release” mode. This is the easiest, quickest, and the likeliest way you compile your code.
At this point, the C# high-level code is compiled into an executable file containing the Intermediate Language code.
Step 2: Common Intermediate Language (assembly level)
From a day-to-day perspective, your job is done. The code is in an executable form and you can wrap up your ticket or user-story. From a technological perspective, the journey is just getting started. The C# code is statically compiled down to Common Intermediate Language, but CIL cannot be run by the operating system.
Figure 3. The C# Compilation Process, Step 2: Intermediate Language. Here we go from static to JIT compilation.
So how do you get from CIL to native code? The missing piece is the Common Language Runtime. This part of .NET 5 translates Common Intermediate Language to native code. It is the “runtime” of .NET. We can compare the CLR to something like the Java Virtual Machine (JVM) when looking at complexity and usage numbers. The CLR has been part of .NET since the very beginning (starting with release 1.0 in February 2002). It is also good to note that with the movement towards .NET Core and .NET 5, a new implementation of the CLR is taking the place of the old CLR: CoreCLR. This article uses the term CLR for both the regular Common Language Runtime and CoreCLR.
Any code that implements a technical standard called Common Language Infrastructure (CLI) can be compiled down to Common Intermediate Language. The CLI describes the infrastructure behind the .NET ecosystem, whose specific flavors are implementations of the CLI themselves, and gives languages a basis to form their type system around. Because the CLR can take any piece of Intermediate Language, and the .NET compiler can generate this CIL from any CLI-compliant language, we can have CIL code generated from mixed-source code. C#, Visual Basic, and F# are the most common .NET programming languages, but there are a bunch more.
Until 2017, Microsoft also supported J#, a CLI compliant implementation of Java. Theoretically, you could download the compatible compiler and use J#, but you would miss out on some modern Java features in exchange for developing on the .NET platform. It is also good to note that .NET 5 includes new Java interoperability functionality that does not use J#.
NOTE The CLR is an immensely complicated piece of software. If you want to know more about the (traditional, Windows-based) CLR, see Jeffrey Richter’s CLR Via C# (fourth edition; Microsoft Press, 2012).
Since the compiler embeds CIL in executable files, we need to use a disassembler to view the CIL. All .NET flavors come with such a tool called IL DASM (which stands for Intermediate Language Disassembler). To use IL DASM, we need to run the “Developer Command Prompt for Visual Studio” which is installed alongside Visual Studio. This is a command prompt environment that gives us access to .NET tools. Be aware that IL DASM is only available for Windows.
Once in the developer command prompt, we can invoke IL DASM on our compiled file and specify an output file:
>\ ildasm HelloPythagoras.dll /output:HelloPythagoras.il
If we do not specify an output file, the command-line tool launches the GUI for IL DASM. In there you can also view the IL code of the disassembled executable. The output file can be of whatever file extension you want, as it is a simple binary text file. Note that in .NET Framework, ildasm operates against the .exe file, not the .dll. .NET 5 and .NET Core use the .dll file.
When we open the HelloPythagoras.il file in a text editor or look at the IL DASM GUI, a file filled with mysterious code opens. This is the CIL code. We focus on the CIL for the Pythagoras method (if compiled in “Debug mode”) as shown in listing 1.
Listing 2. Pythagorean Theorem (Common Intermediate Language)
.method public hidebysig static float64 Pythagoras(float64 sideLengthA, float64 sideLengthB) cil managed { .maxstack 3 .locals init ([0] float64 squaredLength, [1] float64 V_1) IL_0000: nop IL_0001: ldarg.0 IL_0002: ldarg.0 IL_0003: mul IL_0004: ldarg.1 IL_0005: ldarg.1 IL_0006: mul IL_0007: add IL_0008: stloc.0 IL_0009: ldloc.0 IL_000a: stloc.1 IL_000b: br.s IL_000d IL_000c: ldloc.1 IL_000e: ret }
If you have ever worked in or seen assembly level programming, you might notice some similarities. Common Intermediate Language is definitely harder to read and more “close to the metal” than regular C# code, but it is not as mysterious as it might look. By stepping through the CIL line-by-line, you see that this is just a different syntax for programming concepts you already know. The CIL code generated by the compiler on your machine may look slightly different (especially the numbers used with the ldarg
opcode), but the functionality and types of opcodes should be the same.
The very first thing we see is the method declaration:
.method private hidebysig static float64 Pythagoras(float64 sideLengthA, float64 sideLengthB) cil managed
We can easily deduce that the method is public, static, and returns a 64-bit floating-point number (known as a double in C#). We can also see that the method is named Pythagoras
and takes in 2 arguments called sideLengthA
and sideLengthB
, both 64-bit floating-point numbers. The two terms that seem odd are hidebysig
and cil managed
.
First, the term hidebysig
tells us that the Pythagoras method hides every other method with the same method signature. When omitted, the method hides all methods with the same name (not limited to signature match). Second, cil managed
means that this code is Common Intermediate Language and that we are running in managed mode. The other side of the coin would be unmanaged. This refers to whether the CLR can execute the method, potentially has manual memory handling, and has all the metadata that the CLR requires. By default, all your code runs in managed mode unless you explicitly tell it not to by enabling the compiler “unsafe” flag and designating code as “unsafe”.
Moving into the method itself, we can split the method into two parts: the setup (constructor) and execution (the logic). First, let’s look at the constructor:
.maxstack 3 .locals init ([0] float64 squaredLength, [1] float64 V_1)
There are some unfamiliar terms here. To start, .maxstack 3
tells us that the maximum allowed elements on the memory stack during execution is 3. The static compiler automatically generates this number and tells the CLR JITter how many elements to reserve for the method. This is a very important part of the method code. Imagine not being able to tell the CLR how much memory we need. It may decide to reserve all available stack space on the system, or not reserve any at all. Either scenario would be catastrophic.
.locals init (…)
When we declare a variable in a CLI-compliant programming language, the compiler assigns the variable a scope and initializes the variable’s value to a default value at compile time. The locals
keyword tells us the scope of the variables declared in this code block are local in scope (scoped to the method, not the class) while init
means that we are initializing the declared variables to their default values. The compiler assigns it to null
or a zeroed-out value, depending whether the variable is a reference or value type.
Expanding the .locals init (…)
code block reveals the variables we are declaring and initializing:
.locals init ( [0] float64 squaredLength, [1] float64 V_1
The CIL declares two local variables and initializes them to zero values: squaredLength
and V_1
.
Now you might say, hang on a second! We only declared one local variable in our C# code: squaredLength.
What is this V_1
business? Have a look at the C# code again:
public double Pythagoras(double sideLengthA, double sideLengthB) { double squaredLength = sideLengthA* sideLengthA+ sideLengthB* sideLengthAB; return squaredLength; }
We only explicitly declared one local variable. However, we are returning squaredLength
by value rather than by reference. This means that under the hood a new variable is declared, initialized, and assigned the value of squaredLength
. This is V_1
.
To summarize, we looked at the method signature and the setup. Now we can dive into the weeds of the logic. Let’s also split this part up into two sections: the evaluation of the Pythagorean Theorem and the returning of the resulting value.
IL_0000: nop IL_0001: ldarg.0 IL_0002: ldarg.0 IL_0003: mul IL_0004: ldarg.1 IL_0005: ldarg.1 IL_0006: mul IL_0007: add IL_0008: stloc.0
To start, we see an operation (we also call these operations opcodes) called nop
. This is also called the “Do Nothing Operation” or “No Operation” because on its own a nop
operation does nothing. They are widely used in CIL and Assembly code to enable breakpoint debugging. Along with the PDB file that is generated in Debug builds, the CLR can inject instructions to stop program execution at a nop
operation. This allows us to “step through” code at runtime.
Next up, we look at the evaluation of the Pythagorean Theorem itself:
double squaredLength = sideLengthA * sideLengthA + sideLengthB * sideLengthB;
The following two operations are a doubleheader: two ldarg.0
operations. The first operation (IL_0001
) loads the first sideLengthA
occurrence onto the stack. The second operation (IL_0002
) loads the second sideLengthA
occurrence onto the stack as well.
After we have loaded the first mathematical evaluation’s arguments onto the stack, the CIL code calls the multiplication operation:
IL_0003: mul
This results in the two arguments loaded during IL_0001
and IL_0002
being multiplied and stored into a new element on the stack. The garbage collector now purges the previous (now unused) stack elements from the stack.
We repeat this process for the squaring of the sideLengthB
arguments:
IL_0004: ldarg.1 IL_0005: ldarg.1 IL_0006: mul
So now we have elements in the stack containing the values of sidelengthA2 and sidelengthB2. To fulfill the Pythagorean Theorem, and our code, we have to add these two values up and store them in squaredLength.
This is done in IL_0007
and IL_0008
.
IL_0007: add IL_0008: stloc.0
Similar to the mul
operations (IL_0003
and IL_0006
), the add
operation (IL_0007
) evaluates the addition of the previously stored arguments and places the resulting value into an element on the stack. The CIL takes this element and stores it into the squaredLength
variable we initialized in setup ([0] float64 squaredLength
) through the stloc.0
command (IL_0008
). The stloc.0
operation pops a value from the stack and stores it at the variable on index 0.
We have now fully evaluated and stored the Pythagorean Theorem result into a variable. All that remains is to return the value from the method, just like we promised in our original method signature.
IL_0009: ldloc.0 IL_000a: stloc.1 IL_000b: br.s IL_000d IL_000c: ldloc.1 IL_000e: ret
First, we load the value of the variable at location 0 into memory (IL_0009).
In the previous segment, we ended with storing the value of the Pythagorean Theorem into the variable at location 0, so that must be squaredLength
. But, as mentioned before, we are passing the variable by value, not reference, so we create a copy of squaredLength to return out of the method with. Luckily, we declared and initialized a variable just for this purpose at index 1: V_1 ([1] float64 V_1
). We store the value into index 1 through the stdloc.1 operation (IL_000a)
.
Next up we see another strange operation: br.s IL_000d (IL_000b)
. This is a branching operator that signifies that the return value is calculated and stored away for returning. The CIL uses a branching operator for debugging purposes. A branching operator is similar to a nop
operation. All different branches of your code (conditionals with other return values) jump to the br.s
operator when return is called. The br.s
operator takes up two bytes and therefore has two IL locations (IL_000b and IL_000d)
as one opcode usually takes up one byte. Because the br.s
operator has a size of two bytes, IL_000c
(ldloc.1
) is wrapped in the branching operator. This allows the debugger to stop executing at the loading of the stored return value and manipulate it if wanted.
Finally, we are ready to return out of the method through IL_000c
and IL_000e:
IL_000c: ldloc.1 IL_000e: ret
The ldloc.1
(IL_000c)
operation loads the previously-stored return value. This is followed by the ret operator, which takes the value we loaded at IL_000c
and returns it from the method.
That brings us to the end of the section. Hopefully, you are now a bit more comfortable with the nitty-gritty parts of the static compilation step of C# and .NET.
Listing 3. The CIL source code of the Pythagorean Theorem method
.method private hidebysig static float64 #A Pythagoras(float64 sideLengthA, #B float64 sideLengthB) cil managed { #C .maxstack 3 #D .locals init ([0] float64 squaredLength, #E [1] float64 V_1) IL_0000: nop #F IL_0001: ldarg.0 #G IL_0002: ldarg.0 IL_0003: mul #H IL_0004: ldarg.1 #I IL_0005: ldarg.1 IL_0006: mul #J IL_0007: add #K IL_0008: stloc.0 #L IL_0009: ldloc.0 #M IL_000a: stloc.1 #N IL_000b: br.s IL_000d #O IL_000c: ldloc.1 #P IL_000e: ret #Q }
#A Start of a method that is private, static, returns a double and hides other methods with the same signature.
#B The method is called “Pythagoras”. It expects two arguments of type float64 (double).
#C This is a CIL (Common Intermediate Language) method and runs in managed mode.
#D The maximum number of simultaneous elements needed on the stack is 3.
#E Two local variables of type float64 are declared and initialized: squaredLength at index 0 and V_1 at index 1.
#F A “Do Nothing” operation. Used by debuggers for breakpoints.
#G The first “sideLengthA” argument is loaded into memory.
#H The two “sideLengthA” values loaded into memory are multiplied and stored in a stack element.
#I The first “sideLengthB” argument is loaded into memory.
#J The two “sideLengthB” values loaded into memory are multiplied and stored in a stack element.
#K The squared values of sideLengthA and sideLengthBare added together and stored in a stack element.
#L The squared values previously stored in a stack element are stored in a new stack element designated for the variable at index 0: “squaredLength”.
#M The value for “squaredLength” is loaded into memory.
#N The previously loaded into memory value of “squaredLength” is stored in the stack element for the variable with index 1: “V_1”.
#O The branching operator. Signifies the completion of the method and storage of the return value.
#P The return value (variable V_1) is loaded into memory.
#Q We return out of the method with the value of V_1.
Step 3: Native code (Processor Level)
The last step in the compilation process is the conversion from Common Intermediate Language to native code that the processor can actually run. Until now, the code has been statically compiled, but that changes here. When .NET 5 executes an application, the CLR launches and scans the executable files for the CIL code. Then, the CLR invokes the JIT compiler to convert the CIL into native code as it runs. Native code is the lowest level of code that is (somewhat) human-readable. A processor can execute this code directly because of the inclusion of pre-defined operations (opcodes), similar to how Common Intermediate Language includes the CIL operation codes.
Figure 4. The C# Compilation Process, Step 3: Native code. This is the JIT phase.
JIT-ing our code comes at a performance cost, but also means that we can execute .NET based code on any platform supported by the CLR and a compiler. We can see this in practice with .NET Core and the new CoreCLR. The CoreCLR can JIT Intermediate Language to Windows, macOS, and Linux.
Figure 5. CoreCLR can JIT for targets such as Linux, Windows, and macOS. This allows for cross-platform execution of C# code.
Because of the JIT nature of this compilation step, viewing the actual native code is a bit tricky. The only way to view native code generated from your Intermediate Language would be to use a command-line tool called ngen
, which comes pre-installed with .NET 5. This tool allows you to generate so-called native images containing native code from the Common Intermediate Language stored in a PE file ahead of time. The CLR stores native code output in a subfolder of “%SystemRoot%/Assembly/NativeAssembly” (only available on Windows). Be aware, however, that you cannot use the regular file explorer to navigate here, nor would the resulting output be legible. After running ngen
, the CLR sees that the CIL is already compiled (statically) to native code and executes based on that. This comes with the expected performance boost; however, the native code and CIL code can get out of sync when a new build is released and have unexpected side effects if the CLR decides to use the older statically compiled native image instead of re-compiling the new, updated code.
In day-to-day operations, you likely don’t touch CIL all that much or are overly concerned about the CIL to native code compilation. However, understanding the compilation process is a fundamental block of knowledge as it sheds light on design decisions in .NET 5.
That’s all for now.
If you want to learn more about the book, you can check it out on our browser-based liveBook platform here.