.NET Internals
Category | C# |
---|
Overview
This document provides information on the internals of .NET gathered from Microsoft blogs, documentations, and the .NET Core source code.
Initialization
The entry point of a .NET applications is located in the Execution Engine (EE) in mscoree.dll
.
After checking the metadata of the executable, the correct .NET runtime is loaded.
The actual execution starts in the _CorExeMain
function, which initializes the CLR and begins execution in the managed entry point in the executable.
Garbage Collector
To reduce memory fragmentation, the GC regularly compacts the heap by moving objects in memory.
The GC divides objects up into small and large objects.
Small objects are allocated on the heap and large objects are allocated on the Large Object Heap (LOH).
LOH
The LOH is used for allocating memory for large objects.
A large object is greater than or equal to 85,000 bytes.
Typically, larges arrays are allocated on the LOH.
Because of the performance impact of copying large blocks of memory, the GC does not compact the LOH by default. The GC performs a sweep by creating a list of freed memory locations for future allocations.
Marshalling
There are two techniques for calling into the CLR from managed code.
- FCall
- QCall
FCall
FCall calls directly into the CLR code.
Identified as extern
methods with the MethodImplOptions.InternalCall
attribute.
QCall
QCall calls into the CLR via the P/Invoke.
Memory
Heap Memory
Objects and Value types
The new
operator creates objects or value types and invoke constructors.
The new
operator is translated into the newobj
CIL instruction.
newobj ctor
The newobj
instruction allocates an uninitialized object or value type and calls the constructor method ctor
.
The allocation process is as follow:
- Allocates a new instance of the class associated with the provided constructor.
- Initializes all the fields in the new instance to zero or null references.
- Calls the constructor with the provided arguments and the newly created instance.
- The initialized object reference is pushed on the stack.
Arrays
The new
operator is also used to creates arrays.
With one-dimensional arrays, the new
operator is translated into the newarr
CIL instruction.
Stack Memory
The stackalloc
operator allocates a block of memory on the stack.
The stackalloc
operator is translated into the localloc
CIL instruction.
localloc
The localloc
instruction allocates memory from the heap.
Memory Layout
Managed pointers are aligned on 4-byte or 8-byte address boundaries depending on the platform.
Type Size
A type that does not contain any data member has a size of 1 byte.
[StructLayout(LayoutKind.Sequential)]
struct MyStruct
{
int value;
}
The size of MyStruct
is 4 bytes.
[StructLayout(LayoutKind.Sequential)]
struct MyStruct
{
}
The size of MyStruct
is 1 byte.
Value Types
A value type contains the values of the fields without any object header.
Reference Types
+----------------------+
| Instance |
+----------------------+
| Object Header |
| Method Table Address |
| Field1 |
| ... |
| FieldN |
+----------------------+
A reference type contains a pointer (4 or 8 bytes) to the Method Table (MT) that contains type information (shared per type).
Before the pointer to the Method Table, the reference type contains an Object Header (4 or 8 bytes).
The pointer to the Method Table is followed by the values of the fields.
If the object has a finalizer, the reference type also contains a pointer to the finalizer chain link (4 or 8 bytes) at the end of the object.
Special types
Strings and Arrays have special implementations.
- They are implemented in managed and native code.
- They memory layout in managed and native is identical.
- Their size is not fixed among different instances.
Arrays
Arrays are represented by the System.Array
type.
Array
implementsIList
T[]
implementsIList<T>
(SZArray
internally)
Value type arrays point to an integer (4 bytes) that stores the size of the array. The value is followed by the items in the array stored inline.
Reference type arrays store the size of the array after the Method Table (for bounds checking).
The size of the array is followed by object references of each items in the array.
For multi-dimension arrays the length of each dimension is stored instead.
String
A string is stored starting with an integer (4 bytes) that represents the length of the string, followed by the characters of the string in-place.
✔Memory locality: no pointer lookup is required to access the string data.
Types
Value types
Value types are represented by the System.ValueType
type.
Instance methods from value types have a different signature. The implicit this
is inserted as a parameter passed by reference in the method.
void MyMethod( [ref MyStruct this], int myArg)
Object types
Object types are represented by the System.Object
type.
Instance methods from object types have a different signature. The implicit this
is inserted as the first parameter of the method.
void MyMethod( [ MyClass this], int myArg)
Metadata
Constructor
A constructor is labeled as .ctor
in the metadata.
Static Constructor
A static constructor is labeled as .cctor
in the metadata.
The BeforeFieldInit
flag indicates that the type benefits from lazy initialization.
The compiler automatically adds this attribute to any type thay does not contain a static constructor.
class Test
{
static object obj = new object();
}
Marked with the BeforeFieldInit
flag
class Test
{
static object obj;
static Test()
{
obj = new object();
}
}
Not marked with the BeforeFieldInit
flag
Threading
The GC runs in its own background thread.
.NET threads have two stacks: a user stack (1 MB) and a kernel stack (12 KB on 32-bit platforms or 24KB on 64-bit platforms).