Introduction

I've been trying to figure out how to calculate the heap size of a managed object for a while now, but I've finally figured it out.

After trying methods from here, I came across this code snippet a while back from here which proved to be promising:

Marshal.ReadInt32(typeof(T).TypeHandle.Value, 4)

After further research, I discovered that TypeHandle.Value is actually the pointer to a type's MethodTable, and that code snippet read a DWORD from it. (This will be elaborated on later.)

Background

As you may already know, the layout of a managed object in the heap is as follows (64-bit):

Offset Size Type
-8 8 Object Header
0 8 MethodTable*
8 ... Fields

A MethodTable contains a type's information necessary for the CLR. The first two fields inside the MethodTable are used for calculation of the heap size:

Offset Size Type Name
0 4 DWORD m_dwFlags
4 4 DWORD m_BaseSize

If you haven't noticed, that code snippet from earlier read the DWORD m_BaseSize. However, the first DWORD is also very important in calculating the size.

The engineers of the CLR are very creative in minimizing the size of objects. The lowest WORD in m_dwFlags is the component size of a type. If the type is an "array type" such as an int[] or string, the value of the lowest WORD will be the size of one component (read: element). For example, for a string, the component size will be 2 (sizeof(char)), and for an int[], it will be 4 (sizeof(int)). The other WORD is used as flags.

Going back to the snippet above, the second DWORD, m_BaseSize, is the base instance size of the object when allocated on the heap. By default, this value is 24 (64-bit) or 12 (32-bit) because that is the minimum size of an object:

#define MIN_OBJECT_SIZE     (2*sizeof(uint8_t*) + sizeof(ObjHeader))

m_BaseSize alone is typically enough to calculate the heap size of an object, but there are two special types in the CLR that have dynamic sizes; that is, their sizes vary per instance. Those are strings and arrays. Therefore, the runtime uses this formula for calculating the size of objects in the heap:

MT->GetBaseSize() + ((OBJECTTYPEREF->GetSizeField() * MT->GetComponentSize())

In other words:

Base instance size + (length * component size)

For instance, the size of an object would evaluate to this (64-bit):

24 + (1 * 0) == 24

Using this formula, we can calculate the heap size of any object.

Implementation

Disclaimer: This may be considered very evil.

Note: I aliased UInt32 as DWORD and UInt16 as WORD.

Thankfully, replicating the MethodTable can be done easily thanks to the StructLayout and FieldOffset attribute:

[StructLayout(LayoutKind.Explicit)]
    public unsafe struct MethodTable
    {
        [FieldOffset(0)] private DWFlags m_dwFlags;
 
        [FieldOffset(4)] private DWORD m_BaseSize;
        ...

Because only one WORD is used for component size, I made a separate struct splitting the two WORDS for convenience:

[StructLayout(LayoutKind.Explicit)]
    internal struct DWFlags
    {
        [FieldOffset(0)] internal WORD m_componentSize;
        [FieldOffset(2)] internal WORD m_flags;
       ...

Now that we have the representation of a MethodTable, it's just a matter of acquiring it. Going back to TypeHandle.Value, we know that it already points to a MethodTable*, so now it's just a matter of casting it!

var methodTable = (MethodTable*) typeof(T).TypeHandle.Value;

Now we can calculate the heap size of any object at runtime. You can write your own methods for calculating it. Here is an example of my code to show how you can calculate the size:

public static int HeapSize<T>(ref T t) where T : class
{
         var methodTable = (MethodTable*) typeof(T).TypeHandle.Value;

         if (typeof(T).IsArray) {
                var arr = t as Array;
                return (int) methodTable->BaseSize + arr.Length * methodTable->ComponentSize;
         }

         if (t is string) {
                var str = t as string;
                return (int) methodTable->BaseSize + str.Length * methodTable->ComponentSize;
         }

        return (int) methodTable->BaseSize;
}

Note: I only followed the specified formula for array-type objects, because otherwise the formula would still evaluate to the base size.

Now all that's left to do is to verify it works.

string s = "foo";

HeapSize gives us:

HeapSize(ref s) == 32

WinDbg gives us:

!DumpObj /d 000001f98001bc08
        Name:        System.String
        MethodTable: 00007fff1c1a6830
        EEClass:     00007fff1ba86cb8
        Size:        32(0x20) bytes
        File:        C:\WINDOWS\Microsoft.Net\assembly\GAC_64\mscorlib\
                        v4.0_4.0.0.0__b77a5c561934e089\mscorlib.dll
        String:      foo

And that is how the GC calculates the heap size of objects!