Garbage Collector Internals
The CLR GC is a highly efficient, scalable, and reliable automatic memory manager. Much time and effort went into researching the optimal behavioral characteristics of the GC. Before delving into the details of the CLR GC, it is important to state the definition of what the GC is and also what assumptions were made during its design and implementation. Let's begin by looking at some of the key assumptions.- The CLR GC assumes that everything is garbage unless otherwise told. This means that the GC is ready to collect all objects on the managed heap unless told otherwise. In essence, it implements a reference tracking scheme for all live objects in the system (we will define what live means shortly) where objects without any references to them are considered garbage and can be collected.
- The CLR GC assumes that all objects on the managed heap will be short lived (or ephemeral). In other words, the GC attempts to collect short-lived objects more often than long-lived objects operating under the assumption that if an object has been around for a while, chances are it will be around for a little longer and there is no need to attempt to collect that object again.
- The CLR GC tracks an object's age via the use of generations. Young objects are placed in generation 0 and older objects in generations 1 and 2. As an object grows older, it is promoted from one generation to the next. As such, a generation can be said to define the age of an object.
Let's look at each of the parts of the definition more concretely and begin with how generations define the age of an object.
Generations
The CLR GC defines three generations very innovatively called generation 0, generation 1, and generation 2. Each of the generations contains objects of a certain age where generation 0 contains newly allocated objects and generation 2 contains the oldest of objects. An object moves from one generation to the next by surviving a garbage collection. By surviving, it's implied that the object was still being referenced (or is still rooted) at the time of the garbage collection. Each of the generations can be garbage collected at any time, but the frequency of garbage collections depend on the generation. Remember from the previous section that one of the assumptions that the CLR makes is that most objects are going to be short-lived (i.e., live in generation 0). Due to that assumption, generation 0 is collected far more frequently than generation 2 in hopes to prune these short-lived objects quicker. Figure 5-5 shows the overall algorithm when it comes to how the generations are garbage collected.
Figure 5-5 High-level overview of generational garbage collection algorithm
In Figure 5-5,
we can see that the triggering of a garbage collection is by new
allocation request and when the budget for generation 0 has been
exceeded. If so, the garbage collector collects all objects that have no
roots associated with them and promotes all objects with roots
to generation 1. Much in the same way that generation 0 has a budget
defined, so does generation 1; and if, as part of promoting objects from
generation 0 to generation 1, the budget is exceeded, the GC repeats
the process of collecting objects with no roots in generation 1 and
promoting objects with roots to generation 2. The process repeats itself
for generation 2. If, while promoting to generation 2, the GC cannot
collect any objects and the budget for generation 2 is exceeded, the CLR
heap manager tries to allocate another segment that will hold
generation 2 objects. If the creation of a new segment fails, an OutOfMemoryException
is thrown. The CLR heap manager also releases segments if they are not
in use anymore; we will discuss this process in more detail later in the
chapter.Let's take a practical look at how an object is collected and promoted. Listing 5-2 shows the source code behind the application we will use to illustrate the generational concepts.
Listing 5-2. Example source code to illustrate generational concepts
using System; using System.Text; using System.Runtime.Remoting; namespace Advanced.NET.Debugging.Chapter5 { class Name { private string first; private string last; public string First { get { return first; } } public string Last { get { return last; } } public Name(string f, string l) { first = f; last = l; } } class Gen { static void Main(string[] args) { Name n1 = new Name("Mario", "Hewardt"); Name n2 = new Name("Gemma", "Hewardt"); Console.WriteLine("Allocated objects"); Console.WriteLine("Press any key to invoke GC"); Console.ReadKey(); n1 = null; GC.Collect(); Console.WriteLine("Press any key to invoke GC"); Console.ReadKey(); GC.Collect(); Console.WriteLine("Press any key to exit"); Console.ReadKey(); } } }The source code and binary for Listing 5-2 can be found in the following folders:
- Source code: C:\ADND\Chapter5\Gen
- Binary: C:\ADNDBin\05Gen.exe
Let's run the application under the debugger and see how we can verify our theories on how n1 and n2 are collected and promoted. When the application is running under the debugger, resume execution until the first Press any key to invoke GC prompt. At that point, we need to break execution and find the addresses to the two object instances, which can easily be done via the ClrStack command as shown in the following:
0:000> !ClrStack -a OS Thread Id: 0x1c0c (0) ESP EIP 0028f3b4 77709a94 [NDirectMethodFrameSlim: 0028f3b4] Microsoft.Win32.Win32Native.ReadConsoleInput(IntPtr, InputRecord ByRef, Int32, Int32 ByRef) 0028f3cc 793e8f28 System.Console.ReadKey(Boolean) PARAMETERS: intercept = 0x00000000 LOCALS: <no data> 0x0028f3dc = 0x00000001 <no data> <no data> <no data> <no data> <no data> <no data> <no data> <no data> 0028f40c 793e8e33 System.Console.ReadKey() 0028f410 003000f3 Advanced.NET.Debugging.Chapter5.Gen.Main(System.String[]) PARAMETERS: args = 0x01c55818 LOCALS: <CLR reg> = 0x01da5938 <CLR reg> = 0x01da5948 0028f65c 79e7c74b [GCFrame: 0028f65c]The addresses of the two objects on the managed heap are 0x01da5938 and 0x01da5948. How can we figure out which generation objects on the managed heap belong to? The answer to that lies in understanding the correlation between managed heap segments and generations. As previously discussed, each managed heap consists of one or more segments where the objects reside. Furthermore, part of the segment(s) is dedicated to a given generation. Figure 5-6 shows an example of a hypothetical managed heap segment.
Figure 5-6 Hypothetical managed heap segment
In Figure 5-6,
the managed heap segment is divided into three generations, each with
its own starting address managed by the CLR heap manager. Generations 0
and 1 are part of a single segment known as the ephemeral segment where
short-lived objects live. Because the GC goes under the assumption that
most objects are short lived, most objects are not expected to live past
generation 0 or, at a maximum, generation 1. Objects that live in
generation 2 are the oldest objects and get collected very infrequently.
It is possible that generation 2 can also be part of the ephemeral
segment even though generation 2 is not collected as often. By looking
at an object's address and knowing the address ranges for each of the
generations, we can find out which generation an object belongs to. How
do we know what the generational starting addresses for the CLR heap
manager are? The answer lies in a command called eeheap. The eeheap command displays various memory statistics of data consumed by internal CLR data structures. By default, eeheap
displays verbose data, meaning that information related to the GC as
well as the loader is displayed. To display information only about the
GC, the –gc switch can be used. Let's run the command in our existing debug session and see what we get:0:004> !eeheap -gc Number of GC Heaps: 1 generation 0 starts at 0x01da1018 generation 1 starts at 0x01da100c generation 2 starts at 0x01da1000 ephemeral segment allocation context: none segment begin allocated size 002c7db0 790d8620 790f7d8c 0x0001f76c(128876) 01da0000 01da1000 01da8010 0x00007010(28688) Large object heap starts at 0x02da1000 segment begin allocated size 02da0000 02da1000 02da3250 0x00002250(8784) Total Size 0x289cc(166348) ––––––––––––––––––––––––––––– GC Heap Size 0x289cc(166348)Part of the output shows clearly the starting addresses of each of the generations. If we look at the object addresses in the debug session of our sample application, we can see the following:
<CLR reg> = 0x01da5938 <CLR reg> = 0x01da5948Both of these addresses corresponding to our objects fall within the address range of generation 0 (starting at 0x01da1018), hence we can conclude that both of them live within the realm of that generation. This makes perfect sense because we are currently in the code flow where the objects were just allocated and we are pending a garbage collection. If we resume execution of the application and subsequently break execution again the next time we see the Press any key to invoke GC, we should see some difference in which generation the objects belong to. If we look at the source code, we can see that prior to invoking a garbage collection, we set the n1 reference to null, which in essence makes the object rootless and one that should be garbage collected. Furthermore, n2 is still rooted and as such should be promoted to generation 1 during the garbage collection. Let's take a look by following the same process as earlier: find the object addresses, use the eeheap command to find the generational address ranges, and see which generation the object falls into:
0:000> !ClrStack -a OS Thread Id: 0x1910 (0) ESP EIP 0021f394 77709a94 [NDirectMethodFrameSlim: 0021f394] Microsoft.Win32.Win32Native.ReadConsoleInput(IntPtr, InputRecord ByRef, Int32, Int32 ByRef) 0021f3ac 793e8f28 System.Console.ReadKey(Boolean) PARAMETERS: intercept = 0x00000000 LOCALS: <no data> 0x0021f3bc = 0x00000001 <no data> <no data> <no data> <no data> <no data> <no data> <no data> <no data> 0021f3ec 793e8e33 System.Console.ReadKey() 0021f3f0 01690111 Advanced.NET.Debugging.Chapter5.Gen.Main(System.String[]) PARAMETERS: args = 0x01da5818 LOCALS: <CLR reg> = 0x00000000 <CLR reg> = 0x01da5948 0021f644 79e7c74b [GCFrame: 0021f644] 0:000> !eeheap -gc Number of GC Heaps: 1 generation 0 starts at 0x01da6c00 generation 1 starts at 0x01da100c generation 2 starts at 0x01da1000 ephemeral segment allocation context: none segment begin allocated size 002c7db0 790d8620 790f7d8c 0x0001f76c(128876) 01da0000 01da1000 01da8c0c 0x00007c0c(31756) Large object heap starts at 0x02da1000 segment begin allocated size 02da0000 02da1000 02da3240 0x00002240(8768) Total Size 0x295b8(169400) –––––––––––––––––––––––––––––– GC Heap Size 0x295b8(169400)The most interesting part of the output is in the eeheap command output. We can see now that the generational address ranges have changed slightly. More specifically, the starting address of generation 0 has changed from 0x01da1018 to 0x01da6c00, which in essence implies that generation 1 has become bigger (because the starting address of generation 1 remains unchanged). If we correlate the address of our n2 object (0x01da5948) with the generational address ranges that the eeheap command displayed, we can see that the n2 object falls into generation 1. Again, this is fully expected because n2 previously lived in generation 0 and was still rooted at the time of the garbage collection, thereby promoting the object to the next generation. I will leave it as an exercise to you to see what happens on the final garbage collection in the sample application.
Although the SOS debugger extension provides the means of finding out which generation any given object belongs to, it is a somewhat tedious process as it requires that addresses be checked against potentially changing generational addresses within any given managed heap segment. Furthermore, there is no concrete way to list all the objects that fall into any given generation, making it hard to get an overall picture of the per generation utilization. Fortunately, the SOSEX extension comes to the rescue with a command named dumpgen. With the dumpgen command, you can easily get a list of all objects that belong to the generation specified as an argument to the command. For example, using the same sample application as shown in Listing 5-2, here is the output when running dumpgen:
0:000> !dumpgen 0 01da6c00 12 **** FREE **** 01da6c0c 68 System.Char[] 2 objects, 80 bytes 0:000> !dumpgen 1 01da100c 12 **** FREE **** 01da1018 12 **** FREE **** 01da1024 72 System.OutOfMemoryException 01da106c 72 System.StackOverflowException 01da10b4 72 System.ExecutionEngineException 01da10fc 72 System.Threading.ThreadAbortException 01da1144 72 System.Threading.ThreadAbortException 01da118c 12 System.Object 01da1198 28 System.SharedStatics 01da11b4 100 System.AppDomain ... ... ... 01da5948 16 Advanced.NET.Debugging.Chapter5.Name 01da5958 28 Microsoft.Win32.Win32Native+InputRecord 01da5974 12 System.Object 01da5980 20 Microsoft.Win32.SafeHandles.SafeFileHandle 01da5994 36 System.IO.__ConsoleStream 01da59b8 28 System.IO.Stream+NullStream ... ... ...We can see that there aren't a lot of objects in generation 0; instead, we have a ton of objects in generation 1 including our n2 instance at address 0x01da5948. The dumpgen command really makes life easier when looking at generation specific data.
So far, we have discussed how objects live in managed heap segments divided into generations and how these objects are either garbage collected or promoted to the next generation, depending on if they are still referenced (or still rooted). One question that still remains is what it means for an object to be rooted. The next section introduces the notion of roots, which are at the heart of the decision-making process the GC uses to determine if an object can be collected.
Roots
One of the most fundamental aspects of a garbage collection is that of being able to determine which objects are still being referenced and which objects are not and can be considered for garbage collection. Contrary to popular belief, the GC itself does not implement the logic for detecting which objects are still being referenced; rather, it uses other components in the CLR that have far more knowledge about the lifetimes of the objects. The CLR uses the following components to determine which objects are still referenced:- Just In Time compiler. The JIT compiler is the component responsible for translating IL to machine code and has detailed knowledge of which local variables were considered active at any given point in time. The JIT compiler maintains this information in a table that it subsequently references when the GC asks for objects that are still considered to be alive.
- Stack walker. This comes into play when unmanaged calls are made to the execution engine. During these calls, it is imperative that any managed objects used during the call also be part of the reference tracking system.
- Handle table. The CLR maintains a set of handle tables on a per application domain basis that can contain, for example, pointers to pinned reference types on the managed heap. During a GC inquiry, these handle tables are probed for live references to objects on the managed heap.
- Finalize queue. We will discuss the notion of object finalizers shortly, but for the time being, view objects with finalizers as objects that can be considered dead from an application's perspective but still need to be kept alive for cleanup purposes.
- If the object is a member of any of the above categories.
Listing 5-3. Sample application to illustrate object roots
using System; using System.Text; using System.Threading; namespace Advanced.NET.Debugging.Chapter5 { class Name { private string first; private string last; public string First { get { return first; } } public string Last { get { return last; } } public Name(string f, string l) { first = f; last = l; } } class Roots { public static Name CompleteName = new Name ("First", "Last"); private Thread thread; private bool shouldExit; static void Main(string[] args) { Roots r = new Roots(); r.Run(); } public void Run() { shouldExit = false; Name n1 = CompleteName; thread = new Thread(this.Worker); thread.Start(n1); Thread.Sleep(1000); Console.WriteLine("Press any key to exit"); Console.ReadKey(); shouldExit = true; } public void Worker(Object o) { Name n1 = (Name)o; Console.WriteLine("Thread started {0}, {1}", n1.First, n1.Last); while (true) { // Do work Thread.Sleep(500); if (shouldExit) break; } } } }The source code and binary for Listing 5-3 can be found in the following folders:
- Source code: C:\ADND\Chapter5\Roots
- Binary: C:\ADNDBin\05Roots.exe
- We have a static reference to the object instance at the Roots class level serving as our first root to the object.
- In the Run method, we assign a local variable reference (n1) to the object instance serving as our second root. The n1 local variable is not used after the thread has started and is subject to becoming invalid even before the end of the method scope (in retail builds). In debug builds, the reference is guaranteed to remain valid until the end of the scope is reached.
- In the Run method, we pass the local variable reference n1 to the thread method during thread startup serving as our third root.
0:005> ~0s eax=002cef9c ebx=002cef94 ecx=792274ec edx=79ec9058 esi=002cedf0 edi=00000000 eip=77709a94 esp=002ceda0 ebp=002cedc0 iopl=0 nv up ei pl zr na pe nc cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246 ntdll!KiFastSystemCallRet: 77709a94 c3 ret 0:000> !ClrStack -a OS Thread Id: 0x2358 (0) ESP EIP 002cef6c 77709a94 [NDirectMethodFrameSlim: 002cef6c] Microsoft.Win32.Win32Native.ReadConsoleInput(IntPtr, InputRecord ByRef, Int32, Int32 ByRef) 002cef84 793e8f28 System.Console.ReadKey(Boolean) PARAMETERS: intercept = 0x00000000 LOCALS: <no data> 0x002cef94 = 0x00000001 <no data> <no data> <no data> <no data> <no data> <no data> <no data> <no data> 002cefc4 793e8e33 System.Console.ReadKey() 002cefc8 00890212 Advanced.NET.Debugging.Chapter5.Roots.Run() PARAMETERS: this = 0x01c758e0 LOCALS: <CLR reg> = 0x01c758d0 002cefe8 0089013f Advanced.NET.Debugging.Chapter5.Roots.Main(System.String[]) PARAMETERS: args = 0x01c75888 LOCALS: <CLR reg> = 0x01c758e0 002cf208 79e7c74b [GCFrame: 002cf208] 0:000> !do 0x01c758d0 Name: Advanced.NET.Debugging.Chapter5.Name MethodTable: 001b311c EEClass: 001b13a0 Size: 16(0x10) bytes (C:\ADNDBin\05Roots.exe) Fields: MT Field Offset Type VT Attr Value Name 790fd8c4 4000001 4 System.String 0 instance 01c75898 first 790fd8c4 4000002 8 System.String 0 instance 01c758b4 last 0:000> !gcroot 0x01c758d0 Note: Roots found on stacks may be false positives. Run "!help gcroot" for more info. Scan Thread 0 OSTHread 2358 ESP:2cefbc:Root:01c758d0(Advanced.NET.Debugging.Chapter5.Name) Scan Thread 1 OSTHread 1630 Scan Thread 3 OSTHread 254c ESP:47df428:Root:01c758d0(Advanced.NET.Debugging.Chapter5.Name) ESP:47df42c:Root:01c758d0(Advanced.NET.Debugging.Chapter5.Name) ESP:47df438:Root:01c758d0(Advanced.NET.Debugging.Chapter5.Name) ESP:47df4d0:Root:01c75984(System.Threading.ThreadHelper)-> 01c758d0(Advanced.NET.Debugging.Chapter5.Name) ESP:47df4d8:Root:01c75984(System.Threading.ThreadHelper)-> 01c758d0(Advanced.NET.Debugging.Chapter5.Name) ESP:47df4f4:Root:01c75984(System.Threading.ThreadHelper)-> 01c758d0(Advanced.NET.Debugging.Chapter5.Name) ESP:47df500:Root:01c75984(System.Threading.ThreadHelper)-> 01c758d0(Advanced.NET.Debugging.Chapter5.Name) ESP:47df5c0:Root:01c758d0(Advanced.NET.Debugging.Chapter5.Name)-> 01c758d0(Advanced.NET.Debugging.Chapter5.Name) ESP:47df5c4:Root:01c75998(System.Threading.ParameterizedThreadStart)-> 01c75984(System.Threading.ThreadHelper) ESP:47df754:Root:01c758d0(Advanced.NET.Debugging.Chapter5.Name)-> 01c75984(System.Threading.ThreadHelper) ESP:47df758:Root:01c75998(System.Threading.ParameterizedThreadStart)-> 01c75984(System.Threading.ThreadHelper) ESP:47df764:Root:01c75998(System.Threading.ParameterizedThreadStart)-> 01c75984(System.Threading.ThreadHelper) ESP:47df76c:Root:01c758d0(Advanced.NET.Debugging.Chapter5.Name)-> 01c75984(System.Threading.ThreadHelper) DOMAIN(0037FCF8):HANDLE(Pinned):a13fc:Root:02c71010(System.Object[])-> 01c758d0(Advanced.NET.Debugging.Chapter5.Name)As you can see from the gcroot output, the command scans a number of different sources to find and build the reference chain to the object specified. Regardless of the source, the output of the GCRoot command results in the following general format:
<root>-><reference 1>-><reference 2>-><reference X>-><object>Depending on the source probed, each of the elements takes on a slightly different format as shown.
- Local variables on a threads stack. The root element typically looks like the following: <stack register>:<stack pointer>:Root:<object>. The stack register depends on the architecture. For example, on x86 machines it shows as ESP and on x64 machines it shows as RSP. The stack pointer shows the location on the stack where the object is rooted, and the object address
is the address of the object that is holding a reference to the next
object in the reference chain. Let's take a look at an example:
ESP:47df428:Root:01c758d0(Advanced.NET.Debugging.Chapter5.Name)
We can see that there is a local variable located on stack (ESP) location 0x047df428. Furthermore, the output tells us that this constitutes a root to the object at address 0x01c758d0, which is a reference to the Advanced.NET.Debugging.Chapter5.Name type. - Handle tables. All handle tables are scanned as part of GCRoot
execution looking for references to the specified object. If a
reference is found, the output of the command takes on the following
general syntax: DOMAIN(<address>):HANDLE(<type>):<handleaddress>:Root: <object>. The domain address field indicates the address of the application domain to which the handle reference belongs. The handle type specifies the type of the handle. The possible handle types are Weak, WeakTrac Resurrection, Normal, and Pinned.Next is the handle address,
which is the address to the handle itself. Please keep in mind that the
handle type is a value type and if you want to dump out the contents
you must use the DumpVC command rather than DumpObj. Finally, the root object address is shown. Let's take a look at an example:
DOMAIN(002EFCD8):HANDLE(Pinned):2813fc:Root:02c81010 (System.Object[])->01c858d0(Advanced.NET.Debugging. Chapter5.Name)
The preceding output indicates that the object at address 0x01c858d0 is rooted by an object that resides in the handle table corresponding to the application domain with address 0x002efcd8. Furthermore, the address of the handle value holding the reference is located at address 0x002813fc and the type of the handle value is pinned. Lastly, the actual object that holds the reference is at address 0x02c81010, which is of type System.Object[].
- F-reachable queue. The f-reachable queue is scanned to see if
there are any references to the specified object. If a root reference to
the object is found on the f-reachable queue, it will be displayed in
the following general format: Finalizer queue:Root:<object address>(<object type>).
The first part of the output indicates that the source of the root is
the f-reachable queue. Next, the address of the referenced object is
displayed, followed by the object type. What follows is an example of
the output of GCRoot when run against an object that is on the f-reachable queue:
Finalizer queue:Root:01d15750(Advanced.NET.Debugging.Chapter5.Name)
In the preceding output, we can see that the object at address 0x01d15750 of type Advanced.NET.Debugging.Chapter5.Name is rooted by the f-reachable queue. - The last source of output for the GCRoot command is if an object is a member of any of the preceding categories.
public void Run() { Name n1 = new Name("A", "B"); Console.WriteLine("Press any key to exit"); Console.ReadKey(); }In the source code, we have a simple instance of the Name class assigned to the n1 local variable. If we ran the GCRoot command on the n1 reference, we would expect to only see one reference on the thread stack:
0:000> !GCRoot 0x01e9580c Note: Roots found on stacks may be false positives. Run "!help gcroot" for more info. Scan Thread 0 OSTHread 1638 ESP:1df29c:Root:01e9580c(Advanced.NET.Debugging.Chapter5.Name) ESP:1df2a0:Root:01e9580c(Advanced.NET.Debugging.Chapter5.Name) Scan Thread 2 OSTHread 14acThe output clearly shows that thread 0 apparently has two references to the object on the thread stack. How is this possible? The way that the GCRoot command works is by assuming that every address on the stack is an address to an object. It tries to verify this assumption by utilizing various metadata information. In light of this, objects that are (or were) previously present on the stack are treated as first class references to those objects and listed in the output of GCRoot. If you suspect that the output of GCRoot, in as far as thread stacks is concerned, is incorrect, the best approach is to use the U command to unassemble the stack frames and correlate the stack registers in the GCRoots output to the unassembled code to see which objects are truly valid.
Finalization
The garbage collection mechanism described so far assumes that objects that are collected do not require any special cleanup code. At times, objects that encapsulate other resources require that these resources be cleaned up as part of object destruction. A great example is an object that wraps an underlying native resource such as a file handle. Without explicit cleanup code, the memory behind the managed object is cleaned up by the GC, but the underlying handle that the object encapsulates is not (because GC has no special knowledge of native handles). The net result is naturally a resource leak. To provide a proper cleanup mechanism, the CLR introduces what is known as finalizers. A finalizer can be compared to destructors in the native C++ world. Whenever an object is freed (or garbage collected), the destructor (or finalizer) is run. In C#, a finalizer is declared very similarly to a C++ destructor by using the ~<class name>() notation. An example is shown in the following listing:public class MyClass { ... ... ... ~MyClass() { // Cleanup code } }When the class is compiled into IL, the finalize method gets translated into a function called Finalize. The key thing about objects with finalizers is that the garbage collector treats them a little differently than other objects. Because the garbage collector is in fact an automatic memory manager, it also has the responsibility of executing all finalization code that an object may have during a garbage collection. To keep tabs on which objects have finalizers, the garbage collector maintains a queue called a finalization queue. Objects that are created on the managed heap and contain finalizers are automatically placed on the finalization queue during creation. Please note that the finalization queue does not contain objects that are considered garbage, but rather it contains all objects with finalizers that are alive on the managed heap. When an object with a finalizer becomes rootless and a garbage collection occurs, the GC places the object on a different queue known as the f-reachable queue. This queue contains all objects with defined finalizers that are considered to be garbage and need to have their finalizers executed. All objects on the f-reachable queue are considered roots to those objects, meaning that the object is still alive. It is important to note that the finalizer code for each of the objects on the f-reachable queue is not executed as part of the garbage collection phase. Instead, each .NET process contains a special thread known as the finalization thread. The finalization thread wakes up, on request of the GC, and checks the state of the f-reachable queue. If there are any objects on the f-reachable queue, the finalization thread picks them up one by one and executes the finalize methods.
When the garbage collection finishes, objects with finalizers are on the f-reachable queue (rooted and alive) until the finalization thread executes the finalize methods. At that point, the object is removed from the f-reachable queue, is considered rootless, and can be truly reclaimed by the garbage collector. The next time a garbage collection is started, the objects are collected. Figure 5-7 illustrates an example of the finalization process.
Figure 5-7 Example of finalization process
Step 1 in Figure 5-7 consists of allocating Obj D and Obj E,
both of which contain finalize methods. As part of the allocation, the
objects are placed on the managed heap as well as on the finalization
queue to indicate that the objects need to be finalized when no longer
in use. In step 2, Obj D and Obj E have both become
rootless when a garbage collection occurs. At that point, both objects
are moved from the finalization queue to the f-reachable queue to
indicate that the finalize methods are now ready to be run. At some
point in the future (nondeterministic), step 3 is executed and the
finalizer thread wakes up and starts running the finalize methods for
both of the objects. Even after the finalizer has finished, both objects
are still rooted on the f-reachable queue. Lastly, in step 4, another
garbage collection occurs and the objects are removed from the
f-reachable queue (no longer rooted) and then collected from the managed
heap by the garbage collector.An interesting aspect of having a dedicated thread executing the finalize methods is that the CLR does not place any guarantees when the thread wakes up and executes. As such, it is possible that it will take some time before an object with a finalizer is actually cleaned up. When dealing with objects that aggregate scarce resources, it may not always be feasible to wait for a long period of time for the resource to be reclaimed. In such situations, it is best to implement an explicit and deterministic cleanup pattern such as the IDisposable and/or Close patterns. Finally, having a dedicated thread also means that you have no control over the state of that thread, and making assumptions based on state can break your application.
Let's take a look at a concrete example of an object with a finalize method and see if we can track the object during a garbage collection. Listing 5-4 shows the source code of the application we will be utilizing.
Listing 5-4. Simple object with a finalize method
using System; using System.Text; using System.Runtime.InteropServices; namespace Advanced.NET.Debugging.Chapter5 { class NativeEvent { private IntPtr nativeHandle; public IntPtr NativeHandle { get { return nativeHandle; } } public NativeEvent(string name) { nativeHandle = CreateEvent(IntPtr.Zero, false, true, name); } ~NativeEvent() { if(nativeHandle!=IntPtr.Zero) { CloseHandle(nativeHandle); nativeHandle=IntPtr.Zero; } } [DllImport("kernel32.dll")] static extern IntPtr CreateEvent(IntPtr lpEventAttributes, bool bManualReset, bool bInitialState, string lpName); [DllImport("kernel32.dll")] static extern IntPtr CloseHandle(IntPtr lpEvent); } class Finalize { static void Main(string[] args) { Finalize f = new Finalize(); f.Run(); } public void Run() { NativeEvent nEvent = new NativeEvent("MyNewEvent"); // // Use nEvent // nEvent = null; Console.WriteLine("Press any key to GC"); Console.ReadKey(); GC.Collect(); Console.WriteLine("Press any key to GC"); Console.ReadKey(); GC.Collect(); Console.WriteLine("Press any key to exit"); Console.ReadKey(); } } }The source code and binary for Listing 5-4 can be found in the following folders:
- Source code: C:\ADND\Chapter5\Finalize
- Binary: C:\ADNDBin\05Finalize.exe
0:004> !FinalizeQueue SyncBlocks to be cleaned up: 0 MTA Interfaces to be released: 0 STA Interfaces to be released: 0 –––––––––––––––––––––––––––––––––– generation 0 has 6 finalizable objects (003d3160->003d3178) generation 1 has 0 finalizable objects (003d3160->003d3160) generation 2 has 0 finalizable objects (003d3160->003d3160) Ready for finalization 0 objects (003d3178->003d3178) Statistics: MT Count TotalSize Class Name 00123128 1 12 Advanced.NET.Debugging.Chapter5.NativeEvent 7911c9c8 1 20 Microsoft.Win32.SafeHandles.SafePEFileHandle 791037c0 1 20 Microsoft.Win32.SafeHandles.SafeFileMappingHandle 79103764 1 20 Microsoft.Win32.SafeHandles.SafeViewOfFileHandle 79101444 1 20 Microsoft.Win32.SafeHandles.SafeFileHandle 790fe704 1 56 System.Threading.Thread Total 6 objectsThere are several pieces of useful information in the output. First, the finalization queues for each generation are shown. In this particular case, generation 0 has 6 finalizable objects and generations 1 and 2 have none. For each of the finalization queues, the FinalizeQueue command also shows the address range of the queue itself for that particular generation. For example, generation 0's finalization queue starts at address 0x003d3160 and ends at address 0x003d3178. We can use the dd command to dump the queue as shown here:
0:004> dd 003d3160 l6 003d3160 01fc1df0 01fc5090 01fc5964 01fc5998 003d3170 01fc683c 01fc6850The elements in the queue can be looked at further by using the do command. If we want to look at the object at address 0x01fc5964 in more detail, we would use the command shown here:
0:004> !do 01fc5964 Name: Advanced.NET.Debugging.Chapter5.NativeEvent MethodTable: 00123128 EEClass: 00121804 Size: 12(0xc) bytes (C:\ADNDBin\05Finalize.exe) Fields: MT Field Offset Type VT Attr Value Name 791016bc 4000001 4 System.IntPtr 1 instance 1f0 nativeHandleThe next piece of useful information from the FinalizeQueue command is the f-reachable queue, which is shown in the following output:
Ready for finalization 0 objects (000c3178->000c3178)The output indicates that at this point there are no objects that are ready to be finalized. This makes perfect sense because a garbage collection has not yet occurred.
The final piece of output in the FinalizeQueue command is the statistics section, which shows a summarized list of all objects in either the finalization queue or the f-reachable queue.
Before we resume execution, we need to discuss the magic finalization thread that exists in all managed processes. What does the stack trace of this thread look like? To find the answer, use the ~*kn command to display the stack traces of all the threads in the process including frame numbers. In the output, one thread in particular looks interesting:
2 Id: 1a10.c10 Suspend: 1 Teb: 7ffdd000 Unfrozen # ChildEBP RetAddr 00 011cf604 77709254 ntdll!KiFastSystemCallRet 01 011cf608 7618c244 ntdll!ZwWaitForSingleObject+0xc 02 011cf678 79e789c6 KERNEL32!WaitForSingleObjectEx+0xbe 03 011cf6bc 79e7898f mscorwks!PEImage::LoadImage+0x1af 04 011cf70c 79e78944 mscorwks!CLREvent::WaitEx+0x117 05 011cf720 79ef2220 mscorwks!CLREvent::Wait+0x17 06 011cf73c 79fb997b mscorwks!WKS::WaitForFinalizerEvent+0x4a 07 011cf750 79ef3207 mscorwks!WKS::GCHeap::FinalizerThreadWorker+0x79 08 011cf764 79ef31a3 mscorwks!Thread::DoADCallBack+0x32a 09 011cf7f8 79ef30c3 mscorwks!Thread::ShouldChangeAbortToUnload+0xe3 0a 011cf834 79fb9643 mscorwks!Thread::ShouldChangeAbortToUnload+0x30a 0b 011cf85c 79fb960d mscorwks!ManagedThreadBase_NoADTransition+0x32 0c 011cf86c 79fba09b mscorwks!ManagedThreadBase::FinalizerBase+0xd 0d 011cf8a4 79f95a2e mscorwks!WKS::GCHeap::FinalizerThreadStart+0xbb 0e 011cf93c 76184911 mscorwks!Thread::intermediateThreadProc+0x49 0f 011cf948 776ee4b6 KERNEL32!BaseThreadInitThunk+0xe 10 011cf988 776ee489 ntdll!__RtlUserThreadStart+0x23 11 011cf9a0 00000000 ntdll!_RtlUserThreadStart+0x1bFrames 6 and 7 in the stack trace indicate that in fact this is the finalizer thread for the process. Frame 6 in particular shows that the thread is currently waiting for finalizer events (or objects that need to be finalized). Let's set a breakpoint on the return address of frame 6 (0x79fb997b), which will trigger any time the finalizer thread is awakened to perform work:
bp 79fb997bWhen the breakpoint is set, resume execution and press any key to trigger the first garbage collection. You'll notice that a breakpoint is hit, as shown in the following:
0:003> g Breakpoint 0 hit eax=00000001 ebx=00000001 ecx=7618c42d edx=77709a94 esi=00000000 edi=00493a48 eip=79fb997b esp=00b7f768 ebp=00b7f770 iopl=0 nv up ei pl nz na po nc cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202 mscorwks!WKS::GCHeap::FinalizerThreadWorker+0x79: 79fb997b 3bde cmp ebx,esiThe breakpoint corresponds to the finalizer thread breakpoint set earlier and indicates that the finalizer is ready to execute the Finalize methods on the objects in the f-reachable queue. How do we find out what objects are in the f-reachable queue? You guessed it: by using the FinalizeQueue command:
0:002> !FinalizeQueue SyncBlocks to be cleaned up: 0 MTA Interfaces to be released: 0 STA Interfaces to be released: 0 –––––––––––––––––––––––––––––––––– generation 0 has 0 finalizable objects (003d3170->003d3170) generation 1 has 4 finalizable objects (003d3160->003d3170) generation 2 has 0 finalizable objects (003d3160->003d3160) Ready for finalization 2 objects (003d3170->003d3178) Statistics: MT Count TotalSize Class Name 00123128 1 12 Advanced.NET.Debugging.Chapter5.NativeEvent 7911c9c8 1 20 Microsoft.Win32.SafeHandles.SafePEFileHandle 791037c0 1 20 Microsoft.Win32.SafeHandles.SafeFileMappingHandle 79103764 1 20 Microsoft.Win32.SafeHandles.SafeViewOfFileHandle 79101444 1 20 Microsoft.Win32.SafeHandles.SafeFileHandle 790fe704 1 56 System.Threading.ThreadThis time, the output states that there are two objects in the f-reachable queue, starting at address 0x003d3160, that the finalization thread is about to execute. If we dump out the contents of the f-reachable queue and each of the objects, we can see the following:
0:002> dd 003d3170 l2 003d3170 01fc5090 01fc5964 0:002> !do 01fc5090 Name: Microsoft.Win32.SafeHandles.SafePEFileHandle MethodTable: 7911c9c8 EEClass: 791fb61c Size: 20(0x14) bytes (C:\Windows\assembly\GAC_32\mscorlib\2.0.0.0__b77a5c561934e089\mscorlib.dll) Fields: MT Field Offset Type VT Attr Value Name 791016bc 40005c1 4 System.IntPtr 1 instance 3eab28 handle 79102290 40005c2 8 System.Int32 1 instance 4 _state 7910be50 40005c3 c System.Boolean 1 instance 1 _ownsHandle 7910be50 40005c4 d System.Boolean 1 instance 1 _fullyInitialized 0:002> !do01fc5964 Name: Advanced.NET.Debugging.Chapter5.NativeEvent MethodTable: 00123128 EEClass: 00121804 Size: 12(0xc) bytes (C:\ADNDBin\05Finalize.exe) Fields: MT Field Offset Type VT Attr Value Name 791016bc 4000001 4 System.IntPtr 1 instance 1f0 nativeHandleThe first object is of type SafePEFileHandle and the second object is of type NativeEvent, which happens to be the object we are interested in. If we resume execution, the finalizer thread executes the Finalize method of our NativeEvent class. What happens to the objects on the f-reachable queue after finalization has completed? Well, the objects are removed from the f-reachable queue, which renders them rootless; they will be collected during the next garbage collection.
This concludes our discussion of finalization. As you can see, there is a lot of work being done under the hood whenever a finalizable type comes into play. Not only does the CLR need additional data structures (such as the finalization queue and f-reachable queue), but it also spins up a dedicated thread to run the Finalize methods for each object that is being collected. Furthermore, an object with a Finalize does not get collected in just one garbage collection, but rather two, which in essence means that the objects with Finalize methods always get promoted to generation 1 before they are truly dead, making it a far more expensive object to work with.
Reclaiming GC Memory
We have discussed the GC in quite a bit of detail. We now know exactly what the GC does when an object is considered garbage. The one missing piece of information is what the GC does with the memory that becomes available after an object is garbage collected. Does the memory get put on some sort of free list and then reused when another allocation request arrives? Does the memory get freed? Is fragmentation ever a problem on the managed heap? The answer is a combination of all three. If a collection that occurs in generations 0 and 1 leaves a gap on the managed heap, the garbage collector compacts all live objects so that they reside next to each other and coalesces any free blocks on the managed heap into a larger block that is located after the last live object (starting at the current allocation pointer). Figure 5-8 shows an example of the compacting and coalescing.
Figure 5-8 Garbage collection compacting and coalescing phase
In Figure 5-8,
the initial state of the managed heap contains five rooted objects (A
through E). At some point during execution, objects B and D become
rootless and are candidates to be reclaimed during a garbage collection.
When the garbage collection occurs, the memory occupied by objects B
and D is reclaimed, which leads to gaps on the managed heap. To remove
these gaps, the garbage collector compacts the remaining live objects
(Obj A, C, and E) and coalesces the two free blocks (used to hold Obj B
and D) into one free block. Lastly, the current allocation pointer is
updated as a result of the compacting and coalescing.The ephemeral segment contains both generation 0 and generation 1 (and also part of generation 2), but generation 2 can consist of multiple managed heap segments. As more and more objects make it to generation 2, the need to grow generation 2 also increases. The way that the CLR heap manager grows generation 2 is by allocating more segments. When objects in generation 2 are collected, the CLR heap manager decommits memory in the segments, and when a segment is no longer needed, it is entirely freed. In certain situations and allocation patterns, generation 2 grows and shrinks quite frequently, leading to a large number of calls to allocate and free virtual memory (VirtualAlloc and VirtualFree APIs). Two common drawbacks of this approach are that these calls can be expensive because a transition to kernel mode is required as well as the potential to fragment the VM address space. As such, CLR 2.0 introduces a feature called VM hoarding, which essentially does not free segments but rather keeps the segments on a standby list that can be utilized when more memory is required. To utilize the VM hoarding feature, the CLR host itself must specify that it wants to use the feature.
Because the cost of a compaction is directly proportional to the size of the object (the bigger the object, the costlier the compaction), the garbage collector introduces another type of heap called the large object heap (LOH). Objects that are large enough to severely hurt the performance of a compaction are placed on the LOH, which we will discuss next.
Large Object Heap
The large object heap (LOH) consists of objects that are greater than or equal to 85,000 bytes in size. The decision to separate objects of that size into its own heap is related to the fact that during the compacting phase of a garbage collection, the cost of compacting an object is directly proportional to the size of the object being compacted. Rather than having large objects on the standard heap eating up garbage collection time during compaction, the LOH was created. The LOH is best viewed as an extension of generation 2, and a collection of the LOH can only be done after a generation 2 collection has occurred, implying that a collection of the LOH is only done during a full garbage collection. Because compacting large objects is very expensive, the GC avoids compacting the LOH altogether and instead uses a process known as sweeping that keeps a free list that is used to keep track of available memory in the LOH segment(s). Figure 5-9 shows an example of a LOH with two segments.
Figure 5-9 LOH example
Please note that although the LOH does not perform any compaction, it
does do coalescing of adjacent free blocks. That is, if you ever end up
with two free adjacent blocks, the GC coalesces those blocks into a
larger block and adds it to the free list (while also removing the two
smaller blocks).To find out the current state of the LOH in the debugger, we can again use the eeheap –gc command, which includes details on the LOH:
0:004> !eeheap -gc Number of GC Heaps: 1 generation 0 starts at 0x01fc6c18 generation 1 starts at 0x01fc100c generation 2 starts at 0x01fc1000 ephemeral segment allocation context: none segment begin allocated size 00308030 790d8620 790f7d8c 0x0001f76c(128876) 01fc0000 01fc1000 01fc8c24 0x00007c24(31780) Large object heap starts at 0x02fc1000 segment begin allocated size 02fc0000 02fc1000 02fc3240 0x00002240(8768) Total Size 0x295d0(169424) –––––––––––––––––––––––––––––– GC Heap Size 0x295d0(169424)The LOH section in the command output shows the starting point of the LOH as well as per-segment information such as the segment, start, and end address of the segment and total size of the segment. In the preceding example, we can see that the LOH has one segment (0x02fc000) starting at address 0x02fc1000 and ending at 0x02fc3240 with a total size of 0x00002240. The last piece of information is the total size of all segments in the LOH. One interesting question related to the LOH is how the contents of the LOH can be dumped. There are a couple of options that both revolve around using DumpHeap command switches. The first switch of interest is the –min switch, which tells the DumpHeap command that you are only interested in objects of the specified size. Because we know that LOH objects are greater than or equal to 85,000 bytes in size, we can use the following command:
0:004> !DumpHeap -min 85000 Address MT Size 02c53250 7912dae8 100016 total 1 objects Statistics: MT Count TotalSize Class Name 7912dae8 1 100016 System.Byte[]Here, we can see that there is one object of size 100016 on the LOH. You can verify or convince yourself that the object is in fact on the LOH by looking at the address. If the address of the object falls within the LOH segments addresses, it must be located on the LOH (with the exception of free objects, which can reside both in the SOH as well as the LOH).
The next option we have is to specify a starting address for the DumpHeap command. If we specify the starting address of the LOH, we can ask the command to dump out all objects on the LOH. The switch to use is the –startAtLowerBound switch, which takes the address as a parameter. Using the same LOH as earlier, the following command can be used:
0:004> !DumpHeap -startAtLowerBound 02c51000 Address MT Size 02c51000 002a6360 16 Free 02c51010 7912d8f8 4096 02c52010 002a6360 16 Free 02c52020 7912d8f8 4096 02c53020 002a6360 16 Free 02c53030 7912d8f8 528 02c53240 002a6360 16 Free 02c53250 7912dae8 100016 02c6b900 002a6360 16 Free total 9 objects Statistics: MT Count TotalSize Class Name 002a6360 5 80 Free 7912d8f8 3 8720 System.Object[] 7912dae8 1 100016 System.Byte[] Total 9 objectsAgain, we see the object of size 100016, but even more interesting is that we see objects that are smaller than 85,000 bytes on the LOH. What are these objects and how did they end up on the LOH? The answer is that these very, very small objects are placed there by the CLR heap manager, which uses them for its own purposes. Generally speaking, you always see a select few objects with a size less than 85,000 bytes exclusively used by the GC.
Let's take a look at a small sample application that allocates a single large object of size 10,000 bytes (see Listing 5-5). We will then use the debuggers to see if we can locate the object on the LOH and see what happens when the object is collected.
Listing 5-5. Sample application demonstrating LOH
using System; using System.Text; using System.Runtime.InteropServices; namespace Advanced.NET.Debugging.Chapter5 { class LOH { static void Main(string[] args) { LOH l = new LOH(); l.Run(); } public void Run() { byte[] b = null; Console.WriteLine("Press any key to allocate on LOH"); Console.ReadKey(); b = new byte[100000]; Console.WriteLine("Press any key to GC"); Console.ReadKey(); b = null; GC.Collect(); Console.WriteLine("Press any key to exit"); Console.ReadKey(); } } }The source code and binary for Listing 5-5 can be found in the following folders:
- Source code: C:\ADND\Chapter5\LOH
- Binary: C:\ADNDBin\05LOH.exe
0:004> !eeheap -gc Number of GC Heaps: 1 generation 0 starts at 0x01f01018 generation 1 starts at 0x01f0100c generation 2 starts at 0x01f01000 ephemeral segment allocation context: none segment begin allocated size 004a8008 790d8620 790f7d8c 0x0001f76c(128876) 01f00000 01f01000 01f5c334 0x0005b334(373556) Large object heap starts at 0x02f01000 segment begin allocated size 02f00000 02f01000 02f03250 0x00002250(8784) Total Size 0x7ccf0(511216) –––––––––––––––––––––––––––––– GC Heap Size 0x7ccf0(511216) 0:004> !dumpheap -startatlowerbound 02f01000 Address MT Size 02f01000 00496360 16 Free 02f01010 7912d8f8 4096 02f02010 00496360 16 Free 02f02020 7912d8f8 4096 02f03020 00496360 16 Free 02f03030 7912d8f8 528 02f03240 00496360 16 Free total 7 objects Statistics: MT Count TotalSize Class Name 00496360 4 64 Free 7912d8f8 3 8720 System.Object[] Total 7 objectsWe start by finding the starting point of the LOH by using the eeheap command. The starting point in this case is 0x02f01000. Then, we feed the starting address to the dumpheap command using the –startatlowerbound switch to output all objects on the LOH. In the output, we can see that the only objects that are on the LOH are the mysterious object arrays that are smaller than 85,000 bytes. Other than that, we have no other objects present. Next, resume execution and again manually break execution when the Press any key to GC is shown.
We issue the same dumpheap command as before to see if we can spot our 100KB allocation:
0:003> !dumpheap -startatlowerbound 02f01000 Address MT Size 02f01000 00496360 16 Free 02f01010 7912d8f8 4096 02f02010 00496360 16 Free 02f02020 7912d8f8 4096 02f03020 00496360 16 Free 02f03030 7912d8f8 528 02f03240 00496360 16 Free 02f03250 7912dae8 100016 02f1b900 00496360 16 Free total 9 objects Statistics: MT Count TotalSize Class Name 00496360 5 80 Free 7912d8f8 3 8720 System.Object[] 7912dae8 1 100016 System.Byte[] Total 9 objectsWe can see that our allocation is stored at address 0x02f03250 on the LOH. Next, we resume execution until we see the Press any key to exit prompt. At this point, a garbage collection has occurred, so let's see what the LOH looks like by using the same dumpheap command again:
0:003> !dumpheap -startatlowerbound 02f01000 Address MT Size 02f01000 00496360 16 Free 02f01010 7912d8f8 4096 02f02010 00496360 16 Free 02f02020 7912d8f8 4096 02f03020 00496360 16 Free 02f03030 7912d8f8 528 total 6 objects Statistics: MT Count TotalSize Class Name 00496360 3 48 Free 7912d8f8 3 8720 System.Object[]This time, we can see how the object has been removed from the LOH and the free blocks available as a result of the collection.
Pinning
As we saw in the Releasing GC Memory section, the garbage collector employs a technique known as compaction to reduce fragmentation on the GC heap. When a compaction occurs, objects may end up moving around on the heap so that they can be placed together, thereby avoiding gaps. As part of the object move, because the address of the object changes, all references to the object are also updated. This works well assuming all references to the object are contained within the CLR, but quite often it is necessary for .NET applications to work outside of the boundary of the CLR by using the interoperability services (such as platform invocation or COM interoperability). If a reference to a managed object is passed to an underlying native API, the object might be moved while the native API is reading and/or writing to the memory, causing serious problems because the CLR clearly cannot notify the native API of the address change. Figure 5-10 illustrates the problem.
Figure 5-10 Interoperability services and GC compaction problem
From the flow in Figure 5-10, we can see that the initial state of the managed heap includes five objects starting with Obj A at address 0x02000000. At a certain point, a platform invocation call to an asynchronous native API is required. Furthermore, the address of Obj C (0x02000090) needs to be passed to the API. Upon successfully calling the asynchronous native API, a garbage collection occurs causing Obj A and Obj B
to be collected. This leaves a gap of two free objects on the managed
heap and the garbage collector dutifully rectifies the problem by
compacting the managed heap and therefore moving Obj C to address 0x02000000.
It also coalesces the two free blocks and places them at the end of the
heap. After the garbage collection has finished, the asynchronous API
call we made earlier decides to write to the address initially passed to
it (0x02000090), which originally held Obj C. As you
can see, with the asynchronous API writing to that address, we will
experience a managed heap corruption as the memory is no longer occupied
by Obj C.Because the invocation of native code is such a common task, a solution had to be devised that allowed for safe invocation in light of a compacting garbage collector. The solution is called pinning and refers to the capability to pin specific objects on the managed heap. When an object is pinned, the garbage collector will not move the object for any reason until the object is unpinned. If Obj C in Figure 5-10 was pinned prior to invoking the asynchronous native API, the managed heap corruption would not have occurred due to the garbage collector not moving Obj C during the compaction phase.
Let's take a look at an example of a simple application that performs pinning and see what it looks like in the debugger. Listing 5-6 shows the source code of the application.
Listing 5-6. Sample application using pinning
using System; using System.Text; using System.Runtime.InteropServices; namespace Advanced.NET.Debugging.Chapter5 { class Pinning { static void Main(string[] args) { Pinning p = new Pinning(); p.Run(); } public void Run() { SByte[] b1 = null; SByte[] b2 = null; SByte[] b3 = null; Console.WriteLine("Press any key to alloc"); Console.ReadKey(); b1 = new SByte[100]; b2 = new SByte[200]; b3 = new SByte[300]; GCHandle h1 = GCHandle.Alloc(b1, GCHandleType.Pinned); GCHandle h2 = GCHandle.Alloc(b2, GCHandleType.Pinned); GCHandle h3 = GCHandle.Alloc(b3, GCHandleType.Pinned); Console.WriteLine("Press any key to GC"); Console.ReadKey(); GC.Collect(); Console.WriteLine("Press any key to exit"); Console.ReadKey(); h1.Free(); h2.Free(); h3.Free(); } } }The source code and binary for Listing 5-6 can be found in the following folders:
- Source code: C:\ADND\Chapter5\Pinning
- Binary: C:\ADNDBin\05Pinning.exe
Resume execution of the application until you see the Press any key to GC prompt. At this point, we manually break execution and use a command called GCHandles. The GCHandles command displays a list of all the handles available in the process:
0:004> !GCHandles GC Handle Statistics: Strong Handles: 15 Pinned Handles: 7 Async Pinned Handles: 0 Ref Count Handles: 0 Weak Long Handles: 0 Weak Short Handles: 1 Other Handles: 0 Statistics: MT Count TotalSize Class Name 790fd0f0 1 12 System.Object 790feba4 1 28 System.SharedStatics 790fcc48 2 48 System.Reflection.Assembly 790fe17c 1 72 System.ExecutionEngineException 790fe0e0 1 72 System.StackOverflowException 790fe044 1 72 System.OutOfMemoryException 790fed00 1 100 System.AppDomain 790fe704 2 112 System.Threading.Thread 79100a18 4 144 System.Security.PermissionSet 790fe284 2 144 System.Threading.ThreadAbortException 7912ee44 3 636 System.SByte[] 7912d8f8 4 8736 System.Object[] Total 23 objectsThe GCHandles command walks the handle tables and looks for all types of different handles (strong, weak, pinned, etc.) and displays a summary of the results as well as a statistical section with detailed information on each type found. In the preceding output, we can see that we have 15 strong handles, 7 pinned handles, and 1 weak short handle. In addition, in the Statistics section, we can see the three SByte arrays that we allocated and pinned. The GCHandles command provides a good overview of the handle activity in any given process, but if further information is required, such as the type of handle for each of the types listed in the Statistics section, we have to use an additional command called objsize. One of the functions of the objsize command is to output the size of the object passed in as an argument. If no arguments are specified, it scans all the referenced objects in the process and outputs the size as well as other useful information:
0:004> !objsize Scan Thread 0 OSTHread 2558 ESP:2fed54: sizeof(01d9599c) = 20 ( 0x14) bytes (Microsoft.Win32.SafeHandles.SafeFileHandle) ESP:2fee18: sizeof(01d96d9c) = 312 ( 0x138) bytes (System.SByte[]) ESP:2fee20: sizeof(01d96c58) = 112 ( 0x70) bytes (System.SByte[]) ESP:2fee24: sizeof(01d96cc8) = 212 ( 0xd4) bytes (System.SByte[]) ESP:2fee30: sizeof(01d958b4) = 12 ( 0xc) bytes (Advanced.NET.Debugging.Chapter5.Pinning) ... ... ... Scan Thread 2 OSTHread 2c80 DOMAIN(004DFD10):HANDLE(Strong):1c119c: sizeof(01d958a4) = 16 ( 0x10) bytes (System.Object[]) ... ... ... DOMAIN(004DFD10):HANDLE(WeakSh):1c12fc: sizeof(01d91de8) = 56 ( 0x38) bytes (System.Threading.Thread) DOMAIN(004DFD10):HANDLE(Pinned):1c13e4: sizeof(01d96d9c) = 312 ( 0x138) bytes (System.SByte[]) DOMAIN(004DFD10):HANDLE(Pinned):1c13e8: sizeof(01d96cc8) = 212 ( 0xd4) bytes (System.SByte[]) DOMAIN(004DFD10):HANDLE(Pinned):1c13ec: sizeof(01d96c58) = 112 ( 0x70) bytes (System.SByte[]) DOMAIN(004DFD10):HANDLE(Pinned):1c13f0: sizeof(02d93030) = 708 ( 0x2c4) bytes (System.Object[]) DOMAIN(004DFD10):HANDLE(Pinned):1c13f4: sizeof(02d92020) = 4276 ( 0x10b4) bytes (System.Object[]) DOMAIN(004DFD10):HANDLE(Pinned):1c13f8: sizeof(01d9118c) = 12 ( 0xc) bytes (System.Object) DOMAIN(004DFD10):HANDLE(Pinned):1c13fc: sizeof(02d91010) = 19332 ( 0x4b84) bytes (System.Object[])The output has been abbreviated, but clearly shows that our SByte arrays have been pinned as shown by HANDLE(Pinned).
Although the notion of pinning objects solves the problem of movable objects during native code invocations, it does present a problem to the garbage collector; the problem is that of fragmentation (one of the problems that compaction is meant to solve). If there are a lot of interleaved pinned objects on the managed heap, situations may occur where there isn't enough contiguous free space available. Figure 5-11 shows a hypothetical example of a fragmented managed heap due to excessive pinning.
Figure 5-11 Hypothetical example of a fragmented managed heap
In the layout illustrated in Figure 5-11,
we can see that we have several free smaller blocks intertwined with
live objects (Obj A through D). If a garbage collection should occur,
the layout of the managed heap will remain unchanged. The reason for
that is simple: The garbage collector cannot perform a compaction due to
all live objects being pinned and hence not movable. Because the free
blocks are not adjacent, it also cannot perform coalescing. Even though
we have free blocks available, memory allocation requests may in fact
fail if the size of the requested allocation is greater than 32 bytes.
We will take a look at a real-world managed heap fragmentation problem
in detail later in the chapter.Garbage Collection Modes
The last topic we will discuss are the modes that the garbage collector runs in. There are three primary modes of operation:- Nonconcurrent workstation
- Concurrent workstation
- Server
I think I found a small typo – “Denormalization can be defined as coping of the same data..” coping should be copying
When relational databases began there was a thing called ISAM which was the system dejour for large databases at the time – the interesting part about ISAM in reference to NoSql was that the application code navigated indexes to find the data, and the indexes were designed into the application. Changing these became a nightmare as the tight coupling of data and access methods was very difficult to change, often it was easier to chuck it and start again. Relational databases provided the freedom to model the data, and then as the system evolved and you needed to provide different queries for reporting and so on, you could tune your queries by adding indexes, add columns here and there and modify the schema.
There was much argument at the time from the ISAM guys saying this couldn’t possibly work, how can you optimise your access paths without knowing before hand, also query optimisation will be to slow and so on. It’s funny to be seeing the same old things going back the other way.
I’ll pick one of your points above among the many – ‘Relational modeling is typically driven by structure of available data’, this is not true. Relational databases are designed with the view of what data is needed to be stored for the application. You could be referring to queries – in which case this would be true, and query and reporting, by it’s nature, can only give you the data you have. There is an incredible body of work on how to design databases to suit applications, not the other way round.
High transaction rates are not a problem with RDBMSes, likewise availability, but here’s the rub the present incumbents are very expensive Oracle et al, they are also complex, the black arts of being a DBA are legendary. So to create a high availability platform using a commercial RDBMS is expensive, because of licensing and specialized skills. So what’s happening imho is that we’re going back to the bad old days – not because it’s better, but because it’s cheaper, and NoSql is free, and in uncomplicated data models – like shopping sites – then it probably will do. *However* know what your trading off, because as your site gets larger, and more complicated access paths are required as the CEO wants sales reports based every two weeks but your data structure stores them monthly, then it will hit the wall.
So like everything NoSql is a trade off and in this case cost versus coding hours amongst other things, if your coding time is cheap then by all means reproduce all the things an RDBMS does, however know that you’ll be delaying the inevitable, something will come along that your NoSql model doesn’t cover and the pain of changing the model isn’t as easy as typing ‘Create Index…’
A very nice exposition of NoSql though, and it does have it’s place, much the same way as Microsoft access does :-)
I would like to mention that many NoSQL systems provide data introspection tools, so it is often possible to do ad-hoc queries. Of course, many applications use their own binary data format, but in this case custom introspection tools are often developed.
NoSQL data modeling often starts from the application-specific queries as opposed to relational modeling:
Relational modeling is typically driven by structure of available data, the main design theme is ”What answers do I have?”
Relational databases are not very convenient for hierarchical or graph-like data modeling and processing
and I’m afraid all these statements are false. The implication being that RDBMses are deficient and NoSql is superior: again an incorrect inference, and that was my statement that NoSql is a back to the past technology and I gave the reasons why. As I’ve said NoSql has it’s place but look at it with open eyes. If you said that NoSql can be used for performance and cost reasons in particular cases then I would be less inclined to argue.
I admit that the third statement may be controversial, but I don’t think that judgement like “are not very convenient” can be true or false. There is a group of people, including me, who think that graph databases are more convenient than RDBMSes in certain cases, but of course there are different opinions.
I completely agree with you that performance, scalability and cost reasons are the main drivers of NoSQL. Obviously, complex data modeling is not an end in itself ;)
The real equation is that storage of data in a structured store represents a significant investment in software, hardware, and particularly in human capital. Then maintenance and updates to this store require significant additional investment of human capital.
It’s justified and it really works well, but things have changed. Here’s how:
Cost of storage hardware has decreased by 1400x in past years.
Cost of network transport has decreased by around 400x in the same time period.
Data read times have improved only 12x in that same time.
So to get at data in the scale we now see becoming common, the data must be stored in a fashion that can be distributed over a large compute range.
In light of that, key-value stores now find themselves being the Belle of the Ball again.
Cost of insertion and maintenance for data in a Hadoop cluster is lower than RDBMS by a couple of orders of magnitude. It’s not completely clear yet, but cost of design and maintenance will probably be a full order of magnitude improvement.
Ignore this paradigm evolution at your own peril.
I worked with eCommerce systems that used Nested Sets or Materialized Paths, but these structures were chosen to meet high performance requirements, say, thousands requests/sec for a set of tens of thousands of products. But high performance doesn’t come for free – these structures are relatively difficult to implement and update. Of course I don’t know your functional and performance requirements, but from what you said (small data size, statistical processing) I can advice you to consider something straightforward – per-product documents in MongoDB, search engine like Solr (can be helpful to deal with free text data like product description), or standard relational DB. Complicated modeling should be avoided unless it is unavoidable because of performance requirements or whatever.
http://blog.mikemccandless.com/2012/01/searching-relational-content-with.html
https://issues.apache.org/jira/browse/SOLR-3076
It’s worth to mention it too
Wrong. I have done rdb modeling for 20 years and we don’t do that. Statements like this undermine the integrity of your otherwise useful article.
Can I ask you what exactly is wrong?
Really nice overview of data modelling techniques for NoSQL databases.
If you don’t mind, I’d like to translate your article into Korean and publish it onto my community website.
And I promise you that agree your all rights and never use for profit.
Would you give me a permission to do so, please?
Thanks for your time and concern. I’m looking forward to hearing from you.
Great, long discussion on NoSQL data modeling.
Existing NoSQL systems are little more than bucketed hashtables, and provide terrible consistency properties (i.e. eventual consistency) as a result. HyperDex provides linearizability. It does so while supporting retrieval by secondary attributes. And it does so with very high performance.
I liked the first few bullet points, but the article went south around the middle, when it started assuming that the only way to support retrieval by secondary attributes is by building indices. This is a very RDBMS-centric view, and it poses problems for consistency. HyperDex’s internal data organization (called hyperspace hashing) enables it to sidestep these problems.
Overall, the author needs to do some further reading on recent developments in the NoSQL field.
Also with all these controversies of RDB vs NoSQL, and How vs What,
By saying “What answers do I have?” I think what Ilya meant is by nature SQL is a declarative language. SQL tends specify “What should be accomplished” without worrying about what technologies are being used while getting the query results, and RDBMS uses SQL to manipulate and define data.
On the other hand NoSQL tends to focus on “how” to process data based on data access patterns.
NoSQL could also be declarative language, for example adding hive on top of hbase SQL could be used to query NoSQL system.
RDB could also be procedural by using PL/SQL
So in my point-of-view the boundaries between RDB and NoSQL being what and how is getting less definite over time.
Also (Voice in the wind) mentioned we’re going back to the bad old days, but nothing is going back. Whether NoSQL uses denormalization or not it is a perfectly normal chronological evolution of DB systems. In fact 5th RDB normalization is all about denormalization. NoSQL came into recent trend in DB systems because many DB engineers felt performance and consistency issue with RDB in complex operations of large-data.
Who knows? in couple of years RDB may find right HW and technologies to support large-data.
As (Voice in the Wind) told, all DB’s currently have trade offs and their place. Even within NoSQL key-value, bigtable, mongoDB all has pros/cons.
Unfortunately DB engineers today will need to be thoroughly accustomed to all of them to make right decisions for business needs.
I need full information for no sql,i need to give seminaar on no sql ,plz email me important docs
ling…thanks!
just a small typo:” entires can be partitioned across multiple servers”,
I guess you meant “entries”
{
“skill” : “Math”,
“level” : “Low”,
“name” : “John”
}
{
“skill” : “Poetry”,
“level” : “High”,
“name” : “John”
}
Then you could run the query “skill:Math AND level:High”, and then remove all duplicate names.
“Relational modeling is typically driven by structure of available data” … Statements like this undermine the integrity of your otherwise useful article.
Robert,
Can I ask you what exactly is wrong?
I think a more accurate statement about relational modeling should be something like
Relational modeling is typically driven by the modeling the business.
Once the relational data model represents the business, then update anomalies are avoided and reporting is just a matter joining the appropriate structures. (With noted performance issues at scale)
I agree each modeling technique and platform has an appropriate use, based on data volumes, available funding, data source and access patterns
Excellent post, thank you Ilya for putting in the effort to make it so thorough and professional!
Great post. I’m reblogging this for my reference and of course for spreading the information to others.
a good blog post. If it had even more photos this might be perhaps even far better.
Thanks -Seymour