SharpDevelop Community

Get your problems solved!
Welcome to SharpDevelop Community Sign in | Join | Help
in Search

David Srbecký's blog

Internals of SharpDevelop's Debugger

In this blog post, I will explain the internal workings of the SharpDevelop's managed debugger.  It might be useful for anyone who wants to contribute to SharpDevelop or to use the SharpDevelop's debugger for any other purpose (the debugger is LGPL library independent of SharpDevelop so you can easily reuse it).


Some terminology first.  A program which debugs an another program is called the debugger.  The program which is being debugged is called the debuggee.  In this case SharpDevelop is the debugger and your HelloWorld program is the debuggee.

The debugger can start a new debuggee process or it can attach to an existing one.  While the debuggee is running, there is not much that the debugger can do.  Almost all operations are forbidden.  The debugger has to wait until the debuggee pauses - usually because user's breakpoint is hit.  Once the debuggee is paused, the debugger can investigate its state - it can look at the callstack, read local variables and so on.  Stepping or pressing "Continue" will put the debuggee into running state again.

Investigating variables

The first most important thing to grasp is that the debugger and the debuggee are completely separate processes.  Memory spaces of different processes are strictly separated by the operating system and therefore the debugger cannot obtain a reference (pointer) to any object in the debuggee.  You might as well imagine the processes being on two different computers.  If the debugger wants to investigate the debuggee it needs to use some form of interprocess communication.  The low level COM API takes care of this and debugging library provides the functionality in the Value class.  You can obtain an instance of the Value class by, for example, calling StackFrame.GetLocalVariableValue(string name).  The Value does not hold the actual value of the local variable; it instead acts as a reference to the value in the remote process.  If the value is a primitive type like string or integer, you can simply request the actual content.  However if the value is a class, you will have to enumerate its fields and properties and get the values for the ones that you are interested in.  You are of course free to get fields of the new values as well and drill down as much as you want to.

There is one more good reason why this model is appropriate.  When the debugger's code was compiled, it did not know that the user will create field "myHelloWorldMessage" and therefore it could not reference it.  Even if direct reference to the object in the other process was somehow available, the debugger would still have to use reflection to figure out what fields the object contains and then get their values one by one.  In fact, most of the debugger's API inherits from the abstract reflection classes so if you are familiar with reflection, you should not have any problems with the debugging API.

Lifespan of values

The .NET garbage collector (GC) presents a significant complication to the debugger.  When the debuggee is paused no code can be executed including the garbage collector so it is safe to investigate it as much and as long as we want.  However, if the debugger is resumed even for just a few instructions, the GC might have been run and it might have moved all variables around in memory.  The GC takes care to update all references within the debuggee so that it does not even notice.  However, it unfortunately does not tell the debugger.  This means that whenever debuggee is resumed, all debugger's Values become invalid because they might be pointing to the wrong memory.  The next time the debuggee is paused, it has to obtain all values again.  This problem is more problematic than it might initially seem - getting value of a property or calling Object.ToString() both require that the debuggee is resumed for a while so that the methods can be injected into the debuggee and executed.  Imagine that you have used the tooltips to drill down to object "" which contains two properties - FirstName and Surname.  After you evaluate the "FirstName" property, all values will become invalid and you will have to obtain "" again just so that you can evaluate "Surname".

To get around this problem SharpDevelop is using expressions to obtain the values.  That is, whenever it might need to use value later it stores the string expression rather than the value.  So when user has "" open and expands "Person", SharpDevelop will first generate the expression "" and then it will evaluate it.  The expression evaluator has a cache which ensures that any recently evaluated values will be reused rather than obtained again (if they are still valid).

At one point in the past, the Value class was designed so that it would remember how it was obtained and automatically reevaluate itself if needed.  However, this approach turned out to be quite difficult to debug since a relatively simple call could cause complicated chain of events.  The expression based approach is more explicit and thus allows better reasoning about the program - both in terms of behaviour and performance.

Debugger's components

 - COM API:  The low-level unmanaged debugging API of the .NET framework.  It contains interfaces such as ICorDebug or ICorDebugManagedCallback.

 - COM wrappers:  Auto-generated thin layer over the COM one which makes it a bit easier to use.  It converts 'out' parameters to return values and tracks returned COM objects so that they can be explicitly released (this is necessary so that the debugger does not lock assemblies).  The layer also contains several hand-written methods that handle marshaling of strings and other objects.

 - NDebugger:  The debugging library itself.  It provides access to variables and types via reflection-like interface.  It provides commands for setting breakpoints, stepping and pretty much everything you would expect from debugger.

 - ExpressionEvaluator:  Extension on top of NDebugger which can evaluate C# expressions.  It depends on SharpDevelop's NRefactory.

 - AbstractTree:  This provides GUI-independent model for the tree that you can see in Local Variables pad or in the Tooltips.

 - GUI: The actual GUI in SharpDevelop.  This level connects SharpDevelop with the debugging library.

 - Visualizers: Extensions in the GUI.





NoMoDo said:

I've been reading the code of some of the debugger parts, and it has been a very interesting, educating, and illuminating experience. Thanks so much for the code and this blog entry as well!

August 10, 2010 4:34 PM
Powered by Community Server (Commercial Edition), by Telligent Systems
Don't contact us via this ( email address.