Identifying High CPU in GC (.NET) because of LOH – using Windbg

I am sure most of us are aware that one of the common reasons for High CPU usage .NET is because of, percentage time spent on GC is high. There are lot of write up about this. Tess has amazing blog post  on this specifically, which explains in detail how to identify the symptoms. But one thing that I want share was the experience I had ,where in i could identify the real call stack which was causing allocation on LOH by have a break-point on the CLR Garbage collector itself. I am going to assume that you are aware of CLR GC and LOH and basic debugging using windbg.

FYI the Perfmon counter like “% Time spent in GC”, “Large Object Heap size” can identify we have an issue with allocation in LOH. But this only indicates there is a problem in High allocations that’s causing  GC to work hard and in turn causing high CPU usage . But does not point where the exact problem is .If there is a large code base and all of sudden if there is an issue like this it is hard to pinpoint where the root cause of the problem.

I am going to walk through an example on a button click the code allocates objects on LOH and identify it using windbg.

using System;
usingSystem.Collections.Generic;
usingSystem.Linq;
usingSystem.Windows.Forms;

namespaceWindowsFormsApplication1
{
public partial classForm1: Form
{
publicList<byte[]> buffer = newList<byte[]>();

publicForm1()
{
InitializeComponent();
}

private voidButton1Click(objectsender, EventArgs e)
{
AllocateinLOH();
}

voidAllocateinLOH()
{
var b = new byte[85001];
b.ToList().ForEach(
(x) => x = new byte());

buffer.Add(new byte[85001]);
}

}
}

In the above code AllocateinLOH will cause the heap allocations to LOH because the size is more than 85000 bytes. The next step is to launch the application and attach it to the debugger. To figure out the cause I could have got multiple memory dumps  and checked at every call stack when GC was happening . That is the harder way and we could be spending lot of time on that.

To get to solve the problem i always like to approach from the bottom of the stack. I was certain there should be a function within the CLR / GC which should be doing this and I was certain it has to be within MSCORWKS.dll. The next step was examine symbols using “x” command and here is the output

0:000> x mscorwks!wks::gc_heap::*

000007fe`e90d1340 mscorwks!WKS::gc_heap::alloc_allocated = <no type information>
000007fe`e89fd380 mscorwks!WKS::gc_heap::limit_from_size = <no type information>
000007fe`e8907e20 mscorwks!WKS::gc_heap::fix_older_allocation_area = <no type information>
000007fe`e90d1310 mscorwks!WKS::gc_heap::max_free_space_items = <no type information>
000007fe`e90cba38 mscorwks!WKS::gc_heap::gc_low = <no type information>
000007fe`e8dae150 mscorwks!WKS::gc_heap::grow_brick_card_tables = <no type information>
000007fe`e8a12d40 mscorwks!WKS::gc_heap::fix_generation_bounds = <no type information>
000007fe`e8a13040 mscorwks!WKS::gc_heap::fix_large_allocation_area = <no type information>
000007fe`e8a2c680 mscorwks!WKS::gc_heap::allocate_large_object = <no type information>
000007fe`e90c0478 mscorwks!WKS::gc_heap::slow = <no type information>
000007fe`e8ed8ec0 mscorwks!WKS::gc_heap::c_promote_callback = <no type information>

The above results are only partial because it was not necessary to get all the methods. The command “x mscorwks!wks::gc_heap::*” causes the debugger to get all the functions that are within the mscorwks dll with class name gc_heap. I knew it is gc_heap because prior to this I issued the command “x mscorwks!*gc*”. Because we don’t have private symbols for mscorwks we cannot see the type information. But we don’t need that for our purpose. The public symbols has FPO information for us to have break-point.

Form the above result the method name should be “allocate_large_object”

The next step was to put a break-point on the method

bp mscorwks!wks::gc_heap::allocate_large_object “!CLRStack”

And then click button and here is the callstack output

0:004> g
OS Thread Id: 0xf6f4 (0)
Child-SP         RetAddr          Call Site
000000000027e000 000007ff00190684 WindowsFormsApplication1.Form1.AllocateinLOH()
000000000027e060 000007feeab00555 WindowsFormsApplication1.Form1.Button1Click(System.Object, System.EventArgs)
000000000027e090 000007feeb1f5873 System.Windows.Forms.Control.OnClick(System.EventArgs)
000000000027e0d0 000007feeb1aadf7 System.Windows.Forms.Button.OnMouseUp(System.Windows.Forms.MouseEventArgs)
000000000027e130 000007feeb702b7d System.Windows.Forms.Control.WmMouseUp(System.Windows.Forms.Message ByRef, System.Windows.Forms.MouseButtons, Int32)
000000000027e200 000007feeab49b5a System.Windows.Forms.Control.WndProc(System.Windows.Forms.Message ByRef)
000000000027e3b0 000007feeab49954 System.Windows.Forms.ButtonBase.WndProc(System.Windows.Forms.Message ByRef)
000000000027e430 000007feeab53aa6 System.Windows.Forms.Button.WndProc(System.Windows.Forms.Message ByRef)
000000000027e460 000007feeab53945 System.Windows.Forms.Control+ControlNativeWindow.WndProc(System.Windows.Forms.Message ByRef)
000000000027e4b0 000007feeab52244 System.Windows.Forms.NativeWindow.Callback(IntPtr, Int32, IntPtr, IntPtr)
000000000027e560 000007fee8aab07a DomainBoundILStubClass.IL_STUB(Int64, Int32, Int64, Int64)
000000000027e810 000007feeab6e883 DomainBoundILStubClass.IL_STUB(MSG ByRef)
000000000027e990 000007feeab6e0f8 System.Windows.Forms.Application+ComponentManager.System.Windows.Forms.

UnsafeNativeMethods.IMsoComponentManager.FPushMessageLoop(Int32, Int32, Int32)
000000000027ebe0 000007feeab6db65 System.Windows.Forms.Application+ThreadContext.RunMessageLoopInner(Int32, System.Windows.Forms.ApplicationContext)
000000000027ed30 000007ff00190171 System.Windows.Forms.Application+ThreadContext.RunMessageLoop(Int32, System.Windows.Forms.ApplicationContext)
000000000027ed90 000007fee8aad502 WindowsFormsApplication1.Program.Main()
mscorwks!WKS::gc_heap::allocate_large_object:
000007fe`e8a2c680 4053            push    rbx

Bingo now at the bottom of the stack we can see the “allocate_large_object” and on the top of the stack it is the managed code that we wrote “AllocateinLOH”.  Now we have solved the reason behind the High CPU usage in GC because of LOH.

The “allocate_large_object” is not documented by Microsoft as a public API and I don’t know whether the name would be same going forward from .NET framework 4.0.  This holds good until .NET framework 3.5. The idea behind this is just digging in to the framework can give us some information which has saved us valuable time and effort.

 

Identifying High CPU in GC (.NET) because of LOH – using Windbg

I am sure most of us are aware that one of the common reasons for High CPU usage .NET is because of, percentage time spent on GC is high. There are lot of write up about this. Tess has amazing blog post  on this specifically, which explains in detail how to identify the symptoms. But one thing that I want share was the experience I had ,where in i could identify the real call stack which was causing allocation on LOH by have a break-point on the CLR Garbage collector itself. I am going to assume that you are aware of CLR GC and LOH and basic debugging using windbg.

FYI the Perfmon counter like “% Time spent in GC”, “Large Object Heap size” can identify we have an issue with allocation in LOH. But this only indicates there is a problem in High allocations that’s causing  GC to work hard and in turn causing high CPU usage . But does not point where the exact problem is .If there is a large code base and all of sudden if there is an issue like this it is hard to pinpoint where the root cause of the problem.

I am going to walk through an example on a button click the code allocates objects on LOH and identify it using windbg.

using System;
usingSystem.Collections.Generic;
usingSystem.Linq;
usingSystem.Windows.Forms;

namespaceWindowsFormsApplication1
{
public partial classForm1: Form
{
publicList<byte[]> buffer = newList<byte[]>();

publicForm1()
{
InitializeComponent();
}

private voidButton1Click(objectsender, EventArgs e)
{
AllocateinLOH();
}

voidAllocateinLOH()
{
var b = new byte[85001];
b.ToList().ForEach(
(x) => x = new byte());

buffer.Add(new byte[85001]);
}

}
}

In the above code AllocateinLOH will cause the heap allocations to LOH because the size is more than 85000 bytes. The next step is to launch the application and attach it to the debugger. To figure out the cause I could have got multiple memory dumps  and checked at every call stack when GC was happening . That is the harder way and we could be spending lot of time on that.

To get to solve the problem i always like to approach from the bottom of the stack. I was certain there should be a function within the CLR / GC which should be doing this and I was certain it has to be within MSCORWKS.dll. The next step was examine symbols using “x” command and here is the output

0:000> x mscorwks!wks::gc_heap::*

000007fe`e90d1340 mscorwks!WKS::gc_heap::alloc_allocated = <no type information>
000007fe`e89fd380 mscorwks!WKS::gc_heap::limit_from_size = <no type information>
000007fe`e8907e20 mscorwks!WKS::gc_heap::fix_older_allocation_area = <no type information>
000007fe`e90d1310 mscorwks!WKS::gc_heap::max_free_space_items = <no type information>
000007fe`e90cba38 mscorwks!WKS::gc_heap::gc_low = <no type information>
000007fe`e8dae150 mscorwks!WKS::gc_heap::grow_brick_card_tables = <no type information>
000007fe`e8a12d40 mscorwks!WKS::gc_heap::fix_generation_bounds = <no type information>
000007fe`e8a13040 mscorwks!WKS::gc_heap::fix_large_allocation_area = <no type information>
000007fe`e8a2c680 mscorwks!WKS::gc_heap::allocate_large_object = <no type information>
000007fe`e90c0478 mscorwks!WKS::gc_heap::slow = <no type information>
000007fe`e8ed8ec0 mscorwks!WKS::gc_heap::c_promote_callback = <no type information>

The above results are only partial because it was not necessary to get all the methods. The command “x mscorwks!wks::gc_heap::*” causes the debugger to get all the functions that are within the mscorwks dll with class name gc_heap. I knew it is gc_heap because prior to this I issued the command “x mscorwks!*gc*”. Because we don’t have private symbols for mscorwks we cannot see the type information. But we don’t need that for our purpose. The public symbols has FPO information for us to have break-point.

Form the above result the method name should be “allocate_large_object”

The next step was to put a break-point on the method

bp mscorwks!wks::gc_heap::allocate_large_object “!CLRStack”

And then click button and here is the callstack output

0:004> g
OS Thread Id: 0xf6f4 (0)
Child-SP         RetAddr          Call Site
000000000027e000 000007ff00190684 WindowsFormsApplication1.Form1.AllocateinLOH()
000000000027e060 000007feeab00555 WindowsFormsApplication1.Form1.Button1Click(System.Object, System.EventArgs)
000000000027e090 000007feeb1f5873 System.Windows.Forms.Control.OnClick(System.EventArgs)
000000000027e0d0 000007feeb1aadf7 System.Windows.Forms.Button.OnMouseUp(System.Windows.Forms.MouseEventArgs)
000000000027e130 000007feeb702b7d System.Windows.Forms.Control.WmMouseUp(System.Windows.Forms.Message ByRef, System.Windows.Forms.MouseButtons, Int32)
000000000027e200 000007feeab49b5a System.Windows.Forms.Control.WndProc(System.Windows.Forms.Message ByRef)
000000000027e3b0 000007feeab49954 System.Windows.Forms.ButtonBase.WndProc(System.Windows.Forms.Message ByRef)
000000000027e430 000007feeab53aa6 System.Windows.Forms.Button.WndProc(System.Windows.Forms.Message ByRef)
000000000027e460 000007feeab53945 System.Windows.Forms.Control+ControlNativeWindow.WndProc(System.Windows.Forms.Message ByRef)
000000000027e4b0 000007feeab52244 System.Windows.Forms.NativeWindow.Callback(IntPtr, Int32, IntPtr, IntPtr)
000000000027e560 000007fee8aab07a DomainBoundILStubClass.IL_STUB(Int64, Int32, Int64, Int64)
000000000027e810 000007feeab6e883 DomainBoundILStubClass.IL_STUB(MSG ByRef)
000000000027e990 000007feeab6e0f8 System.Windows.Forms.Application+ComponentManager.System.Windows.Forms.

UnsafeNativeMethods.IMsoComponentManager.FPushMessageLoop(Int32, Int32, Int32)
000000000027ebe0 000007feeab6db65 System.Windows.Forms.Application+ThreadContext.RunMessageLoopInner(Int32, System.Windows.Forms.ApplicationContext)
000000000027ed30 000007ff00190171 System.Windows.Forms.Application+ThreadContext.RunMessageLoop(Int32, System.Windows.Forms.ApplicationContext)
000000000027ed90 000007fee8aad502 WindowsFormsApplication1.Program.Main()
mscorwks!WKS::gc_heap::allocate_large_object:
000007fe`e8a2c680 4053            push    rbx

Bingo now at the bottom of the stack we can see the “allocate_large_object” and on the top of the stack it is the managed code that we wrote “AllocateinLOH”.  Now we have solved the reason behind the High CPU usage in GC because of LOH.

The “allocate_large_object” is not documented by Microsoft as a public API and I don’t know whether the name would be same going forward from .NET framework 4.0.  This holds good until .NET framework 3.5. The idea behind this is just digging in to the framework can give us some information which has saved us valuable time and effort.

 

Function hit count using Pseudo-Register in Windbg

What if we want to know the number of times a function was invoked. We can have “.echo” or “.printf” on break-point of a function and count the output manually. The better way to do this is using pseudo-registers.

In my previous post I had mentioned about alias inside the debugger. The debugger also provides User defined Pseudo-Registers for scripting inside the debugger. We can use them to manipulate values within our scripts. There are 20 of them from $t0,$t1.. $t19. These are pre-defined names that we cannot change.
The syntax to assign value to the register is “r”

r @$t0 = 0

FYI the default value is 0.

bp 000007ff0003c050  ”  r @$t0 =@$t0 + 1;gc “

The above command will increment the $t0 register by one more, every time the break-point is hit. And gc is to go to next conditional break-point. To get the output of the register we can use the expression evaluator command “?”

0:004> ? @$t0
Evaluate expression: 5 = 00000000`00000005

There quite a few places we can use these inside the debugger. Example we could calculate the total size of multiple instances of an object type inside the debugger. That is we could get !objsize of each instance for type DataTable and then total it, to get the total memory consumption of datatable within our process.

 

Conditional Breakpoint in .NET using Windbg

With my few years of production debugging .NET code ,one thing that has really helped me a lot is Windbg. Lot us of know that using sos, sosex and Windbg we should be able to troubleshoot most of the .NET Code. But certain tips / tricks makes us productive in those crucial moments. I am assuming that you are aware of basic usage of sos and windbg.

We know by using !bpmd command we can stick in a break-point on a method. But the issue is we would want to break in to the debugger only on a certain condition, very similar to VS.NET break-point condition.

Here is the sample code that i am going to be using to set the conditional break-point. This is a simple Winforms app.

  1: using System;
  2: using System.Windows.Forms;
  3:
  4: namespace WindowsFormsApplication2
  5: {
  6:     public partial class Form1 : Form
  7:     {
  8:         public int Foo;
  9:
 10:         public Form1()
 11:         {
 12:             InitializeComponent();
 13:         }
 14:
 15:         private void Button1Click(object sender, EventArgs e)
 16:         {
 17:             Test(textBox1.Text);
 18:         }
 19:
 20:         void Test(string s)
 21:         {
 22:             Console.WriteLine(s);
 23:         }
 24:
 25:     }
 26: }
 27:

In this sample code I would like to put a conditional break-point on the Test method. After attaching the application to windbg ,look for the object Form1 in the heap.

0:000> !dumpheap -type WindowsFormsApplication2.Form1
Address MT Size
0000000002c92548 000007ff004b7b78 480
total 1 objects
Statistics:
MT Count TotalSize Class Name
000007ff004b7b78 1 480 WindowsFormsApplication2.Form1
Total 1 objects

The next step was to get the address of the function Test.

0:000> .shell -ci “!dumpmt -md 000007ff004b7b78” FIND “Test”
000007ff004a3508 000007ff004b7af0 JIT WindowsFormsApplication2.Form1.Test(System.String)
.shell: Process exited

The reason to get address of the function is to use the native “bp” command to set the break-point. FYI the sos also uses only the built in bp command for setting the break-point. The difference is condition that we can pass to the break-point. In the above I use the .shell command to look for Test function address instead of manually looking for the Test function. The .shell command comes in very handy.

In this exercise I would like to break into the debugger only if the argument “s” matches certain condition. I set the bp on the method test using the command “bp 000007ff004a3508”. And here is the result when the break-point hits.

0:000> g
Breakpoint 0 hit
rax=0000000002d9fcd8 rbx=0000000000000000 rcx=0000000002c92548
rdx=0000000002d9fcd8 rsi=0000000000000001 rdi=0000000000000000
rip=000007ff004a3508 rsp=000000000028d3e8 rbp=000000000028d590
r8=000000000028cde0 r9=000007feed0b14c0 r10=000007feff469f20
r11=000007ff00060120 r12=00000000003a9460 r13=0000000000000202
r14=000000001b3e23b8 r15=0000000000030672
iopl=0 nv up ei pl nz na po nc
cs=0033 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000206
000007ff`004a3508 e9f35a2c00 jmp 000007ff`00769000

FYI I am using a 64-bit machine and that’s the reason my pointers are much bigger than usual x86. We are interested on the argument “s”  that is passed to the method which is @rdx register “rdx=0000000002d7ed00”. To verify that we setting the break-point on the correct argument we can test it by using command

0:000> .printf “%mu”,@rdx+10
testa

Not many of them are aware of how to get just the string from string object , instead of the all additions from !dumpobj. The above command would get just the string . The command .printf contains “%mu” because it is null terminated unicode string and @rdx is the register which contains the argument “s”. The “@rdx+10” is the actual location of the string in memory and for the x86 it would be “@rdx+c” for the actual string. Now that we are sure the @rdx is the register we can build condition for the argument. Here it is

bp 000007ff004a3508 “.block {as /mu ${/v:cmp} @rdx+10; .if ( $spat( “${cmp}”, “*test*” ) ) { !clrstack; } .else { gc }}”

And here is the detailed explanation of the above condition within quotes. The .block command is used for alias evaluation. Alias is like variables within windbg. The “as /mu ${v:cmp} @rdx+10” command creates an string alias by name  of cmp which contains the value of argument “s”. This condition would be evaluated only when the functions first line of code is executed so @rdx will always have the value that is passed to the function. The next

“.if ( $spat( “${cmp}”, “*test*” )  ) { !clrstack; } .else { gc }}”

command is real crux where the code compares the alias cmp with “*test*”. Notice i am using a built-in function called $spat which is nothing but a string pattern function. So from the above condition i am instructing the break-point to give a callstack if the argument “s” has something like “*test*” . If not the command “gc” means go to the next conditional break-point,similar  F5 in VS.NET. So, for example if the function is called 40 times and of which we are interested only once when it is  something like “*test*” ,with the existing  !bpmd we would wasted our time 39 times.

The “” is a escape character in windbg ,i am using it because i would have to a string compare.

 

Identify and Patch .NET Code using Windbg

 

The last week was really an interesting one with debugging production code. I was debugging a Winforms application which was using .NET framework 3.5 version. The real problem was with the latest release of the code, there was bug which caused certain elements on the UI not to be displayed. This is was High priority bug and very important to the business.

The code that was causing this bug was an integer variable inside a class.

using System;
using System.Windows.Forms;

namespace WindowsFormsApplication2
{
    public partial class Form1 : Form
    {
        public int Foo;
        public Form1()
        {
            InitializeComponent();
        }
        private void Button1Click(object sender, EventArgs e)
        {
            if (Foo == 100) 
                this.label1.Visible = true;
        }
    }
}

The code was something like this. If Foo was equal 100 then the label was set to be visible. Reflector came in handy to disassemble  the code.With the latest code release the  “if (Foo == 100) “ condition was introduced.

The next step was to verify and validate that this condition was the reason for this bug. Though this looks very simple because i have shown a very contrived example which does not involve all the dependencies of the real world business application.

Fired up windbg looked up the heap for object type Form1 using dumpheap.

0:004> !dumpheap -type Form1

         Address               MT     Size

0000000002cb2548 000007ff004b7b68      480    
total 1 objects

Statistics:

              MT    Count    TotalSize Class Name

000007ff004b7b68        1          480 WindowsFormsApplication2.Form1

Total 1 objects

The next was figure of the offset of Foo. So used the !do command on Form1 instance. !do 0000000002cb2548 and here is the partial output of the command   

000007ff00107050  4001e9b     1780        System.Object  0   static 0000000002cb2d48 EVENT_MAXIMIZEDBOUNDSCHANGED

0000000000000000  4000002      1b8                       0 instance 0000000000000000 components

000007ff00694f40  4000003      1c0 …dows.Forms.Button  0 instance 0000000002cdc040 button1

000007ff006962c8  4000004      1c8 …ndows.Forms.Label  0 instance 0000000002cdc2f8 label1

000007ff002683d8  4000005      1d0         System.Int32  1 instance                0 Foo

So from the result i could identify that the Foo variable was on the 1d0 offset of the Form1 object.

After couple of test case runs i dumped the object and here was the output

000007ff00694f40  4000003      1c0 …dows.Forms.Button  0 instance 0000000002cdc040 button1

000007ff006962c8  4000004      1c8 …ndows.Forms.Label  0 instance 0000000002cdc2f8 label1

000007ff002683d8  4000005      1d0         System.Int32  1 instance               23 Foo

So the Foo’s value was now 23. I am sure most of us are used to updating the variable’s value in VS.NET using immediate window. I did something similar to that but instead used windbg.

In Windbg numbers are hex values, so to set the value as 100 it would have to be 64. You can fire up calc to figure this out or you use the command in windbg ?64

Evaluate expression: 100 = 00000000`00000064

FYI “?” expression evaluator in windbg. Now the final step of updating the Foo in memory. The command to do that is

ed 0000000002cb2548+1d0 64

“e” command is enter values in memory. “e” has many flavors like eu,ed,ea. And ed command is for updating  Double-word values. So ed is to update double-word value and the memory location is 0000000002cb2548+1d0 which is the Form1 memory location  along with Foo is offset. 

Voila here is the output of !dumpobj after updating the memory

000007ff002683d8  4000005      1d0         System.Int32  1 instance              100 Foo

Now we could make an emergency patch and be certain the patch would work with the fix already tested in production using the debugger. Having windbg in your toolbox is always saves a lot of time.  

 

Home cooked web monitoring use Rx

One of the key things in software is to write succinct , declarative and asynchronous code. The answer to that is Reactive Extensions for .NET 

I have been digging into Rx for sometime now. Though there isn’t much of documentation I have been kind of successful in getting certain things done with it.

One of requirement that came up from our Operations was to monitor some websites and notify someone if the site was not accessible. But the added constraints to this were check for the site status only at a certain interval and report failure only if it was more than certain percentage within a certain duration.

So it was like check the site status every 5 seconds , buffer the results for a minute and in the buffered response if it had more 3  failures then tweet someone about it.

And here is the code to solve the problem

 (from time in Observable.Interval(TimeSpan.FromSeconds(5))
 let req = WebRequest.Create("http://www.nonexisting.com")
 from res in Observable.
    FromAsyncPattern<WebResponse>(
        req.BeginGetResponse, req.EndGetResponse)()
 .Materialize()select res).Buffer(new TimeSpan(0, 0, 1, 0)).
    Select(failed => 
        failed.Where(
        n => n.Kind == NotificationKind.OnError)).
    Where(failed => failed.Count() > 3).
    Subscribe(x => Tweet("Nonexisting.com Failed thrice"));
Console.Read();

The “from time in Observable.Interval(TimeSpan.FromSeconds(5)” is for ensuring that an Observable is generated every 5 seconds to check the status of the website. In the next line the code creates a web request.

Using the FromAsyncPattern I was able to reduce all the plumbing code to handle async i/o calls for the web request.  The materialize is for the sequence to continue even if there is an exception. Here is an good write up on Materialize. And the rest is just the usual Linq where the code filters Notification type of error.

This was a fun exercise. I will continue to explore Rx and blog about it.

 

Resharper Template for F#

I have been hacking F# lately and I am big fan of ReSharper. So why not create a few Live Templates for the things i do daily in F#.

Download R# live template for F#. I am sure there are few more things that can be part of this template. I have posted the code in MSDN Code gallery so that others can find and contribute.