How to find source of memory leak in Windows 7?

10

8

I've got a Windows 7 machine that keeps running low in RAM. I can see the free RAM going down over a few hours until the point the machine becomes unresponsive. I've checked the process list and none of them takes that much RAM.

I've also checked the number of handles per process and various other indicators but still can't find why the machine runs out of RAM.

Is there any good way to check how the memory is used in Windows?

Edit

Here is the result of tasklist a few minutes before the machine becomes unresponsive:

Image Name                     PID Session Name        Session#    Mem Usage
========================= ======== ================ =========== ============
System Idle Process              0 Services                   0         24 K
System                           4 Services                   0        300 K
smss.exe                       196 Services                   0      1,024 K
csrss.exe                      272 Services                   0      4,336 K
wininit.exe                    320 Services                   0      4,184 K
csrss.exe                      332 Console                    1      3,516 K
winlogon.exe                   372 Console                    1      6,316 K
services.exe                   416 Services                   0      8,112 K
lsass.exe                      432 Services                   0     10,088 K
lsm.exe                        440 Services                   0      3,664 K
svchost.exe                    548 Services                   0      8,152 K
svchost.exe                    620 Services                   0      6,564 K
svchost.exe                    660 Services                   0     15,764 K
LogonUI.exe                    724 Console                    1     18,428 K
svchost.exe                    768 Services                   0      7,992 K
svchost.exe                    828 Services                   0      9,724 K
svchost.exe                    852 Services                   0     28,092 K
svchost.exe                    176 Services                   0     13,096 K
spoolsv.exe                    824 Services                   0     10,608 K
svchost.exe                    952 Services                   0     11,632 K
svchost.exe                   1076 Services                   0      8,524 K
fshoster32.exe                1120 Services                   0      9,148 K
fsorsp.exe                    1200 Services                   0      8,036 K
fsgk32.exe                    1324 Services                   0      3,084 K
cygrunsrv.exe                 1552 Services                   0      5,852 K
conhost.exe                   1864 Services                   0      2,996 K
sshd.exe                      1896 Services                   0      7,804 K
FSMA32.EXE                    2024 Services                   0      1,628 K
svchost.exe                   1320 Services                   0      5,092 K
fssm32.exe                    1704 Services                   0      2,196 K
FSHDLL64.EXE                  2120 Services                   0        644 K
SearchIndexer.exe             3260 Services                   0     13,596 K
sshd.exe                    138920 Services                   0      8,696 K
sshd.exe                    138448 Services                   0      8,696 K
sshd.exe                    138660 Services                   0      8,696 K
bash.exe                    137924 Services                   0      5,380 K
bash.exe                    137820 Services                   0      3,832 K
SAV32CLI.EXE                136344 Services                   0    133,868 K
WmiPrvSE.exe                139444 Services                   0      7,168 K
sshd.exe                    139672 Services                   0      8,692 K
sshd.exe                    139876 Services                   0      8,684 K
bash.exe                    139992 Services                   0      5,432 K
bash.exe                    140040 Services                   0      3,996 K
bash.exe                    140200 Services                   0      5,400 K
bash.exe                    139424 Services                   0      4,048 K
typeperf.exe                139300 Services                   0      5,372 K
sleep.exe                   138268 Services                   0      2,272 K
sshd.exe                    139612 Services                   0      7,168 K
sshd.exe                    137720 Services                   0      5,700 K
bash.exe                    139524 Services                   0      5,304 K
bash.exe                    138952 Services                   0      3,756 K
tasklist.exe                137580 Services                   0      5,164 K
bash.exe                    139460 Services                   0      5,452 K
bash.exe                    139796 Services                   0        104 K

At that point, wmic OS get FreePhysicalMemory /Value reports about 400 MB of free memory out of 2GB.

RamMap:

enter image description here

Task Manager:

enter image description here

laurent

Posted 2014-06-04T18:30:11.283

Reputation: 5 258

1Can you restart your computer and post a screenshot of your Task Manager processes? And make sure to display processes from all users please. If you have more than 50-60 after a reboot then there are definitely things you can do but if nothing looks fishy then things could get tricky.. – MonkeyZeus – 2014-06-04T18:41:27.220

1Are you running out of virtual memory or physical memory. The solution and cause are different. A low virtual memory warning is a configuration problem, being low on physical memory, means you have to many processes running. – Ramhound – 2014-06-04T18:41:44.707

1Do you have any evidence that the cause is a memory leak? It sounds like you have good evidence that it's not a memory leak. (Though it could be a busted driver, I guess.) – David Schwartz – 2014-06-04T19:01:56.553

Thanks for the feedback. I've added some more info to the post, in particular the task list. It is the free physical memory that keeps going down over time. Is it possible to check how it is used? – laurent – 2014-06-04T19:15:42.370

It seemed that sshd.exe and bash.exe were spawned multiple times. Do you run server apps on that machine? If you're interested in what spawned them, you may use Process Explorer to check process trees.

– Scott Rhee – 2014-06-05T01:55:44.733

1

post screenshots of RAMMAp: http://technet.microsoft.com/en-us/sysinternals/ff700229.aspx

– magicandre1981 – 2014-06-05T04:07:46.350

@ScottRhee, yes it's a vm that's being accessed via SSH. Then I guess there must be one instance of bash.exe per sshd.exe. – laurent – 2014-06-05T15:55:31.320

@magicandre1981, I have added a screenshot of RamMap. – laurent – 2014-06-05T16:27:29.713

To rule out bad drivers, have a look at Paged and Nonpaged Pool Sizes (for example in Process Explorer) - if they get too high, check PoolMon (but you'll probably need to debug your vm with a kernel debugger to get more details) – mihi – 2014-06-05T16:51:33.903

I'd set up a few perfmon performance counters (length of processor queue, disk activity percentage, swap disk activity percentage, page faults, resident ram, virtual ram size, to name a few) and have a look how they change over time if you can see any discrepancies in these values - if you know what exactly goes up/down, monitor it in more detail. – mihi – 2014-06-05T16:53:49.047

@mihi, thanks I'm going to try this. I've also added a screenshot of task manager some time before the crash. In particular, I see there's a process with 9 millions page faults though I don't know if this is relevant. – laurent – 2014-06-05T17:52:52.697

As page faults is an accumulating counter, it does not tell much if you don't know how long the process has been running. (Page Fault Delta is better to see a current snapshot of what is swapping, and correlating page faults to uptime if you want to know if the machine has been out of RAM earlier) – mihi – 2014-06-06T17:12:42.407

for me the rammap-screenshot looks quite good. 400+ mb of cached files, no leaks nowhere to see. @this.lau_: check out http://channel9.msdn.com/Events/TechEd/NorthAmerica/2011/WCL405

– akira – 2014-06-06T17:44:57.793

@akira the usage is bad. The page table entry is too high. I posted how to trace this. – magicandre1981 – 2014-06-15T08:47:52.793

@magicandre1981: "too high" implies you know proper values. what are they? – akira – 2014-06-15T08:59:17.347

@akira look at my values. Everything over 100MB is too much. – magicandre1981 – 2014-06-15T18:00:44.540

@magicandre1981: because of … ? you are not explaining where the "100mb" value comes from. – akira – 2014-06-15T18:42:23.937

Answers

8

The high memory usgae comes from a high Page table usage. To see which processes use it, install the Windows Performance Toolkit, open a command prompt as admin and run this command:

xperf -on ReferenceSet -BufferSize 1024 -MaxFile 512 -FileMode Circular  && timeout 5 && xperf -d MemUsage.etl

Open the MemUsage.etl with Windows Performance Analyzer (WPA.exe), drag and drop the graph "ResidentSet" from the left graph list to the analysis pane:

enter image description here

Now move the "Page Category" column to the left side and expand the "Page Table" entry:

enter image description here

Here you see the processes which have the high pagetable usage. On the right site (after the blue line), you see page table memory usage in MB for each process.

magicandre1981

Posted 2014-06-04T18:30:11.283

Reputation: 86 560

in windows8.1 i do not see 'residentset' as an available graph-option. how can i get it? a different xperf-flag? – akira – 2014-06-19T10:07:42.910

i answer my own comment: update xperf. – akira – 2014-06-19T10:29:29.323

This doesn't work for me in Windows 7 - the first call to xperf gives xperf: error: NT Kernel Logger: Invalid flags. (0x3ec). – benshepherd – 2014-07-15T07:42:59.217

@benshepherd for me it works. Make sure you use the latest WPT from the 8.1 SDK: https://www.dropbox.com/s/e5ol59a6n9g3ctb/Win7_xperf.png

– magicandre1981 – 2014-07-15T17:54:10.803

I have xperf version 4.8.7701, downloaded from http://msdn.microsoft.com/en-us/library/windows/desktop/bg162891.aspx. As far as I can tell it's the latest.

– benshepherd – 2014-07-16T06:42:11.090

@benshepherd this is an old version. Use the version from the Win8.1 SDK. – magicandre1981 – 2014-07-16T15:50:20.217

Thanks! So I followed your instructions - vast majority of my page table is used by "Unknown"! https://www.dropbox.com/s/km5a91qxa4hpnmu/wpa-pagetable.png Now what??

– benshepherd – 2014-07-17T16:18:42.263

@benshepherd I have no idea. I asked someone from Microsoft. – magicandre1981 – 2014-07-18T03:52:45.600

@magicandre1981: Did you get any response? 7 days of uptime, and my page table is up to 950MB, 94% of which is "Unknown". – benshepherd – 2014-07-21T15:48:32.960

@benshepherd can you share the ETL file? (zip and upload it to a cloud service) – magicandre1981 – 2014-07-21T17:38:57.197

https://www.dropbox.com/s/edhppb76021htja/MemUsage.zip – benshepherd – 2014-07-21T17:41:47.383

@benshepherd I wrote a mail to my contact at MS and included the link. Lets hope they can see anything from it. – magicandre1981 – 2014-07-23T18:51:26.680

@benshepherd you should call the Microsoft support about this issue. My contact stated that this is not normal and should be investigated by the Support. – magicandre1981 – 2014-07-30T18:01:29.263

I also found "Unknown" (-1) was using majority of the page table. @benshepherd did you find out any further hints as to why this might be happening? – sparrowt – 2014-12-08T17:20:06.847

@sparrowt also phone the Microsoft support about the issue. – magicandre1981 – 2014-12-09T05:15:25.607

1

I eventually discovered the problem, it was Lenovo "RapidBoot Shield" and removing it solved my problem: http://superuser.com/a/850346/79763

– sparrowt – 2014-12-09T10:09:28.070

I had more luck with "wprui.exe" that is also bundled with windows perfomance SDK 8.1 and provides a UI for xperf. Specifically, I've collected "Resident Set analysis" profile – alex – 2015-03-20T06:01:12.960

@alex this profile adds a lot of extra information (.net events, JavaScript, store app events in Win8.x) which are not needed. I prefer to only capture the data that I need. – magicandre1981 – 2015-03-20T19:22:34.187

I can't find where it stores MemUsage.etl. Running it again gives me an error saying the file already exists.. but where!? – B T – 2016-12-05T23:36:38.343

@BT it stores in in the current dir. if you open cmd as admin via startmenu this will be system32. you can use this xperf -d C:\MemUsage.etl to store the file in C:\ (replace it with the folder you like) – magicandre1981 – 2016-12-06T05:09:43.317

@magicandre1981 It was giving me this misleading error: http://www.msfn.org/board/topic/155479-xperf-error-nt-kernel-logger-cannot-create-a-file-when-that-file-al/

– B T – 2016-12-06T06:17:23.593

do what I wrote on msfn. stop ProcessExplorer and any other tool that uses ETW and now run xperf. you can also run this command in Win8.1: wpr.exe -start ReferenceSet -filemode && timeout 5 && wpr.exe -stop C:\HighMemoryUsage.etl – magicandre1981 – 2016-12-06T16:03:19.913

0

the only "strange" things i can see here are these:

  • you have a process called scan.exe which hogs away 98% cpu right now
  • you have a process called fssm32.exe which has 9 millions page faults. fssm32.exe looks like a program from the f-secure virus scanner.
  • you also have a process called SAV32CLI.EXE which hogs away another 130mb+ of memory. it looks like you feel better protected to run 2 virus-scanners: f-secure and sophos at the same time.

the rammap-screenshot looks ok for me: you have a pool of ~400mb cached files, 85mb of them in active use, 300mb+ in standby mode (meaning they are freed as soon as you need more ram). looking at your amount of sshd.exe and bash.exe processes this looks legit as well.

the best way to learn the inner workings of how windows manages memory is this talk here: http://channel9.msdn.com/Events/TechEd/NorthAmerica/2011/WCL405 .. you are already using some of the tools the speaker developed.

akira

Posted 2014-06-04T18:30:11.283

Reputation: 52 754

1my Microsoft contacts (Microsoft Premier Field Engineers) also told me that several 100MB of page table are not normal for Windows. – magicandre1981 – 2014-07-30T18:02:54.357