4

I've only basic Windows Server knowledge and I've inherited responsibility for a Terminal Server installation with 20-30 concurrent users (Windows Server 2003).

There are intermittent problems with performance - ultimately due to to the low spec of the server I think (triple core, 4Gb memory using PAE). I'm trying to see if I can keep things running before a major upgrade later on this year.

One thing I've noticed is that processes from various sessions often consume 100% of a core. I think the freezes happen when this occurs on several sessions at once. Is there anything I can do to limit the CPU use of individual sessions? Alternatively is it possible to reserve one core so that it is not used by individual sessions but is available to handle logins etc instead?

  • If it's choking on resources from logged-in users, you might not want to allow a processor to allow more to log in at that point. That's like making sure there's "just enough room" inside the door to fit more ravers at a dance club already overcrowded past fire code. – Bart Silverstrim Mar 26 '12 at 14:56
  • good point - though the active sessions can be as low as the dozen mark when the freezes occur, or as high as 30 –  Mar 26 '12 at 14:58
  • Sounds like there's a particular issue to narrow down. I'll elaborate slightly in my answer... – Bart Silverstrim Mar 26 '12 at 15:08
  • 1
    Are any specific processes showing up with regularity? @BartSilverstrim mentioned poorly written AV software, and I would like to call particular attention to this. I have seen Symantec Endpoint Protection periodically monopolize an 8-core, 80-user terminal server by launching the same UI process simultaneously for all 80 users; the kludge that the vendor recommended as a workaround stopped the freezes but introduced other problems. Switching to better-behaved AV software had numerous performance benefits throughout the organization. – Skyhawk Mar 26 '12 at 15:20
  • @Miles I've swapped AVG for Microsoft FEP already (disabling the GUI for non-admin users) and that improved things a certain amount. Thunderbird (3.1) seems to be the worst culprit but I'm loath to try a newer version in case it makes things worse. –  Mar 26 '12 at 15:34
  • 1
    @JackDouglas It's been a long time since I've evaluated Thunderbird in any context, but as of last time (a few years ago) I remember that it was *very* resource-hungry when it had to open a large mailbox. Seems like *everyone* has a large mailbox these days... – Skyhawk Mar 26 '12 at 15:39
  • @Miles very good point, I'll look into that and issues like compacting, expunging etc, thanks –  Mar 26 '12 at 15:46

1 Answers1

4

You would want to look up resource quotas (memory and cpu quotas) like in http://kurtsh.com/2007/07/16/howto-throttle-the-cpu-on-desktops-terminal-servers/ or http://technet.microsoft.com/en-us/library/cc732553.aspx although they may be particular to Win2008. This should give you a starting point to search, though.

When we ran terminal services, we found that certain applications and practices could mitigate the type of issue you see, namely the resource hogging (we were running back on 2000, though...things seemed to have improved over time.)

Some users like screen savers on the terminal. Restrict them to not allow.

Create policies for idle logoff.

Monitor certain habits such as running flash animations in a loop. We had someone drive a terminal to the ground because they had The Weather Channel on a radar loop that leaked memory.

Use performance monitor to check for other constraints; poorly written AV software can bog you down when it launches a per-user monitoring instance, for example.

This is one of the few times where a fragmented disk can be bad since you have ~25 users with their own caches of tiny files scattered around the disk. Check for fragmentation and do an off-hours cleaning.

~25 users on a system like you described will indeed bog it down; that was about our limit on systems with terminal services before it affected other users. You can't expect miracles in tuning it at a certain point.

What RAID level are you running? Slow disk subsystems can cause bogdowns. Especially if you have a morning rush of logins. Upgrade that and you may see a decent speed boost, although you said you're trying to nurse this one to last until a major upgrade...

Monitor for unauthorized software installations. Doesn't take much for a certain type of software to hog everything.

Use a utility like procmon and the procmonitor from sysinternals (free) to find what could be bogging your system down. There's no reason it should be freezing; it may be slow, but not locked up. Those utilities may help narrow down the root cause. It's been a lifesaver for us at times when we'd otherwise be left scratching our heads.

That's about all I remember off the top of my head...hopefully others will have better advice on alleviating your resource shortage on the system.

NOTE - if you're having this freeze when there are low numbers of users as well as high numbers, there could be a particular application or action causing the problem. In addition to the suggestion of the sysinternals tools, I'd start looking for patterns; anything in the logs? Who is logged in at the time, and what were they doing? Can you track what users are doing, if the screen freezes and leaves up activity? What time of day does this happen? Can you get users to send you a note of what they were doing when it goes splat?

We had an issue with servers spontaneously rebooting on our TS cluster. Even Microsoft was at a loss to explain it. Turned out to be a particular application running that shouldn't have caused a reboot, but did. Removed it from the servers and our reboots went away. But it literally took months to figure it out!

Bart Silverstrim
  • 31,092
  • 9
  • 65
  • 87
  • To chime in with the "slow disk" - make sure you have write-caching enabled on your RAID, which usually requires a battery-backed write cache. Without that, your writes will queue, and interactive performance (very visible to Citrix/TS users) will suffer *a lot*. – mfinni Mar 26 '12 at 15:21
  • @mfinni I trust the hardware about as far as I could throw it - particularly the UPS. The RAID is software-only so no battery-backed cache (the TS is virtualized on top of Linux/KVM). –  Mar 26 '12 at 15:36
  • @BartSilverstrim "Adobe Flash" and "Terminal Server" sound like a *very* frightening combination. – Skyhawk Mar 26 '12 at 15:44
  • Jack - if you're going to have that much virtualization, you'll need some serious instrumentation on that hardware to find out if your IO is queuing. Good luck, and I wouldn't do that again. Triple core? That sounds like a AMD Phenom desktop processor; is this "server" really a server ? – mfinni Mar 26 '12 at 15:47
  • @mfinni I've got myself confused here - it is quad-core Xeon X3210, with 3 cores available to the VM. Still I'd call it 'very low-end' myself and I'm looking forward to replacing it :-) –  Mar 26 '12 at 16:00
  • I'm trialling ThreadMaster, it looks like it might be just what I'm after, thanks. –  Mar 27 '12 at 08:22