79

A little background: We have several Windows servers (2003, 2008) for our department. We're a division of IT so we manage our own servers. Of the four of us here I'm the only one with a slight amount of IT knowledge. (Note the "slight amount".) My boss says the servers need to be restarted at least weekly. I disagree. Our IT Department says that because she restarts them constantly that's the reason why our hard drives fail and power supplies go out on them. (That's happened to a few of our servers a couple times over the last four years, and very recently.)

So the question is: How often does everyone restart their Windows servers? Is there an industry standard or recommendation? Is our IT department correct in saying that because we re-start that's why we're having hardware issues? (I need a reason if I'm going to change her mind!)

John Gardeniers
  • 27,262
  • 12
  • 53
  • 108
Evan
  • 909
  • 1
  • 6
  • 4
  • 113
    Oh, about every [second Tuesday of the month](http://en.wikipedia.org/wiki/Patch_Tuesday). :) – jscott May 26 '11 at 14:38
  • 4
    Dang! We were doing every fourth Thursday of the month! :) – Evan May 26 '11 at 14:40
  • 2
    I'm in the opposite boat. We're not allowed to reboot servers without a weeklong back-and-forth debate among the entire department that takes about 12 manhours per server. Yes, this includes reboots for patching, which effectively means that it never gets done. – Hyppy May 26 '11 at 15:46
  • 19
    Restarting weekly shouldn't cause a drastic increase in hardware failures either. – JamesRyan May 26 '11 at 16:46
  • 3
    It sounds like your servers get rebooted more often than my laptop. I generally it into sleep mode when I'm not using it. The usual reason for doing a reboot is installing windows updates or software. – Phil May 26 '11 at 20:59
  • I recently restarted a production server for the first time since 2007. It turns out it wasn't necessary. :( – Gabe May 27 '11 at 01:28
  • 2
    @Hyppy: sounds like some of your folks need a good thorough read of this thread to get back from "paranoid risk extinguishment" to "rational risk management". Risk *management* gets you the best results for your effort; risk *extinguishment* just gets you into as bad of a situation as risk *ignorance*. – ParanoidMike Aug 17 '11 at 21:41

13 Answers13

119

My boss says the servers need to be restarted at least weekly

I strongly disagree. Microsoft has made great strides since the good-ole [NT, anyone?] days with regard to stability and uptime. It's a shame the consensus within IT support has not changed along with this.

How often does everyone restart their Windows servers?

Only when required -- Either because of an OS/software update, a critical software failure which cannot be recovered via other methods, hardware upgrade/replacement or other activity that cannot happen without a restart.1

Is there an industry standard or recommendation?

I have never seen a standard recommendation, per se, but I could not agree with any recommendation [except from MS themselves] which would indicate a required reboot at a specific time interval "just-because".

Is our IT department correct in saying that because we re-start that's why we're having hardware issues?

Restarting [and, more so, power cycling] is the most stressful period of hardware activity for a computer. You have most everything spinning up to 100% -- disk and fans... ...as well as significant fluctuations in component temperatures. Modern hardware is incredibly resilient, but that shouldn't be a reason for just bouncing servers, on a whim, a few times a week.

1 Aside, I loathe when techs "just" reboot a Windows server in the case of a failed service, or the like. I understand the need to get the service running again, but a reboot should be the last step in trouble shooting a server. Identifying, and fixing[!], the root cause of failure should almost never result in "Meh, just reboot it...."

jscott
  • 24,204
  • 8
  • 77
  • 99
  • 2
    Thank you for the thorough answer. We do updates once a month, which obviously when we do those we have to restart. I appreciate the answer. – Evan May 26 '11 at 15:08
  • 5
    I have to disagree with your addendum. If the service defines the server (for instance an NFS server which stops sharing exports), and you know that a clean reboot will bring back up the service in X minutes, and after basic troubleshooting you determine it will take x+5 to resolve the issue, it is most expedient to just reboot. You can do a cause analysis afterwards. Now, that is my method of doing it anyways, and you could argue for and against quite easily :) Just how I roll. – Matthew May 26 '11 at 15:44
  • 34
    @Matthew: Performing root cause analysis after-the-fact is all well and good if there isn't transient information about the cause lost by rebooting. I think I speak for a number of people when I say that I'd rather have one more extended downtime to ferret-out and fix the root cause of an outage than a number of shorter downtime incidents when I decide to "just reboot" and potentially lose the ability to use volatile information to assist in root cause analysis. – Evan Anderson May 26 '11 at 15:56
  • great thorough explanation – Jim B May 26 '11 at 15:57
  • 8
    @Matthew In service failure cases, I would expect the tech to try restarting *the service*, as a troubleshooting step, *before* rebooting the whole box. – jscott May 26 '11 at 16:06
  • 6
    @Evan I agree with you however I think there has to be a threshold of incidents that become a problem. Eg it if happens once a month and is resolved in 10 mins with a reboot, the business may never care about root cause. I think you and I would like to know but uptime is more importnat than root cause. However if it happens 3 times a week, it's a whole different story. – Jim B May 26 '11 at 16:09
  • +1 - great post. agree on all points. – Cypher May 26 '11 at 17:35
  • Good points here - and a thorough explanation. However, none of this gives the questioner anything he can use to convince management. A standard (or "best practices") document from Microsoft would be good. A knowledge-base article would be good too. Even a quote from the Big Bad Boy himself specifically relating to this would be good. – Mei May 26 '11 at 22:48
  • I run every main task on a seperate Virtual Machine (linux AND windows), This way when there is a problem I do not have to take down all the other systems. Since the host OS is debian and not directly connected to the internet I don't fear about having to update or reboot. When for example my windows RDP or linux mail server is having trouble I won't have to take down the web and fileshare servers as well In the ultimate case of a reboot (the host systems have more then 100Days uptime now – HTDutchy May 27 '11 at 10:27
  • @David You're correct, my response is somewhat anecdotal as written. The biggest problem [to providing supporting citations] is Microsoft documentation doesn't provide a list of things *not* to do to your servers. Even their [Best Practices for WSUS](http://technet.microsoft.com/en-us/library/cc708536(WS.10).aspx) only says, roughly, reboot *if* needed and scheduled it outside of production hours. – jscott May 27 '11 at 10:50
  • I fully agree that rebooting should not be a regular troubleshooting step. But I work on mission critical systems where a down system needs to get back up as soon as possible. While we try to find the root cause of problems often we need to restart it to just get it up and running and then have to do what we can to find the cause of the problem from logs and such. It is a completely different, and annoying, environment. – JLZenor Jun 10 '12 at 01:26
  • any OS that needs a reboot to fix something is but a toy. – SnakeDoc Dec 06 '13 at 22:16
53

Windows servers need to be rebooted monthly, if you're applying patches. You are applying patches, right? Right?

Hyppy
  • 15,458
  • 1
  • 37
  • 59
  • 3
    You only apply patches monthly? – John Gardeniers May 27 '11 at 02:37
  • Strictly speaking, xe's only applying _the patches that themselves require a reboot_ monthly. Not all PTFs require a reboot, and not all monthly updates even contain any such fixes at all. – JdeBP May 27 '11 at 10:52
  • 2
    I only reboot Windows servers when an update _requires_ it. Sometimes it will go a couple months without a patch that requires a reboot. I do, however, have linux servers that have not rebooted in years and run without a hitch. I think the longest I've seen in my network is a linux box that got put in a closet and forgotten (it did get automatic updates). I ssh'd in and the uptime was at 3 years. A year later it was rebooted due to the power supply failing. – James May 28 '11 at 00:04
  • If it were linux, or BSD, you could patch your server *without* needing a reboot. You only must reboot for kernel updates (and with a server oriented distro, those are infrequent). – SnakeDoc Dec 06 '13 at 22:17
18

I'll give an alternative answer for a very specific case. The advances of the last 2-3 years may have changed this, but if you have heavily-used TS or Citrix servers that run a lot of interactive application (like Office), it's been a good idea to do weekly reboots off-hours, just to start from a clean slate for resources like stuck sessions, used desktop heap, etc. If you have your farm set up right and stagger the reboots, even if you have light use off-hours, users should not be impacted.

Sure, it's regular reboots of servers, but they're being used like desktops.

mfinni
  • 35,711
  • 3
  • 50
  • 86
  • 4
    Mm... good call on the TS/Citrix case. – Hyppy May 26 '11 at 15:41
  • Similar experience here using Citrix with CCH's audit management software. –  May 27 '11 at 00:56
  • 1
    The same applied back in the MetaFrame days, when Citrix themselves recommended nightly reboots if that was practical. – John Gardeniers May 27 '11 at 02:38
  • Yeah, Metaframe ... whoof. I don't miss playing with the printer driver mapping file. It's certainly gotten a lot better from an IT management perspective. – mfinni May 27 '11 at 14:15
10

This is more a political and psychological issue than a technical one.

In my experience, certain people who worked with some of the much older versions of windows got it into their heads that they needed weekly reboots, and they have enshrined that philosophy in a little corner of their mind (they never do seem to notice when a reboot is missed when they're on vacation, though). Unless you've got some very unstable systems and applications, it's no longer based in reality.

On the flip side, frequent reboots may catalyze hardware failure, but are not terribly likely to be the cause of it.

Shane Madden
  • 112,982
  • 12
  • 174
  • 248
  • 7
    My boss is good friends with the retired network administrator who told her that they needed to be rebooted at least weekly...which explains why she is so adamant about that. Thank you for the answer. – Evan May 26 '11 at 15:26
  • 5
    No wonder he is "retired"...is that a euphemism for fired? – KCotreau May 26 '11 at 18:28
3

The only time they should need to be restarted is for maintenance if everything is working correctly. Scheduled reboots are truly only a requirement when A) upgrading software, B) performing hardware maintenance, or C) dealing with a memory leak that can't be solved by restarting the software/service causing it. While windows isn't known for long uptimes, it does happen (last job had some Win2k boxes that were up for months at a time - they just worked). Just remember that any patching will most likely require reboots.

Matthew
  • 2,666
  • 8
  • 32
  • 50
  • Thank you for the answer. This should help in persuading her. – Evan May 26 '11 at 15:27
  • 1
    I've found windows NT,2000 and 2003 Boxes on work network that have been up and running for a number of years. and up until recently our data center had a yearly patching policy and with over 600 servers it's not uncommon to see up times in the 250+ day range. My servers ( I have about 120) Get updated and booted when every Microsoft patches. Sometimes, like last month we didn't have a cycle. The uptime depends on what is running on the server and how well the things work together. 2003 R2 with that stuff I have to run needs to be reboot every 35 days. funny stuff happens after that. – Christopher Thornton May 27 '11 at 23:48
3

Microsoft has done a great job of improving their server OS over the years. And some servers you can run for 6 - 12 months before they start experiencing problems, some only make it 2 - 3 months. It all depends on what services and apps the servers are running. But they will all have a problem at some point. Windows updates, memory leaks, imperfect software, are just a few reasons.

For our clients with maintenance contracts we install updates and reboot their servers monthly. These clients have a much lower indecent of unplanned server issues, on the order of 1/5th as many issues as those that don't reboot regularly.

For those that say rebooting causes premature hardware failure, there was time when restarting hard drives and systems was a potential issue. However today HDDs and other components are build to withstand thousands of start stop cycles. If your server hardware is weak, would you rather know about it at a controlled time when you are there to address the problem quickly, or a random failure with a call in the middle of the business day saying a department is down?

I feel there is no downside to regular monthly restarts, while the upsides are clear and proven over time.

Skyhawk
  • 14,149
  • 3
  • 52
  • 95
Todd H
  • 31
  • 1
2

I'm by no means an expert on the subject, but depending on what services you have running, some may be susceptible to overflow on certain timing functions, such as timeGetTime() and getTickCount().

timeGetTime has a 32bit result, which equals the number of milliseconds since the computer was started. This maxes out at approximately 49.7 days.

Matthew
  • 231
  • 1
  • 7
  • 2
    Err, no. I have a server (on a completely isolated, trusted network - don't preach to me) that has been up for the best part of 14 months with **NO** ill effects. – Ben Pilbrow May 26 '11 at 19:34
  • 3
    I didn't mean to imply that **every** server and instance would have this problem, but that if the server uses software that utilizes these functions and did not account for such would encounter computational problems. – Matthew May 26 '11 at 19:51
  • 2
    The 32-bit timer issue is valid, but it's an issue that individual software vendors need to carefully avoid in their own code. Windows is no longer susceptible to failures related to this timer (as it was in the past), but if you have software installed that does not account for timer rollback, then it can cause unanticipated effects. – tylerl May 26 '11 at 19:52
  • My point was that in *any* modern day operating system, this is not an issue. I also have a very hard time believing any OS would expose any API's that would allow the OS to crash every 2 months. – Ben Pilbrow May 26 '11 at 19:59
  • 1
    Are you referring to this [Microsoft KB](http://support.microsoft.com/kb/q216641/)? – jscott May 26 '11 at 20:14
  • 9
    Err this is an _NT 4_ bug win 2k+ do not suffer from this. I think we can safely say NT 4 is dead in 2011. and if someone somewhere is running it ... they deserve what they get at this point. – Zypher May 26 '11 at 20:19
  • 1
    @Zypher: Are you talking about the KB @jscott linked to? If so, that's a *Win95* bug! No version of NT has ever had a bug relating to the uptime counter – Gabe May 27 '11 at 01:33
  • The main reason NT servers needed regular reboots was because of the memory leaks, not because of an overflow issue. – John Gardeniers May 27 '11 at 02:35
  • @Zypher - Many of the famous retailers are using Windows NT4.0 on there Cash machine Systems installed within a store - which I have witnessed myself. As far as restarts, I have worked and seen Windows 2003R2 and above servers that only gets restarted once a year... It all depends on what one is running and the enviornment, one of our Exchange 2007 Server got restarted after 2years, it had 200 Mailboxes and as it was working as desired with all functionality, we never upgraded / installed Update rollups or SPs onto it. – Mutahir May 27 '11 at 07:29
  • 1
    Right on. We were just hit hard by this crazy bug in Windows own networking stack: http://support.microsoft.com/kb/2553549. In short, if you manage to get the server up for 497 days, you are screwed. @Ben Pilbrow, I suppose your server didn't get to this stage? – sayap Mar 23 '12 at 21:34
2

I used to restart all my Windows servers each week and there was certainly a time when that was required. These days I only restart them when an update requires it. Of course that means they still get restarted every few weeks anyway.

John Gardeniers
  • 27,262
  • 12
  • 53
  • 108
1

I rely on the windows updates to configure my 'reboot schedule'. Let Windows manage itself.. for once! Only very rarely is a reboot required with our setup due to memory leaks...

1

I am a network administrator with a company that operates on several Windows 2003 2008 servers. I restart servers on a monthly basis typically not waiting longer than 3 months, as it is very crucial to be down for that short period of time.

However, with patches and windows updates I will be installing WSUS on a domain controller to apply updates etc. on a schedule based on my liking. This is to avoid any servers from updating themselves and unexpectedly rebooting...

GMitch
  • 500
  • 3
  • 12
1

All you Windows Haters should check out the Netcraft.com Sites with longest running systems by average uptime (http://uptime.netcraft.com/up/today/top.avg.html). This shows the sites that have been running longest since their last reboot and 95% of the top 50 are Windows 2003 and 2000 machines. As always, your mileage may vary.

Mark Lawrence
  • 813
  • 5
  • 7
0

Specifying only the Windows might be too broad for coming into a reasonable decision. In fact, you will come to a better decision if you consider the services, roles and features that you run on the Windows machine (e.g: Web Services, Database Servers, etc).

The quality and behavior of third party applications and web services ran on a specific server can suggest a demand of more/less frequent restarting of the hosting Windows machine than other machines without them.

Actually some third party applications (non-perfectly designed one; well no-one is perfect though!) may fail to release the acquired system resources such as memory, locks, and sockets in a graceful and timely fashion. This for example may keep some crashed applications, services or drivers [,when re-ran,] in pending or starting state which might not be easily fixed without a reboot.

In practice, Disk I/O, Network and Memory hungry applications under high and stressed workload and with low system resources available may render your Windows machine lagging, unstable or trashing which may suggest you to restart them sooner.

If you have to run such faulty applications or have to serve more users than the typical capacity of your hardware/software, or you are forced to co-locate incompatible services into one physical machine you may come to such decision that you should restart your Windows periodically. In this case you may adjust the restart period by listening to the users complaints about the server speed!

F.I.V
  • 181
  • 11
-7

Right answer is never unless you do software upgrade. Last time I rebooted my server about 2 years ago and reason was power failure.

  • 3
    I hope you're either talking about a linux server or I hope your server is not in professional use... – HTDutchy May 27 '11 at 10:22
  • 3
    Every server that gets patched needs to be restarted to apply some of those patches. Any server that is exposed to a public network needs to be patched. – railmeat May 27 '11 at 12:55
  • I have a couple of NT 4 domain controllers that get booted about once a year. No more updates and not targeted by bad guys much any more...(they are not internet facing) – hsmiths May 28 '11 at 03:40