Why does updating a web application of large organizations take hours?

0

Often, web applications are updated. If the update is properly planned, the maintenance is announced and visitors know what to expect and why the downtime is happening.

However, I have problems understanding the length of some of these maintenance windows. I have maintained medium-sized web applications. Since updates were tested in advance in a staging environment, updating the application involved only pushing the new code to the server, performing data migrations and reloading/restarting the server configuration, topping at 15 minutes for maintenance updates, but generally only a few seconds.

Maintenance updates at large corporations tend to take a lot longer. I have seen organizations taking 3 hours to rollout minor updates; updates to governmental systems may take eight to nine hours, or if you're unlucky an entire weekend. It is unclear to me what their process might be. I understand they may have large clusters and databases to work with, but I can't imagine what may take those amounts of time. What are they doing in those hours? Is this their inefficiency and planning for disaster, or is there something I'm missing completely?

(I'm ignoring hardware maintenance or major version rollouts here.)

ralphje

Posted 2013-09-02T14:21:17.113

Reputation: 103

Question was closed 2013-09-08T15:42:31.950

2So? They have larger and more mutually dependent systems, and they take the time to verify each step one more time in the live enviroment, and who knows what else is going on behind the scenes. Think of replacing servers, rewiring hardware, implementing couplings to other software. You just have no idea about what's going on behind the (their) scenes. – Jan Doggen – 2013-09-02T15:04:55.837

1Your example of domain registrations is comparing apples and oranges. There we are talking about data updates having to propagate across organizations who all do it at their own automated intervals. – Jan Doggen – 2013-09-02T15:06:16.937

In addition to what @JanDoggen said, the maintenance windows are pessimistically sized, meaning that the break might be announced (and sometimes also enforced by shutting down interfaces) for a longer time than the actual work will take just to make sure the announced window is long enough (in case something unforeseen happens). There might also be some other maintenance tasks occurring during the break even though they are not mentioned in communication to outsiders. And finally, some systems just are awful to upgrade, meaning plenty of manual work. – zagrimsan – 2013-09-03T09:11:05.000

Even though you only know the web interface, it might be that there are numerous backends involved, too, in the process, and restarting them might require a specific order to ensure correct behaviour since not all enterprise system integrations are that error-tolerant as one would wish. It might take the 15 minutes you mentioned just to bring everything gracefully down and close all interfaces... – zagrimsan – 2013-09-03T09:20:11.383

I take the liberty of combining all our comments into an answer ;-) Feel free to edit. – Jan Doggen – 2013-09-03T09:54:36.780

Answers

0

There are plenty of reasons, e.g.:

  • They have larger and more mutually dependent systems.
  • They take the time to verify each step one more time in the live enviroment.
  • There might also be some other maintenance tasks occurring during the break even though they are not mentioned in communication to outsiders. Think of replacing servers, rewiring hardware, implementing couplings to other software.
  • Some systems just are awful to upgrade, meaning plenty of manual work
  • It might be that there are numerous backends involved, too, in the process, and restarting them might require a specific order to ensure correct behaviour since not all enterprise system integrations are that error-tolerant as one would wish.

Etcetera...

Also, the maintenance windows are pessimistically sized, meaning that the break might be announced (and sometimes also enforced by shutting down interfaces) for a longer time than the actual work will take just to make sure the announced window is long enough (in case something unforeseen happens).

Jan Doggen

Posted 2013-09-02T14:21:17.113

Reputation: 3 591