8

I have a server 2012r2 machine that I just installed KB2919355 (the mega 800+MB patch recently released for Windows 8.1 and Server 2012). Server is a Dell Poweredge R715. Disks are 2x500GB SAS in RAID1 on an H200 controller.

The server was working perfectly until the update - and I have the dell management tools installed, so I know there were no failing disk alerts or anything. The day before I had upgraded the firmware of the H200 controller, but the system rebooted after that without any issues.

After rebooting for the update, it came up to a black screen with a movable mouse cursor but nothing else - Ctrl-Shift-Esc and Ctrl-Alt-Del do nothing. Let it sit there for over an hour, nothing changed.

Booted with the "don't automatically restart on BSOD" option, and get INACCESSIBLE_BOOT_DEVICE as the error reason. Strangely, it says "We're just collecting some error info, and then you can restart.(0% complete)" and stays at 0%, never making any progress.

Tried to reboot with Last Known Good Configuration, same BSOD.

Rebooting into Repair My Computer works. From the command prompt I can see all the partitions and all the files appear to be intact. chkdsk reports no errors.

After this, the server managed to boot normally once. After rebooting it, it never came back up despite repeated boot attempts, they all end in the INACCESSIBLE_BOOT_DEVICE blue screen.

The issue seems to be with LSI raid controller cards. There is a thread on technet reporting others having similar issues with super micro machines - http://social.technet.microsoft.com/Forums/en-US/6bf5815f-55d9-4403-8f41-a16ebcb83735/patch-kb2919355-makes-supermicro-machines-crash?forum=winserver8setup

I have a support case open with Dell, who is trying to replicate this issue in their lab. There probably isn't anything else anyone can do here.

Update

On Dell's advice, I wiped the system, and did a fresh install of Server 2012R2 Datacenter with GUI. I did nothing to it except install windows updates.

After installing KB2919355, the server rebooted properly. After rebooting again, it blue screened with the INACCESSIBLE_BOOT_DEVICE error.

I highly recommend NOT installing this update on any servers with LSI based RAID cards until this issue is resolved. Hopefully Dell will come up with a solution quickly.

Update from Dell Support

This is an issue we are now looking into on a larger scale and most likely will have to be addressed by Microsoft as it is more widespread than just Dell. We will continue to work it until our resources have been exhausted. I will keep you posted as to what we find.

Another update from Dell Support

Dell has been unable to replicate this problem in their lab. I have confirmed 2 of my systems have the same issue, and reproducing it is easy - install windows, install updates until it gives you KB2919355, server dies on the second reboot after the update is installed.

They are currently building me an exchange machine to swap one of them with, so they have a broken machine to test with. Hopefully that helps them resolve it quickly.

Grant
  • 17,671
  • 14
  • 69
  • 101
  • Seems like you're not alone: http://social.technet.microsoft.com/Forums/windowsserver/en-US/cbe63608-aab1-4ceb-8828-eb358ac766e4/windows-server-2012r2-fail-to-boot-after-installing-kb2919355-update?forum=winserver8gen – MichelZ Apr 16 '14 at 18:21
  • Have you tried updating BIOS and stuff? – MichelZ Apr 16 '14 at 18:23
  • BIOS and all firmwares are up to date (using the built in dell firmware updating tools). Currently booting it from a linux live cd so I can take a look at the disk partitions. – Grant Apr 16 '14 at 18:35
  • I really want to figure out what went wrong here and how to fix it. Luckily this server is non critical, so I can take some time to fix it. But I have more servers to update and need to be prepared if this happens again. – Grant Apr 19 '14 at 00:29
  • I guess if you can - open a case with `Microsoft` and `DELL`. It seems to have to do with `LSI` controllers. I don't think there's anything we can do at this point – MichelZ Apr 19 '14 at 07:13
  • @MichelZ I have support contracts. Looks like this is one of those times when they are useful. Will be contacting dell support Monday morning to see what they say. I'm sure they'll blame microsoft, microsoft will blame lsi, lsi will blame dell...should make for a fun Monday. – Grant Apr 20 '14 at 03:15
  • Let's hope that someone still cares to fix it – MichelZ Apr 20 '14 at 06:46
  • I'm going to keep an eye on this, as I have a production server running 2012 R2 (not updated) and the H200. – DanBig Apr 21 '14 at 17:43
  • @DanBig do NOT install this update. I will keep updating with what Dell finds, but you may want to open your own support case as well if you have support on that server. Dell is working on replicating it in their own lab. – Grant Apr 21 '14 at 17:50
  • @michelz want to make your "contact dell support" comment into an answer? – Grant Apr 26 '14 at 02:30
  • @Grant done that – MichelZ Apr 26 '14 at 05:34

4 Answers4

3

Please see KB2977012 for a Workaround and (in the future) solutions.

Current status (2014/05/13):

Microsoft is researching this problem and will post more information in this article when the information becomes available.

Workaround:

Start the computer from media for Windows RT 8.1, Windows 8.1, or Windows Server 2012 R2, select the Repair your computer option, click Troubleshoot, and then click Command Prompt.

Note For this workaround, the media that you use should not include Update Rollup 2919355.

At a command prompt, run the following command:

Bcdedit /store <path of Boot Configuration Data (BCD)> /set {default} truncatememory 4294967296

Note The path of the BCD file is :\BOOT\BCD, in which the driver letter is the system partition. This command adds an entry to the BCD file under Windows Boot Loader that is named truncatememory. The new entry will have a value format of 0x100000000. For example, run the following command:

Bcdedit /store C:\BOOT\BCD /set {default} truncatememory 4294967296

Restart the computer. The computer should now boot to the desktop.

Note If you were installing Update Rollup 2919355 when this problem occurred, the computer will continue to complete the installation of the update. After you successfully start Windows, uninstall Update Rollup 2919355.

To remove the truncatememory boot option, run the following command at a command prompt:

Bcdedit /deletevalue truncatememory

Start the computer normally.

MichelZ
  • 11,008
  • 4
  • 30
  • 58
  • Yay! Someone is finally taking this seriously. Hopefully they come up with a permanent fix soon - limiting my hyper-v server to 4GB of RAM isn't going to work very well :) – Grant May 13 '14 at 14:55
  • Yah, the recommendation is to uninstall the rollup and wait for a fix – MichelZ May 13 '14 at 15:41
  • I had the "Inaccessible_boot_device" bug while installing from scratch the latest Windows 8.1 with Update on a Dell Poweredge R210 II. Most intriguingly, it would boot fine in about 10% of the times, but fail and reboot most frequently. The workaround worked fine for me, many thanks, I spent 2 days looking for a solution on the internet to no success. Many thanks, really, honestly. – msb May 22 '14 at 23:08
2

You should open a case with Microsoft and DELL. It seems to have to do with LSI controllers. I don't think there's anything we can do at this point

MichelZ
  • 11,008
  • 4
  • 30
  • 58
0

I think your Raid controller firmware update touched some areas of the Dell-BIOS so it started to look for SAN boot devices.

Try to open a case with Dell about this.

In the meantime check your hba and iscsi boot settings and disable them.

Nils
  • 7,657
  • 3
  • 31
  • 71
  • doesn't seem related to the san. I even tried pulling the hba card. still blue screens. – Grant Apr 20 '14 at 03:17
0

I raised a call with Dell for my PowerEdge T110 II + H200 controller + Windows 2012 R2 Foundation - There's a potential hotfix that exactly matches the symptoms We're getting:

  • KB2919355 installed
  • When turned on or restarted, the server gets into a boot loop, crashing to the start as the Windows flag appears
  • it either boots itself after 3 or 4 attempts, or goes to boot recover wizard. turning it off then on again eventually gets it to boot
  • when booted, the server runs normally, until the next shutdown/restart
  • Turning off automatic reboot creates a INACCESSIBLE_BOOT_DEVICE Blue screen crash

The fix is available to email to yourself at https://support.microsoft.com/kb/2966870

Dell confirmed the fix applies to us, and I applied it last night (It's being offered automatically via Windows Update now) - I rebooted the server six times, all without issue. Looks like it's fixed.

Dave
  • 427
  • 2
  • 8
  • 16
  • Since you're already talking to Dell...I noticed there is also an H200 driver update being pushed by windows update, but no information on what it fixes...wonder if that is part of the fix? – Grant Jun 17 '14 at 17:17
  • There's been a Windows Update H200 driver for about a year; I don't think it's a new driver: Dell confirmed the A09 (Card)/A10 (Integrated) drivers from Mid-2013 will work fine. – Dave Jun 18 '14 at 15:26
  • this one was released by LSI recently, May 2014. and pushed by windows update starting 6/9/2014. – Grant Jun 18 '14 at 15:35
  • I saw the new H200 Driver last night. I didn't apply it (Just KB2966870), which seemed to resolve the issue. Not sure what updates it fixes, but it doesn't seem to be needed to stop the reboot issue. – Dave Jun 19 '14 at 08:59