16

I have a Mirrored Dynamic disk on my Windows 2003 Server. How do you monitor the health of the volume?

Is there a way to have the server send an email when there is an issue with the volume? Is there a way to have the server run S.M.A.R.T. tests?

EDIT: Nothing says WTF like logging into a client server, running DISKPART LIST VOLUME and seeing this.

Volume ###  Ltr  Label        Fs     Type        Size     Status     Info
----------  ---  -----------  -----  ----------  -------  ---------  --------
Volume 0     X   xDrive       NTFS   Mirror       233 GB  Failed Rd
Volume 1     C                NTFS   Simple        57 GB  Healthy    System
Volume 2     D                       DVD-ROM         0 B  Healthy
Volume 3     F                RAW    Partition    466 GB  Healthy
Volume 4     E   New Volume   NTFS   Partition    932 GB  Healthy
NitroxDM
  • 635
  • 1
  • 13
  • 29
  • We're talking a software mirror here, right? If so, great question. – Chris_K Jun 10 '10 at 05:23
  • @Chris_k Correct. Last time a disk failed I only found out by chance. On a enterprise system that is completely unacceptable. I have a back up system. But that is not the point of doing a mirror. – NitroxDM Jun 10 '10 at 22:41
  • With info like that I guess now is a good time to test out that script. Windows for the win! – NitroxDM Jun 25 '10 at 17:35
  • I'm working on a solution using both of the answers listed here. – NitroxDM Aug 05 '10 at 17:38

6 Answers6

5

I had the same question a while ago. The first thing I thought of was using WMI, but for some weird reason, WMI doesn't expose the health of a RAID volume through any of the normal Win32_* classes.

I eventually stumbled across the script in this article and made a few modifications to suit my requirements. It parses the output of diskpart.exe's "LIST VOLUME" command. This may seem a little dirty and ugly, but right now its the best option I've seen.

The script as it appears on the linked page is ready to be used with Nagios / NSClient++. If you know a bit of VBScript it's easy enough to modify this to send e-mail instead of printing status information.

If you don't know VBScript, I'll gladly give you a modified version which will do whatever you want it to.

ThatGraemeGuy
  • 15,314
  • 12
  • 51
  • 78
  • VBScript not so much. C# on the other hand ;) The script doesn't look too bad. – NitroxDM Jun 11 '10 at 06:25
  • Another article on this topic and how to work-around this problem: http://www.eventlogblog.com/blog/2012/02/how-to-make-the-windows-softwa.html – Lucky Luke Feb 28 '12 at 22:53
  • Those (@LuckyLuke & ThatGraemeGuy scripts) are great, but lacking of language support. Both of my servers are in English, so good. But, my download machine is in French. I've been able to figure (from ThatGraemeGuy script) `RE0.Pattern = "Healthy|Sain" RE1.Pattern = "Mirror|RAID-5|Miroir"`, but not `RE2` & `RE3` that are "Failed|At risk" & "Rebuild". Unfortunately, this is bad because those, mostly the `RE2`, are the important ones. Do you where I could get those translated in French or maybe another way around that would not rely on the language? – Master DJon Mar 02 '17 at 03:53
  • 1
    Good point - but it would be very time consuming to install Windows in every language and observe the strings. If I were you then I would install a French Windows in a VM and simulate a RAID failure with Virtual Disks. You can probably extract the strings from a DLL somewhere, but that would probably equally time consuming. – Lucky Luke Mar 02 '17 at 12:56
3
for /f "tokens=4,9 delims= " %a IN ('echo list volume ^| diskpart ^| find "SSD"') do echo %a %b

Replace find "SSD" with "mirror"(or stripe... whatever) or your volume name. (my volumes are named SSD1 + SSD2)

Stick in a batch file with @echo off and ur done. :)

@echo off
for /f "tokens=4,9 delims= " %%a IN ('echo list volume ^| diskpart ^| find "SSD"') do echo %%a %%b

Above line is needed for batch. =)

Notes

  • You need to have a volume name for this to work, else change tokens to tokens=8
Starfish
  • 2,716
  • 24
  • 28
Mindfart
  • 39
  • 2
2

I use this ugly batch file to monitor more than one hundred servers to check mirror status and the result is lovely. It is a nsclient++ client plugin to do passive check every four hour to send result to nagios server.

check_mirror.bat

@echo off
echo list volume | diskpart | find "Mirror" > H
for /f %%i in ('type H ^| find /c "Mirror"') do set /a M=%%i 
for /f %%i in ('type H ^| find "Mirror" ^| find /c "Health" ') do set /a H=%%i 
for /f %%i in ('type H ^| find /c "Risk"') do set /a risk=%%i 
@del H /q
rem echo M=%M%, H = %H% Risk=%risk%
if %risk% GTR 0 goto err
IF %M%.==0. goto nomirror
IF %M% EQU %H% goto mirrorok

:err
echo CRITICAL: Something Wrong.
exit /B 1

:mirrorok
echo OK: Mirror Health.
exit /B 0

:nomirror
echo OK: No Mirror Found.
exit /B 1
dawud
  • 14,918
  • 3
  • 41
  • 61
user191549
  • 21
  • 1
1

Here is a Powershell script to monitor the health of RAID arrays on Windows from https://gist.github.com/schakko at https://gist.github.com/schakko/4713248

I tested it on a Windows 10 software raid and a Dell PERC H700 hardware raid on Server 2016. Worked well. We use MegaRaid for notifications.

I'm reproducing it here just in case the Gist goes away, like the target on the accepted answer https://serverfault.com/a/150089/2985

The "Fehlerfre" and "Fehlerhaf" in the script are German translations for result codes.

# A simple PowerShell script for retrieving the RAID status of volumes with help of diskpart.
# The nicer solution would be using WMI (which does not contain the RAID status in the Status field of Win32_DiskDrive, Win32_LogicalDisk or Win32_Volume for unknown reason)
# or using the new PowerShell API introduced with Windows 8 (wrong target system as our customer uses a Windows 7 architecture).
# 
# diskpart requires administrative privileges so this script must be executed under an administrative account if it is executed standalone.
# check_mk has this privileges and therefore this script must only be copied to your check_mk/plugins directory and you are done.
#
# Christopher Klein <ckl[at]neos-it[dot]de>
# This script is distributed under the GPL v2 license.

$dp = "list volume" | diskpart | ? { $_ -match "^  [^-]" }

echo `<`<`<local`>`>`>
foreach ($row in $dp) {
    # skip first line
    if (!$row.Contains("Volume ###")) {
        # best match RegExp from http://www.eventlogblog.com/blog/2012/02/how-to-make-the-windows-softwa.html
        if ($row -match "\s\s(Volume\s\d)\s+([A-Z])\s+(.*)\s\s(NTFS|FAT)\s+(Mirror|RAID-5|Stripe|Spiegel|Spiegelung|Übergreifend|Spanned)\s+(\d+)\s+(..)\s\s([A-Za-z]*\s?[A-Za-z]*)(\s\s)*.*")  {
            $disk = $matches[2] 
            # 0 = OK, 1 = WARNING, 2 = CRITICAL
            $statusCode = 1
            $status = "WARNING"
            $text = "Could not parse line: $row"
            $line = $row
            
            if ($line -match "Fehlerfre |OK|Healthy") {
                $statusText = "is healthy"
                $statusCode = 0
                $status = "OK"
            }
            elseif ($line -match "Rebuild") {
                $statusText = "is rebuilding"
                $statusCode = 1
            }
            elseif ($line -match "Failed|At Risk|Fehlerhaf") {
                $statusText = "failed"
                $statusCode = 2
                $status = "CRITICAL"
            }
        
            echo "$statusCode microsoft_software_raid - $status - Software RAID on disk ${disk}:\ $statusText"
        }
    }
}

This version from https://gist.github.com/LionRelaxe also emails error results (https://gist.github.com/schakko/4713248#gistcomment-2256715)

# A simple PowerShell script for retrieving the RAID status of volumes with help of diskpart.
# The nicer solution would be using WMI (which does not contain the RAID status in the Status field of Win32_DiskDrive, Win32_LogicalDisk or Win32_Volume for unknown reason)
# or using the new PowerShell API introduced with Windows 8 (wrong target system as our customer uses a Windows 7 architecture).
# 
# diskpart requires administrative privileges so this script must be executed under an administrative account if it is executed standalone.
# check_mk has this privileges and therefore this script must only be copied to your check_mk/plugins directory and you are done.
#
# Christopher Klein <ckl[at]neos-it[dot]de>
# This script is distributed under the GPL v2 license.

#Volumes:
$dpV = "list volume" | diskpart | ? { $_ -match "^  [^-]" }
foreach ($row in $dpV) {
    $OutString = $OutString+$row+"`r`n"
    # skip first line
    if (!$row.Contains("Volume ###")) {
        # best match RegExp from http://www.eventlogblog.com/blog/2012/02/how-to-make-the-windows-softwa.html
        if ($row -match "\s\s(Volume\s\d)\s+([A-Z])\s+(.*)\s\s(NTFS|FAT)\s+(Mirror|RAID-5|Stripe|Spiegel|Spiegelung|Übergreifend|Spanned)\s+(\d+)\s+(..)\s\s([A-Za-z]*\s?[A-Za-z]*)(\s\s)*.*")  {
            $disk = $matches[2] 
            # 0 = OK, 1 = WARNING, 2 = CRITICAL
            $statusCode = 1
            $status = "WARNING"
            $text = "Could not parse line: $row"
            $line = $row
            
            if ($line -match "Fehlerfre |OK|Healthy") {
                $statusText = "is healthy"
                $statusCode = 0
                $status = "OK"
            }
            elseif ($line -match "Rebuild") {
                $statusText = "is rebuilding"
                $statusCode = 1
                $VolumeErrorFound = 1
            }
            elseif ($line -match "Failed|At Risk|Fehlerhaf") {
                $statusText = "failed"
                $statusCode = 2
                $status = "CRITICAL"
                $VolumeErrorFound = 1
            }
        
            #echo "$statusCode microsoft_software_raid - $status - Software RAID on disk ${disk}:\ $statusText"
        }
    }
}
$OutString = $OutString+"`r`n"

#Disk:
$dpD = "list disk" | diskpart | ? { $_ -match "^  [^-]" }
foreach ($row in $dpD) {
    # skip first line
    if (!$row.Contains("Volume ###")) {
        $OutString = $OutString+$row+"`r`n"
        # best match RegExp from http://www.eventlogblog.com/blog/2012/02/how-to-make-the-windows-softwa.html
        if ($row -match "Errors") {
            #echo "$row"
            $DiskErrorFound = 1
        }
    }
}
if (($DiskErrorFound) -Or ($VolumeErrorFound)) {
    $SMTPServer = "your.smtp.server"
    $SMTPPort = 25
    $EmailTo = "ENTERYOUR@EMAIL.HERE"
    $EmailFrom = "ENTERYOUR@EMAIL.HERE"
    $EmailSubject = "Disk or Volume Error on $env:computername"
    $EmailBody = $OutString
    Send-MailMessage -To $EmailTo -From $EmailFrom -Subject $EmailSubject -Body $EmailBody -SmtpServer $SMTPServer -Port $SMTPPort
}
Chris Vesper
  • 438
  • 1
  • 7
  • 18
1

Smartmontools (http://sourceforge.net/apps/trac/smartmontools/wiki) has a windows version, but I don't know it runs on 2K8

lrosa
  • 1,657
  • 14
  • 15
0

while all of these answers will get you the status, none of them are the correct answer.

in an enterprise environment, you should be using enterprise-grade server and network monitoring tools. pretty much all of the monitoring tools i have used automatically monitor the health of any supported RAID array, software or hardware. they also monitor other things that you should be aware of, such as temperature, free disk space, etc. do you really want to create a custom script for every possible thing that needs to be monitored?

do yourself a favor and skip all this shoe-string and bubble-gum stuff and use the right tool.

longneck
  • 22,793
  • 4
  • 50
  • 84
  • Tools like what? – NitroxDM Sep 26 '12 at 22:46
  • 1
    Solar winds, n-able, what's up, spice works, even HP insight manager – longneck Sep 27 '12 at 02:31
  • I don't see how your answer is relevant or helpful, it's just an opinion. There are lot more capable tools than Solarwinds (n-able is from Solarwinds btw). Things also have changed, and software raid is not "shoe-string" anymore: http://www.smbitjournal.com/2016/12/the-software-raid-inflection-point/ – Lucky Luke Mar 02 '17 at 12:50
  • I didn't say that software raid is shoe-string. I said developing shoe-string processes or procedures for monitoring was a bad idea. (And at the time I wrote my answer, SolarWinds hadn't bought n-able yet.) – longneck Mar 02 '17 at 14:11