Intel SSD DC P3600 1.2TB performance

5

1

I just got a workstation with an Intel SSD DC P3600 1.2TB on an Asus X99-E WS motherboard. I started Ubuntu 15.04 from a live CD and ran the Disks (gnome-disks) application to benchmark the SSD. The disk is mounted under /dev/nvme0n1. I ran the default benchmark (using 100 samples of 10 MB each, sampled randomly from the entire disk) and the results are disappointing: average read rate is 720 MB/s, average write rate is 805 MB/s (greater than the read rate!?) and average access time is 0.12 ms. Furthermore, the only information about the disk that Disks shows is its size - there is no model name or any other info.

I am unable to connect this machine to the network before it is setup due to corporate policy, so I cannot use any diagnostics tools (I wanted to follow the official documentation) apart from what is preinstalled. The documentation states that The NVMe driver is preinstalled in Linux kernel 3.19 and Ubuntu 15.04 has 3.19.0-15-generic so that should not be the problem. The

dd if=/dev/zero of=/dev/nvme0n1 bs=1M oflag=direct

command from the documentation gives me a write rate of about 620 MB/s and

hdparm -tT --direct /dev/nvme0n1

gives 657 MB/s O_DIRECT cached reads and 664 MB/s O_DIRECT disk reads.

I fixed the PCIe port the disk is connected to to a PCIe v3.0 slot in BIOS and do not use UEFI boot.

Edit 1: The PC supplier connected the SSD to the mainboard using Hot-swap Backplane PCIe Combination Drive Cage Kit for P4000 Server Chassis FUP8X25S3NVDK (2.5in NVMe SSD).

The device is physically plugged into a PCIe 3.0 x16 slot, but lspci under Centos 7 and Ubuntu 15.04 lists it as using PCIe 2.0 1.0 x4 (LnkSta is 2.5 GT/s which is the speed of PCIe v1.0):

[user@localhost ~]$ sudo lspci -vvv -s 6:0.0
06:00.0 Non-Volatile memory controller: Intel Corporation PCIe Data Center SSD (rev 01) (prog-if 02 [NVM Express])
    Subsystem: Intel Corporation DC P3600 SSD [2.5" SFF]
    Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
    Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    Latency: 0
    Interrupt: pin A routed to IRQ 40
    Region 0: Memory at fb410000 (64-bit, non-prefetchable) [size=16K]
    Expansion ROM at fb400000 [disabled] [size=64K]
    Capabilities: [40] Power Management version 3
        Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
        Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
    Capabilities: [50] MSI-X: Enable+ Count=32 Masked-
        Vector table: BAR=0 offset=00002000
        PBA: BAR=0 offset=00003000
    Capabilities: [60] Express (v2) Endpoint, MSI 00
        DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 <4us
            ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+
        DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
            RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
            MaxPayload 256 bytes, MaxReadReq 512 bytes
        DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
        LnkCap: Port #0, Speed 8GT/s, Width x4, ASPM L0s L1, Exit Latency L0s <4us, L1 <4us
            ClockPM- Surprise- LLActRep- BwNot-
        LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
            ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
        LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
        DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported
        DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
        LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
             Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
             Compliance De-emphasis: -6dB
        LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+, EqualizationPhase1-
             EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
    Capabilities: [100 v1] Advanced Error Reporting
        UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
        UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
        UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
        CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
        CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
        AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
    Capabilities: [150 v1] Virtual Channel
        Caps:   LPEVC=0 RefClk=100ns PATEntryBits=1
        Arb:    Fixed- WRR32- WRR64- WRR128-
        Ctrl:   ArbSelect=Fixed
        Status: InProgress-
        VC0:    Caps:   PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
            Arb:    Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
            Ctrl:   Enable+ ID=0 ArbSelect=Fixed TC/VC=01
            Status: NegoPending- InProgress-
    Capabilities: [180 v1] Power Budgeting <?>
    Capabilities: [190 v1] Alternative Routing-ID Interpretation (ARI)
        ARICap: MFVC- ACS-, Next Function: 0
        ARICtl: MFVC- ACS-, Function Group: 0
    Capabilities: [270 v1] Device Serial Number 55-cd-2e-40-4b-fa-80-bc
    Capabilities: [2a0 v1] #19
    Kernel driver in use: nvme

Edit 2:

I tested the drive under Centos 7 and the performance is identical to what I got on Ubuntu. I have to mention that the official documentation states that Intel tested this SSD on Centos 6.7 which does not seem to exist. Instead, after 6.6 came Centos 7.

Another source of confusion: benchmark results vary depending on the physical PCIe slot I connect the drive to. Slots 1-3 give the described performance, while on slots 4-7 the SSD achieves 100 MB/s higher read speed.

The only other PCIe device in the computer is an EVGA Nvidia GT 210 GPU with 512 MB RAM which seems to be a PCIe 2.0 x16 device, however, its LnkSta indicates PCIe v1.0 (2.5 GT/s) x8:

[user@localhost ~]$ sudo lspci -vvv -s a:0.0
0a:00.0 VGA compatible controller: NVIDIA Corporation GT218 [GeForce 210] (rev a2) (prog-if 00 [VGA controller])
    Subsystem: eVga.com. Corp. Device 1313
    Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
    Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    Latency: 0
    Interrupt: pin A routed to IRQ 114
    Region 0: Memory at fa000000 (32-bit, non-prefetchable) [size=16M]
    Region 1: Memory at c0000000 (64-bit, prefetchable) [size=256M]
    Region 3: Memory at d0000000 (64-bit, prefetchable) [size=32M]
    Region 5: I/O ports at e000 [size=128]
    Expansion ROM at fb000000 [disabled] [size=512K]
    Capabilities: [60] Power Management version 3
        Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
        Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
    Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
        Address: 00000000fee005f8  Data: 0000
    Capabilities: [78] Express (v2) Endpoint, MSI 00
        DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s unlimited, L1 <64us
            ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
        DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
            RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+
            MaxPayload 128 bytes, MaxReadReq 512 bytes
        DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
        LnkCap: Port #8, Speed 2.5GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <1us, L1 <4us
            ClockPM+ Surprise- LLActRep- BwNot-
        LnkCtl: ASPM Disabled; RCB 128 bytes Disabled- CommClk-
            ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
        LnkSta: Speed 2.5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
        DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR-, OBFF Not Supported
        DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
        LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
             Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
             Compliance De-emphasis: -6dB
        LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
             EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
    Capabilities: [b4] Vendor Specific Information: Len=14 <?>
    Capabilities: [100 v1] Virtual Channel
        Caps:   LPEVC=0 RefClk=100ns PATEntryBits=1
        Arb:    Fixed- WRR32- WRR64- WRR128-
        Ctrl:   ArbSelect=Fixed
        Status: InProgress-
        VC0:    Caps:   PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
            Arb:    Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
            Ctrl:   Enable+ ID=0 ArbSelect=Fixed TC/VC=01
            Status: NegoPending- InProgress-
    Capabilities: [128 v1] Power Budgeting <?>
    Capabilities: [600 v1] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
    Kernel driver in use: nouveau

Edit 3:

I have now connected the workstation to the network, installed Intel's Solid-State Drive Data Center Tool (isdct) and updated the firmware but the benchmark results haven't changed. What is interesting is its output:

[user@localhost ~]$ sudo isdct show -a -intelssd 
ls: cannot access /dev/sg*: No such file or directory
- IntelSSD CVMD5130002L1P2HGN -
AggregationThreshold: 0
Aggregation Time: 0
ArbitrationBurst: 0
AsynchronousEventConfiguration: 0
Bootloader: 8B1B012F
DevicePath: /dev/nvme0n1
DeviceStatus: Healthy
EnduranceAnalyzer: 17.22 Years
ErrorString: 
Firmware: 8DV10151
FirmwareUpdateAvailable: Firmware is up to date as of this tool release.
HighPriorityWeightArbitration: 0
Index: 0
IOCompletionQueuesRequested: 30
IOSubmissionQueuesRequested: 30
LBAFormat: 0
LowPriorityWeightArbitration: 0
ProductFamily: Intel SSD DC P3600 Series
MaximumLBA: 2344225967
MediumPriorityWeightArbitration: 0
MetadataSetting: 0
ModelNumber: INTEL SSDPE2ME012T4
NativeMaxLBA: 2344225967
NumErrorLogPageEntries: 63
NumLBAFormats: 6
NVMePowerState: 0
PCILinkGenSpeed: 1
PCILinkWidth: 4
PhysicalSize: 1200243695616
PowerGovernorMode: 0 (25W)
ProtectionInformation: 0
ProtectionInformationLocation: 0
RAIDMember: False
SectorSize: 512
SerialNumber: CVMD5130002L1P2HGN
SystemTrimEnabled: 
TempThreshold: 85 degree C
TimeLimitedErrorRecovery: 0
TrimSupported: True
WriteAtomicityDisableNormal: 0

Specifically, it lists PCILinkGenSpeed as 1 and PCILinkWidth as 4. I haven't found out what NVMePowerState of 0 means.

My questions:

  1. How do I make the SSD run at PCIe v3.0 x4 speed?

nedim

Posted 2015-06-09T10:10:35.097

Reputation: 184

Can you take pictures of the mystery card? And...it's Ubuntu; I wouldn't expect it to perform well. – Michael Hampton – 2015-06-09T14:43:07.287

Give us you lspci please. – Konrad Gajewski – 2015-06-09T15:10:40.727

Michael@ you shouldn't expect Ubuntu to perform well out of the box in a few situations. ;-) – Konrad Gajewski – 2015-06-09T15:13:30.057

I cannot take pictures of the card and I've inspected the lspci output again and there is nothing there about any kind of SSD or storage device. Unfortunately this machine is off the network so I cannot post the output either. I know it makes the problem more difficult, sorry about that. – nedim – 2015-06-09T15:22:10.473

I have some updated info, hope it helps. – nedim – 2015-06-10T11:52:15.217

For context, here is the Intel SSD DC P3600 spec sheet. Quoted performance is up to 2600 MB/s read and up to 1700 MB/s write in sequential workloads, sequential latency typical 20 µs both read and write, and random 4 KB IOPS is up to 450 read, 56K write (?!). 700-800 MB/s for 10 MB reads is well below what should be expected.

– a CVn – 2015-06-13T18:53:00.243

8 GT/s on PCIe indicates PCIe v3.x, which allows for up to 985 MB per second per lane. With x4, you should be able to get up to 3940 MB/s on-bus bandwidth. It's almost as if only a single lane is being used... what other PCIe devices do you have in that system?

– a CVn – 2015-06-13T18:58:30.607

@MichaelKjörling so Capabilities [60] Express (v2) Endpoint, MSI 00 does not mean PCIe v2.0? It is strange that only one lane is being used and that the slot is shown as x4 when the motherboard supports up to 4 v3.0 x16 slots. The only other device is the Nvidia GPU I just described in Edit 2. – nedim – 2015-06-14T19:35:01.373

@KonradGajewski I have now connected the workstation to the network and updated the post with new information, I hope it helps – nedim – 2015-06-15T12:34:34.743

Reading the specs of the board you posted I find:

7 "x16 slots". If we want those physical x16 slots connected with x16 electrical lanes all the time then we need 112 PCI-e lanes.

The same page lists 40 lane and 28 lane CPU's and the X99 chipset (8 PCI-e v2 lanes). (See http://www.intel.com/content/dam/www/public/us/en/images/diagrams/rwd/x99-chipset-block-diagram-rwd.jpg/jcr:content/renditions/intel.web.720.405.jpg for the chipset).

– Hennes – 2015-07-22T08:42:02.747

That probably means that some of the slots are not electrically x16 all the time. Either they share lanes (e.g. x16/unused or x8/x8) or they are always hardwired to x8, x4 or similar.

In addition to that PCI-e to memory performance may be slower if it need to go via the chipset to the CPU and from there to memory (even if both used the same PCI-e version).

So in short:

  1. Performance difference between slots is expected.
  2. Can you look up how the slots are wired?
  3. < – Hennes – 2015-07-22T08:42:06.010

@Hennes I already checked this and populated the PCIe slots in the correct order as indicated in the motherboard manual. Anyway, the issue is now settled as there is a solution. – nedim – 2015-07-22T09:39:30.703

Answers

1

This is a hardware issue.

Hot-swap Backplane PCIe Combination Drive Cage Kit for P4000 Server Chassis FUP8X25S3NVDK (2.5in NVMe SSD) seems to be incompatible with the Asus X99-E WS motherboard. The solution is to connect the SSD using Asus HyperKit. However, this solution requires a cable between the HyperKit and the SSD which is not bundled with any of those and also not available for purchase at this time. Such a cable is bundled with the Intel SSD 750 Series (2.5' form factor) and our supplier was able to deliver one as a special service.

Beware of hardware incompatibility issues.

nedim

Posted 2015-06-09T10:10:35.097

Reputation: 184