5
1
I just got a workstation with an Intel SSD DC P3600 1.2TB on an Asus X99-E WS motherboard. I started Ubuntu 15.04 from a live CD and ran the Disks (gnome-disks
) application to benchmark the SSD. The disk is mounted under /dev/nvme0n1
. I ran the default benchmark (using 100 samples of 10 MB each, sampled randomly from the entire disk) and the results are disappointing: average read rate is 720 MB/s, average write rate is 805 MB/s (greater than the read rate!?) and average access time is 0.12 ms. Furthermore, the only information about the disk that Disks shows is its size - there is no model name or any other info.
I am unable to connect this machine to the network before it is setup due to corporate policy, so I cannot use any diagnostics tools (I wanted to follow the official documentation) apart from what is preinstalled. The documentation states that The NVMe driver is preinstalled in Linux kernel 3.19
and Ubuntu 15.04 has 3.19.0-15-generic
so that should not be the problem. The
dd if=/dev/zero of=/dev/nvme0n1 bs=1M oflag=direct
command from the documentation gives me a write rate of about 620 MB/s and
hdparm -tT --direct /dev/nvme0n1
gives 657 MB/s O_DIRECT cached reads and 664 MB/s O_DIRECT disk reads.
I fixed the PCIe port the disk is connected to to a PCIe v3.0 slot in BIOS and do not use UEFI boot.
Edit 1: The PC supplier connected the SSD to the mainboard using Hot-swap Backplane PCIe Combination Drive Cage Kit for P4000 Server Chassis FUP8X25S3NVDK (2.5in NVMe SSD).
The device is physically plugged into a PCIe 3.0 x16 slot, but lspci
under Centos 7 and Ubuntu 15.04 lists it as using PCIe 2.0 1.0 x4 (LnkSta
is 2.5 GT/s which is the speed of PCIe v1.0):
[user@localhost ~]$ sudo lspci -vvv -s 6:0.0
06:00.0 Non-Volatile memory controller: Intel Corporation PCIe Data Center SSD (rev 01) (prog-if 02 [NVM Express])
Subsystem: Intel Corporation DC P3600 SSD [2.5" SFF]
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 40
Region 0: Memory at fb410000 (64-bit, non-prefetchable) [size=16K]
Expansion ROM at fb400000 [disabled] [size=64K]
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [50] MSI-X: Enable+ Count=32 Masked-
Vector table: BAR=0 offset=00002000
PBA: BAR=0 offset=00003000
Capabilities: [60] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 <4us
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
MaxPayload 256 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 8GT/s, Width x4, ASPM L0s L1, Exit Latency L0s <4us, L1 <4us
ClockPM- Surprise- LLActRep- BwNot-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+, EqualizationPhase1-
EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
Capabilities: [100 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
Capabilities: [150 v1] Virtual Channel
Caps: LPEVC=0 RefClk=100ns PATEntryBits=1
Arb: Fixed- WRR32- WRR64- WRR128-
Ctrl: ArbSelect=Fixed
Status: InProgress-
VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=01
Status: NegoPending- InProgress-
Capabilities: [180 v1] Power Budgeting <?>
Capabilities: [190 v1] Alternative Routing-ID Interpretation (ARI)
ARICap: MFVC- ACS-, Next Function: 0
ARICtl: MFVC- ACS-, Function Group: 0
Capabilities: [270 v1] Device Serial Number 55-cd-2e-40-4b-fa-80-bc
Capabilities: [2a0 v1] #19
Kernel driver in use: nvme
Edit 2:
I tested the drive under Centos 7 and the performance is identical to what I got on Ubuntu. I have to mention that the official documentation states that Intel tested this SSD on Centos 6.7 which does not seem to exist. Instead, after 6.6 came Centos 7.
Another source of confusion: benchmark results vary depending on the physical PCIe slot I connect the drive to. Slots 1-3 give the described performance, while on slots 4-7 the SSD achieves 100 MB/s higher read speed.
The only other PCIe device in the computer is an EVGA Nvidia GT 210 GPU with 512 MB RAM which seems to be a PCIe 2.0 x16 device, however, its LnkSta
indicates PCIe v1.0 (2.5 GT/s) x8:
[user@localhost ~]$ sudo lspci -vvv -s a:0.0
0a:00.0 VGA compatible controller: NVIDIA Corporation GT218 [GeForce 210] (rev a2) (prog-if 00 [VGA controller])
Subsystem: eVga.com. Corp. Device 1313
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 114
Region 0: Memory at fa000000 (32-bit, non-prefetchable) [size=16M]
Region 1: Memory at c0000000 (64-bit, prefetchable) [size=256M]
Region 3: Memory at d0000000 (64-bit, prefetchable) [size=32M]
Region 5: I/O ports at e000 [size=128]
Expansion ROM at fb000000 [disabled] [size=512K]
Capabilities: [60] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
Address: 00000000fee005f8 Data: 0000
Capabilities: [78] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s unlimited, L1 <64us
ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
LnkCap: Port #8, Speed 2.5GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <1us, L1 <4us
ClockPM+ Surprise- LLActRep- BwNot-
LnkCtl: ASPM Disabled; RCB 128 bytes Disabled- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR-, OBFF Not Supported
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
Capabilities: [b4] Vendor Specific Information: Len=14 <?>
Capabilities: [100 v1] Virtual Channel
Caps: LPEVC=0 RefClk=100ns PATEntryBits=1
Arb: Fixed- WRR32- WRR64- WRR128-
Ctrl: ArbSelect=Fixed
Status: InProgress-
VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=01
Status: NegoPending- InProgress-
Capabilities: [128 v1] Power Budgeting <?>
Capabilities: [600 v1] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
Kernel driver in use: nouveau
Edit 3:
I have now connected the workstation to the network, installed Intel's Solid-State Drive Data Center Tool (isdct
) and updated the firmware but the benchmark results haven't changed. What is interesting is its output:
[user@localhost ~]$ sudo isdct show -a -intelssd
ls: cannot access /dev/sg*: No such file or directory
- IntelSSD CVMD5130002L1P2HGN -
AggregationThreshold: 0
Aggregation Time: 0
ArbitrationBurst: 0
AsynchronousEventConfiguration: 0
Bootloader: 8B1B012F
DevicePath: /dev/nvme0n1
DeviceStatus: Healthy
EnduranceAnalyzer: 17.22 Years
ErrorString:
Firmware: 8DV10151
FirmwareUpdateAvailable: Firmware is up to date as of this tool release.
HighPriorityWeightArbitration: 0
Index: 0
IOCompletionQueuesRequested: 30
IOSubmissionQueuesRequested: 30
LBAFormat: 0
LowPriorityWeightArbitration: 0
ProductFamily: Intel SSD DC P3600 Series
MaximumLBA: 2344225967
MediumPriorityWeightArbitration: 0
MetadataSetting: 0
ModelNumber: INTEL SSDPE2ME012T4
NativeMaxLBA: 2344225967
NumErrorLogPageEntries: 63
NumLBAFormats: 6
NVMePowerState: 0
PCILinkGenSpeed: 1
PCILinkWidth: 4
PhysicalSize: 1200243695616
PowerGovernorMode: 0 (25W)
ProtectionInformation: 0
ProtectionInformationLocation: 0
RAIDMember: False
SectorSize: 512
SerialNumber: CVMD5130002L1P2HGN
SystemTrimEnabled:
TempThreshold: 85 degree C
TimeLimitedErrorRecovery: 0
TrimSupported: True
WriteAtomicityDisableNormal: 0
Specifically, it lists PCILinkGenSpeed
as 1 and PCILinkWidth
as 4. I haven't found out what NVMePowerState
of 0 means.
My questions:
- How do I make the SSD run at PCIe v3.0 x4 speed?
Can you take pictures of the mystery card? And...it's Ubuntu; I wouldn't expect it to perform well. – Michael Hampton – 2015-06-09T14:43:07.287
Give us you lspci please. – Konrad Gajewski – 2015-06-09T15:10:40.727
Michael@ you shouldn't expect Ubuntu to perform well out of the box in a few situations. ;-) – Konrad Gajewski – 2015-06-09T15:13:30.057
I cannot take pictures of the card and I've inspected the
lspci
output again and there is nothing there about any kind of SSD or storage device. Unfortunately this machine is off the network so I cannot post the output either. I know it makes the problem more difficult, sorry about that. – nedim – 2015-06-09T15:22:10.473I have some updated info, hope it helps. – nedim – 2015-06-10T11:52:15.217
For context, here is the Intel SSD DC P3600 spec sheet. Quoted performance is up to 2600 MB/s read and up to 1700 MB/s write in sequential workloads, sequential latency typical 20 µs both read and write, and random 4 KB IOPS is up to 450 read, 56K write (?!). 700-800 MB/s for 10 MB reads is well below what should be expected.
– a CVn – 2015-06-13T18:53:00.2438 GT/s on PCIe indicates PCIe v3.x, which allows for up to 985 MB per second per lane. With x4, you should be able to get up to 3940 MB/s on-bus bandwidth. It's almost as if only a single lane is being used... what other PCIe devices do you have in that system?
– a CVn – 2015-06-13T18:58:30.607@MichaelKjörling so
Capabilities [60] Express (v2) Endpoint, MSI 00
does not mean PCIe v2.0? It is strange that only one lane is being used and that the slot is shown as x4 when the motherboard supports up to 4 v3.0 x16 slots. The only other device is the Nvidia GPU I just described in Edit 2. – nedim – 2015-06-14T19:35:01.373@KonradGajewski I have now connected the workstation to the network and updated the post with new information, I hope it helps – nedim – 2015-06-15T12:34:34.743
Reading the specs of the board you posted I find:
7 "x16 slots". If we want those physical x16 slots connected with x16 electrical lanes all the time then we need 112 PCI-e lanes.
The same page lists 40 lane and 28 lane CPU's and the X99 chipset (8 PCI-e v2 lanes). (See http://www.intel.com/content/dam/www/public/us/en/images/diagrams/rwd/x99-chipset-block-diagram-rwd.jpg/jcr:content/renditions/intel.web.720.405.jpg for the chipset).
That probably means that some of the slots are not electrically x16 all the time. Either they share lanes (e.g. x16/unused or x8/x8) or they are always hardwired to x8, x4 or similar.
In addition to that PCI-e to memory performance may be slower if it need to go via the chipset to the CPU and from there to memory (even if both used the same PCI-e version).
So in short:
@Hennes I already checked this and populated the PCIe slots in the correct order as indicated in the motherboard manual. Anyway, the issue is now settled as there is a solution. – nedim – 2015-07-22T09:39:30.703