Which CPUs support 1GB pages?

19

1

Some Intel CPUs support 1GB pages. This is identified by looking at CPUID 0x80000001, EDX bit 26. The Linux kernel exposes this via /proc/cpuinfo as the pdpe1gb flag.

Where do we find out which CPUs support this, and which ones do not? Or what product line supports this feature? There's nothing on these Intel ARK pages that indicate support for this feature.

CPUs that do support 1GB pages:

Other CPUs that do not support 1GB pages:

Jonathon Reinhart

Posted 2014-02-03T19:11:09.123

Reputation: 2 917

1

Related: How to use Intel Westmere 1GB pages on Linux?

– Jonathon Reinhart – 2015-01-20T07:33:37.320

Answers

4

According to this page:

in addition to their standard 4 KB pages newer x86-64 processors, such as AMD's newer AMD64 processors and Intel's Westmere and later processors can use 1 GB pages in long mode

Seems to be true as it was a new feature of Westmere CPUs.

Viacheslav Rodionov

Posted 2014-02-03T19:11:09.123

Reputation: 251

and what about Haswell ? Do they have this feature of having 1 GB pages in long mode ? Specifically in the consumer segments (I5 and lower) medium-range to low ? – shirish – 2015-03-05T13:28:14.330

2@JonathonReinhart: the huge downside to Hugepages for general use, esp. 1G pages, is that the entire hugepage is tying up that much physical RAM. If a process allocates 1GiB normally, only the parts that it's ever touched actually take up virtual memory. (overcommit even allows allocations that the kernel doesn't have enough swap space to handle). Linux can't page hugepages to disk even when a process is stopped, so a hugepage allocation effectively pins / locks that much physical memory. – Peter Cordes – 2015-09-11T07:01:10.027

12M hugepages make sense when they won't be left half empty (e.g. when you know for sure that you are going to write every 4k of the 2M anyway), but lack of paging is a big deal. Designing general-purpose software to work horribly badly on memory-constrained desktops is not a good idea. I don't think you can even mmap a file on disk with 2M hugepages, but it would be a bad idea for executables because there will be some 4k pages within a 2M block that aren't touched. These can get evicted from the pagecache (assuming they were prefetched), freeing RAM. – Peter Cordes – 2015-09-11T07:04:56.440

1With current CPUs having multi-level TLBs, the total time spent on TLB misses is probably not too bad, is it? I haven't profiled big stuff like firefox. I'd be interested to see a readable summary of how much time it spends on TLB misses, (esp. page-walks), and stuff like L1 I-cache misses. I know I could just point perf at it... Even if you did want to use 2M hugepages for firefox, I'd guess a lot of its internal data is allocated in smaller chunks than that. There'd be overhead to making sure you minimized external fragmentation of allocations inside a hugepage buffer. – Peter Cordes – 2015-09-11T07:08:42.303

@shirish I've updated my question with more CPUs. Yes, the i5-4250U a lower-power Haswell does support pdpe1gb. – Jonathon Reinhart – 2016-01-20T21:20:36.727

1TLB misses are expensive on high-memory, random-access operations, such as many database applications. Huge pages make a significant difference - but even there, they're talking about 2MB pages, not 1GB. The OS is the most likely user of 1GB pages through direct mapping of the whole physical address space. – GreenReaper – 2016-09-03T13:24:48.160

1@Peter Your comments are valid for userspace. One place where 1GB pages are especially useful is in the Linux kernel itself, where all of physical memory is is mapped into a range of the kernel VA space. This means only a handful of pages are required to maintain this mapping, and fewer TLB entries. This is important when kernel heap allocations come directly from this direct-map area. – Jonathon Reinhart – 2018-04-11T21:14:48.890

@JonathonReinhart: Yeah, kernel direct-mapping of all physical RAM is the killer app for 1G hugepages, because it's ok that these pages contain data from multiple different processes. The mapping isn't also access control / permission / allocation+reservation. 1G hugepages would be good for big user-space processes that need more than 1G allocated, like maybe a big JVM, if they aren't very concerned with handing back unused chunks of it to the OS. – Peter Cordes – 2018-04-11T21:23:15.840

5Yes, I definitely looked at all of those Wikipedia pages! However, this answer is not correct. Sandy Bridge is newer that Westmere, and I now have two Sandy Bridge CPUs that do not support it. – Jonathon Reinhart – 2014-02-27T14:39:37.393

I don't think that it's true for desktop and mobile CPUs. Just for the servers where this feature makes sense. – Viacheslav Rodionov – 2014-02-27T15:05:31.793

For mobile it does not. But why not desktop? As I understand it there are two benefits to this feature: 1. Reduced pagetable size - You don't need as many entries to map all of that memory. 2. Reduced TLB misses. With one entry covering 1GB of virt->phys space translations, this is fewer entries in the TLB, improving hit rate. For any desktop with lots of RAM this feature would be beneficial. – Jonathon Reinhart – 2014-02-27T22:52:42.890

Well, it's not a scientific answer, but I think it's the same reason as why Linux is not the OS number one and Windows is still alive. Just because geeks are minority and from what I saw you can get an advantage of nearly 5% from 1GB page on specific applications which need a lot of RAM. It's not what pretty blond girls usually do with their computers ;) – Viacheslav Rodionov – 2014-02-28T08:50:35.367