1

we have several servers in cluster , and we want to know what in general in which cases we need to configure huge pages ?

I have also few quastions

  1. dose "memory page size" is equal to Huge pages ?

in my linux server I entered the following command to verify the default memory page size

grep Hugepagesize /proc/meminfo

Hugepagesize: 2048 kB

getconf PAGESIZE

4096
  1. but as all see here we get diff results , why ?

  2. what are the risks when using huge pages?

  3. dose Disable Transparent Huge Page - means to disable the HUGE PAGES option ?

shalom
  • 451
  • 12
  • 26

2 Answers2

6

Hugepages are interesting when applications need large mappings that they will do random accesses to, because that is the worst possible case for the Translation Lookaside Buffer (TLB). You trade off mapping granularity for TLB entries.

Pages, including hugepages, can only be mapped to a physical memory block of the same size, and aligned to that size. So a 2 MB hugepage needs to be mapped to a 2MB boundary in physical RAM, and a 1GB hugepage needs to be mapped to an 1GB boundary, because the lower bits address data inside the page, and no offset can be added here.

This means that hugepages are usually reserved at system start, when physical memory isn't fragmented yet. Applications that are hugepage aware can use hugetlbfs to allocate them.

You have to decide with a kernel parameter whether hugepages should be 2MB or 1GB in size, you cannot mix these. Normal 4kB pages are always available.

The most common use case are virtual machines (qemu/kvm can use hugepages), where this allows to keep the VM's entire memory mapping in a small number of TLB entries, which are therefore never evicted, so memory accesses inside the VM require a page table lookup only inside the guest context.

Some database systems also support hugepages, but this is generally only useful if you work with large datasets and indexes.

The questions:

  1. There are normal (4kB) pages and huge (2MB or 1GB) pages. When you query the page size, you get the size for normal pages, when you query the huge page size, you get the setting for the huge pages. Both normal and huge pages can be used in parallel, but you cannot mix different huge page sizes.

  2. You get different results because these are two different things. The size of normal pages is fixed in hardware, so it is not a setting.

  3. Huge pages need to be allocated early, and while the memory is technically "free", it cannot be used for anything but hugepage aware applications, so all but these applications will have less memory available. That's usually not a problem, because you'd use hugepages on machines that are dedicated to memory-hungry applications like VMs or databases.

  4. Transparent hugepages try to make the memory available as buffers and caches (contrary to #3), and try to give hugepages to applications that map large memory blocks, so applications that are not hugepage aware can still profit from them -- basically an application that requests a 2MB/1GB block of memory will be given a hugepage if possible. Whether this helps or hurts performance depends on your applications. If you have a hugepage aware app and you want to assign the memory manually, you need to disable THP, while a system that has a database app that doesn't understand hugepages would likely benefit.

Simon Richter
  • 3,209
  • 17
  • 17
3

Obvious use cases for huge pages is when PageTables (visible in /proc/meminfo) becomes tens of GB. This means large memory and CPU overheads of just tracking your memory. It happens with giant chunks of memory, large number of processes, or both. Often in database applications.

Huge pages significantly cut down on that overhead because a single page table entry now addresses much more memory, say 2,048 KB instead of 4 KB. (There are different sizes on other platforms, AIX on POWER supports 16 MB large pages for example.)

Huge pages on Linux may not be used for file caching, and are annoying and inefficient for a couple of MB to malloc() for non-shared memory. So the administrator has to allocate huge page pools that only can be used for some purposes. This is a con of using huge pages.

Transparent huge pages (THP) try to make administration less annoying by automatically "de-fragmenting" contiguous memory into huge pages. The idea was this makes pre-allocated huge pages optional. Benefits are highly specific to the workload, its possible it will spend too much CPU to be worth it. Disabling THP means you can still use allocate huge pages manually. Sometimes it is worthwhile to turn off THP and just put the shared memory segments of the database in huge pages.

One last gripe about Linux huge pages: I find managing it annoying.

  • Shared memory uses one interface, but for the others you use a hugetlbfs library and file system.
  • You can "waste" memory by allocating too any huge pages and not configuring your application to use it.
  • That number of pages has to be scaled to each host size, because its a count of pages and not a percentage of memory.
  • Often the ability to allocate huge pages is restricted to being in one group, switching database users may lead to surprise memory waste.
John Mahowald
  • 30,009
  • 1
  • 17
  • 32