Out of MTRRs at boot

6

1

It seems my laptop (an Acer Aspire One AOA150) runs out of MTRRs at boot. I've done some googling on the problem and read that people recommend turning on MTRR sanitizing to fix it, however, it still occurs. I'm running Arch Linux (but that shouldn't matter). You can see that I have enabled MTRR sanitization here:

[chris@helios ~]$ zgrep 'SANITIZER' /proc/config.gz
CONFIG_MTRR_SANITIZER=y
CONFIG_MTRR_SANITIZER_ENABLE_DEFAULT=1
CONFIG_MTRR_SANITIZER_SPARE_REG_NR_DEFAULT=1

Here is my relevant system info:

uname -a:

Linux helios 2.6.38-aao-light #1 SMP PREEMPT Fri Apr 1 03:02:37 BST 2011 i686 Intel(R) Atom(TM) CPU N270 @ 1.60GHz GenuineIntel GNU/Linux

dmesg potential warnings:

[    0.000000] Notice: NX (Execute Disable) protection missing in CPU!
[    0.000000]  RCU-based detection of stalled CPUs is disabled.
[    0.000000]  Verbose stalled-CPUs detection is disabled.
[    0.157222] ACPI Error: [CAPB] Namespace lookup failure, AE_ALREADY_EXISTS (20110112/dsfield-143)
[    0.157393] ACPI Error: Method parse/execution failed [\_SB_.PCI0._OSC] (Node f682fca8), AE_ALREADY_EXISTS (20110112/psparse-536)
[    0.157578] ACPI: Marking method _OSC as Serialized because of AE_ALREADY_EXISTS error
[    0.160212] pci 0000:00:1b.0: PME# disabled
[    0.160354] pci 0000:00:1c.0: PME# disabled
[    0.160498] pci 0000:00:1c.1: PME# disabled
[    0.160642] pci 0000:00:1c.2: PME# disabled
[    0.160787] pci 0000:00:1c.3: PME# disabled
[    0.161441] pci 0000:00:1d.7: PME# disabled
[    0.162043] pci 0000:00:1f.2: PME# disabled
[    0.162677] pci 0000:02:00.0: PME# disabled
[    0.166197] pci 0000:00:1e.0:   bridge window [io  0xf000-0x0000] (disabled)
[    0.166207] pci 0000:00:1e.0:   bridge window [mem 0xfff00000-0x000fffff] (disabled)
[    0.166219] pci 0000:00:1e.0:   bridge window [mem 0xfff00000-0x000fffff pref] (disabled)
[    0.167559] ACPI Error: [CAPB] Namespace lookup failure, AE_ALREADY_EXISTS (20110112/dsfield-143)
[    0.167729] ACPI Error: Method parse/execution failed [\_SB_.PCI0._OSC] (Node f682fca8), AE_ALREADY_EXISTS (20110112/psparse-536)
[    0.168113] ACPI Error: [CAPB] Namespace lookup failure, AE_ALREADY_EXISTS (20110112/dsfield-143)
[    0.168279] ACPI Error: Method parse/execution failed [\_SB_.PCI0._OSC] (Node f682fca8), AE_ALREADY_EXISTS (20110112/psparse-536)
[    0.178246] ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 7 9 10 11 12) *0, disabled.
[    0.178612] ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 7 9 10 11 12) *0, disabled.
[    0.179008] ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 7 9 10 11 12) *0, disabled.
[    0.179375] ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 7 9 10 11 12) *0, disabled.
[    0.190814] pnp 00:04: [irq 0 disabled]
[    0.235056] pci 0000:00:1e.0:   bridge window [io  disabled]
[    0.235135] pci 0000:00:1e.0:   bridge window [mem disabled]
[    0.235212] pci 0000:00:1e.0:   bridge window [mem pref disabled]
[    0.254489] Marking TSC unstable due to TSC halts in idle states deeper than C2
[    0.332589] [drm] MTRR allocation failed.  Graphics performance may suffer.

zgrep -i MTRR /proc/config.gz

CONFIG_MTRR=y
CONFIG_MTRR_SANITIZER=y
CONFIG_MTRR_SANITIZER_ENABLE_DEFAULT=1
CONFIG_MTRR_SANITIZER_SPARE_REG_NR_DEFAULT=2

/proc/iomem:

00000000-0000ffff : reserved
00010000-0009fbff : System RAM
0009fc00-0009ffff : reserved
000a0000-000bffff : PCI Bus 0000:00
  000a0000-000bffff : Video RAM area
000c0000-000c7fff : Video ROM
000cf000-000cffff : Adapter ROM
000e0000-000fffff : reserved
  000f0000-000fffff : System ROM
00100000-3f375fff : System RAM
  01000000-0148cffe : Kernel code
  0148cfff-015e417f : Kernel data
  01646000-016b2fff : Kernel bss
3f376000-3f3befff : reserved
3f3bf000-3f46cfff : System RAM
3f46d000-3f4befff : ACPI Non-volatile Storage
3f4bf000-3f4effff : System RAM
3f4f0000-3f4fefff : ACPI Tables
3f4ff000-3f4fffff : System RAM
3f500000-3fffffff : reserved
40000000-febfffff : PCI Bus 0000:00
  40000000-4fffffff : 0000:00:02.0
  50000000-50ffffff : PCI Bus 0000:01
  51000000-520fffff : PCI Bus 0000:02
    51000000-5100ffff : 0000:02:00.0
      51000000-5100ffff : r8169
    51010000-51010fff : 0000:02:00.0
      51010000-51010fff : r8169
    51020000-5103ffff : 0000:02:00.0
  52100000-530fffff : PCI Bus 0000:03
  53100000-540fffff : PCI Bus 0000:04
  54100000-551fffff : PCI Bus 0000:04
  55200000-562fffff : PCI Bus 0000:03
    55200000-5520ffff : 0000:03:00.0
      55200000-5520ffff : ath5k
  56300000-572fffff : PCI Bus 0000:02
  57300000-583fffff : PCI Bus 0000:01
  58400000-5847ffff : 0000:00:02.1
  58480000-584fffff : 0000:00:02.0
  58500000-5853ffff : 0000:00:02.0
  58540000-58543fff : 0000:00:1b.0
    58540000-58543fff : ICH HD audio
  58544400-585447ff : 0000:00:1d.7
    58544400-585447ff : ehci_hcd
  58545000-58545fff : Intel Flush Page
  e0000000-efffffff : PCI MMCONFIG 0000 [bus 00-ff]
    e0000000-efffffff : reserved
      e0000000-efffffff : pnp 00:01
fec00000-fec00fff : reserved
  fec00000-fec003ff : IOAPIC 0
fed00000-fed003ff : HPET 0
fed14000-fed19fff : reserved
  fed14000-fed17fff : pnp 00:01
  fed18000-fed18fff : pnp 00:01
  fed19000-fed19fff : pnp 00:01
fed1c000-fed1ffff : reserved
  fed1c000-fed1ffff : pnp 00:01
fee00000-fee00fff : Local APIC
  fee00000-fee00fff : reserved
    fee00000-fee00fff : pnp 00:01
fff00000-ffffffff : reserved

/proc/cpuinfo:

processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model       : 28
model name  : Intel(R) Atom(TM) CPU N270   @ 1.60GHz
stepping    : 2
cpu MHz     : 1600.000
cache size  : 512 KB
physical id : 0
siblings    : 2
core id     : 0
cpu cores   : 1
apicid      : 0
initial apicid  : 0
fdiv_bug    : no
hlt_bug     : no
f00f_bug    : no
coma_bug    : no
fpu     : yes
fpu_exception   : yes
cpuid level : 10
wp      : yes
flags       : fpu vme de tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe constant_tsc arch_perfmon pebs bts aperfmperf pni dtes64 monitor ds_cpl est tm2 ssse3 xtpr pdcm movbe lahf_lm dts
bogomips    : 3192.06
clflush size    : 64
cache_alignment : 64
address sizes   : 32 bits physical, 32 bits virtual
power management:

processor   : 1
vendor_id   : GenuineIntel
cpu family  : 6
model       : 28
model name  : Intel(R) Atom(TM) CPU N270   @ 1.60GHz
stepping    : 2
cpu MHz     : 1600.000
cache size  : 512 KB
physical id : 0
siblings    : 2
core id     : 0
cpu cores   : 1
apicid      : 1
initial apicid  : 1
fdiv_bug    : no
hlt_bug     : no
f00f_bug    : no
coma_bug    : no
fpu     : yes
fpu_exception   : yes
cpuid level : 10
wp      : yes
flags       : fpu vme de tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe constant_tsc arch_perfmon pebs bts aperfmperf pni dtes64 monitor ds_cpl est tm2 ssse3 xtpr pdcm movbe lahf_lm dts
bogomips    : 3191.83
clflush size    : 64
cache_alignment : 64
address sizes   : 32 bits physical, 32 bits virtual
power management:

/proc/mtrr:

reg00: base=0x0fffe0000 ( 4095MB), size=  128KB, count=1: write-protect
reg01: base=0x0fffc0000 ( 4095MB), size=  128KB, count=1: uncachable
reg02: base=0x000000000 (    0MB), size=  512MB, count=1: write-back
reg03: base=0x020000000 (  512MB), size=  512MB, count=1: write-back
reg04: base=0x03f800000 ( 1016MB), size=    8MB, count=1: uncachable
reg05: base=0x03f600000 ( 1014MB), size=    2MB, count=1: uncachable
reg06: base=0x03f500000 ( 1013MB), size=    1MB, count=1: uncachable
reg07: base=0x000000000 (    0MB), size=  128KB, count=1: uncachable

lshw:

helios
    description: Computer
    product: AOA150 (Napa_Fab5)
    vendor: Acer
    version: 1
    serial: LUS050A0748370F93B2535
    width: 32 bits
    capabilities: smbios-2.4 dmi-2.4
    configuration: boot=normal family=Intel_Mobile sku=Napa_Fab5 uuid=E081A8C2-6CE0-D411-BB0B-001E68E42BCA
  *-core
       description: Motherboard
       vendor: Acer
       physical id: 0
       version: Base Board Version
       serial: Base Board Serial Number
       slot: Base Board Chassis Location
     *-firmware
          description: BIOS
          vendor: Acer
          physical id: 0
          version: v0.3301
          date: 05/09/2008
          size: 1MiB
          capabilities: pci upgrade shadowing cdboot bootselect socketedrom edd int13floppynec int13floppytoshiba int13floppy360 int13floppy1200 int13floppy720 int13floppy2880 int9keyboard int10video acpi usb
     *-memory
          description: System Memory
          physical id: 14
          slot: System board or motherboard
          size: 1GiB
        *-bank:0
             description: DIMM DDR2 Synchronous 533 MHz (1.9 ns)
             product: HYMP164S64CP6-Y5
             vendor: Hynix Semiconductor (Hyundai Electronics)
             physical id: 0
             serial: 0x00000000
             slot: J2
             size: 512MiB
             width: 64 bits
             clock: 533MHz (1.9ns)
        *-bank:1
             description: DIMM DDR2 Synchronous 533 MHz (1.9 ns)
             product: HYMP164S64CP6-Y5
             vendor: Hynix Semiconductor (Hyundai Electronics)
             physical id: 1
             serial: 0x00002337
             slot: J6H2
             size: 512MiB
             width: 64 bits
             clock: 533MHz (1.9ns)
     *-cpu
          description: CPU
          product: Intel(R) Atom(TM) CPU N270   @ 1.60GHz
          vendor: Intel Corp.
          physical id: 1c
          bus info: cpu@0
          version: 6.12.2
          serial: 0001-06C2-0000-0000-0000-0000
          slot: CPU
          size: 800MHz
          capacity: 1600MHz
          width: 32 bits
          clock: 533MHz
          capabilities: fpu fpu_exception wp vme de tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe constant_tsc arch_perfmon pebs bts aperfmperf pni dtes64 monitor ds_cpl est tm2 ssse3 xtpr pdcm movbe lahf_lm cpufreq
          configuration: id=0
        *-cache:0
             description: L2 cache
             physical id: 1d
             slot: Unknown
             size: 512KiB
             capacity: 512KiB
             capabilities: synchronous internal write-back unified
        *-cache:1
             description: L1 cache
             physical id: 1e
             slot: Unknown
             size: 32KiB
             capacity: 32KiB
             capabilities: synchronous internal write-back instruction
        *-logicalcpu:0
             description: Logical CPU
             physical id: 0.1
             width: 32 bits
             capabilities: logical
        *-logicalcpu:1
             description: Logical CPU
             physical id: 0.2
             width: 32 bits
             capabilities: logical
     *-pci
          description: Host bridge
          product: Mobile 945GME Express Memory Controller Hub
          vendor: Intel Corporation
          physical id: 100
          bus info: pci@0000:00:00.0
          version: 03
          width: 32 bits
          clock: 33MHz
          configuration: driver=agpgart-intel
          resources: irq:0
        *-display:0
             description: VGA compatible controller
             product: Mobile 945GME Express Integrated Graphics Controller
             vendor: Intel Corporation
             physical id: 2
             bus info: pci@0000:00:02.0
             version: 03
             width: 32 bits
             clock: 33MHz
             capabilities: msi pm vga_controller bus_master cap_list rom
             configuration: driver=i915 latency=0
             resources: irq:16 memory:58480000-584fffff ioport:60c0(size=8) memory:40000000-4fffffff memory:58500000-5853ffff
        *-display:1 UNCLAIMED
             description: Display controller
             product: Mobile 945GM/GMS/GME, 943/940GML Express Integrated Graphics Controller
             vendor: Intel Corporation
             physical id: 2.1
             bus info: pci@0000:00:02.1
             version: 03
             width: 32 bits
             clock: 33MHz
             capabilities: pm bus_master cap_list
             configuration: latency=0
             resources: memory:58400000-5847ffff
        *-multimedia
             description: Audio device
             product: N10/ICH 7 Family High Definition Audio Controller
             vendor: Intel Corporation
             physical id: 1b
             bus info: pci@0000:00:1b.0
             version: 02
             width: 64 bits
             clock: 33MHz
             capabilities: pm msi pciexpress bus_master cap_list
             configuration: driver=HDA Intel latency=0
             resources: irq:45 memory:58540000-58543fff
        *-pci:0
             description: PCI bridge
             product: N10/ICH 7 Family PCI Express Port 1
             vendor: Intel Corporation
             physical id: 1c
             bus info: pci@0000:00:1c.0
             version: 02
             width: 32 bits
             clock: 33MHz
             capabilities: pci pciexpress msi pm normal_decode bus_master cap_list
             configuration: driver=pcieport
             resources: irq:40 ioport:5000(size=4096) memory:57300000-583fffff ioport:50000000(size=16777216)
        *-pci:1
             description: PCI bridge
             product: N10/ICH 7 Family PCI Express Port 2
             vendor: Intel Corporation
             physical id: 1c.1
             bus info: pci@0000:00:1c.1
             version: 02
             width: 32 bits
             clock: 33MHz
             capabilities: pci pciexpress msi pm normal_decode bus_master cap_list
             configuration: driver=pcieport
             resources: irq:41 ioport:3000(size=8192) memory:56300000-572fffff ioport:51000000(size=17825792)
           *-network DISABLED
                description: Ethernet interface
                product: RTL8101E/RTL8102E PCI Express Fast Ethernet controller
                vendor: Realtek Semiconductor Co., Ltd.
                physical id: 0
                bus info: pci@0000:02:00.0
                logical name: eth0
                version: 02
                serial: 00:1e:68:e4:2b:ca
                size: 10Mbit/s
                capacity: 100Mbit/s
                width: 64 bits
                clock: 33MHz

Matthieu Cartier

Posted 2011-04-01T13:35:17.547

Reputation: 3 422

Very strange - I have the same model with 1.5GB RAM running Fedora 14 absolutely fine for ages. Has it just started to show the problem. – Linker3000 – 2011-04-01T14:30:49.680

The problem has been there for a while, just got around to posting it here. If you do dmesg | fgrep mtrr do you see the same (or similar) error messages? If not, could you please do zcat /proc/config.gz and pastebin it /w cat /proc/mtrr somewhere for me? Thanks! :) – Matthieu Cartier – 2011-04-01T15:03:58.017

@Linker3000: Check neurolysis' comment, he forgot to use the mention syntax. – Tamara Wijsman – 2011-04-03T23:00:14.053

@neurolysis: dmesg contains 'no more MTRRS', but no errors or warnings. There's no config.gz. /proc/mtrr at http://pastebin.com/Lh2yBCEc Linux: Linux aa1.localdomain 2.6.35.11-83.fc14.i686.PAE #1 SMP Mon Feb 7 06:57:55 UTC 2011 i686 i686 i386 GNU/Linux

– Linker3000 – 2011-04-04T20:21:44.333

@Linker3000 Thanks. Intriguing... – Matthieu Cartier – 2011-04-05T16:01:15.820

Answers

3

Thanks to harrymc, I discovered that you can actually rewrite /proc/mtrr. I put the following in /etc/rc.local, rebooted, and my MTRR table was correct.

echo "disable=7" > /proc/mtrr
echo "disable=6" > /proc/mtrr
echo "disable=5" > /proc/mtrr
echo "disable=4" > /proc/mtrr
echo "disable=3" > /proc/mtrr
echo "disable=2" > /proc/mtrr
echo "disable=1" > /proc/mtrr
echo "disable=0" > /proc/mtrr
echo "base=0x000000000 size=0x40000000 type=write-back" > /proc/mtrr
echo "base=0x03f500000 size=0x00100000 type=uncachable" > /proc/mtrr
echo "base=0x03f600000 size=0x00200000 type=uncachable" > /proc/mtrr
echo "base=0x03f800000 size=0x00800000 type=write-back" > /proc/mtrr
echo "base=0x040000000 size=0x10000000 type=write-combining" > /proc/mtrr

Also, after talking to a few people that are involved with kernel development, I have been informed that CONFIG_MTRR_SANITIZER has been broken for the past few kernels, hence why it worked for others in the past.

Matthieu Cartier

Posted 2011-04-01T13:35:17.547

Reputation: 3 422

Nice bit of digging by you two - will have to see what I can do on mine running Fedora – Linker3000 – 2011-04-06T16:06:15.607

2

Quoting from the answer to your own question on Arch Linux forums :

From the dmesg, it is easy to see that it runs out of mtrr during i915/drm graphics initialization. I have no specific experience with this problem, but here's my suggestions:

  1. Boot with 'mtrr_spare_reg_nr=2' and you may also need 'enable_mtrr_cleanup=1 (add to kernel line in /boot/grub/menu.lst).
  2. Try kernel 2.6.38 from [testing].

From the look of your /proc/mtrr, you ATOM CPU has only 8 MTRRs, and they are truly all used up. However, the way memory is broken up into such small fragments is puzzling. In general, such a problem may be caused by :

  • The BIOS - look for parameters that cause memory allocation for devices.
  • The graphics card which might have shared memory with the CPU and which the BIOS might cause to be allocated brutally in the middle of the memory.
  • The graphics card driver - search for the latest version.
  • A misconfigured kernel.

The greatest puzzle I can see is that /proc/mtrr says you have 8GB. But in /proc/cpuinfo the 'flags' entry doesn't contain 'lm', which the Arch64 FAQ says is required for the processor to be x86_64 compatible. The FAQ further says :

Note that Arch32 does not support more than 3GB of RAM by default: you have to turn to Arch64 if you have more.

So it seems that you have Arch32 and 8GB of RAM, which the documentation contradicts.
Could you maybe throw some light on this puzzle ?

harrymc

Posted 2011-04-01T13:35:17.547

Reputation: 306 093

No luck with either (I already had enable_mtrr_cleanup=1, which just changes CONFIG_MTRR_SANITIZER_ENABLE_DEFAULT, but booting with mtrr_spare_reg_nr=2 doesn't fix it either). I get a different, but similar, error: [ 0.329894] mtrr: no more MTRRs available. – Matthieu Cartier – 2011-04-04T18:50:44.030

Actually, that exists in the first dmesg too... so increasing the spare number seems to remove the drm warning, but not the "no more MTRRs" warning. – Matthieu Cartier – 2011-04-04T19:25:52.313

Added some more detailed analysis, as your computer presents some real puzzles. – harrymc – 2011-04-04T19:54:13.043

To answer all points you made: this computer only has 1GB of RAM. The BIOS only has very basic options (time, etc) -- nothing to allow modifying memory allocation. The graphics card driver is intel and is the latest (stable) version. There is no option in the BIOS to modify the amount of memory being reserved for graphics. This happens when using the following kernels: make defconfig, the standard Ubuntu kernel (stable, 32-bit), and 2.6.37-ARCH (stable, 32-bit), so I don't think it's a misconfiguration. – Matthieu Cartier – 2011-04-04T21:52:01.680

Interestingly, this blog entry documents the same laptop that I have, and despite taking the same steps I have, he has a completely different (and by the look of it), correct MTRR table afterwards. Is there some way to manually specify these values? http://blog.jolexa.net/2009/11/22/buggy-mtrr-on-acer-aspire-one-zg5/

– Matthieu Cartier – 2011-04-04T21:54:06.267

What is your take on zgrep -i MTRR /proc/config.gz ? – harrymc – 2011-04-05T06:32:36.690

zgrep 'SANITIZER' /proc/config.gz was already in OP, but I put this one in too. – Matthieu Cartier – 2011-04-05T16:00:02.840

This thread says to use mtrr_spare_reg_nr=3, so you might try this bit of black magic.

– harrymc – 2011-04-05T19:07:35.403

Doing so results in the drm warning, the no more MTRRs warning, and a new, seemingly more severe message, [drm] failed to find VBIOS tables. cat /proc/mtrr still yields the same output as in the OP. – Matthieu Cartier – 2011-04-05T21:59:40.053

I was asked on the Arch forums to "boot with nopat on the kernel commandline" to see if I was using PAT, but I see absolutely no difference in performance with nopat being passed to the kernel. – Matthieu Cartier – 2011-04-06T04:14:34.650

I think your problem is with the BIOS, stupidly allocating device-memory beyond the 1GB RAM, one MTRR per device. (This is why I thought you had more RAM than you really did.) You can either try to upgrade the BIOS (if exists), or maybe get more RAM (not sure that will help), or even chuck-out some unrequired device (if possible). Some people managed to rewrite /proc/mtrr, but I don't know enough about it to help.

– harrymc – 2011-04-06T07:51:40.003

Thank you for that suggestion -- I looked into it some more, and also have managed to rewrite the MTRR table. I will submit the way I fixed it as an answer, but will give you the bounty since it was ultimately you that led me to the solution. Thanks a lot! – Matthieu Cartier – 2011-04-06T13:19:18.067