4

There's a fairly complex application than runs on two VMs (on Xen). Both VMs run CentOS 6.2 with the exact same packages and configuration for every application running (minus networking which is different). SELinux is disabled on both.

On machine A the application builds perfectly. On machine B when running some tests we get:

ruby[2010] trap invalid opcode ip:7ff9d2944c30 sp:7fff9797e0f8 error:0 in ld-2.12.so[7ff9d2930000+20000]

Digging a bit more to find out where the machines differ, machine A has:

model name : Six-Core AMD Opteron(tm) Processor 2423 HE

and machine B:

model name : AMD Opteron(TM) Processor 6272

I've tried booting machine B with cpuid_mask_cpu=fam_10_rev_c in grub but it did not help either.

So any advice as to how to deal with this, or how to approach the hosting provider so as to run this VM on another physical machine will be greatly appreciated.

adamo
  • 6,867
  • 3
  • 29
  • 58

1 Answers1

2

Apparently there is an issue with libc on Xen machines with AVX support that can cause this error. Please see this trouble ticket from Chef, another ruby application, and a related issue for volk. Finally, this ArchLinux thread helped me understand the issue further.

Lo and behold, the AMD Opteron(TM) Processor 6272 supports AVX while the 2423 HE does not.

So... you can either be asked to move to another processor, or you can recompile libc with --disable-multi-arch, which will make it ignore AVX. I would think that you could also tell Xen to ignore AVX support, but I can't find how - maybe someone smarter than me can tell you.

Nada
  • 986
  • 7
  • 9