0

I am getting the following error time to time. Process is different at each time. What do you think about the error? What do you suggest?

Kernel is

Linux version 3.5.3 (developer@devel) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-50)) #1 SMP

Aug 18 04:24:06 2013 kernel: [6349586.289190] Modules linked in: xt_iprange xt_pkttype xt_length xt_state xt_addrtype xt_set xt_LOG xt_tcpudp xt_connlimit xt_hashlimit xt_NFQUEUE xt_connmark xt_mark xt_multiport iptable_raw iptable_mangle iptable_nat nf_nat_ftp nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack_ftp nf_conntrack xt_recent iptable_filter ip_tables nfnetlink_queue rmd160 sha1_generic crypto_null camellia_generic lzo cast6 cast5 deflate zlib_deflate cts ctr gcm ccm serpent_generic blowfish_generic blowfish_common twofish_generic twofish_i586 twofish_common xcbc sha256_generic sha512_generic des_generic geode_aes aesni_intel cryptd aes_i586 xfrm_user ah6 ah4 esp6 esp4 xfrm4_mode_beet xfrm4_tunnel tunnel4 xfrm4_mode_tunnel xfrm4_mode_transport xfrm6_mode_transport xfrm6_mode_ro xfrm6_mode_beet xfrm6_mode_tunnel ipcomp ipcomp6 xfrm_ipcomp xfrm6_tunnel tunnel6 af_key xfrm_algo tun ipt_ULOG ip_set_hash_net ip_set nfnetlink x_tables cls_route cls_u32 cls_fw sch_sfq sch_htb bonding binfmt_misc raid1 video lp nvram evbug ixgbe mdio e1000e pcspkr serio_raw i7core_edac parport_pc edac_core parport lpc_ich mfd_core ioatdma tpm_tis tpm tpm_bios i2c_i801 dca microcode usb_storage [last unloaded: nf_conntrack]
Aug 18 04:24:06 2013 kernel: [6349587.948936]
Aug 18 04:24:06 2013 kernel: [6349587.950082] Pid: 25938, comm: rateup Tainted: G        W    3.5.3 #1 Intel Thurley/Greencity
Aug 18 04:24:06 2013 kernel: [6349588.915284] EIP: 0060:[<c04d3bab>] EFLAGS: 00010246 CPU: 2
Aug 18 04:24:06 2013 kernel: [6349588.949140] EIP is at bdi_position_ratio+0x15b/0x1e0
Aug 18 04:24:06 2013 kernel: [6349588.979872] EAX: 00236415 EBX: 00000000 ECX: 25448580 EDX: 00000000
Aug 18 04:24:06 2013 kernel: [6349589.018397] ESI: 00000000 EDI: 00236415 EBP: d2743cfc ESP: d2743cc4
Aug 18 04:24:06 2013 kernel: [6349589.056924]  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Aug 18 04:24:06 2013 kernel: [6349589.090255] CR0: 80050033 CR2: 0967c000 CR3: 1205b000 CR4: 000007f0
Aug 18 04:24:06 2013 kernel: [6349589.128781] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
Aug 18 04:24:06 2013 kernel: [6349589.167305] DR6: ffff0ff0 DR7: 00000400
Aug 18 04:24:06 2013 kernel: [6349589.191295] Process rateup (pid: 25938, ti=d2742000 task=e3b4f110 task.ti=d2742000)
Aug 18 04:24:06 2013 kernel: [6349589.238126] Stack:
Aug 18 04:24:06 2013 kernel: [6349589.251209]  d2743cec 0e38e200 00000000 00000000 00000000 00000021 00000010 00000001
Aug 18 04:24:06 2013 kernel: [6349589.539680]  0c6c2c80 000bcc07 00000000 00000000 00000000 00000000 d2743da4 c04d421e
Aug 18 04:24:06 2013 kernel: [6349589.699231]  00000017 00000000 0000000c d2743d54 c05d9291 00000000 eb7f9614 ec140798
Aug 18 04:24:06 2013 kernel: [6349589.747052] Call Trace:
Aug 18 04:24:06 2013 kernel: [6349589.762739]  [<c04d421e>] balance_dirty_pages_ratelimited_nr+0x15e/0x6f0
Aug 18 04:24:06 2013 kernel: [6349589.803856]  [<c05d9291>] ? journal_stop+0x121/0x290
Aug 18 04:24:06 2013 kernel: [6349589.834593]  [<c05353e9>] ? __mark_inode_dirty+0x29/0x1c0
Aug 18 04:24:06 2013 kernel: [6349589.977634]  [<c04cb3b8>] ? unlock_page+0x18/0x20
Aug 18 04:24:06 2013 kernel: [6349590.006812]  [<c04cb0a5>] generic_file_buffered_write+0x165/0x210
Aug 18 04:24:06 2013 kernel: [6349590.044300]  [<c04cd546>] __generic_file_aio_write+0x236/0x530
Aug 18 04:24:06 2013 kernel: [6349590.080230]  [<c050f424>] ? __mem_cgroup_commit_charge+0x74/0x230
Aug 18 04:24:06 2013 kernel: [6349590.117716]  [<c04d66d6>] ? lru_cache_add_lru+0x16/0x30
Aug 18 04:24:06 2013 kernel: [6349590.150009]  [<c04f2a46>] ? page_add_new_anon_rmap+0x56/0x70
Aug 18 04:24:06 2013 kernel: [6349590.184901]  [<c04cd892>] generic_file_aio_write+0x52/0xb0
Aug 18 04:24:06 2013 kernel: [6349590.218754]  [<c05130cb>] do_sync_write+0xbb/0x100
Aug 18 04:24:06 2013 kernel: [6349590.248453]  [<c0609649>] ? security_file_permission+0x19/0x90
Aug 18 04:24:06 2013 kernel: [6349590.284381]  [<c051326d>] ? rw_verify_area+0x5d/0x110
Aug 18 04:24:06 2013 kernel: [6349590.315636]  [<c05135b6>] vfs_write+0x96/0x160
Aug 18 04:24:06 2013 kernel: [6349590.343259]  [<c0513010>] ? do_sync_readv_writev+0xd0/0xd0
Aug 18 04:24:06 2013 kernel: [6349590.377111]  [<c0513dbd>] sys_write+0x3d/0x70
Aug 18 04:24:06 2013 kernel: [6349590.404215]  [<c08bd48c>] sysenter_do_call+0x12/0x22
89 fa 77 08 89 f8 31 d2 <f7> f6 89 c3 89 c8 f7 f6 89 de 89 c3 8b 45 e4 d1 e8 39 45 10 73
Aug 18 04:24:06 2013 kernel: [6349590.552236] EIP: [<c04d3bab>] bdi_position_ratio+0x15b/0x1e0 SS:ESP 0068:d2743cc4
Aug 18 04:24:06 2013 kernel: [6349590.598936] ---[ end trace 4ea20832b85a6756 ]---
seaquest
  • 668
  • 2
  • 11
  • 25
  • After this EIP error, system continues to be up but, no function. Even logs are empty after this hour:minute. – seaquest Aug 19 '13 at 08:35
  • Heh, "EIP error"? EIP is the x86 program counter, not a type of error. You've got a kernel panic it looks like. – Falcon Momot Aug 19 '13 at 09:18
  • If it's always in filesystem/journal related code then there is a change that `fsck` will help. PS. Also it's better to cut and paste starting from line `------------[ cut here ]------------` – SaveTheRbtz Aug 19 '13 at 19:26

2 Answers2

1

I would suspect some kind of hardware fault; bad RAM, bad CPU, something else. Have you tried to run memtest86?

Janne Pikkarainen
  • 31,454
  • 4
  • 56
  • 78
1

This is a kernel panic. It's similar to what a BSOD is in windows-land.

If you notice it happens all the time, but is completely unpredictable and presents differently each time, you almost certainly have a hardware failure. Use something like memtest86 and test your RAM, as this is the most likely cause. If you've got a warranty or a support contract you should open a call with the vendor. It can be the motherboard or a CPU as well, though, or more rarely any component at all.

If you recently updated the kernel and you don't have a hardware failure, revert. Kernel bugs can cause this too, but that code doesn't often get past testing.

It's distantly possible that you have a corrput kernel, a corrupt kernel module, or are needing to reflash your BIOS.

Falcon Momot
  • 24,975
  • 13
  • 61
  • 92
  • Actually on this machine I have kernel.panic=10, what is it should reboot if this is a panic. However it does not reboot. I think this is something like panic. – seaquest Aug 19 '13 at 13:08