Bonjour tout les gens 
Je vous expose mon petit problème, je rencontre quelques difficultés sur un problème, ça fait la 2ème fois qu’un de nos hôte de virtu claque comme un malpropre, dans les logs
Oct 29 07:57:39 kch03 kernel: WARNING: at /build/linux-X2rDfB/linux-3.2.57/kernel/watchdog.c:241 watchdog_overflow_callback+0x93/0x9e()
Oct 29 07:57:39 kch03 kernel: [773113.285217] Hardware name: PowerEdge T610
Oct 29 07:57:39 kch03 kernel: [773113.285218] Watchdog detected hard LOCKUP on cpu 12
avec tout un tas de logs comme ça :
Oct 29 08:24:40 kch03 kernel: [774775.837511] NMI backtrace for cpu 14
Oct 29 08:24:40 kch03 kernel: [774775.837519] CPU 14
Oct 29 08:24:40 kch03 kernel: [774775.837525] Modules linked in: tun ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables x_tables fuse rpcsec_gss_krb5 nfsd nfs nfs_acl auth_rpcgss
fscache lockd sunrpc bonding bridge stp loop kvm_intel kvm snd_pcm snd_page_alloc snd_timer snd i7core_edac psmouse serio_raw edac_core iTCO_wdt iTCO_vendor_support soundcore evdev pcspkr dcdbas proc
essor button thermal_sys acpi_power_meter coretemp crc32c_intel ext4 crc16 jbd2 mbcache dm_mod sd_mod crc_t10dif sg sr_mod cdrom ata_generic usbhid hid uhci_hcd ata_piix libata ehci_hcd mpt2sas usbcor
e raid_class scsi_transport_sas scsi_mod usb_common netxen_nic bnx2 [last unloaded: scsi_wait_scan]
Oct 29 08:24:40 kch03 kernel: [774775.837788]
Oct 29 08:24:40 kch03 kernel: [774775.837795] Pid: 0, comm: swapper/14 Tainted: G W 3.2.0-4-amd64 #1 Debian 3.2.57-3 Dell Inc. PowerEdge T610/0CX0R0
Oct 29 08:24:40 kch03 kernel: [774775.837819] RIP: 0010:[] [] intel_idle+0xb9/0x119
Oct 29 08:24:40 kch03 kernel: [774775.837836] RSP: 0018:ffff880632efde68 EFLAGS: 00000046
Oct 29 08:24:40 kch03 kernel: [774775.837844] RAX: 0000000000000010 RBX: 0000000000000004 RCX: 0000000000000001
Oct 29 08:24:40 kch03 kernel: [774775.837853] RDX: 0000000000000000 RSI: ffff880632efdfd8 RDI: 13a18da0dc383000
Oct 29 08:24:40 kch03 kernel: [774775.837861] RBP: 0000000000000002 R08: 0000000000000000 R09: 00000000000001c1
Oct 29 08:24:40 kch03 kernel: [774775.837867] R10: 0000000000000293 R11: 0000000000000293 R12: ffff88063fcf9d70
Oct 29 08:24:40 kch03 kernel: [774775.837875] R13: 0000000000000010 R14: 13a18da105f673bc R15: 000000000000000e
Oct 29 08:24:40 kch03 kernel: [774775.837884] FS: 0000000000000000(0000) GS:ffff88063fce0000(0000) knlGS:0000000000000000
Oct 29 08:24:40 kch03 kernel: [774775.837911] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Oct 29 08:24:40 kch03 kernel: [774775.837918] CR2: 00002aaaaf2abe68 CR3: 0000000001605000 CR4: 00000000000026e0
Oct 29 08:24:40 kch03 kernel: [774775.837926] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Oct 29 08:24:40 kch03 kernel: [774775.837934] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Oct 29 08:24:40 kch03 kernel: [774775.837943] Process swapper/14 (pid: 0, threadinfo ffff880632efc000, task ffff880632ee5800)
Oct 29 08:24:40 kch03 kernel: [774775.837975] ffffffff81013726 ffffffff8127161a 0000000000000000 000000000006dbd3
Oct 29 08:24:40 kch03 kernel: [774775.837996] 0000000000000000 0000000e0006dbd3 ffff88063fcf9d70 ffffffff81646510
Oct 29 08:24:40 kch03 kernel: [774775.838017] 00000000fffffff0 0000000000000002 ffffffff816465c0 ffffffff81270739
Oct 29 08:24:40 kch03 kernel: [774775.838068] [] ? read_tsc+0x5/0x14
Oct 29 08:24:40 kch03 kernel: [774775.838078] [] ? menu_select+0x198/0x2c1
Oct 29 08:24:40 kch03 kernel: [774775.838089] [] ? cpuidle_idle_call+0xec/0x179
Oct 29 08:24:40 kch03 kernel: [774775.838099] [] ? cpu_idle+0xa5/0xf2
Oct 29 08:24:40 kch03 kernel: [774775.838109] [] ? start_secondary+0x1d5/0x1db
Oct 29 08:24:40 kch03 kernel: [774775.838449] Call Trace:
Oct 29 08:24:40 kch03 kernel: [774775.838461] [] ? read_tsc+0x5/0x14
Oct 29 08:24:40 kch03 kernel: [774775.838474] [] ? menu_select+0x198/0x2c1
Oct 29 08:24:40 kch03 kernel: [774775.838487] [] ? cpuidle_idle_call+0xec/0x179
Oct 29 08:24:40 kch03 kernel: [774775.838502] [] ? cpu_idle+0xa5/0xf2
Oct 29 08:24:40 kch03 kernel: [774775.838515] [] ? start_secondary+0x1d5/0x1db
Pour tout les CPU de la machine. Ca a fait ça pendant environ 25 mins, donc machine “down” à 8h25, mais elle était juste en “standby” j’ai l’impression, puisque toujours allumé, mais écran noir etc etc …
J’ai essayé de me documenté là dessus, mais j’ai pas encore trouvé grand chose, enfin j’arrive pas à définir si c’est matériel ou logiciel, et je coince assez là dessus ><
Si quelqu’un à une idée là dessus, ça serait sympas ! :d
Bonne journée 
