Bug? du kernel 2.6.26

yassinegtr4 · Février 21, 2016, 12:16am

Bonjour à tous,

J’ai quelques VM debian en production qui ont commencé à crasher (ou plutot qui entrent dans un état de freezing total : la VM parait démarrée mais aucun service ne répond) dans la même période. Le redémarrage est ma seule solution pour le moment.

Voici un bout du fichier /var/log/kern.log qui parle du service apache2, mais le fichier est plein d’autres parties similaire qui parlent des autres services :

Dec 10 16:02:23 Prod-FO1 kernel: [13771.852161] INFO: task apache2:4623 blocked for more than 120 seconds.
Dec 10 16:02:23 Prod-FO1 kernel: [13771.852172] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 10 16:02:23 Prod-FO1 kernel: [13771.852179] apache2       D ffffffff8044af00     0  4623   2415
Dec 10 16:02:23 Prod-FO1 kernel: [13771.852189]  ffff8802da027b98 0000000000000286 0000000000000000 ffff8802ff701200
Dec 10 16:02:23 Prod-FO1 kernel: [13771.852199]  ffff8802fda7d180 ffffffff804ff460 ffff8802fda7d400 0000000000008f0d
Dec 10 16:02:23 Prod-FO1 kernel: [13771.852210]  00000000ffffffff ffffffff803fa4c4 ffff880193b7f280 ffffffff80268d62
Dec 10 16:02:23 Prod-FO1 kernel: [13771.852221] Call Trace:
Dec 10 16:02:23 Prod-FO1 kernel: [13771.852236]  [<ffffffff803fa4c4>] tcp_transmit_skb+0x731/0x76e
Dec 10 16:02:23 Prod-FO1 kernel: [13771.852245]  [<ffffffff80268d62>] free_hot_cold_page+0x14c/0x1ba
Dec 10 16:02:23 Prod-FO1 kernel: [13771.852271]  [<ffffffffa01e994a>] :ocfs2:ocfs2_wait_for_recovery+0x6d/0x83
Dec 10 16:02:23 Prod-FO1 kernel: [13771.852280]  [<ffffffff8023f6ad>] autoremove_wake_function+0x0/0x2e
Dec 10 16:02:23 Prod-FO1 kernel: [13771.852290]  [<ffffffff803feed6>] tcp_v4_do_rcv+0x2c8/0x49d
Dec 10 16:02:23 Prod-FO1 kernel: [13771.852310]  [<ffffffffa01d8158>] :ocfs2:ocfs2_inode_lock_full+0x176/0xd88
Dec 10 16:02:23 Prod-FO1 kernel: [13771.852320]  [<ffffffff8029add4>] dput+0x21/0x13e
Dec 10 16:02:23 Prod-FO1 kernel: [13771.852343]  [<ffffffffa01e323b>] :ocfs2:ocfs2_permission+0x66/0x153
Dec 10 16:02:23 Prod-FO1 kernel: [13771.852351]  [<ffffffff8029223d>] permission+0xb5/0x118
Dec 10 16:02:23 Prod-FO1 kernel: [13771.852359]  [<ffffffff80293846>] __link_path_walk+0x145/0xe0c
Dec 10 16:02:23 Prod-FO1 kernel: [13771.852368]  [<ffffffff802a1d3d>] mnt_want_write+0x31/0x86
Dec 10 16:02:23 Prod-FO1 kernel: [13771.852377]  [<ffffffff80293ced>] __link_path_walk+0x5ec/0xe0c
Dec 10 16:02:23 Prod-FO1 kernel: [13771.852389]  [<ffffffff80294553>] path_walk+0x46/0x8b
Dec 10 16:02:23 Prod-FO1 kernel: [13771.852397]  [<ffffffff8029487f>] do_path_lookup+0x158/0x1ce
Dec 10 16:02:23 Prod-FO1 kernel: [13771.852404]  [<ffffffff802934f7>] getname+0x140/0x1a7
Dec 10 16:02:23 Prod-FO1 kernel: [13771.852412]  [<ffffffff802951ed>] __user_walk_fd+0x37/0x4c
Dec 10 16:02:23 Prod-FO1 kernel: [13771.852421]  [<ffffffff802894c7>] sys_faccessat+0xbc/0x186
Dec 10 16:02:23 Prod-FO1 kernel: [13771.852430]  [<ffffffff802a0721>] mntput_no_expire+0x20/0x169
Dec 10 16:02:23 Prod-FO1 kernel: [13771.852437]  [<ffffffff80288c0b>] filp_close+0x5d/0x65
Dec 10 16:02:23 Prod-FO1 kernel: [13771.852447]  [<ffffffff8020b528>] system_call+0x68/0x6d
Dec 10 16:02:23 Prod-FO1 kernel: [13771.852455]  [<ffffffff8020b4c0>] system_call+0x0/0x6d

Version du noyau : 2.6.26-2-xen-amd64
Existe t-il une solution à ce problème de noyau : un patch, une mise à jour … ?

Merci par avance de votre aide,

ggoodluck47 · Février 21, 2016, 12:16am

Salut,

Avec un fichier sources.list normal un safe-upgrade te répondrait mieux que nous

yassinegtr4 · Février 21, 2016, 12:16am

Bonjour,

Merci pour le feedback !

Voici mon source.list

deb http://security.debian.org/ lenny/updates main
deb-src http://security.debian.org/ lenny/updates main
deb http://ftp.fr.debian.org/debian/ lenny main
deb-src http://ftp.fr.debian.org/debian/ lenny main

J’ai lancé aptitude safe-upgrade … je vais attendre deux ou 3 jours pour voir si tout ira bien et le bug corrigé :

merci

fran.b · Février 21, 2016, 12:16am

Visiblement c’est un gel dans la transmission des paquets TCP mis en file d’attente que ça gèle. Ça peut venir du noyau ou ça peut venir de la machine virtuelle elle même (bug de Xen). Je te suggère une mise à jour du noyau de la machine.