Plantage serveur,diagnostic?

Un phénomène curieux a eu lieu hier à 13 heure 38 sur le serveur principal.
Le tableau: Serveur sous debian woody noyau 2.2.19, uptime 72 jours, uptime maximal environ 250 jours, bref du solide mais vieux: machine installée en 1998 sous Bo et mise à jour de temps à autre. Jamais éteinte bien sur, des scripts efficace assure la relance des démons en cas de plantage (cela a permis à l’incident de passer inaperçu pendant 1 journée).

Le symptome: On me dit que quelques services ne fonctionnent plus, je m’aperçois que c’est inetd qui est gelé. Je regarde donc les logs et là j’ai du mal à comprendre ce qui c’est passé: voilà en gros les parties intéressantes, à noter que lookatpid.sh est un script perso vérifiant l’existence de plusieurs proceesus essentiels et les relançant (ligne Relance de…), piderror.log contient les messages d’erreurs lors de la relance des démons.

[quote]kern.log:
Mar 27 12:48:08 yoda kernel: klogd 1.3-3#33.1, log source = /proc/kmsg started.
Mar 27 12:48:08 yoda kernel: Inspecting /boot/System.map-2.2.19
Mar 27 12:48:08 yoda kernel: Loaded 8525 symbols from /boot/System.map-2.2.19.
Mar 27 12:48:08 yoda kernel: Symbols match kernel version 2.2.19.
Mar 27 12:48:08 yoda kernel: No module symbols loaded.
Mar 27 13:38:21 yoda kernel: Unable to load interpreter /lib/ld-linux.so.2
Mar 27 13:38:54 yoda last message repeated 206 times
Mar 27 13:39:02 yoda last message repeated 234 times
Mar 27 13:41:22 yoda kernel: Unable to load interpreter /lib/ld-linux.so.2
Mar 27 13:42:08 yoda last message repeated 483 times
Mar 27 13:44:09 yoda kernel: Unable to load interpreter /lib/ld-linux.so.2
Mar 27 13:44:11 yoda last message repeated 34 times
Mar 27 13:47:04 yoda kernel: Unable to load interpreter /lib/ld-linux.so.2
Mar 27 13:47:15 yoda last message repeated 10 times
Mar 27 13:48:07 yoda kernel: klogd 1.3-3#33.1, log source = /proc/kmsg started.
Mar 27 13:48:07 yoda kernel: Inspecting /boot/System.map-2.2.19
Mar 27 13:48:07 yoda kernel: Loaded 8525 symbols from /boot/System.map-2.2.19.
Mar 27 13:48:07 yoda kernel: Symbols match kernel version 2.2.19.
Mar 27 13:48:08 yoda kernel: No module symbols loaded.
Mar 27 13:49:30 yoda kernel: klogd 1.3-3#33.1, log source = /proc/kmsg started.
Mar 27 13:49:30 yoda kernel: Inspecting /boot/System.map-2.2.19
Mar 27 13:49:30 yoda kernel: Loaded 8525 symbols from /boot/System.map-2.2.19.
Mar 27 13:49:30 yoda kernel: Symbols match kernel version 2.2.19.
Mar 27 13:49:30 yoda kernel: No module symbols loaded.
Mar 27 13:53:08 yoda kernel: klogd 1.3-3#33.1, log source = /proc/kmsg started.
Mar 27 13:53:08 yoda kernel: Inspecting /boot/System.map-2.2.19
Mar 27 13:53:08 yoda kernel: Loaded 8525 symbols from /boot/System.map-2.2.19.
Mar 27 13:53:08 yoda kernel: Symbols match kernel version 2.2.19.
Mar 27 13:53:08 yoda kernel: No module symbols loaded.

syslog:
[…]
Mar 27 13:37:29 yoda mountd[311]: NFS mount of /var/spool/mail attempted from 192.168.1.214
Mar 27 13:37:29 yoda mountd[311]: /var/spool/mail has been mounted by 192.168.1.214
Mar 27 13:37:51 yoda mountd[311]: NFS mount of /var/spool/mail attempted from 192.168.1.202
Mar 27 13:37:52 yoda mountd[311]: /var/spool/mail has been mounted by 192.168.1.202
Mar 27 13:37:56 yoda mountd[311]: NFS mount of /var/spool/mail attempted from 192.168.1.212
Mar 27 13:37:56 yoda mountd[311]: /var/spool/mail has been mounted by 192.168.1.212
Mar 27 13:38:02 yoda CRON[31176]: PAM unable to dlopen(/lib/security/pam_unix.so)
Mar 27 13:38:02 yoda CRON[31176]: PAM [dlerror: /lib/security/pam_unix.so: failed to map segment from shared obj
ect: Cannot allocate memory]
Mar 27 13:38:02 yoda CRON[31176]: PAM adding faulty module: /lib/security/pam_unix.so
Mar 27 13:38:02 yoda CRON[31177]: PAM unable to dlopen(/lib/security/pam_unix.so)
Mar 27 13:38:02 yoda CRON[31177]: PAM [dlerror: /lib/security/pam_unix.so: failed to map segment from shared obj
ect: Cannot allocate memory]
Mar 27 13:38:02 yoda CRON[31177]: PAM adding faulty module: /lib/security/pam_unix.so
Mar 27 13:38:02 yoda CRON[31176]: PAM unable to dlopen(/lib/security/pam_env.so)
Mar 27 13:38:02 yoda CRON[31177]: PAM [dlerror: /lib/security/pam_unix.so: failed to map segment from shared obj
ect: Cannot allocate memory]
Mar 27 13:38:02 yoda CRON[31177]: PAM adding faulty module: /lib/security/pam_unix.so
Mar 27 13:38:02 yoda CRON[31176]: PAM unable to dlopen(/lib/security/pam_env.so)
Mar 27 13:38:02 yoda CRON[31176]: PAM [dlerror: /lib/security/pam_env.so: failed to map segment from shared obje
ct: Cannot allocate memory]
Mar 27 13:38:02 yoda CRON[31176]: PAM adding faulty module: /lib/security/pam_env.so
Mar 27 13:38:02 yoda CRON[31177]: PAM unable to dlopen(/lib/security/pam_env.so)
Mar 27 13:38:02 yoda CRON[31177]: PAM [dlerror: /lib/security/pam_env.so: failed to map segment from shared obje
ct: Cannot allocate memory]
Mar 27 13:38:02 yoda CRON[31177]: PAM adding faulty module: /lib/security/pam_env.so
Mar 27 13:38:02 yoda CRON[31176]: PAM unable to dlopen(/lib/security/pam_deny.so)
Mar 27 13:38:02 yoda CRON[31176]: PAM [dlerror: /lib/security/pam_deny.so: failed to map segment from shared obj
ect: Cannot allocate memory]
Mar 27 13:38:02 yoda CRON[31176]: PAM adding faulty module: /lib/security/pam_deny.so
Mar 27 13:38:02 yoda CRON[31176]: Module is unknown
Mar 27 13:38:02 yoda CRON[31177]: PAM unable to dlopen(/lib/security/pam_deny.so)
Mar 27 13:38:02 yoda CRON[31177]: PAM [dlerror: /lib/security/pam_deny.so: failed to map segment from shared obj
ect: Cannot allocate memory]
Mar 27 13:38:02 yoda CRON[31177]: PAM adding faulty module: /lib/security/pam_deny.so
Mar 27 13:38:02 yoda CRON[31177]: Module is unknown
Mar 27 13:38:11 yoda mountd[311]: NFS mount of /var/spool/mail attempted from 192.168.1.208
Mar 27 13:38:12 yoda mountd[311]: /var/spool/mail has been mounted by 192.168.1.208
Mar 27 13:38:21 yoda kernel: Unable to load interpreter /lib/ld-linux.so.2
[…]
Mar 27 13:47:05 yoda last message repeated 2 times
Mar 27 13:47:06 yoda out of memory [17243out of ]
Mar 27 13:47:06 yoda kernel: Unable to load interpreter /lib/ld-linux.so.2
Mar 27 13:47:15 yoda last message repeated 7 times
Mar 27 13:47:34 yoda ipop3d[17328]: connect from xxxxxx.edu.nerim.net
Mar 27 13:47:35 yoda ipop3d[17328]: pop3 service init from 212.x.y.z
Mar 27 13:47:37 yoda ipop3d[17328]: Login user=xxx host=xxxxx.edu.nerim.net [213.41.201.82] nmsgs=0/0
Mar 27 13:47:38 yoda ipop3d[17328]: Logout user=xxx host=xxxxx.edu.nerim.net [213.41.201.82] nmsgs=0 ndele=0
Mar 27 13:48:01 yoda /USR/SBIN/CRON[17335]: (root) CMD (^I/usr/local/bin/lookatpid.sh >> /var/log/lookatpid.log
2>> /var/log/piderror.log)
Mar 27 13:48:01 yoda /USR/SBIN/CRON[17336]: (root) CMD ( test -f /etc/ipac.conf -a -f /usr/sbin/fetchipac -a -f
/proc/net/ip_acct && /usr/sbin/fetchipac)
Mar 27 13:48:03 yoda logger: Relance de
Mar 27 13:48:03 yoda logger: Relance de sysklogd
Mar 27 13:48:03 yoda exiting on signal 15
Mar 27 13:48:07 yoda syslogd 1.3-3#33.1: restart.
Mar 27 13:48:07 yoda kernel: klogd 1.3-3#33.1, log source = /proc/kmsg started.
Mar 27 13:48:07 yoda kernel: Inspecting /boot/System.map-2.2.19
Mar 27 13:48:07 yoda kernel: Loaded 8525 symbols from /boot/System.map-2.2.19.
Mar 27 13:48:07 yoda kernel: Symbols match kernel version 2.2.19.
Mar 27 13:48:08 yoda kernel: No module symbols loaded.
Mar 27 13:48:08 yoda logger: Relance de nis
Mar 27 13:49:07 yoda /USR/SBIN/CRON[17819]: (root) CMD (^I/usr/local/bin/lookatpid.sh >> /var/log/lookatpid.log
2>> /var/log/piderror.log)
Mar 27 13:49:23 yoda logger: Relance de sysklogd
Mar 27 13:49:25 yoda exiting on signal 15
Mar 27 13:49:30 yoda kernel: klogd 1.3-3#33.1, log source = /proc/kmsg started.
Mar 27 13:49:30 yoda kernel: Inspecting /boot/System.map-2.2.19
Mar 27 13:49:30 yoda kernel: Loaded 8525 symbols from /boot/System.map-2.2.19.
Mar 27 13:49:30 yoda kernel: Symbols match kernel version 2.2.19.
Mar 27 13:49:30 yoda kernel: No module symbols loaded.
Mar 27 13:50:02 yoda /USR/SBIN/CRON[18197]: (root) CMD (^I/usr/local/bin/lookatpid.sh >> /var/log/lookatpid.log 2>> /var/log/piderror.log)
Mar 27 13:50:02 yoda /USR/SBIN/CRON[18198]: (root) CMD (^I^Icp /home/Master/up.jpg /var/www/p/upm.jpg > /tmp/ggg 2>&1)
Mar 27 13:50:02 yoda /USR/SBIN/CRON[18203]: (root) CMD (cp -dpRf /ftp/pub/incoming/.??* /home/boisson/FTP_curieu
x/ > /dev/null 2> /dev/null ; rm -R /ftp/pub/incoming/.??* > /dev/null 2> /dev/null ; cp -dpRf /ftp/pub/incoming
/~* /home/boisson/FTP_curieux/ > /dev/null 2> /dev/null ; rm -R /ftp/pub/incoming/~* > /dev/null 2> /dev/null)
Mar 27 13:50:02 yoda /USR/SBIN/CRON[18202]: (www-data) CMD ([ -x /usr/lib/cgi-bin/awstats.pl -a -f /etc/awstats/
awstats.conf -a -r /var/log/apache/access.log ] && /usr/lib/cgi-bin/awstats.pl -config=awstats -update >/dev/null)
[…]

messages:
Mar 27 12:48:08 yoda syslogd 1.3-3#33.1: restart.
Mar 27 12:48:08 yoda kernel: klogd 1.3-3#33.1, log source = /proc/kmsg started.
Mar 27 12:48:08 yoda kernel: Inspecting /boot/System.map-2.2.19
Mar 27 12:48:08 yoda kernel: Loaded 8525 symbols from /boot/System.map-2.2.19.
Mar 27 12:48:08 yoda kernel: Symbols match kernel version 2.2.19.
Mar 27 12:48:08 yoda kernel: No module symbols loaded.
Mar 27 12:57:09 yoda logger: Relance de
Mar 27 12:57:10 yoda last message repeated 2 times
Mar 27 13:08:08 yoda – MARK –
Mar 27 13:28:08 yoda – MARK –
Mar 27 13:44:31 yoda out of memory [17128out of ]
Mar 27 13:44:31 yoda out of memory [17128out of ]
Mar 27 13:45:13 yoda out of memory [17136out of ]
Mar 27 13:45:13 yoda out of memory [17136out of ]
Mar 27 13:47:05 yoda logger: Relance de
Mar 27 13:47:05 yoda last message repeated 8 times
Mar 27 13:47:06 yoda out of memory [17243out of ]
Mar 27 13:48:03 yoda logger: Relance de
Mar 27 13:48:03 yoda logger: Relance de sysklogd
Mar 27 13:48:03 yoda exiting on signal 15
Mar 27 13:48:07 yoda syslogd 1.3-3#33.1: restart.
Mar 27 13:48:07 yoda kernel: klogd 1.3-3#33.1, log source = /proc/kmsg started.
Mar 27 13:48:07 yoda kernel: Inspecting /boot/System.map-2.2.19
Mar 27 13:48:07 yoda kernel: Loaded 8525 symbols from /boot/System.map-2.2.19.
Mar 27 13:48:07 yoda kernel: Symbols match kernel version 2.2.19.
Mar 27 13:48:08 yoda kernel: No module symbols loaded.
Mar 27 13:48:08 yoda logger: Relance de nis
Mar 27 13:49:23 yoda logger: Relance de sysklogd
Mar 27 13:49:25 yoda exiting on signal 15
Mar 27 13:49:30 yoda syslogd 1.3-3#33.1: restart.
Mar 27 13:49:30 yoda kernel: klogd 1.3-3#33.1, log source = /proc/kmsg started.
Mar 27 13:49:30 yoda kernel: Inspecting /boot/System.map-2.2.19
Mar 27 13:49:30 yoda kernel: Loaded 8525 symbols from /boot/System.map-2.2.19.
Mar 27 13:49:30 yoda kernel: Symbols match kernel version 2.2.19.
Mar 27 13:49:30 yoda kernel: No module symbols loaded.
Mar 27 13:53:04 yoda logger: Relance de sysklogd
Mar 27 13:53:04 yoda exiting on signal 15
Mar 27 13:53:05 yoda syslogd 1.3-3#33.1: restart.
Mar 27 13:53:05 yoda logger: Relance de bind
Mar 27 13:53:05 yoda logger: Relance de nis
Mar 27 13:53:06 yoda logger: Relance de nis
Mar 27 13:53:08 yoda logger: Relance de sysklogd
Mar 27 13:53:08 yoda exiting on signal 15
Mar 27 13:53:08 yoda syslogd 1.3-3#33.1: restart.
Mar 27 13:53:08 yoda kernel: klogd 1.3-3#33.1, log source = /proc/kmsg started.
Mar 27 13:53:08 yoda kernel: Inspecting /boot/System.map-2.2.19
Mar 27 13:53:08 yoda logger: Relance de bind
Mar 27 13:53:08 yoda kernel: Loaded 8525 symbols from /boot/System.map-2.2.19.
Mar 27 13:53:08 yoda kernel: Symbols match kernel version 2.2.19.
Mar 27 13:53:08 yoda kernel: No module symbols loaded.
Mar 27 13:53:10 yoda logger: Relance de nis
Mar 27 14:13:08 yoda – MARK –
Mar 27 14:33:08 yoda – MARK –
Mar 27 14:53:08 yoda – MARK –
Mar 27 15:13:08 yoda – MARK –
Mar 27 15:33:08 yoda – MARK --[b]

piderror.log[/b]
Invalid command ‘ExtendedStatus’, perhaps mis-spelled or defined by a module not included in the server configuration
/usr/local/bin/lookatpid.sh: line 1: 20300 Segmentation fault ps $1 >/dev/null
/usr/local/bin/lookatpid.sh: xmalloc: cannot allocate 2524 bytes (0 bytes allocated)
/usr/local/bin/lookatpid.sh: xmalloc: cannot allocate 8 bytes (0 bytes allocated)
[…]
awk: error while loading shared libraries: libc.so.6: failed to map segment from shared object: Cannot allocate memory
[…]
grep: error while loading shared libraries: libc.so.6: failed to map segment from shared object: Cannot allocate memory
[…]
awk: line 0: out of memory
/usr/local/bin/lookatpid.sh: /etc/init.d/: is a directory
/usr/local/bin/lookatpid.sh: /etc/init.d/: is a directory
/usr/local/bin/lookatpid.sh: line 1: 17238 Segmentation fault ps $1 >/dev/null
grep: Memory exhausted
/usr/local/bin/lookatpid.sh: /etc/init.d/: is a directory
/usr/local/bin/lookatpid.sh: /etc/init.d/: is a directory
cat: memory exhausted
cat: memory exhausted
cat: memory exhausted
[…]
[/quote]
Visiblement la mémoire de la machine a été saturée à 13h38 sans que je comprenne pourquoi:

  • Pas de fuite mémoire vu l’uptime
  • Pas d’attaques diverses, j’ai vérifié les logs (je loggue tout ce qui n’est pas Web sur la passerelle)
  • Pas de spams violents ni de surcharge de clamav à ce moment là
  • Une classe qui s’était connecté et qui a pu continué d’ailleurs passées les 10 minutes de l’incident (il y a eu un retardataire apparemment).

La machine dispose de 256M plus 600M de swap.

Ma théorie est une défaillance disque sur le swap. La console ne comporte que

[quote]Unable to load interpreter /lib/ld-linux.so.2 [/quote]répétés sur toutes les lignes. Un smartctl sur le disque concerné montre qu’il n’est pas tout récent mais c’est tout. Je suis dans l’expectative. Une expérience similaire quelqu’un??

Hello,

Si tu penses que c’est la swap pourquoi ne pas essayer de la désactiver un moment pour voir… Vérifie que t’as assez de ram libre avant… certainement un test à faire hors heures de production.

[quote=“gagarine”]Hello,

Si tu penses que c’est la swap pourquoi ne pas essayer de la désactiver un moment pour voir… Vérifie que t’as assez de ram libre avant… certainement un test à faire hors heures de production.[/quote]
C’est en fait délicat, le serveur sert en permanence et a peu de RAM. Tu ne vois pas d’autres explications?