Disque mort ou erreur de montage ?

Bonjour,

Je rencontre un problème assez fâcheux sur mon serveur : le FS se met en lecture seule de manière aléatoire !

Il y a plus d’une semaine (suite à une erreur d’install [le /var était dans / (9Go) au lieu d’avoir sa propre partition]) j’ai mis le /var/ dans /home/ et fait un montage bind de /home/var/ dans /var/.
Je n’avais aucun problème mais aujourd’hui je me retrouve avec des erreurs et montage en lecture seul de mon FS.
Je n’arrive pas à déterminer si le problème vient du montage, qui a rendu mon FS instable ou du disque dure (assez vieux et je n’ai pas confiance en lui).

(un reboot résout temporairement le problème ) voici les message relevé :

[102954.948190] sd 0:0:0:0: [sda] CDB:
[102954.948191] Write(10): 2a 00 5b ae 86 e0 00 00 20 00
[102954.948198] EXT4-fs warning (device sda6): ext4_end_bio:317: I/O error -5 writing to inode 46402967 (offset 0 size 16384 starting block 192270560)
[102954.948215] sd 0:0:0:0: [sda] Unhandled error code
[102954.948218] sd 0:0:0:0: [sda]
[102954.948219] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[102954.948221] sd 0:0:0:0: [sda] CDB:
[102954.948222] Write(10): 2a 00 5b ae 87 c0 00 00 40 00
[102954.948229] EXT4-fs warning (device sda6): ext4_end_bio:317: I/O error -5 writing to inode 46402958 (offset 0 size 20480 starting block 192270589)
[102954.948235] EXT4-fs warning (device sda6): ext4_end_bio:317: I/O error -5 writing to inode 46402965 (offset 0 size 4096 starting block 192270590)
[102954.948240] EXT4-fs warning (device sda6): ext4_end_bio:317: I/O error -5 writing to inode 46402961 (offset 0 size 8192 starting block 192270592)
[102954.948252] sd 0:0:0:0: [sda] Unhandled error code
[102954.948255] sd 0:0:0:0: [sda]
[102954.948256] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[102954.948258] sd 0:0:0:0: [sda] CDB:
[102954.948259] Write(10): 2a 00 5b ae 88 08 00 00 08 00
[102954.948275] sd 0:0:0:0: [sda] Unhandled error code
[102954.948277] sd 0:0:0:0: [sda]
[102954.948278] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[102954.948280] sd 0:0:0:0: [sda] CDB:
[102954.948281] Write(10): 2a 00 5b ae 88 80 00 00 08 00
[102954.948299] sd 0:0:0:0: [sda] Unhandled error code
[102954.948301] sd 0:0:0:0: [sda]

... et encore des lignes...
[102954.949164] sd 0:0:0:0: [sda] CDB: 
[102954.949166] Write(10): 2a 00 00 8b 99 60 00 00 08 00
[102954.949173] EXT4-fs warning (device sda1): ext4_end_bio:317: I/O error -5 writing to inode 260631 (offset 0 size 4096 starting block 1143597)
[102954.949185] sd 0:0:0:0: [sda] Unhandled error code
[102954.949187] sd 0:0:0:0: [sda]  
[102954.949189] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
....

[102954.949713] sd 0:0:0:0: [sda] CDB:
[102954.949714] Write(10): 2a 00 5b 37 a2 f0 00 00 08 00
[102954.997387] sd 0:0:0:0: [sda] Unhandled error code
[102954.997397] sd 0:0:0:0: [sda]
[102954.997400] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[102954.997402] sd 0:0:0:0: [sda] CDB:
[102954.997405] Write(10): 2a 00 00 00 08 00 00 00 08 00
[102954.997415] Buffer I/O error on device sda1, logical block 0
[102954.997476] lost page write due to I/O error on sda1
[102954.997511] EXT4-fs error (device sda1): ext4_journal_check_start:56: Detected aborted journal
[102954.997587] EXT4-fs (sda1): Remounting filesystem read-only
[102954.997635] EXT4-fs (sda1): previous I/O error to superblock detected
[102954.997702] sd 0:0:0:0: [sda] Unhandled error code
[102954.997705] sd 0:0:0:0: [sda]
[102954.997706] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[102954.997708] sd 0:0:0:0: [sda] CDB:
[102954.997709] Write(10): 2a 00 00 00 08 00 00 00 08 00
[102954.997715] Buffer I/O error on device sda1, logical block 0
[102954.997763] lost page write due to I/O error on sda1
[103152.185492] sd 0:0:0:0: [sda] Unhandled error code
[103152.185502] sd 0:0:0:0: [sda]
[103152.185504] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[103152.185507] sd 0:0:0:0: [sda] CDB:
[103152.185510] Read(10): 28 00 00 57 28 c8 00 00 08 00
[103152.185518] blk_update_request: 62 callbacks suppressed
[103152.185520] end_request: I/O error, dev sda, sector 5712072
[103152.185647] sd 0:0:0:0: [sda] Unhandled error code
[103152.185649] sd 0:0:0:0: [sda]
[103152.185651] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK
[103152.185653] sd 0:0:0:0: [sda] CDB:
[103152.185654] Read(10): 28 00 00 57 28 c8 00 00 08 00
[103152.185660] end_request: I/O error, dev sda, sector 5712072

— lors du reboot :

Mar 15 19:21:07 2emeDeBian kernel: [  203.532342] sd 0:0:0:0: [sda] CDB: 
Mar 15 19:21:07 2emeDeBian kernel: [  203.532343] Read(10): 28 00 04 aa 00 e8 00 01 00 00
Mar 15 19:21:07 2emeDeBian kernel: [  203.532349] end_request: I/O error, dev sda, sector 78250216
Mar 15 19:21:07 2emeDeBian kernel: [  203.534134] Buffer I/O error on device sda6, logical block 3145757

Je me tâte :fearful: je ne sais pas vraiment quoi penser…

fdisk -l

/dev/sda1 * … 2048 … 19531775 … 19529728 … 9,3G … 83 … Linux
/dev/sda2 … 19533822 … 1953523711… 1933989890 … 922,2G … 5 … Extended
/dev/sda5 … 19533824 … 53082111 … 33548288 … 16G … 82 … Linux swap / Solaris
/dev/sda6 … 53084160 …1953523711 … 1900439552 …906,2G …83 …Linux

df -h

Sys. de fichiers Taille Utilisé Dispo Uti% Monté sur

/dev/sda1 9,1G 1,9G 6,7G 22% /
udev 10M 0 10M 0% /dev
tmpfs 1,6G 8,6M 1,6G 1% /run
tmpfs 4,0G 0 4,0G 0% /dev/shm
tmpfs 5,0M 0 5,0M 0% /run/lock
tmpfs 4,0G 0 4,0G 0% /sys/fs/cgroup
/dev/sda6 892G 2,3G 845G 1% /var
tmpfs 801M 0 801M 0% /run/user/1000

merci pour votre aide :slight_smile:

Bonjour,

ça ressemble malheureusement à un disque qui est en train de rendre l’âme :frowning:.

Pour confirmer, il faudrait regarder les données smart du disque avec :
sudo smartctl -a /dev/sda

Il est probablement nécessaire en premier d’installer le paquet adapté et d’activer SMART sur le disque avec :
sudo apt-get install smartmontools sudo smartctl --smart=on --offlineauto=on --saveauto=on /dev/sda

Pour plus d’informations, la doc d’ubuntu-fr est assez complète sur le sujet : https://doc.ubuntu-fr.org/smartmontools
Il serait en particulier probablement pertinent de lancer un test du disque si rien de concret ne ressort de la première commande.

Merci pour la réponse rapide et efficace !

bon… c’est bien ce que je craignais :persevere:

 == START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: FAILED!
Drive failure expected in less than 24 hours. SAVE ALL DATA.
Failed Attributes:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0033   002   002   036    Pre-fail  Always   FAILING_NOW 4015

Merci en tout cas pour l’aide et cette commande dont j’ignorais l’existence :smile: