]Bonjour à tous,
Pour un projet que je compte présenter aux cours, j’expérimente DRBD et heartbeat pour approcher les bases de la haute disponibilité.
Je suis un débutant sous Linux et encore plus sous Debian ( j’ai commencé avec Ubuntu).
J’ai suivis plusieurs tutoriaux pour mettre en place DRBD et heartbeat.
DRBD ne pose pas de problèmes.
Par contre heartbeat, arg, ca parrait simple mais rien ne marche.
De plus je fais la configuration en style V1, donc c’est facilement lisible.
Lorceque je lance heartbeat via:
frontal1:~# etc/init.d/heartbeat restart
Je recois le message suivant:
Stopping High-Availability services:
Done.
Waiting to allow resource takeover to complete:
Done.
Starting High-Availability services:
2010/05/31_22:12:33 INFO: Resource is stopped
Done.
Se que ne m’arrange pas est Ressource is stopped.
L’ip alias ne semonte également pas
La structure ressemble à ceci:
LAN0 sur eth0: 192.168.0.0 /24 # Lan utilisateurs + heartbeat: Ne fonctionne pas.
LAN1 sur eth1: 192.168.1.0 /30 # Lan réplication DRBD: Fonctionne.
LAN2 sur eth2: 192.168.2.0 /28 # Lan serveurs applicatifs: Pas encore utilisé ici.
Heartbeat est constitué de 2 noeuds: frontal1 et frontal2
Frontal1 eth1------DRBD------eth1 frontal2
eth0⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ eth0
⎯⎯⎯⎯⎯⎯⎯⎯⎯
⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯
frontal1 eth0: 192.168.0.2
frontal2 eth0: 192.168.0.3
heartbeat ip alias eth0:0: 192.168.0.1
frontal1 eth1: 192.168.1.1
frontal2 eth2: 192.168.1.2
Au niveau logiciel j’utilise Debian 5.0 Lenny et heartbeat 2.1.3-6lenny4
J’ai suivi à la lettre et/ou me suis insiprer des tutoriaux ci-dessous sans succès:
http://howtoforge.net/highly-available-nfs-server-using-drbd-and-heartbeat-on-debian-5.0-lenny
http://doc.ubuntu-fr.org/tutoriel/mirroring_sur_deux_serveurs
http://www.drbd.org/users-guide/ch-heartbeat.html
http://www.linux-ha.org/doc/
Voici mes fichiers de config, résultats de commandes et logs:
En faisant un BasicSanityCheck (2)je me rend bien compte qu’il y a um problème avec Ipaddr.
Pourtant si je lance le script Ipaddr ou Ipaddr2 manuellement tout fonctionne parfaitement et ip alias se monte et est accèssible sur le réseau. (5) avant Ipaddr b[/b] après Ipaddr.
J’ai écumé les forum uk et fr sans trouver de solution à mon problème.
Je n’ai plus de pistes.
Si il vous faut plus d’informations n’hésitez pas à me le signaler
Merci d’avance pour votre aide.
b vim /etc/ha.d/ha.cf [/b][/b]
[code]autojoin none
mcast eth0 239.0.0.43 694 1 0
warntime 5
deadtime 5
initdead 15
keepalive 2
node frontal1
node frontal2[/code]
b sh /usr/share/heartbeat/BasicSanityCheck[/b]
[code]RTNETLINK answers: Network is unreachable
Using interface: eth0
Should not run tests with heartbeat already running.
Starting base64 and md5 algorithm tests
base64 and md5 algorithm tests succeeded.
Starting Resource Agent tests
Testing RA: Dummy
Testing RA: IPaddr
ERROR: IPaddr RA failed
Starting IPC tests
That’s weird. Heartbeat seems to be running…
Stopping heartbeat
Stopping High-Availability services:
Done.
Starting heartbeat
Starting High-Availability services:
2010/05/31_22:16:04 INFO: Resource is stopped
Done.
Does not look like we ARPed the address
Looks like monitor operation failed
Reloading heartbeat
Reloading heartbeat
Stopping heartbeat
Stopping High-Availability services:
Done.
Checking STONITH basic sanity.
Performing apphbd success case tests
Performing apphbd failure case tests
Starting LRM tests
Starting heartbeat
Starting High-Availability services:
2010/05/31_22:18:25 INFO: Resource is stopped
Done.
[/code]
bsh /usr/share/heartbeat/ResourceManager listkeys frontal1[/b]
192.168.0.1
bsh /usr/share/heartbeat/ResourceManager listkeys frontal2 [/b]
bip addr show[/b]
[code]
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000
link/ether 00:0c:29:cb:86:45 brd ff:ff:ff:ff:ff:ff
inet 192.168.0.2/24 brd 192.168.0.255 scope global eth0
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000
link/ether 00:0c:29:cb:86:4f brd ff:ff:ff:ff:ff:ff
inet 192.168.1.1/30 brd 192.168.1.3 scope global eth1
4: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
link/ether 00:0c:29:cb:86:59 brd ff:ff:ff:ff:ff:ff[/code]
(6)/etc/ha.d/resource.d/IPaddr 192.168.0.1 start
2010/05/31_22:30:37 INFO: Success
b/etc/ha.d/resource.d/IPaddr2 192.168.0.1 start[/b]
[code]2010/05/31_22:30:24 INFO: Using calculated nic for 192.168.0.1: eth0
2010/05/31_22:30:24 INFO: Using calculated netmask for 192.168.0.1: 255.255.255.0
2010/05/31_22:30:25 INFO: eval ifconfig eth0:0 192.168.0.1 netmask 255.255.255.0 broadcast 192.168.0.255
2010/05/31_22:30:25 INFO: Success
INFO: Success[/code]
b cat /etc/ha.d/haresources[/b]
frontal1 IPaddr2::192.168.0.1/24/eth0/192.168.0.255
OU frontal1 drbddisk::r0 Filesystem::/dev/drbd1::/serveur::ext3 dhcp3-server
b cat /var/log/heartbeat/log[/b]
[code]heartbeat[7179]: 2010/05/31_22:38:40 info: Version 2 support: false
heartbeat[7179]: 2010/05/31_22:38:40 WARN: Deprecated ‘legacy’ auto_failback opt
ion selected.
heartbeat[7179]: 2010/05/31_22:38:40 WARN: Please convert to ‘auto_failback on’.
heartbeat[7179]: 2010/05/31_22:38:40 WARN: See documentation for conversion deta
ils.
heartbeat[7179]: 2010/05/31_22:38:40 WARN: Logging daemon is disabled --enabling
logging daemon is recommended
heartbeat[7179]: 2010/05/31_22:38:40 info: **************************
heartbeat[7179]: 2010/05/31_22:38:40 info: Configuration validated. Starting hea
rtbeat 2.1.3
heartbeat[7180]: 2010/05/31_22:38:40 info: heartbeat: version 2.1.3
heartbeat[7180]: 2010/05/31_22:38:40 info: Heartbeat generation: 1275221613
heartbeat[7180]: 2010/05/31_22:38:40 info: glib: UDP multicast heartbeat started
for group 239.0.0.43 port 694 interface eth0 (ttl=1 loop=0)
heartbeat[7180]: 2010/05/31_22:38:40 info: G_main_add_TriggerHandler: Added sign
al manual handler
heartbeat[7180]: 2010/05/31_22:38:40 info: G_main_add_TriggerHandler: Added sign
al manual handler
heartbeat[7180]: 2010/05/31_22:38:40 info: G_main_add_SignalHandler: Added signa
l handler for signal 17
heartbeat[7180]: 2010/05/31_22:38:40 info: Local status now set to: ‘up’
heartbeat[7180]: 2010/05/31_22:38:41 info: Link frontal2:eth0 up.
heartbeat[7180]: 2010/05/31_22:38:41 info: Status update for node frontal2: stat
[7m–More-- [27m
us active
harc[7188]: 2010/05/31_22:38:41 info: Running /etc/ha.d/rc.d/status status
heartbeat[7180]: 2010/05/31_22:38:42 info: Comm_now_up(): updating status to act
ive
heartbeat[7180]: 2010/05/31_22:38:42 info: Local status now set to: ‘active’
IPaddr2[7242]: 2010/05/31_22:38:42 INFO: Resource is stopped
heartbeat[7204]: 2010/05/31_22:38:42 info: Local Resource acquisition completed.
frontal1:~# cat /var/log/heartbeat/log|more
heartbeat[7179]: 2010/05/31_22:38:40 info: Version 2 support: false
heartbeat[7179]: 2010/05/31_22:38:40 WARN: Deprecated ‘legacy’ auto_failback opt
ion selected.
heartbeat[7179]: 2010/05/31_22:38:40 WARN: Please convert to ‘auto_failback on’.
heartbeat[7179]: 2010/05/31_22:38:40 WARN: See documentation for conversion deta
ils.
heartbeat[7179]: 2010/05/31_22:38:40 WARN: Logging daemon is disabled --enabling
logging daemon is recommended
heartbeat[7179]: 2010/05/31_22:38:40 info: **************************
heartbeat[7179]: 2010/05/31_22:38:40 info: Configuration validated. Starting hea
rtbeat 2.1.3
heartbeat[7180]: 2010/05/31_22:38:40 info: heartbeat: version 2.1.3
heartbeat[7180]: 2010/05/31_22:38:40 info: Heartbeat generation: 1275221613
heartbeat[7180]: 2010/05/31_22:38:40 info: glib: UDP multicast heartbeat started
for group 239.0.0.43 port 694 interface eth0 (ttl=1 loop=0)
heartbeat[7180]: 2010/05/31_22:38:40 info: G_main_add_TriggerHandler: Added sign
al manual handler
heartbeat[7180]: 2010/05/31_22:38:40 info: G_main_add_TriggerHandler: Added sign
al manual handler
heartbeat[7180]: 2010/05/31_22:38:40 info: G_main_add_SignalHandler: Added signa
l handler for signal 17
heartbeat[7180]: 2010/05/31_22:38:40 info: Local status now set to: ‘up’
heartbeat[7180]: 2010/05/31_22:38:41 info: Link frontal2:eth0 up.
heartbeat[7180]: 2010/05/31_22:38:41 info: Status update for node frontal2: stat
[7m–More-- [27m
us active
harc[7188]: 2010/05/31_22:38:41 info: Running /etc/ha.d/rc.d/status status
heartbeat[7180]: 2010/05/31_22:38:42 info: Comm_now_up(): updating status to act
ive
heartbeat[7180]: 2010/05/31_22:38:42 info: Local status now set to: ‘active’
IPaddr2[7242]: 2010/05/31_22:38:42 INFO: Resource is stopped
heartbeat[7204]: 2010/05/31_22:38:42 info: Local Resource acquisition completed.
harc[7337]: 2010/05/31_22:39:06 info: Running /etc/ha.d/rc.d/ip-request-resp
ip-request-resp
ip-request-resp[7337]: 2010/05/31_22:39:06 received ip-request-resp IPaddr2::19
2.168.0.1/24/eth0/192.168.0.255 OK no
ResourceManager[7356]: 2010/05/31_22:39:06 info: Acquiring resource group: fron
tal1 IPaddr2::192.168.0.1/24/eth0/192.168.0.255
IPaddr2[7382]: 2010/05/31_22:39:06 INFO: Resource is stopped
ResourceManager[7356]: 2010/05/31_22:39:06 info: Running /etc/ha.d/resource.d/I
Paddr2 192.168.0.1/24/eth0/192.168.0.255 start
IPaddr2[7491]: 2010/05/31_22:39:07 INFO: ip -f inet addr add 192.168.0.1/24 brd
192.168.0.255 dev eth0
IPaddr2[7491]: 2010/05/31_22:39:07 INFO: ip link set eth0 up
IPaddr2[7491]: 2010/05/31_22:39:07 INFO: /usr/lib/heartbeat/send_arp -i 200 -r
5 -p /var/run/heartbeat/rsctmp/send_arp/send_arp-192.168.0.1 eth0 192.168.0.1 au
to not_used not_used
IPaddr2[7462]: 2010/05/31_22:39:07 INFO: Success
heartbeat[7180]: 2010/05/31_22:39:07 info: Initial resource acquisition complete
[7m–More-- [27m
(ip-request-resp)
harc[7549]: 2010/05/31_22:39:07 info: Running /etc/ha.d/rc.d/ip-request-resp
ip-request-resp
ip-request-resp[7549]: 2010/05/31_22:39:07 received ip-request-resp drbddisk::r
0 OK no
ResourceManager[7568]: 2010/05/31_22:39:07 info: Acquiring resource group: fron
tal1 drbddisk::r0 Filesystem::/dev/drbd1::/serveur::ext3 dhcp3-server tftpd-hpa
ResourceManager[7568]: 2010/05/31_22:39:07 info: Running /etc/ha.d/resource.d/d
rbddisk r0 start
Filesystem[7633]: 2010/05/31_22:39:07 INFO: Resource is stopped
ResourceManager[7568]: 2010/05/31_22:39:07 info: Running /etc/ha.d/resource.d/F
ilesystem /dev/drbd1 /serveur ext3 start
Filesystem[7711]: 2010/05/31_22:39:07 INFO: Running start for /dev/drbd1 o
n /serveur
Filesystem[7700]: 2010/05/31_22:39:07 INFO: Success
ResourceManager[7568]: 2010/05/31_22:39:07 info: Running /etc/init.d/dhcp3-serv
er start
ResourceManager[7568]: 2010/05/31_22:39:09 info: Running /etc/init.d/tftpd-hpa
start
ResourceManager[7568]: 2010/05/31_22:39:09 ERROR: Return code 71 from /etc/init
.d/tftpd-hpa
ResourceManager[7568]: 2010/05/31_22:39:09 CRIT: Giving up resources due to fai
lure of tftpd-hpa
ResourceManager[7568]: 2010/05/31_22:39:09 info: Releasing resource group: fron
[7m–More-- [27m
tal1 drbddisk::r0 Filesystem::/dev/drbd1::/serveur::ext3 dhcp3-server tftpd-hpa
ResourceManager[7568]: 2010/05/31_22:39:09 info: Running /etc/init.d/tftpd-hpa
stop
ResourceManager[7568]: 2010/05/31_22:39:09 info: Running /etc/init.d/dhcp3-serv
er stop
ResourceManager[7568]: 2010/05/31_22:39:09 info: Running /etc/ha.d/resource.d/F
ilesystem /dev/drbd1 /serveur ext3 stop
Filesystem[7898]: 2010/05/31_22:39:09 INFO: Running stop for /dev/drbd1 on
/serveur
Filesystem[7898]: 2010/05/31_22:39:09 INFO: Trying to unmount /serveur
Filesystem[7898]: 2010/05/31_22:39:10 INFO: unmounted /serveur successfull
y
Filesystem[7887]: 2010/05/31_22:39:10 INFO: Success
ResourceManager[7568]: 2010/05/31_22:39:10 info: Running /etc/ha.d/resource.d/d
rbddisk r0 sto[/code]