SD: Heartbeat HS: "Resource is stopped"

]Bonjour à tous,

Pour un projet que je compte présenter aux cours, j’expérimente DRBD et heartbeat pour approcher les bases de la haute disponibilité.
Je suis un débutant sous Linux et encore plus sous Debian ( j’ai commencé avec Ubuntu).
J’ai suivis plusieurs tutoriaux pour mettre en place DRBD et heartbeat.
DRBD ne pose pas de problèmes.
Par contre heartbeat, arg, ca parrait simple mais rien ne marche.
De plus je fais la configuration en style V1, donc c’est facilement lisible.

Lorceque je lance heartbeat via:

frontal1:~# etc/init.d/heartbeat restart

Je recois le message suivant:

Stopping High-Availability services:

Done.

Waiting to allow resource takeover to complete:

Done.

Starting High-Availability services:

2010/05/31_22:12:33 INFO: Resource is stopped

Done.

Se que ne m’arrange pas est Ressource is stopped.
L’ip alias ne semonte également pas

La structure ressemble à ceci:

LAN0 sur eth0: 192.168.0.0 /24 # Lan utilisateurs + heartbeat: Ne fonctionne pas.
LAN1 sur eth1: 192.168.1.0 /30 # Lan réplication DRBD: Fonctionne.
LAN2 sur eth2: 192.168.2.0 /28 # Lan serveurs applicatifs: Pas encore utilisé ici.

Heartbeat est constitué de 2 noeuds: frontal1 et frontal2

Frontal1  eth1------DRBD------eth1  frontal2

eth0⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯ eth0
⎯⎯⎯⎯⎯⎯⎯⎯⎯ :049 ⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯⎯

frontal1 eth0: 192.168.0.2
frontal2 eth0: 192.168.0.3
heartbeat ip alias eth0:0: 192.168.0.1

frontal1 eth1: 192.168.1.1
frontal2 eth2: 192.168.1.2

Au niveau logiciel j’utilise Debian 5.0 Lenny et heartbeat 2.1.3-6lenny4

J’ai suivi à la lettre et/ou me suis insiprer des tutoriaux ci-dessous sans succès:

http://howtoforge.net/highly-available-nfs-server-using-drbd-and-heartbeat-on-debian-5.0-lenny
http://doc.ubuntu-fr.org/tutoriel/mirroring_sur_deux_serveurs
http://www.drbd.org/users-guide/ch-heartbeat.html
http://www.linux-ha.org/doc/

Voici mes fichiers de config, résultats de commandes et logs:

En faisant un BasicSanityCheck (2)je me rend bien compte qu’il y a um problème avec Ipaddr.

Pourtant si je lance le script Ipaddr ou Ipaddr2 manuellement tout fonctionne parfaitement et ip alias se monte et est accèssible sur le réseau. (5) avant Ipaddr b[/b] après Ipaddr.

J’ai écumé les forum uk et fr sans trouver de solution à mon problème.
Je n’ai plus de pistes.

Si il vous faut plus d’informations n’hésitez pas à me le signaler

Merci d’avance pour votre aide.

b vim /etc/ha.d/ha.cf [/b][/b]

[code]autojoin none

mcast eth0 239.0.0.43 694 1 0

warntime 5

deadtime 5

initdead 15

keepalive 2

node frontal1

node frontal2[/code]

b sh /usr/share/heartbeat/BasicSanityCheck[/b]

[code]RTNETLINK answers: Network is unreachable

Using interface: eth0

Should not run tests with heartbeat already running.

Starting base64 and md5 algorithm tests

base64 and md5 algorithm tests succeeded.

Starting Resource Agent tests

Testing RA: Dummy

Testing RA: IPaddr

ERROR: IPaddr RA failed

Starting IPC tests

That’s weird. Heartbeat seems to be running…

Stopping heartbeat

Stopping High-Availability services:

Done.

Starting heartbeat

Starting High-Availability services:

2010/05/31_22:16:04 INFO: Resource is stopped

Done.

Does not look like we ARPed the address

Looks like monitor operation failed

Reloading heartbeat

Reloading heartbeat

Stopping heartbeat

Stopping High-Availability services:

Done.

Checking STONITH basic sanity.

Performing apphbd success case tests

Performing apphbd failure case tests

Starting LRM tests

Starting heartbeat

Starting High-Availability services:

2010/05/31_22:18:25 INFO: Resource is stopped

Done.
[/code]

bsh /usr/share/heartbeat/ResourceManager listkeys frontal1[/b]

192.168.0.1
bsh /usr/share/heartbeat/ResourceManager listkeys frontal2 [/b]

bip addr show[/b]

[code]
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN

link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

inet 127.0.0.1/8 scope host lo

2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000

link/ether 00:0c:29:cb:86:45 brd ff:ff:ff:ff:ff:ff

inet 192.168.0.2/24 brd 192.168.0.255 scope global eth0

3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN qlen 1000

link/ether 00:0c:29:cb:86:4f brd ff:ff:ff:ff:ff:ff

inet 192.168.1.1/30 brd 192.168.1.3 scope global eth1

4: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000

link/ether 00:0c:29:cb:86:59 brd ff:ff:ff:ff:ff:ff[/code]

(6)/etc/ha.d/resource.d/IPaddr 192.168.0.1 start

2010/05/31_22:30:37 INFO:  Success

b/etc/ha.d/resource.d/IPaddr2 192.168.0.1 start[/b]

[code]2010/05/31_22:30:24 INFO: Using calculated nic for 192.168.0.1: eth0

2010/05/31_22:30:24 INFO: Using calculated netmask for 192.168.0.1: 255.255.255.0

2010/05/31_22:30:25 INFO: eval ifconfig eth0:0 192.168.0.1 netmask 255.255.255.0 broadcast 192.168.0.255

2010/05/31_22:30:25 INFO: Success

INFO: Success[/code]

b cat /etc/ha.d/haresources[/b]
frontal1 IPaddr2::192.168.0.1/24/eth0/192.168.0.255

OU frontal1 drbddisk::r0 Filesystem::/dev/drbd1::/serveur::ext3 dhcp3-server

b cat /var/log/heartbeat/log[/b]

[code]heartbeat[7179]: 2010/05/31_22:38:40 info: Version 2 support: false

heartbeat[7179]: 2010/05/31_22:38:40 WARN: Deprecated ‘legacy’ auto_failback opt

ion selected.

heartbeat[7179]: 2010/05/31_22:38:40 WARN: Please convert to ‘auto_failback on’.

heartbeat[7179]: 2010/05/31_22:38:40 WARN: See documentation for conversion deta

ils.

heartbeat[7179]: 2010/05/31_22:38:40 WARN: Logging daemon is disabled --enabling

logging daemon is recommended

heartbeat[7179]: 2010/05/31_22:38:40 info: **************************

heartbeat[7179]: 2010/05/31_22:38:40 info: Configuration validated. Starting hea

rtbeat 2.1.3

heartbeat[7180]: 2010/05/31_22:38:40 info: heartbeat: version 2.1.3

heartbeat[7180]: 2010/05/31_22:38:40 info: Heartbeat generation: 1275221613

heartbeat[7180]: 2010/05/31_22:38:40 info: glib: UDP multicast heartbeat started

for group 239.0.0.43 port 694 interface eth0 (ttl=1 loop=0)

heartbeat[7180]: 2010/05/31_22:38:40 info: G_main_add_TriggerHandler: Added sign

al manual handler

heartbeat[7180]: 2010/05/31_22:38:40 info: G_main_add_TriggerHandler: Added sign

al manual handler

heartbeat[7180]: 2010/05/31_22:38:40 info: G_main_add_SignalHandler: Added signa

l handler for signal 17

heartbeat[7180]: 2010/05/31_22:38:40 info: Local status now set to: ‘up’

heartbeat[7180]: 2010/05/31_22:38:41 info: Link frontal2:eth0 up.

heartbeat[7180]: 2010/05/31_22:38:41 info: Status update for node frontal2: stat

[7m–More-- [27m
us active

harc[7188]: 2010/05/31_22:38:41 info: Running /etc/ha.d/rc.d/status status

heartbeat[7180]: 2010/05/31_22:38:42 info: Comm_now_up(): updating status to act

ive

heartbeat[7180]: 2010/05/31_22:38:42 info: Local status now set to: ‘active’

IPaddr2[7242]: 2010/05/31_22:38:42 INFO: Resource is stopped

heartbeat[7204]: 2010/05/31_22:38:42 info: Local Resource acquisition completed.

frontal1:~# cat /var/log/heartbeat/log|more

heartbeat[7179]: 2010/05/31_22:38:40 info: Version 2 support: false

heartbeat[7179]: 2010/05/31_22:38:40 WARN: Deprecated ‘legacy’ auto_failback opt

ion selected.

heartbeat[7179]: 2010/05/31_22:38:40 WARN: Please convert to ‘auto_failback on’.

heartbeat[7179]: 2010/05/31_22:38:40 WARN: See documentation for conversion deta

ils.

heartbeat[7179]: 2010/05/31_22:38:40 WARN: Logging daemon is disabled --enabling

logging daemon is recommended

heartbeat[7179]: 2010/05/31_22:38:40 info: **************************

heartbeat[7179]: 2010/05/31_22:38:40 info: Configuration validated. Starting hea

rtbeat 2.1.3

heartbeat[7180]: 2010/05/31_22:38:40 info: heartbeat: version 2.1.3

heartbeat[7180]: 2010/05/31_22:38:40 info: Heartbeat generation: 1275221613

heartbeat[7180]: 2010/05/31_22:38:40 info: glib: UDP multicast heartbeat started

for group 239.0.0.43 port 694 interface eth0 (ttl=1 loop=0)

heartbeat[7180]: 2010/05/31_22:38:40 info: G_main_add_TriggerHandler: Added sign

al manual handler

heartbeat[7180]: 2010/05/31_22:38:40 info: G_main_add_TriggerHandler: Added sign

al manual handler

heartbeat[7180]: 2010/05/31_22:38:40 info: G_main_add_SignalHandler: Added signa

l handler for signal 17

heartbeat[7180]: 2010/05/31_22:38:40 info: Local status now set to: ‘up’

heartbeat[7180]: 2010/05/31_22:38:41 info: Link frontal2:eth0 up.

heartbeat[7180]: 2010/05/31_22:38:41 info: Status update for node frontal2: stat

[7m–More-- [27m
us active

harc[7188]: 2010/05/31_22:38:41 info: Running /etc/ha.d/rc.d/status status

heartbeat[7180]: 2010/05/31_22:38:42 info: Comm_now_up(): updating status to act

ive

heartbeat[7180]: 2010/05/31_22:38:42 info: Local status now set to: ‘active’

IPaddr2[7242]: 2010/05/31_22:38:42 INFO: Resource is stopped

heartbeat[7204]: 2010/05/31_22:38:42 info: Local Resource acquisition completed.

harc[7337]: 2010/05/31_22:39:06 info: Running /etc/ha.d/rc.d/ip-request-resp

ip-request-resp

ip-request-resp[7337]: 2010/05/31_22:39:06 received ip-request-resp IPaddr2::19

2.168.0.1/24/eth0/192.168.0.255 OK no

ResourceManager[7356]: 2010/05/31_22:39:06 info: Acquiring resource group: fron

tal1 IPaddr2::192.168.0.1/24/eth0/192.168.0.255

IPaddr2[7382]: 2010/05/31_22:39:06 INFO: Resource is stopped

ResourceManager[7356]: 2010/05/31_22:39:06 info: Running /etc/ha.d/resource.d/I

Paddr2 192.168.0.1/24/eth0/192.168.0.255 start

IPaddr2[7491]: 2010/05/31_22:39:07 INFO: ip -f inet addr add 192.168.0.1/24 brd

192.168.0.255 dev eth0

IPaddr2[7491]: 2010/05/31_22:39:07 INFO: ip link set eth0 up

IPaddr2[7491]: 2010/05/31_22:39:07 INFO: /usr/lib/heartbeat/send_arp -i 200 -r

5 -p /var/run/heartbeat/rsctmp/send_arp/send_arp-192.168.0.1 eth0 192.168.0.1 au

to not_used not_used

IPaddr2[7462]: 2010/05/31_22:39:07 INFO: Success

heartbeat[7180]: 2010/05/31_22:39:07 info: Initial resource acquisition complete

[7m–More-- [27m
(ip-request-resp)

harc[7549]: 2010/05/31_22:39:07 info: Running /etc/ha.d/rc.d/ip-request-resp

ip-request-resp

ip-request-resp[7549]: 2010/05/31_22:39:07 received ip-request-resp drbddisk::r

0 OK no

ResourceManager[7568]: 2010/05/31_22:39:07 info: Acquiring resource group: fron

tal1 drbddisk::r0 Filesystem::/dev/drbd1::/serveur::ext3 dhcp3-server tftpd-hpa

ResourceManager[7568]: 2010/05/31_22:39:07 info: Running /etc/ha.d/resource.d/d

rbddisk r0 start

Filesystem[7633]: 2010/05/31_22:39:07 INFO: Resource is stopped

ResourceManager[7568]: 2010/05/31_22:39:07 info: Running /etc/ha.d/resource.d/F

ilesystem /dev/drbd1 /serveur ext3 start

Filesystem[7711]: 2010/05/31_22:39:07 INFO: Running start for /dev/drbd1 o

n /serveur

Filesystem[7700]: 2010/05/31_22:39:07 INFO: Success

ResourceManager[7568]: 2010/05/31_22:39:07 info: Running /etc/init.d/dhcp3-serv

er start

ResourceManager[7568]: 2010/05/31_22:39:09 info: Running /etc/init.d/tftpd-hpa

start

ResourceManager[7568]: 2010/05/31_22:39:09 ERROR: Return code 71 from /etc/init

.d/tftpd-hpa

ResourceManager[7568]: 2010/05/31_22:39:09 CRIT: Giving up resources due to fai

lure of tftpd-hpa

ResourceManager[7568]: 2010/05/31_22:39:09 info: Releasing resource group: fron

[7m–More-- [27m
tal1 drbddisk::r0 Filesystem::/dev/drbd1::/serveur::ext3 dhcp3-server tftpd-hpa

ResourceManager[7568]: 2010/05/31_22:39:09 info: Running /etc/init.d/tftpd-hpa

stop

ResourceManager[7568]: 2010/05/31_22:39:09 info: Running /etc/init.d/dhcp3-serv

er stop

ResourceManager[7568]: 2010/05/31_22:39:09 info: Running /etc/ha.d/resource.d/F

ilesystem /dev/drbd1 /serveur ext3 stop

Filesystem[7898]: 2010/05/31_22:39:09 INFO: Running stop for /dev/drbd1 on

/serveur

Filesystem[7898]: 2010/05/31_22:39:09 INFO: Trying to unmount /serveur

Filesystem[7898]: 2010/05/31_22:39:10 INFO: unmounted /serveur successfull

y

Filesystem[7887]: 2010/05/31_22:39:10 INFO: Success

ResourceManager[7568]: 2010/05/31_22:39:10 info: Running /etc/ha.d/resource.d/d

rbddisk r0 sto[/code]

up