drbd-user March 2010 archive
Main Archive Page > Month Archives  > drbd-user archives
drbd-user: [DRBD-user] online-verify crashed drbd-resource

[DRBD-user] online-verify crashed drbd-resource

From: joseph <joseph_at_nospam>
Date: Fri Mar 26 2010 - 11:36:04 GMT
To: drbd-user@lists.linbit.com

Hello drbd-users,

I started the verify command on the primary if one of my drbd-resources
(a mysql-db on drbd2: 63G of wich 1.2G are used). As it didn't actually
start verifying (at /proc/drbd it stayed at 0%) but instead resulted in
a load of over 50 I immediatly disconnected the resource (happened so
fast, that i didn't actually pay attention if it was one of the drbd2_*
processes or mysql that was responsible for the load).

Anyhow, now drbd2_receiver on my secondary is still running and can't
even be killed with kill -9. That means, that without rebooting I
probably won't be able to reconnect the two resources, right? Or does
someone has an idea?

Here the output of dmesg on my secondary

[3465134.224587] block drbd2: Online Verify start sector: 0
[3465134.232913] BUG: unable to handle kernel NULL pointer dereference
at 0000000000000030
[3465134.232913] IP: [<ffffffffa02308b2>] :drbd:w_e_end_ov_req+0x32/0x114
[3465134.232913] PGD 0
[3465134.232913] Oops: 0000 [1] SMP
[3465134.232913] CPU: 5
[3465134.232913] Modules linked in: tcp_diag inet_diag fuse ext2
nls_utf8 cifs nls_base sha1_generic vzethdev vznetdev simfs vzrst vzcpt
tun vzdquota vzmon vzdev xt_length ipt_ttl xt_tcpmss xt_multiport
xt_dscp ipt_MASQUERADE xt_TCPMSS xt_tcpudp xt_state ipt_REJECT ipt_LOG
xt_limit iptable_mangle iptable_nat nf_nat iptable_filter
nf_conntrack_ftp nf_conntrack_irc nf_conntrack_ipv4 nf_conntrack
ip_tables x_tables acpi_cpufreq cpufreq_powersave cpufreq_ondemand
cpufreq_userspace cpufreq_conservative cpufreq_stats ocfs2_dlmfs
ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs
ipv6 f71882fg drbd cn loop snd_pcm snd_timer snd pcspkr soundcore wmi
i2c_i801 snd_page_alloc evdev i2c_core button ext3 jbd mbcache dm_mirror
dm_log dm_snapshot dm_mod e1000 ehci_hcd uhci_hcd sd_mod thermal fan
r8168 freq_table processor thermal_sys raid10 raid456 async_xor
async_memcpy async_tx xor raid1 raid0 md_mod atiixp ahci sata_nv
sata_sil sata_via libata dock via82cxxx ide_core 3w_9xxx 3w_xxxx
scsi_mod [last unloaded: scsi_wait_scan]
[3465134.232913] Pid: 7984, comm: drbd2_worker Not tainted
2.6.26-2-openvz-amd64 #1 036test001
[3465134.232913] RIP: 0010:[<ffffffffa02308b2>] [<ffffffffa02308b2>]
:drbd:w_e_end_ov_req+0x32/0x114
[3465134.232913] RSP: 0018:ffff810313c73e90 EFLAGS: 00010202
[3465134.232913] RAX: 0000000000000000 RBX: ffff81031f887000 RCX:
ffff81033d9ce000
[3465134.232913] RDX: 0000000000000000 RSI: 0000000000000010 RDI:
ffff81031f887000
[3465134.232913] RBP: ffff81031f887000 R08: ffff81005b2881d0 R09:
0000000000000004
[3465134.232913] R10: ffff81031f887108 R11: ffff81031f887000 R12:
ffff81031f887630
[3465134.232913] R13: ffff8103374f80d0 R14: ffffffffa0254be2 R15:
ffff81031f887640
[3465134.232913] FS: 0000000000000000(0000) GS:ffff81033d9bb0c0(0000)
knlGS:0000000000000000
[3465134.232913] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[3465134.232913] CR2: 0000000000000030 CR3: 0000000000201000 CR4:
00000000000006e0
[3465134.232913] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[3465134.232913] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[3465134.232913] Process drbd2_worker (pid: 7984, veid=0, threadinfo
ffff810313c72000, task ffff810048193810)
[3465134.232913] Stack: ffff81031f887128 ffff8103374f80d0
ffff81031f887000 ffff81031f887630
[3465134.232913] 0000000000000000 ffffffffa022f10e ffff810313c73ec0
ffff810313c73ec0
[3465134.232913] ffffffff80423446 0000000164627264 0000000000646165
ffff81031f887630
[3465134.232913] Call Trace:
[3465134.232913] [<ffffffffa022f10e>] ? :drbd:drbd_worker+0x23e/0x409
[3465134.232913] [<ffffffff80423446>] ? schedule_timeout+0x85/0xad
[3465134.232913] [<ffffffffa02458c6>] ?
:drbd:drbd_thread_setup+0x124/0x1bb
[3465134.232913] [<ffffffff8020d048>] ? child_rip+0xa/0x12
[3465134.232913] [<ffffffffa02457a2>] ? :drbd:drbd_thread_setup+0x0/0x1bb
[3465134.232913] [<ffffffff8020d03e>] ? child_rip+0x0/0x12
[3465134.232913]
[3465134.232913]
[3465134.232913] Code: 55 53 48 89 fb 48 83 ec 08 85 d2 0f 85 ac 00 00
00 48 8b 46 20 f6 40 18 01 0f 84 9e 00 00 00 48 8b 87 d8 05 00 00 be 10
00 00 00 <44> 8b 60 30 49 63 fc e8 9f b8 06 e0 48 85 c0 48 89 c5 74 7e 49
[3465134.232913] RIP [<ffffffffa02308b2>] :drbd:w_e_end_ov_req+0x32/0x114
[3465134.232913] RSP <ffff810313c73e90>
[3465134.232913] CR2: 0000000000000030
[3465134.232913] ---[ end trace 1a320c0fb997ccd3 ]---

[3465665.496416] block drbd2: Online Verify reached sector 0
[3465665.497035] block drbd2: drbd_pp_alloc interrupted!
[3465665.497035] block drbd2: alloc_ee: Allocation of a page failed
[3465665.497035] block drbd2: error receiving OVRequest, l: 24!
[3465665.499844] block drbd2: asender terminated
[3465665.499844] block drbd2: Terminating asender thread

that's what the primary had to say about that:

[4432776.531522] block drbd2: conn( Connected -> VerifyS )
[4432776.531522] block drbd2: Starting Online Verify from sector 0
[4433306.255700] block drbd2: peer( Secondary -> Unknown ) conn( VerifyS
-> TearDown ) pdsk( UpToDate -> DUnknown )
[4433306.255700] block drbd2: Online Verify reached sector 0
[4433306.255852] block drbd2: Creating new current UUID
[4433306.256527] block drbd2: meta connection shut down by peer.
[4433306.256527] block drbd2: asender terminated
[4433306.256527] block drbd2: Terminating asender thread
[4433306.284947] block drbd2: Connection closed
[4433306.284947] block drbd2: conn( TearDown -> Unconnected )
[4433306.284947] block drbd2: receiver terminated
[4433306.284947] block drbd2: Restarting receiver thread
[4433306.284947] block drbd2: receiver (re)started
[4433306.284947] block drbd2: conn( Unconnected -> WFConnection )

thanks a lot for reading,

Joe
_______________________________________________
drbd-user mailing list
drbd-user@lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user