drbd-user March 2010 archive
Main Archive Page > Month Archives  > drbd-user archives
drbd-user: Re: [DRBD-user] Problems with oos Sectors after verif

Re: [DRBD-user] Problems with oos Sectors after verify

From: Lars Ellenberg <lars.ellenberg_at_nospam>
Date: Thu Mar 25 2010 - 19:28:15 GMT
To: drbd-user@lists.linbit.com

On Mon, Mar 22, 2010 at 07:39:38AM +0000, Henning Bitsch wrote:
> Hi Lars,
>
> >
>
> thanks for your replies.
>
> After your first message, I pushed our supplier to replace the complete
> hardware again (except the disks itself) but without any improvement -
> except they probably hate me now and think into a PEBKAC direction :-I.
>
> Lars Ellenberg <lars.ellenberg@...> writes:
>
> > Anyways, nothing you can "tune away" in DRBD.
> > Data _is_ changing.
>
> I am wondering why only a few sectors change every time, even when I
> do loads of write operations (> 100 GByte) before starting the verify
> again.
>
> > if it is changing when it already reached the secondary,
> > but is not yet on local disk,
>
> Could this happen when the secondary is under IO pressure?
>
> > or happens to fool the crc,
>
> I tried any of the csum algs, same symptom.

Of course. I just meant to say, in theory, you can have different data
pass the same checksum ("collision"), and in case that happens at all,
using the smae algorithm for verify and integrity (or for verify and
csums based resync) makes it impossible for verify to detect it.

That of course has nothing to do with data changes being detected by the
verify algorithm already.

> > -> "silent" data diversion, detected on next verify run.
>
> Just to get the impact straight: At the moment when I switch roles
> and shutdown the secondary afterwards, I will most likely have
> some corrupted data somewhere on the disk, right?
>
> *urgh*

It is _different_ on local and remote disk,
because it has been modified in flight.
Depending on what causes the modification,
it is not necessarily corrupted.

> > This is expected: the "digest" is calculated over the data packets,
> > which naturally flows from primary to secondary.
>
> What does the "l:" number mean? Are those bytes?

The "length" of the data packet on the wire,
including checksum information and other header information.

> [ 4514.958814] block drbd1: Digest integrity check FAILED.
> [ 4514.958814] block drbd1: error receiving Data, l: 20508!
>
> [37004.824665] block drbd1: Digest integrity check FAILED.
> [37004.824722] block drbd1: error receiving Data, l: 4124!
>
> [116754.075758] block drbd1: Digest integrity check FAILED.
> [116754.075811] block drbd1: error receiving Data, l: 4136!
>
> > not block but bytelevel changes, random bit patterns,
> > no obvious pattern or grouping
>
> Those devices are ext3 volumes used by Xen domUs. Would it help to
> find out to which files the sectors belong? Maybe this gives a hint.

Go ahead, maybe you find something.

> > to detect things that fooled the data-integrity-check, you should use a
> > different alg for verify. to detect things that fooled ("collided") the
> > csums alg, you should best have integrity, verify, and csums all
> > different.
>
> Did that:
> data-integrity-alg md5;
> verify-alg sha1;
> csums-alg crc32c;
>
> Same behavior.

Yes of course. See above. That was just a "btw".

> BTW: The current rate of Digest messages is about one every every 3 hours.
> Shouldnt this be much more in case of a hardware related problem?

It depends...
as always ;)

> > What is the usage pattern?
>
> domUs (Debian inside, basically LAMP stacks with some extra stuff) with two
> devices each (one for system, one for data). The digest and oos errors in
> combination only appear on the system devices. I tried to force the Digest
> error on the data devices by doing heavy IO (tiobench, dd, ...) but failed.
>
> > If thats all in the _swap_ of the xen domU's, this may even be "legal".
>
> Since I don't want to do live migration, swap partitions are not on drbd
> devices. The only purpose of this setup is to reduce recovery time in case
> of a complete system loss (fatal hardware error etc). Because of the large
> amount of data (approx 2 TB in ), restoring the complete set from a offsite
> backup connected via fibre would take almost 32h.
>
> Thanks for your help and ideas.

You probably read up on the other threads we had regarding "this issue",
namely DRBD reporting data integrity check failed.

Maybe certain MySQL access patterns are simply likely to trigger a
subtle bug scenario? This may be related to some racy generic page
redirtying, or pages of "deleted" files are handled differently
(not redirtied "properly" -- becaue they are deleted anyways).
Note that many "temporary" files are "deleted", but held open,
and only accessed through their file descriptor.

see also (that has been on reiserfs,
but that does not mean similar mechanisms work in ext3)
http://archives.free.net.ph/message/20080508.110544.7a340aa1.html

-- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. __ please don't Cc me, but send to list -- I'm subscribed _______________________________________________ drbd-user mailing list drbd-user@lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user