Compatibility

Drbd works fine on top of ide and scsi partitions and whole drives, it is not working on top of the loop block device. (If you dare it, it will deadlock)

Drbd also does not like the loop-back network device. (You will also observe a nice deadlock: all requests are occupied by the sending device and the sending process is blocked in sock_sendmsg(). The receiving thread is fetching a block from the network and tries to put it on the cache, but unfortunately the system may decide to bring some blocks from the cache to the disk. This happens in the context of the receiver and since all requests are already occupied the reciever blocks.)

Implementation details

At first I wanted to use UDP so that I could benefit from UDP's multicasting abilities to implement clusters of more than two nodes. But after I finished the first experimental UDP implementation it turned out that the kernel is only storing up to 64 kb of incoming data on an UDP socket. And with faster networks it happens quite easily that your intel box (timer interrupt @ 100 times per second) is not scheduling your receiving process often enough and you get an enormous amount of lost packets. (Alpha did a lot better, there you have 1024 interrupts per second).

UPDATE: In the meantime I have learned that it is possible to change the buffer sizes of a socket. But writing a reliable multicast transport protocol is postponed (for now). (sysctl(2) for /proc/sys/net/core/rmem_max and setsockopt(2) with SO_RCVBUF will do the trick)

So for now I am using TCP.

Accessing the lower level block device is done with a temporary copy of the buffer_head and a call to ll_rw_block. Thus you should never access the lower level block device directly when you have a drbd running on top of it!

I am using /dev/nb0 (and major 43) because of

~linux-2.2.7/drivers/block/ll_rw_blk.c line ~447 :
--snip--
     /* Loop uses two requests, 1 for loop and 1 for the real device.
      * Cut max_req in half to avoid running out and deadlocking. */
        if ((major == LOOP_MAJOR) || (major == NBD_MAJOR))
            max_req >>= 1;
--snap--

It will deadlock if you use another major number than one of these two!!

back
Philipp Reisner
Last modified: Thu Mar 9 15:29:09 CET 2000