<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Thu, Apr 10, 2014 at 11:02 AM, Raghavendra Gowdappa <span dir="ltr"><<a href="mailto:rgowdapp@redhat.com" target="_blank">rgowdapp@redhat.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi all,<br>
<br>
I was trying to come up with some consistency issues. I am not sure whether case 5 is a valid one, since lookup would succeed and mkdir would fail with EEXIST (scroll down to the case for more detailed explanation).<br></blockquote>
<div><br></div><div>Case 5 is a valid one. This comment was based on an earlier test case which seemed to be invalid. Sorry about the confusion.<br> <br></div><blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
We are considering a distribute of 3 bricks - b1, b2, b3.<br>
<br>
Case 1:<br>
=======<br>
<br>
Operation: rename (src, dst) - dst does not exist<br>
<br>
T0: rename successful on Hashed subvol but not on other bricks<br>
T1: Snapshot on b1, b2, b3<br>
<br>
Result: After snapshot is restored and healing is complete on src, dst we end up with two directories src and dst having gfid of src<br>
<br>
Case 2:<br>
=======<br>
<br>
Operation: Two parallel rename (src, dst) and rename (dst, src). Both src and dst exist and hash to b1 and b2 respectively<br>
<br>
T0: rename (src, dst) successful on b1<br>
T1: rename (dst, src) successful on b2<br>
T3: Snapshot on b1, b2, b3<br>
<br>
Result:<br>
After restore, if lookup happens on src and is healed to b1 from b2, gfids of src on each brick will be,<br>
b1 - (src, dst-gfid)<br>
b2 - (src, dst-gfid)<br>
b3 - (src, src-gfid)<br>
<br>
Case 3:<br>
=======<br>
<br>
Operation: Parallel rename and two mkdirs. Only src exists. Both hash to same brick b1.<br>
<br>
T0: two lookups triggered as part of application mkdir1 and mkdir2 complete with ENOENT.<br>
T1: mkdir2 goes ahead and creates directory with gfid, gfid1<br>
T2: rename (src, dst) on b1<br>
T3: mkdir1 (src) on b1<br>
T4: snapshot on b1, b2 and b3<br>
<br>
Result:<br>
After restore and healing of src and dst, we end up with,<br>
b1 - (src, gfid2) and (dst, gfid1)<br>
b2 - (src, gfid1) and (dst, gfid1)<br>
b3 - (src, gfid1) and (dst, gfid1)<br>
<br>
Another reason for this inconsistency is that dht don't consider mkdir failures with EEXIST on subvols as failures. More details can be found in [2].<br>
<br>
Case 4:<br>
=======<br>
<br>
Operation: Parallel rename (src, dst) and rmdir (src). Both src and dst exist with gfids gfid1 and gfid2 respectively<br>
<br>
T0: rename (src, dst) on b1<br>
T1: rmdir (src) on b2 and b3<br>
T2: snapshot on b1, b2 and b3<br>
<br>
Result: After restore and healing,<br>
b1 - (dst, gfid1)<br>
b2 - (dst, gfid2)<br>
b3 - (dst, gfid2)<br>
<br>
case 5:<br>
=======<br>
<br>
This bug was hit and fix being reviewed at [1]<br>
<br>
Operation: Parallel two rmdir and two mkdirs. Directory dir does not exist to start with.<br>
<br>
T0: two lookups triggered as part of application mkdir1 and mkdir2 complete with ENOENT.<br>
T1: mkdir2 goes ahead and creates directory with gfid, gfid1<br>
T2: rmdir1 (dir) on b1<br>
T3: lookup (dir) triggered as part of rmdir2 (or any name based opeartion), heals dir on b1 with gfid, gfid2<br>
T4: mkdir1 (dir, gfid2) on b2 and b3<br>
T5: snapshots on b1, b2 and b3<br>
<br>
Result:<br>
b1 - (dir, gfid1)<br>
b2 - (dir, gfid2)<br>
b3 - (dir, gfid2)<br>
<br>
Considering all these issues, following set of fixes have been proposed:<br>
<br>
1. in posix, if we receive mkdir (dir1) on an existing gfid (with name dir2), posix will convert mkdir (dir1) into rename (dir1, dir2). This solves case 1<br>
<br>
2. in case of rename (src, dst), if dst already exists, rmdir (dst), so that we don't bring in inconsistency into dst gfid space. This solves all the cases of inconsistencies in dst gfid with rename failing.<br>
<br>
3. hold entrylks in directory heal (part of lookup) and rmdir. This solves consistency issues because of races b/w mkdir and rmdir.<br>
<br>
[1] <a href="http://review.gluster.org/#/c/4846/" target="_blank">http://review.gluster.org/#/c/4846/</a><br>
[2] <a href="http://review.gluster.org/4459" target="_blank">http://review.gluster.org/4459</a><br>
<br>
regards,<br>
Raghavendra.<br>
<br>
_______________________________________________<br>
Gluster-devel mailing list<br>
<a href="mailto:Gluster-devel@nongnu.org">Gluster-devel@nongnu.org</a><br>
<a href="https://lists.nongnu.org/mailman/listinfo/gluster-devel" target="_blank">https://lists.nongnu.org/mailman/listinfo/gluster-devel</a><br>
</blockquote></div><br><br clear="all"><br>-- <br>Raghavendra G<br>
</div></div>