<div dir="ltr">On Fri, Sep 27, 2013 at 7:47 PM, Brian Foster <span dir="ltr">&lt;<a href="mailto:bfoster@redhat.com" target="_blank">bfoster@redhat.com</a>&gt;</span> wrote:<br><div class="gmail_extra"><div class="gmail_quote">

<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div class="im">On 09/26/2013 01:27 PM, M. Mohan Kumar wrote:<br>

&gt; In BD xlator cloning refers to full copy of the file (after the copy<br>

&gt; there is no relationship between these 2 files). Snapshot refers to COW<br>

&gt; of the file. I guess these terminologies need to be generalized. I can<br>

&gt; choose &quot;copy&quot; for full copy functionality and either &quot;clone&quot; or<br>

&gt; &quot;snapshot&quot; for COW functionality.<br>

&gt;<br>

<br>

</div>That sounds reasonable to me. I&#39;m wondering if there are ways to<br>

multiplex the commands in a more general way (i.e., clone means one<br>

thing on a bd device, another on a regular file), but for now it might<br>

be better to just get all of the pieces in place.<br></blockquote><div><br></div><div style>Ok, qcow2 patches uses  <span style="color:rgb(53,53,53);font-size:9pt">trusted.glusterfs.block-format </span>xattr to create a clone/snapshot. IIUC normal users can&#39;t operate on trusted namespace xattrs, it needs a change.</div>

<div style>  </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div class="im">

&gt; Sorry, I could not go through complete qemu-block xlator code. If I<br>

&gt; understand correctly your patch<br>

&gt; (<a href="http://review.gluster.com/#/c/5967/" target="_blank">http://review.gluster.com/#/c/5967/</a>) also expects destination file to<br>

&gt; exist before cloning? ie in &#39;setxattr -n trusted.glusterfs.block-format<br>

&gt; -v &quot;qcow2:10GB:&lt;bimg&gt;&quot; ./newimage&#39;,  the file newimage exists before<br>

&gt; calling this operation, isn&#39;t it?<br>

&gt;<br>

<br>

</div>Yes, that&#39;s correct. BTW, my understanding of the bd case is that the<br>

destination file has to exist in the same manner, correct?<br></blockquote><div style><br></div><div style>Correct </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

<div class="im"><br>

&gt; BD patches allow users to specify full path of the destination file as<br>

&gt; part of xattr value because thats how management tools pass the path for<br>

&gt; creating snapshot or complete copy of a KVM image. Current interface for<br>

&gt; offloaded functionality for BD: # setfattr -n [clone|snapshot] -v<br>

&gt; &quot;/path/to/destination-file&quot; path-to-source-file. So IIUC in both the<br>

&gt; approaches we need to validate both source and destination files or am I<br>

&gt; missing something here?<br>

&gt;<br>

<br>

</div>Yeah, I&#39;m not disputing that the resolving code is necessary. I&#39;m<br>

suggesting it could be independently useful to bd snapshots or<br>

qemu-block (file) snapshots. Why make each snapshot capable translator<br>

mechanism solve the same problem?<br>

<br>

My thought was generally:<br>

<br>

- &quot;snapman&quot; xlator sits high up in the graph and implements the setxattr<br>

interface<br>

- independent lower level cow/snap mechanism translators work on a lower<br>

level (possibly common) interface, such as ioctl<br>

<br>

Of course, ioctl support doesn&#39;t exist at the moment and it&#39;s not<br>

necessarily given that it&#39;s the right approach (perhaps a fop, or<br>

something else?). So perhaps the right thing for now is to emulate this<br>

kind of thing with another internal setxattr that encapsulates all the<br>

work the interface translator has done (i.e., pass down the gfid&#39;s<br>

instead of paths, etc.).<br>

<br></blockquote><div style><br></div><div style>Looks like long way to go, may be I will post my current set of patches to Gerrit</div><div style>for review. Offload part can wait for these standardization process.</div>

<div style> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

The longer term idea is that an application can choose to use that lower<br>

level interface. Think &#39;cp --reflink,&#39; for example. In that case, an<br>

ioctl is a requirement, but the broader point is that by virtue of the<br>

interface the application has already been required to resolve the paths<br>

(to open the files) and the higher level interface is not necessary.<br>

Thoughts?<br></blockquote><div style><br></div><div style>Ideally  cp --reflink should be used for creating COW files(I even documented</div><div style>this in my bd_map xlator presentation). I guess cp --reflink needs a patch to </div>

<div style>work with FUSE (and GlusterFS). Also glusterfs needs new reflink FOP. But </div><div style>upstream effort for making reflink syscall still pending.</div><div style><br></div><div style>Also there is recent effort to provide copy functionality through splice interface,</div>

<div style>once it stabilizes FUSE/GlusterFS need to support this FOP also. Copy functionality</div><div style>is commonly used in VM environment and this offloaded copy will be really useful</div><div style>in this situation.</div>

<div style><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div class="im"><br>

&gt; Also how about listing snapshots/origin for a given file and ability to<br>

&gt; merge the external snapshot file to the origin? BD xlator has the<br>

&gt; capability to list origin (ie gfid) for a given snapshot, but listing<br>

&gt; snapshots for a given file is not implemented.<br>

&gt;<br>

<br>

</div>I started thinking about this a bit for the qemu-block case, but tbh I&#39;m<br>

not sure how generic we can make this. Perhaps the interface translator<br>

can be responsible for this just using a generic set of xattrs?<br>

<br>

IMO, I don&#39;t think everything has to be completely worked out and<br>

genericized up front. I just think if we move in that general direction<br>

as far as how the code is broken up, it will be easier to line things up<br>

cleanly as they come in (as long as there is an understanding that<br>

interfaces might change and whatnot).<br>

<br></blockquote><div style><br></div><div style>I have merge functionality in BD xlator and also listing origin for a given</div><div style>snapshot, but nothing &#39;standardized&#39;.</div><div style> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

<br><div class="im">

&gt; Anand, shall I post the patches to gerrit, may be next round we can<br>

&gt; decide about the interfaces for offload operations?<br>

&gt;<br>

&gt;<br>

&gt;<br>

&gt; On Wed, Sep 25, 2013 at 6:20 PM, Brian Foster &lt;<a href="mailto:bfoster@redhat.com">bfoster@redhat.com</a><br>

</div><div><div class="h5">&gt; &lt;mailto:<a href="mailto:bfoster@redhat.com">bfoster@redhat.com</a>&gt;&gt; wrote:<br>

&gt;<br>

&gt;     On 09/25/2013 12:27 AM, M. Mohan Kumar wrote:<br>

&gt;     &gt; Here is the right link: <a href="http://review.gluster.org/#/c/5626/" target="_blank">http://review.gluster.org/#/c/5626/</a><br>

&gt;     &gt;<br>

&gt;<br>

&gt;     Thanks guys. I haven&#39;t taken a deep look at the code, but some initial<br>

&gt;     high-level comments...<br>

&gt;<br>

&gt;     The first thing I notice is that we take the opposite approach in the<br>

&gt;     associated qemu-block command. The target of the clone command is the<br>

&gt;     new file (referencing the source) rather than the original file passing<br>

&gt;     in a name of the target. Personally, I find the former more natural as a<br>

&gt;     core interface. The error handling is more straightforward (i.e.,<br>

&gt;     ENOENT) and it matches more closely with native primitives that provide<br>

&gt;     this kind of functionality (i.e., correct me if wrong, but I think we<br>

&gt;     observed that btrfs clone works via ioctl on the target fd, providing<br>

&gt;     the source fd as a parameter).<br>

&gt;<br>

&gt;     That said, I&#39;m not sure if that is considered more user-friendly or not.<br>

&gt;     If that&#39;s a concern, could we change the low level interface to work as<br>

&gt;     described (i.e., user issues command on source file, high level code<br>

&gt;     converts into command on target file)? IOW, I think a nice goal going<br>

&gt;     forward would be to have the low level mechanisms standardize on some<br>

&gt;     kind of ioctl, and the higher level code become convenience commands<br>

&gt;     that simply exercise the ioctl (and what actually happens after that<br>

&gt;     depends on the type of file, what translators are loaded, etc.). I guess<br>

&gt;     that&#39;s hand wavy at the moment, but the idea is that all of this path<br>

&gt;     resolving and whatnot becomes generic and independent rather than<br>

&gt;     specific to and duplicated across each snapshot/clone mechanism we<br>

&gt;     provide.<br>

&gt;<br>

&gt;     Secondarily, but somewhat related... does the path resolving code that<br>

&gt;     is there now have to be buried in fuse-bridge? Avati and I have briefly<br>

&gt;     discussed this idea of separating the management here into an<br>

&gt;     independent translator, and I think this falls in as a perfect candidate<br>

&gt;     for something like that. The resolving code is non-trivial, however, so<br>

&gt;     I&#39;m not sure if there are serious technical hurdles for that kind of<br>

&gt;     approach. For example, is it possible/reasonable to push this into a new<br>

&gt;     translator beneath fuse (or perhaps library code?) and just skip linking<br>

&gt;     the inode into the parent table until/unless that happens naturally?<br>

&gt;     Thoughts?<br>

&gt;<br>

&gt;     Brian<br>

&gt;<br>

&gt;     &gt;<br>

&gt;     &gt; On Wed, Sep 25, 2013 at 6:53 AM, M. Mohan Kumar<br>

</div></div>&gt;     &lt;<a href="mailto:mohankumar.m@gmail.com">mohankumar.m@gmail.com</a> &lt;mailto:<a href="mailto:mohankumar.m@gmail.com">mohankumar.m@gmail.com</a>&gt;&gt;wrote:<br>

<div class="im">&gt;     &gt;<br>

&gt;     &gt;&gt;<br>

&gt;     &gt;&gt;<br>

&gt;     &gt;&gt;<br>

&gt;     <a href="http://review.gluster.org/#/q/owner:%22M.+Mohan+Kumar+%253Cmohan%2540in.ibm.com%253E%22,n,z" target="_blank">http://review.gluster.org/#/q/owner:%22M.+Mohan+Kumar+%253Cmohan%2540in.ibm.com%253E%22,n,z</a><br>

&gt;     &gt;&gt;<br>

&gt;     &gt;&gt; I also replied to your other comments.<br>

&gt;     &gt;&gt;<br>

&gt;     &gt;&gt;<br>

&gt;     &gt;&gt;<br>

&gt;     &gt;&gt;<br>

&gt;     &gt;&gt;<br>

&gt;     &gt;&gt; On Wednesday, September 25, 2013, Anand Avati &lt;<a href="mailto:avati@gluster.org">avati@gluster.org</a><br>

</div><div class="im">&gt;     &lt;mailto:<a href="mailto:avati@gluster.org">avati@gluster.org</a>&gt;&gt; wrote:<br>

&gt;     &gt;&gt;&gt; Adding Brian Foster (and gluster-devel) for the discussion of<br>

&gt;     unified UI<br>

&gt;     &gt;&gt; for snapshotting.<br>

&gt;     &gt;&gt;&gt; Mohan, I must have missed your comment. Can you please point to the<br>

&gt;     &gt;&gt; specific patch where you posted your comment?<br>

&gt;     &gt;&gt;&gt; Avati<br>

&gt;     &gt;&gt;&gt;<br>

&gt;     &gt;&gt;&gt; On Tue, Sep 24, 2013 at 9:29 AM, M. Mohan Kumar<br>

</div>&gt;     &lt;<a href="mailto:mohankumar.m@gmail.com">mohankumar.m@gmail.com</a> &lt;mailto:<a href="mailto:mohankumar.m@gmail.com">mohankumar.m@gmail.com</a>&gt;&gt;<br>

<div class=""><div class="h5">&gt;     &gt;&gt; wrote:<br>

&gt;     &gt;&gt;&gt;&gt;<br>

&gt;     &gt;&gt;&gt;&gt; Hi Avati,<br>

&gt;     &gt;&gt;&gt;&gt; I am ready with V5 of BD xlator patches (I consolidated the<br>

&gt;     patches to<br>

&gt;     &gt;&gt; 5). Before posting them I wanted your opinion about the<br>

&gt;     interfaces I use<br>

&gt;     &gt;&gt; for creating clone and snapshot. I posted them on Gerrit few days<br>

&gt;     back.<br>

&gt;     &gt;&gt; Could you please respond to that?<br>

&gt;     &gt;&gt;&gt;&gt;<br>

&gt;     &gt;&gt;&gt;&gt; --<br>

&gt;     &gt;&gt;&gt;&gt; Regards,<br>

&gt;     &gt;&gt;&gt;&gt; Mohan.<br>

&gt;     &gt;&gt;&gt;<br>

&gt;     &gt;&gt;<br>

&gt;     &gt;&gt; --<br>

&gt;     &gt;&gt; Regards,<br>

&gt;     &gt;&gt; Mohan.<br>

&gt;     &gt;&gt;<br>

&gt;     &gt;<br>

&gt;     &gt;<br>

&gt;     &gt;<br>

&gt;<br>

&gt;<br>

&gt;<br>

&gt;<br>

&gt; --<br>

&gt; Regards,<br>

&gt; Mohan.<br>

<br>

</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br>Regards,<br>Mohan.

</div></div>