<div dir="ltr">On Fri, Sep 27, 2013 at 7:47 PM, Brian Foster <span dir="ltr"><<a href="mailto:bfoster@redhat.com" target="_blank">bfoster@redhat.com</a>></span> wrote:<br><div class="gmail_extra"><div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div class="im">On 09/26/2013 01:27 PM, M. Mohan Kumar wrote:<br>
> In BD xlator cloning refers to full copy of the file (after the copy<br>
> there is no relationship between these 2 files). Snapshot refers to COW<br>
> of the file. I guess these terminologies need to be generalized. I can<br>
> choose "copy" for full copy functionality and either "clone" or<br>
> "snapshot" for COW functionality.<br>
><br>
<br>
</div>That sounds reasonable to me. I'm wondering if there are ways to<br>
multiplex the commands in a more general way (i.e., clone means one<br>
thing on a bd device, another on a regular file), but for now it might<br>
be better to just get all of the pieces in place.<br></blockquote><div><br></div><div style>Ok, qcow2 patches uses <span style="color:rgb(53,53,53);font-size:9pt">trusted.glusterfs.block-format </span>xattr to create a clone/snapshot. IIUC normal users can't operate on trusted namespace xattrs, it needs a change.</div>
<div style> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div class="im">
> Sorry, I could not go through complete qemu-block xlator code. If I<br>
> understand correctly your patch<br>
> (<a href="http://review.gluster.com/#/c/5967/" target="_blank">http://review.gluster.com/#/c/5967/</a>) also expects destination file to<br>
> exist before cloning? ie in 'setxattr -n trusted.glusterfs.block-format<br>
> -v "qcow2:10GB:<bimg>" ./newimage', the file newimage exists before<br>
> calling this operation, isn't it?<br>
><br>
<br>
</div>Yes, that's correct. BTW, my understanding of the bd case is that the<br>
destination file has to exist in the same manner, correct?<br></blockquote><div style><br></div><div style>Correct </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div class="im"><br>
> BD patches allow users to specify full path of the destination file as<br>
> part of xattr value because thats how management tools pass the path for<br>
> creating snapshot or complete copy of a KVM image. Current interface for<br>
> offloaded functionality for BD: # setfattr -n [clone|snapshot] -v<br>
> "/path/to/destination-file" path-to-source-file. So IIUC in both the<br>
> approaches we need to validate both source and destination files or am I<br>
> missing something here?<br>
><br>
<br>
</div>Yeah, I'm not disputing that the resolving code is necessary. I'm<br>
suggesting it could be independently useful to bd snapshots or<br>
qemu-block (file) snapshots. Why make each snapshot capable translator<br>
mechanism solve the same problem?<br>
<br>
My thought was generally:<br>
<br>
- "snapman" xlator sits high up in the graph and implements the setxattr<br>
interface<br>
- independent lower level cow/snap mechanism translators work on a lower<br>
level (possibly common) interface, such as ioctl<br>
<br>
Of course, ioctl support doesn't exist at the moment and it's not<br>
necessarily given that it's the right approach (perhaps a fop, or<br>
something else?). So perhaps the right thing for now is to emulate this<br>
kind of thing with another internal setxattr that encapsulates all the<br>
work the interface translator has done (i.e., pass down the gfid's<br>
instead of paths, etc.).<br>
<br></blockquote><div style><br></div><div style>Looks like long way to go, may be I will post my current set of patches to Gerrit</div><div style>for review. Offload part can wait for these standardization process.</div>
<div style> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
The longer term idea is that an application can choose to use that lower<br>
level interface. Think 'cp --reflink,' for example. In that case, an<br>
ioctl is a requirement, but the broader point is that by virtue of the<br>
interface the application has already been required to resolve the paths<br>
(to open the files) and the higher level interface is not necessary.<br>
Thoughts?<br></blockquote><div style><br></div><div style>Ideally cp --reflink should be used for creating COW files(I even documented</div><div style>this in my bd_map xlator presentation). I guess cp --reflink needs a patch to </div>
<div style>work with FUSE (and GlusterFS). Also glusterfs needs new reflink FOP. But </div><div style>upstream effort for making reflink syscall still pending.</div><div style><br></div><div style>Also there is recent effort to provide copy functionality through splice interface,</div>
<div style>once it stabilizes FUSE/GlusterFS need to support this FOP also. Copy functionality</div><div style>is commonly used in VM environment and this offloaded copy will be really useful</div><div style>in this situation.</div>
<div style><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div class="im"><br>
> Also how about listing snapshots/origin for a given file and ability to<br>
> merge the external snapshot file to the origin? BD xlator has the<br>
> capability to list origin (ie gfid) for a given snapshot, but listing<br>
> snapshots for a given file is not implemented.<br>
><br>
<br>
</div>I started thinking about this a bit for the qemu-block case, but tbh I'm<br>
not sure how generic we can make this. Perhaps the interface translator<br>
can be responsible for this just using a generic set of xattrs?<br>
<br>
IMO, I don't think everything has to be completely worked out and<br>
genericized up front. I just think if we move in that general direction<br>
as far as how the code is broken up, it will be easier to line things up<br>
cleanly as they come in (as long as there is an understanding that<br>
interfaces might change and whatnot).<br>
<br></blockquote><div style><br></div><div style>I have merge functionality in BD xlator and also listing origin for a given</div><div style>snapshot, but nothing 'standardized'.</div><div style> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<br><div class="im">
> Anand, shall I post the patches to gerrit, may be next round we can<br>
> decide about the interfaces for offload operations?<br>
><br>
><br>
><br>
> On Wed, Sep 25, 2013 at 6:20 PM, Brian Foster <<a href="mailto:bfoster@redhat.com">bfoster@redhat.com</a><br>
</div><div><div class="h5">> <mailto:<a href="mailto:bfoster@redhat.com">bfoster@redhat.com</a>>> wrote:<br>
><br>
> On 09/25/2013 12:27 AM, M. Mohan Kumar wrote:<br>
> > Here is the right link: <a href="http://review.gluster.org/#/c/5626/" target="_blank">http://review.gluster.org/#/c/5626/</a><br>
> ><br>
><br>
> Thanks guys. I haven't taken a deep look at the code, but some initial<br>
> high-level comments...<br>
><br>
> The first thing I notice is that we take the opposite approach in the<br>
> associated qemu-block command. The target of the clone command is the<br>
> new file (referencing the source) rather than the original file passing<br>
> in a name of the target. Personally, I find the former more natural as a<br>
> core interface. The error handling is more straightforward (i.e.,<br>
> ENOENT) and it matches more closely with native primitives that provide<br>
> this kind of functionality (i.e., correct me if wrong, but I think we<br>
> observed that btrfs clone works via ioctl on the target fd, providing<br>
> the source fd as a parameter).<br>
><br>
> That said, I'm not sure if that is considered more user-friendly or not.<br>
> If that's a concern, could we change the low level interface to work as<br>
> described (i.e., user issues command on source file, high level code<br>
> converts into command on target file)? IOW, I think a nice goal going<br>
> forward would be to have the low level mechanisms standardize on some<br>
> kind of ioctl, and the higher level code become convenience commands<br>
> that simply exercise the ioctl (and what actually happens after that<br>
> depends on the type of file, what translators are loaded, etc.). I guess<br>
> that's hand wavy at the moment, but the idea is that all of this path<br>
> resolving and whatnot becomes generic and independent rather than<br>
> specific to and duplicated across each snapshot/clone mechanism we<br>
> provide.<br>
><br>
> Secondarily, but somewhat related... does the path resolving code that<br>
> is there now have to be buried in fuse-bridge? Avati and I have briefly<br>
> discussed this idea of separating the management here into an<br>
> independent translator, and I think this falls in as a perfect candidate<br>
> for something like that. The resolving code is non-trivial, however, so<br>
> I'm not sure if there are serious technical hurdles for that kind of<br>
> approach. For example, is it possible/reasonable to push this into a new<br>
> translator beneath fuse (or perhaps library code?) and just skip linking<br>
> the inode into the parent table until/unless that happens naturally?<br>
> Thoughts?<br>
><br>
> Brian<br>
><br>
> ><br>
> > On Wed, Sep 25, 2013 at 6:53 AM, M. Mohan Kumar<br>
</div></div>> <<a href="mailto:mohankumar.m@gmail.com">mohankumar.m@gmail.com</a> <mailto:<a href="mailto:mohankumar.m@gmail.com">mohankumar.m@gmail.com</a>>>wrote:<br>
<div class="im">> ><br>
> >><br>
> >><br>
> >><br>
> <a href="http://review.gluster.org/#/q/owner:%22M.+Mohan+Kumar+%253Cmohan%2540in.ibm.com%253E%22,n,z" target="_blank">http://review.gluster.org/#/q/owner:%22M.+Mohan+Kumar+%253Cmohan%2540in.ibm.com%253E%22,n,z</a><br>
> >><br>
> >> I also replied to your other comments.<br>
> >><br>
> >><br>
> >><br>
> >><br>
> >><br>
> >> On Wednesday, September 25, 2013, Anand Avati <<a href="mailto:avati@gluster.org">avati@gluster.org</a><br>
</div><div class="im">> <mailto:<a href="mailto:avati@gluster.org">avati@gluster.org</a>>> wrote:<br>
> >>> Adding Brian Foster (and gluster-devel) for the discussion of<br>
> unified UI<br>
> >> for snapshotting.<br>
> >>> Mohan, I must have missed your comment. Can you please point to the<br>
> >> specific patch where you posted your comment?<br>
> >>> Avati<br>
> >>><br>
> >>> On Tue, Sep 24, 2013 at 9:29 AM, M. Mohan Kumar<br>
</div>> <<a href="mailto:mohankumar.m@gmail.com">mohankumar.m@gmail.com</a> <mailto:<a href="mailto:mohankumar.m@gmail.com">mohankumar.m@gmail.com</a>>><br>
<div class=""><div class="h5">> >> wrote:<br>
> >>>><br>
> >>>> Hi Avati,<br>
> >>>> I am ready with V5 of BD xlator patches (I consolidated the<br>
> patches to<br>
> >> 5). Before posting them I wanted your opinion about the<br>
> interfaces I use<br>
> >> for creating clone and snapshot. I posted them on Gerrit few days<br>
> back.<br>
> >> Could you please respond to that?<br>
> >>>><br>
> >>>> --<br>
> >>>> Regards,<br>
> >>>> Mohan.<br>
> >>><br>
> >><br>
> >> --<br>
> >> Regards,<br>
> >> Mohan.<br>
> >><br>
> ><br>
> ><br>
> ><br>
><br>
><br>
><br>
><br>
> --<br>
> Regards,<br>
> Mohan.<br>
<br>
</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br>Regards,<br>Mohan.
</div></div>