<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    Hi Shishir, <br>

    &nbsp;&nbsp;&nbsp; Thank you for sending out your paper.&nbsp; Here are some comments I

    have (written in markdown format):<br>

    <br>

    # Review of Online Snapshot Support for GlusterFS<br>

    <br>

    ## Section: Introduction<br>

    * Primary use case should have a better explanation.&nbsp; It does not

    explain how the user currently compensating for not currently having

    the technology in their environment, nor the benefits of having the

    feature.<br>

    * Last sentence should explain why it is the same.&nbsp; Why would it

    be?&nbsp; No benefits can be gained from having this feature for non-vm

    image environments?&nbsp; If not, then the name should be changed to

    vmsnapshots or something that discourages usage in environments

    other than VM image storage.<br>

    <br>

    ## Section: Snapshot Architecture<br>

    * The architecture section does not talk about architecture, but

    instead focuses on certain modes of operation. Please explain how a

    user from either a client or something like OpenStack interface

    interact with the snapshots.&nbsp; Also describe in good detail all

    aspects of operation (delete,create,etc.).&nbsp; Describe here the

    concept of Barriers instead of at the end of the document.<br>

    * I'm new to GlusterFS, but I am confused on what is meant by bullet

    #3: "The planned support is for GlusterFS Volume based

    snapshots...".&nbsp; Seems like the sentence is not finished.&nbsp; Do you

    mean "The planned support is for snapshots of GlusterFS

    volumes..."?&nbsp; Also, how is brick coherency kept across multiple AFR

    nodes?<br>

    * Snapshot Consistency section is confusing, please reword the

    description.&nbsp; Maybe change the format to paragraphs instead of

    bullets<br>

    * Please explain why there is a snapshot limit of 256.&nbsp; Are we using

    only one byte for tracking a snapshot id?<br>

    * When the CLI executes multiple volume snapshots, is it possible to

    execute them in parallel?&nbsp; Why do they need to be serially

    processed?<br>

    * What happens when `restore` is executed?&nbsp; How does the volume

    state change?&nbsp; Does the .gluster directory change in any way?<br>

    * What happens when `delete` is executed?&nbsp; When we have the

    following snaps `A-&gt;B-&gt;C-&gt;D`, and we delete `B`, what

    happens to the state of the volume?&nbsp; Do the changes from `B` get

    merged to `A` so that it provided the dependencies needed by `C`?<br>

    * Using the example above, can I branch or clone from `B` to `B'`

    and create a *new* volume?&nbsp; I am guessing that the LVM technology

    would probably not allow this, but maybe btrfs would.<br>

    <br>

    ## Section: Data Flow<br>

    * This section is confusing.&nbsp; Why are they bullets if they read as a

    sequence?&nbsp; This seems to me more like a project requirements list

    than a data flow description.<br>

    * What are the side effects of acquiring the cluster wide lock?&nbsp;

    What benefits/concerns should it have on the system with N nodes?<br>

    * What is the average amount of time the CLI will expect to be

    blocked before it returns?<br>

    &nbsp;&nbsp;&nbsp; * I am not sure if we have something like this already, but we

    may want to discuss the concept of a JOB manager.&nbsp; For example, here

    the CLI will send a request which may take longer than 3 secs.&nbsp; In

    such a situation, the CLI will be returned a JOB ticket number.&nbsp; The

    user can then query the JOB manager and provide the ticket number

    for status, or provide a callback mechanism (which is a little

    harder, but possible to do).&nbsp; In any case, I think this JOB manager

    falls outside the scope of this paper, but is something we should

    revisit if we do not already posses.<br>

    * The bullet "Once barrier is on, initiate back-end snapshot."

    should explain in greater detail what is meant by "back-end

    snapshot".<br>

    <br>

    ## Section: CLI Interface<br>

    * Each one of these commands should be explained in the architecture

    section in fine detail on how they affect volume state changes and

    side effects.<br>

    <br>

    ## Section: Snapshot Design<br>

    * Does the amount of content in a brick affect the create, delete,

    list, or restore snapshot time?<br>

    * The paper only describes `create` in the first part of the

    section.&nbsp; There probably should be a subsection for each of the

    commands supported, each describing in detail how they are planned

    to be implemented.<br>

    * Could there be a section showing how JSON/XML interfaces would be

    supporting this feature?<br>

    <br>

    ### Subsection: Stage-1 Prepare<br>

    * Are barriers on multiple bricks executed serially?&nbsp; What is the

    maximum number of bricks supported by the snapshot feature before

    taking an unusual amount of time to execute?&nbsp; Should brick barriers

    be done in parallel?<br>

    * This again seems like a requirement list and sometimes like a

    sequence.&nbsp; Please reword section.<br>

    <br>

    ## Section: Barrier<br>

    * Paragraph states "unless Asynchronous IO is used".&nbsp; How does that

    affect the barrier and snapshots?&nbsp; Paper does not describe this

    situation.<br>

    * A description of the planned Barrier design will help understand

    what is meant by queuing of fops.<br>

    * Will the barrier be implemented as a new xlator which will be

    interested on the fly when a snapshot is requested, or will it

    require changes to existing xlators?&nbsp; If it is not planned to be a

    xlator, should it be implemented as such to provide code isolation?<br>

    * Why are `write` and `unlink` not fops to be barriered?&nbsp; Barriers

    still allow disk changes?&nbsp; Maybe the paper should describe why it

    allows certain calls to affect the disk and how these changes may or

    may not affect the snapshot or the volume state.<br>

    <br>

    ## Section: Snapshot management<br>

    * Item `#2` is confusing, please reword.<br>

    * Item `#3` says that individual snapshots will not be supported.&nbsp;

    If that is true, then what does `delete` do?<br>

    * Item `#7` is confusing.&nbsp; Please reword.&nbsp; The paper should state

    why the user and developer need to know this information.<br>

    * Item `#8` is confusing.&nbsp; Is the item stating that the user can

    only do certain commands on a volume snapshot restore?&nbsp; If this is

    true, are volume snapshot restores not a true volume restore where

    the volume is back to a previous state?&nbsp; What is the benefit of this

    feature to the user?<br>

    * Item `#9` seems like an outline for the `delete` design.&nbsp; There

    needs to be more information here in greater detail as discussed

    above.<br>

    * Item `#10` needs to describe why it is proposing that a restored

    snapshot is shown as a snap shot volume.&nbsp; Is a volume with snapshot

    not identified as a snap volume also?<br>

    <br>

    ## Section: Error Scenarios<br>

    * Please reword item `#3`.<br>

    <br>

    ## Section: Open-ended issues<br>

    * Item `#4` is confusing.&nbsp; Please reword.<br>

    * Item `#6` suggests that snapshot volumes can be mounted.&nbsp; Can a

    snapshot *and* the latest volume be mounted at the same time?&nbsp; If

    the volume is `reverted` to a previous snapshot so that the user can

    inspect the volume state, I highly suggest on keeping all snapshot

    mounts as Read-Only.&nbsp; If the user wants to write to that mount, they

    should delete all snapshots to that point.&nbsp; I highly discourage this

    feature from dealing with merges.<br>

    * Item `#8` does not describe what will happen if a re-balance is

    initiated.&nbsp; Will snaps be deleted?&nbsp; I do not think these constraints

    are a good alternative.&nbsp; In my opinion, the snapshot features should

    support all GlusterFS high availability features.<br>

    * Item `#9` does not describe what the `master` volume is.&nbsp; Does it

    mean what the user cannot revert to a previous snapshot?&nbsp; If this is

    true, does that not violate the original requirement?<br>

    <br>

    ## Section: Upgrade/Downgrade<br>

    * This section describes that snap state will be maintained in

    `/var/lib/glusterd...`.&nbsp; The paper needs to describe snapshot state

    in greated detail in the `Design` section.&nbsp; For example, what state

    is kept in `/var/lib/glusterd...` and what state is read from the

    underlying file system snapshot technology?&nbsp; What happens when the

    underlying file system snapshot technology has one state and

    `/var/lib/glusterd...` has another?<br>

    <br>

    Look forward to your reply.<br>

    <br>

    - Luis<br>

    <br>

    <div class="moz-cite-prefix">On 08/02/2013 02:26 AM, Shishir Gowda

      wrote:<br>

    </div>

    <blockquote

      cite="mid:1485495815.10402866.1375424816058.JavaMail.root@redhat.com"

      type="cite">

      <pre wrap="">Hi All,

We propose to implement snapshot support for glusterfs volumes in release-3.6.

Attaching the design document in the mail thread.

Please feel free to comment/critique.

With regards,

Shishir</pre>

      <br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <br>

      <pre wrap="">_______________________________________________

Gluster-devel mailing list

<a class="moz-txt-link-abbreviated" href="mailto:Gluster-devel@nongnu.org">Gluster-devel@nongnu.org</a>

<a class="moz-txt-link-freetext" href="https://lists.nongnu.org/mailman/listinfo/gluster-devel">https://lists.nongnu.org/mailman/listinfo/gluster-devel</a>

</pre>

    </blockquote>

    <br>

  </body>

</html>