<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <div class="moz-cite-prefix">Al 04/09/13 02:55, En/na Anand Avati ha

      escrit:<br>

    </div>

    <blockquote

cite="mid:CAFboF2w2BTVefSNMaBDexQrVqSYFmjwLsVWj2uUXFXjU3dwqdQ@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div class="gmail_extra"><br>

          <div class="gmail_quote">On Tue, Sep 3, 2013 at 1:42 AM,

            Xavier Hernandez <span dir="ltr">&lt;<a

                moz-do-not-send="true"

                href="mailto:xhernandez@datalab.es" target="_blank">xhernandez@datalab.es</a>&gt;</span>

            wrote:<br>

            <blockquote class="gmail_quote" style="margin:0 0 0

              .8ex;border-left:1px #ccc solid;padding-left:1ex">

              <div text="#000000" bgcolor="#FFFFFF">

                <div>Al 03/09/13 09:33, En/na Anand Avati ha escrit:<br>

                </div>

                <div class="im">

                  <blockquote type="cite">

                    <div dir="ltr">On Mon, Sep 2, 2013 at 7:24 AM,

                      Xavier Hernandez <span dir="ltr">&lt;<a

                          moz-do-not-send="true"

                          href="mailto:xhernandez@datalab.es"

                          target="_blank">xhernandez@datalab.es</a>&gt;</span>

                      wrote:<br>

                      <div class="gmail_extra">

                        <div class="gmail_quote">

                          <blockquote class="gmail_quote"

                            style="margin:0 0 0 .8ex;border-left:1px

                            #ccc solid;padding-left:1ex">Hi,<br>

                            <br>

                            dict_t structures are widely used in

                            glusterfs. I've some ideas that could

                            improve its performance.<br>

                            <br>

                            * On delete operations, return the current

                            value if it exists.<br>

                            <br>

                            This is very useful when we want to get a

                            value and remove it from the dictionary.

                            This way it can be done accessing and

                            locking the dict_t only once (and it is

                            atomic).<br>

                          </blockquote>

                          <div><br>

                          </div>

                          <div>Makes sense.</div>

                          <div><br>

                          </div>

                          <div>&nbsp;</div>

                          <blockquote class="gmail_quote"

                            style="margin:0 0 0 .8ex;border-left:1px

                            #ccc solid;padding-left:1ex"> * On add

                            operations, return the previous value if it

                            existed.<br>

                            <br>

                            This avoids to use a lookup and a

                            conditional add (and it is atomic).<br>

                          </blockquote>

                          <div><br>

                          </div>

                          <div>Do you mean dict_set()? If so, how do you

                            propose we differentiate between "failure"

                            and "previous value did not exist"? Do you

                            propose setting the previous value into a

                            pointer to pointer, and retain the return

                            value as is today?</div>

                        </div>

                      </div>

                    </div>

                  </blockquote>

                </div>

                Yes, I'm thinking to something similar to dict_set() (by

                the way, I would remove the dict_add() function).</div>

            </blockquote>

            <div><br>

            </div>

            <div style="">dict_add() is used in unserialization routines

              where dict_set() for a big set of keys guaranteed not to

              repeat is very expensive (unserializing would otherwise

              have a quadratic function as its asymptote). What is the

              reason you intend to remove it?</div>

          </div>

        </div>

      </div>

    </blockquote>

    Yes, but it is used only inside dict.c itself. It should not be

    published in the dict.h. This is only a cosmetic change but I think

    it would be better. Any other use of dict_add() by a translator will

    be incorrect unless it is made with much care, which is not

    desirable for future maintenance.<br>

    <br>

    <blockquote

cite="mid:CAFboF2w2BTVefSNMaBDexQrVqSYFmjwLsVWj2uUXFXjU3dwqdQ@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">

            <div>&nbsp;</div>

            <blockquote class="gmail_quote" style="margin:0 0 0

              .8ex;border-left:1px #ccc solid;padding-left:1ex">

              <div text="#000000" bgcolor="#FFFFFF"> What you propose

                would be the simplest solution right now. However I

                think it would be interesting to change the return value

                to an error code (this would supply more detailed

                information in case of failure and we could use EEXIST

                to know if the value already existed. In fact I think it

                would be interesting to progressively change the -1

                return code of many functions by an error code). The

                pointer to pointer argument could be NULL if the

                previous value is not needed.<br>

                <br>

                Of course this would change the function signature,

                breaking a lot of existing code. Another possibility

                could be to create a dict_replace() function, and

                possibly make it to fail if the value didn't exist.</div>

            </blockquote>

            <div><br>

            </div>

            <div style="">It is best we do not change the meaning of

              existing APIs, and just add new APIs instead. The new API

              can be:</div>

            <div style=""><br>

            </div>

            <div style="">int dict_replace (dict_t *dict, const char

              *key, data_t *newval, data_t **oldval);</div>

            <div style=""><br>

            </div>

            <div style="">.. and leave dict_set() as is.</div>

            <div>&nbsp;</div>

          </div>

        </div>

      </div>

    </blockquote>

    That would be good. I would allow that dict_replace() sets *oldval

    to NULL to indicate that the key did not exist, and maintain the

    0/-1 return code to indicate error (this will maintain homogeneity

    with other APIs, though I still think that an error code would be

    more useful in general, but this would be another topic).<br>

    <br>

    <blockquote

cite="mid:CAFboF2w2BTVefSNMaBDexQrVqSYFmjwLsVWj2uUXFXjU3dwqdQ@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">

            <blockquote class="gmail_quote" style="margin:0 0 0

              .8ex;border-left:1px #ccc solid;padding-left:1ex">

              <div text="#000000" bgcolor="#FFFFFF">

                <div class="im"> <br>

                  <blockquote type="cite">

                    <div dir="ltr">

                      <div class="gmail_extra">

                        <div class="gmail_quote">

                          <div>&nbsp;</div>

                          <blockquote class="gmail_quote"

                            style="margin:0 0 0 .8ex;border-left:1px

                            #ccc solid;padding-left:1ex"> * Always

                            return the data_pair_t structure instead of

                            data_t or the data itself.<br>

                            <br>

                            This can be useful to avoid future lookups

                            or other operations on the same element.

                            Macros can be created to simplify writing

                            code to access the actual value.<br>

                          </blockquote>

                          <div><br>

                          </div>

                          <div>The use case is not clear. A more

                            concrete example will help..</div>

                          <div><br>

                          </div>

                        </div>

                      </div>

                    </div>

                  </blockquote>

                </div>

                Having a data_pair_t could help to navigate from an

                existing element (getting next or previous. This is

                really interesting if dict where implemented using a

                sorted structure like a trie since it would allow to

                process a set of similar entries very fast, like the

                trusted.afr.&lt;brick&gt; values for example) or

                removing or replacing it without needing another lookup

                (a more detailed analysis would be needed to see how to

                handle race conditions).<br>

                <br>

                By the way, is really the dict_t structure used

                concurrently ? I haven't analyzed all the code deeply,

                but it seems to me that every dict_t is only accessed

                from a single place at once.</div>

            </blockquote>

            <div><br>

            </div>

            <div style="">There have been instances of dict_t getting

              used concurrently, when used as xdata and in xattrop (by

              AFR). There have been bugs in the past with concurrent

              dict access.</div>

            <div>&nbsp;</div>

          </div>

        </div>

      </div>

    </blockquote>

    I missed that. Sorry.<br>

    <br>

    <blockquote

cite="mid:CAFboF2w2BTVefSNMaBDexQrVqSYFmjwLsVWj2uUXFXjU3dwqdQ@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">

            <blockquote class="gmail_quote" style="margin:0 0 0

              .8ex;border-left:1px #ccc solid;padding-left:1ex">

              <div text="#000000" bgcolor="#FFFFFF">

                <div class="im">

                  <blockquote type="cite">

                    <div dir="ltr">

                      <div class="gmail_extra">

                        <div class="gmail_quote">

                          <div>&nbsp;</div>

                          <blockquote class="gmail_quote"

                            style="margin:0 0 0 .8ex;border-left:1px

                            #ccc solid;padding-left:1ex"> * Use a trie

                            instead of a hash.<br>

                            <br>

                            A trie structure is a bit more complex than

                            a hash, but only processes the key once and

                            does not need to compute the hash. A test

                            implementation I made with a trie shows a

                            significant improvement in dictionary

                            operations.<br>

                          </blockquote>

                          <div><br>

                          </div>

                          <div>There is already an implementation of

                            trie in libglusterfs/src/trie.c. Though it

                            does not compact (collapse) single-child

                            nodes upwards into the parent. In any case,

                            let's avoid having two implementations of

                            tries.</div>

                        </div>

                      </div>

                    </div>

                  </blockquote>

                </div>

                I know. The current implementation wastes a lot of

                memory because it uses an array of 256 pointers, and in

                some places it needs to traverse the array. Not a b&iexcl;g

                deal, but if it is made many times it could be

                noticeable. In my test I used a trie with 4 child

                pointers (with collapsing single-child nodes) that runs

                a bit faster than the 256 implementation and uses much

                less memory. I tried with 2, 4, 16 and 256 childs per

                node, and 4 seems to be the best (at least for

                dictionary structures) though there are very little

                difference between 4 and 16 in terms of speed.<br>

              </div>

            </blockquote>

            <div><br>

            </div>

            <div style="">The 256 child pointers give you constant time

              lookup for the next level child with just an offset

              indirection. With smaller fan-out, do you search through

              the list? Can you show an example of this? Collapsing

              single child node upwards is badly needed though.</div>

          </div>

        </div>

      </div>

    </blockquote>

    For the case of 4 childs I split each byte in 4 elements of 2 bits.

    This makes it very easy to access the next child. It only needs some

    basic logic operations. This is the current implementation for the

    lookup function. It shows how it is accessed:<br>

    <br>

    <pre>/* TRIE_DIMENSION values:

 *

 *    1: 1 bit per element, 8 elements per byte

 *    2: 2 bits per element, 4 elements per byte

 *    3: 4 bits per element, 2 elements per byte

 *    4: 8 bits per element, 1 element per byte

 */

#define TRIE_DIMENSION      2

#define TRIE_INDEX_BITS     (4 - TRIE_DIMENSION)

#define TRIE_ELEM_BITS      (1 &lt;&lt; (TRIE_DIMENSION - 1))

#define TRIE_CHILDS         (1 &lt;&lt; TRIE_ELEM_BITS)

#define TRIE_ELEMS_PER_BYTE (1 &lt;&lt; TRIE_INDEX_BITS)

#define TRIE_INDEX_MASK     (TRIE_ELEMS_PER_BYTE - 1)

#define TRIE_ELEM_MASK      (TRIE_CHILDS - 1)

#define sys_trie_value(_data, _offs) \

    (((_data) &gt;&gt; (((_offs) &amp; TRIE_INDEX_MASK) * TRIE_ELEM_BITS)) &amp; \

     TRIE_ELEM_MASK)

#define sys_trie_value_idx(_data, _offs) \

    ({ \

        off_t __tmp_offs = (_offs); \

        sys_trie_value(((_data)[(__tmp_offs) &gt;&gt; TRIE_INDEX_BITS]), \

                       __tmp_offs); \

    })

typedef struct _sys_trie_node

{

    struct _sys_trie_node * childs[TRIE_CHILDS];

    struct _sys_trie_node * parent;

    void *                  data;

    uint32_t                count;

    uint32_t                length;

    uint8_t                 key[0];

} sys_trie_node_t;                                                              

typedef struct _sys_trie

{

    sys_trie_node_t root;

} sys_trie_t;

sys_trie_node_t * sys_trie_lookup(sys_trie_t * trie,

                                  const uint8_t * key,

                                  size_t length)

{

    sys_trie_node_t * node;

    size_t len;

    len = length &lt;&lt; TRIE_INDEX_BITS;

    for (node = trie-&gt;root.childs[sys_trie_value(*key, 0)];

         (node != NULL) &amp;&amp; (node-&gt;length &lt; len);

         node = node-&gt;childs[sys_trie_value_idx(key, node-&gt;length)]);

    if ((node != NULL) &amp;&amp; (node-&gt;length == len) &amp;&amp; (node-&gt;data != NULL) &amp;&amp;

        (memcmp(node-&gt;key, key, length) == 0))

    {

        return node;

    }

    return NULL;

}

</pre>

    <br>

    It's true that the collapsing feature is not really used for

    dictionaries, however I think it would be interesting to have it to

    use tries for other purposes that may require a long duration data

    structure that eventually adds and removes items.<br>

    <br>

    <blockquote

cite="mid:CAFboF2w2BTVefSNMaBDexQrVqSYFmjwLsVWj2uUXFXjU3dwqdQ@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">

            <div>&nbsp;</div>

            <blockquote class="gmail_quote" style="margin:0 0 0

              .8ex;border-left:1px #ccc solid;padding-left:1ex">

              <div text="#000000" bgcolor="#FFFFFF"> I agree that it is

                not good to maintain two implementations of the same

                thing. Maybe we could change the trie implementation. It

                should be transparent.</div>

            </blockquote>

            <div><br>

            </div>

            <div style="">Yes, I believe the current API can accommodate

              such internal changes.</div>

            <div>&nbsp;</div>

            <blockquote class="gmail_quote" style="margin:0 0 0

              .8ex;border-left:1px #ccc solid;padding-left:1ex">

              <div text="#000000" bgcolor="#FFFFFF">

                <div class="im"><br>

                  <br>

                  <blockquote type="cite">

                    <div dir="ltr">

                      <div class="gmail_extra">

                        <div class="gmail_quote">

                          <blockquote class="gmail_quote"

                            style="margin:0 0 0 .8ex;border-left:1px

                            #ccc solid;padding-left:1ex"> * Implement

                            dict_foreach() as a macro (similar to

                            kernel's list_for_each()).<br>

                            <br>

                            This gives more control and avoids the need

                            of helper functions.<br>

                          </blockquote>

                          <div><br>

                          </div>

                          <div>This makes sense too, but there are quite

                            a few users of dict_foreach in the existing

                            style. Moving them all over might be a pain.</div>

                        </div>

                      </div>

                    </div>

                  </blockquote>

                </div>

                Maybe we could create a differently named macro to

                implement this feature and allow the developers to

                slowly change it. The old implementation could be

                flagged as deprecated and use the new one for new code.

                Old code will have enough time to change it until

                eventually the old implementation is removed.<br>

                <br>

                If we make important changes to the dict_t structure, we

                could replace current functions by macros that use the

                new implementation but simulates the old behavior.</div>

            </blockquote>

            <div><br>

            </div>

            <div style="">Sounds OK.</div>

            <div>&nbsp;</div>

            <blockquote class="gmail_quote" style="margin:0 0 0

              .8ex;border-left:1px #ccc solid;padding-left:1ex">

              <div text="#000000" bgcolor="#FFFFFF">

                <div class="im"> <br>

                  <blockquote type="cite">

                    <div dir="ltr">

                      <div class="gmail_extra">

                        <div class="gmail_quote">

                          <div>&nbsp;</div>

                          <blockquote class="gmail_quote"

                            style="margin:0 0 0 .8ex;border-left:1px

                            #ccc solid;padding-left:1ex"> Additionally,

                            I think it's possible to redefine structures

                            to reduce the number of allocations and

                            pointers used for each element (actual data,

                            data_t, data_pair_t and key).<br>

                          </blockquote>

                          <div><br>

                          </div>

                          <div>This is highly desirable. There was some

                            effort from Amar in the past (<a

                              moz-do-not-send="true"

                              href="http://review.gluster.org/3910"

                              target="_blank">http://review.gluster.org/3910</a>)

                            but it has been in need of attention for

                            some time. It would be intersting to know if

                            you were thinking along similar lines?</div>

                          <div><br>

                          </div>

                        </div>

                      </div>

                    </div>

                  </blockquote>

                </div>

                Yes, it is quite similar though I should analyze it more

                deeply. I would also try to remove some unused/unneeded

                fields that are used in very few places, add complexity

                and can be replaced easily, like extra_free and

                extra_stdfree in dict_t for example.

                <div class="im"><br>

                </div>

              </div>

            </blockquote>

            <div><br>

            </div>

            <div style="">Thanks,</div>

            <div style="">Avati</div>

            <div style="">&nbsp;</div>

          </div>

        </div>

      </div>

      <br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <br>

      <pre wrap="">_______________________________________________

Gluster-devel mailing list

<a class="moz-txt-link-abbreviated" href="mailto:Gluster-devel@nongnu.org">Gluster-devel@nongnu.org</a>

<a class="moz-txt-link-freetext" href="https://lists.nongnu.org/mailman/listinfo/gluster-devel">https://lists.nongnu.org/mailman/listinfo/gluster-devel</a>

</pre>

    </blockquote>

    <br>

  </body>

</html>