<div dir="ltr">ok. done.<div>this time there were no disconnects, at least all of vms are working, but got some mails from VM about IO writes again.</div><div><span style="font-size:11pt;font-family:Calibri,sans-serif"><br></span></div><div><span style="font-size:11pt;font-family:Calibri,sans-serif">WARNINGs: Read IO Wait time is 1.45 (outside
range [0:1]).</span><br></div><div><span style="font-size:11pt;font-family:Calibri,sans-serif"><br></span></div><div>here is the output</div><div><br></div><div><div>root@stor1:~# gluster volume profile HA-WIN-TT-1T info</div><div>Brick: stor1:/exports/NFS-WIN/1T</div><div>--------------------------------</div><div>Cumulative Stats:</div><div>   Block Size:             131072b+              262144b+</div><div> No. of Reads:                    0                     0</div><div>No. of Writes:              7372798                     1</div><div> %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls         Fop</div><div> ---------   -----------   -----------   -----------   ------------        ----</div><div>      0.00       0.00 us       0.00 us       0.00 us             25     RELEASE</div><div>      0.00       0.00 us       0.00 us       0.00 us             16  RELEASEDIR</div><div>      0.00      64.00 us      52.00 us      76.00 us              2     ENTRYLK</div><div>      0.00      73.50 us      51.00 us      96.00 us              2       FLUSH</div><div>      0.00      68.43 us      30.00 us     135.00 us              7      STATFS</div><div>      0.00      54.31 us      44.00 us     109.00 us             16     OPENDIR</div><div>      0.00      50.75 us      16.00 us      74.00 us             24       FSTAT</div><div>      0.00      47.77 us      19.00 us     119.00 us             26    GETXATTR</div><div>      0.00      59.21 us      21.00 us      89.00 us             24        OPEN</div><div>      0.00      59.39 us      22.00 us     296.00 us             28     READDIR</div><div>      0.00    4972.00 us    4972.00 us    4972.00 us              1      CREATE</div><div>      0.00      97.42 us      19.00 us     184.00 us             62      LOOKUP</div><div>      0.00      89.49 us      20.00 us     656.00 us            324    FXATTROP</div><div>      3.91 1255944.81 us     127.00 us 23397532.00 us            189       FSYNC</div><div>      7.40 3406275.50 us      17.00 us 23398013.00 us            132     INODELK</div><div>     34.96   94598.02 us       8.00 us 23398705.00 us          22445    FINODELK</div><div>     53.73     442.66 us      79.00 us 3116494.00 us        7372799       WRITE</div><div><br></div><div>    Duration: 7813 seconds</div><div>   Data Read: 0 bytes</div><div>Data Written: 966367641600 bytes</div><div><br></div><div>Interval 0 Stats:</div><div>   Block Size:             131072b+              262144b+</div><div> No. of Reads:                    0                     0</div><div>No. of Writes:              7372798                     1</div><div> %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls         Fop</div><div> ---------   -----------   -----------   -----------   ------------        ----</div><div>      0.00       0.00 us       0.00 us       0.00 us             25     RELEASE</div><div>      0.00       0.00 us       0.00 us       0.00 us             16  RELEASEDIR</div><div>      0.00      64.00 us      52.00 us      76.00 us              2     ENTRYLK</div><div>      0.00      73.50 us      51.00 us      96.00 us              2       FLUSH</div><div>      0.00      68.43 us      30.00 us     135.00 us              7      STATFS</div><div>      0.00      54.31 us      44.00 us     109.00 us             16     OPENDIR</div><div>      0.00      50.75 us      16.00 us      74.00 us             24       FSTAT</div><div>      0.00      47.77 us      19.00 us     119.00 us             26    GETXATTR</div><div>      0.00      59.21 us      21.00 us      89.00 us             24        OPEN</div><div>      0.00      59.39 us      22.00 us     296.00 us             28     READDIR</div><div>      0.00    4972.00 us    4972.00 us    4972.00 us              1      CREATE</div><div>      0.00      97.42 us      19.00 us     184.00 us             62      LOOKUP</div><div>      0.00      89.49 us      20.00 us     656.00 us            324    FXATTROP</div><div>      3.91 1255944.81 us     127.00 us 23397532.00 us            189       FSYNC</div><div>      7.40 3406275.50 us      17.00 us 23398013.00 us            132     INODELK</div><div>     34.96   94598.02 us       8.00 us 23398705.00 us          22445    FINODELK</div><div>     53.73     442.66 us      79.00 us 3116494.00 us        7372799       WRITE</div><div><br></div><div>    Duration: 7813 seconds</div><div>   Data Read: 0 bytes</div><div>Data Written: 966367641600 bytes</div><div><br></div><div>Brick: stor2:/exports/NFS-WIN/1T</div><div>--------------------------------</div><div>Cumulative Stats:</div><div>   Block Size:             131072b+              262144b+</div><div> No. of Reads:                    0                     0</div><div>No. of Writes:              7372798                     1</div><div> %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls         Fop</div><div> ---------   -----------   -----------   -----------   ------------        ----</div><div>      0.00       0.00 us       0.00 us       0.00 us             25     RELEASE</div><div>      0.00       0.00 us       0.00 us       0.00 us             16  RELEASEDIR</div><div>      0.00      61.50 us      46.00 us      77.00 us              2     ENTRYLK</div><div>      0.00      82.00 us      67.00 us      97.00 us              2       FLUSH</div><div>      0.00     265.00 us     265.00 us     265.00 us              1      CREATE</div><div>      0.00      57.43 us      30.00 us      85.00 us              7      STATFS</div><div>      0.00      61.12 us      37.00 us     107.00 us             16     OPENDIR</div><div>      0.00      44.04 us      12.00 us      86.00 us             24       FSTAT</div><div>      0.00      41.42 us      24.00 us      96.00 us             26    GETXATTR</div><div>      0.00      45.93 us      24.00 us     133.00 us             28     READDIR</div><div>      0.00      57.17 us      25.00 us     147.00 us             24        OPEN</div><div>      0.00     145.28 us      31.00 us     288.00 us             32    READDIRP</div><div>      0.00      39.50 us      10.00 us     152.00 us            132     INODELK</div><div>      0.00     330.97 us      20.00 us   14280.00 us             62      LOOKUP</div><div>      0.00      79.06 us      19.00 us     851.00 us            430    FXATTROP</div><div>      0.02      29.32 us       7.00 us   28154.00 us          22568    FINODELK</div><div>      7.80 1313096.68 us     125.00 us 23281862.00 us            189       FSYNC</div><div>     92.18     397.92 us      76.00 us 1838343.00 us        7372799       WRITE</div><div><br></div><div>    Duration: 7811 seconds</div><div>   Data Read: 0 bytes</div><div>Data Written: 966367641600 bytes</div><div><br></div><div>Interval 0 Stats:</div><div>   Block Size:             131072b+              262144b+</div><div> No. of Reads:                    0                     0</div><div>No. of Writes:              7372798                     1</div><div> %-latency   Avg-latency   Min-Latency   Max-Latency   No. of calls         Fop</div><div> ---------   -----------   -----------   -----------   ------------        ----</div><div>      0.00       0.00 us       0.00 us       0.00 us             25     RELEASE</div><div>      0.00       0.00 us       0.00 us       0.00 us             16  RELEASEDIR</div><div>      0.00      61.50 us      46.00 us      77.00 us              2     ENTRYLK</div><div>      0.00      82.00 us      67.00 us      97.00 us              2       FLUSH</div><div>      0.00     265.00 us     265.00 us     265.00 us              1      CREATE</div><div>      0.00      57.43 us      30.00 us      85.00 us              7      STATFS</div><div>      0.00      61.12 us      37.00 us     107.00 us             16     OPENDIR</div><div>      0.00      44.04 us      12.00 us      86.00 us             24       FSTAT</div><div>      0.00      41.42 us      24.00 us      96.00 us             26    GETXATTR</div><div>      0.00      45.93 us      24.00 us     133.00 us             28     READDIR</div><div>      0.00      57.17 us      25.00 us     147.00 us             24        OPEN</div><div>      0.00     145.28 us      31.00 us     288.00 us             32    READDIRP</div><div>      0.00      39.50 us      10.00 us     152.00 us            132     INODELK</div><div>      0.00     330.97 us      20.00 us   14280.00 us             62      LOOKUP</div><div>      0.00      79.06 us      19.00 us     851.00 us            430    FXATTROP</div><div>      0.02      29.32 us       7.00 us   28154.00 us          22568    FINODELK</div><div>      7.80 1313096.68 us     125.00 us 23281862.00 us            189       FSYNC</div><div>     92.18     397.92 us      76.00 us 1838343.00 us        7372799       WRITE</div><div><br></div><div>    Duration: 7811 seconds</div><div>   Data Read: 0 bytes</div><div>Data Written: 966367641600 bytes</div><div><br></div></div><div>does it make something more clear?</div></div><div class="gmail_extra"><br><div class="gmail_quote">2014-10-13 20:40 GMT+03:00 Roman <span dir="ltr">&lt;<a href="mailto:romeo.r@gmail.com" target="_blank">romeo.r@gmail.com</a>&gt;</span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">i think i may know what was an issue. There was an iscsitarget service runing, that was exporting this generated block device. so maybe my collegue Windows server picked it up and mountd :) I&#39;ll if it will happen again.</div><div class="gmail_extra"><div><div class="h5"><br><div class="gmail_quote">2014-10-13 20:27 GMT+03:00 Roman <span dir="ltr">&lt;<a href="mailto:romeo.r@gmail.com" target="_blank">romeo.r@gmail.com</a>&gt;</span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">So may I restart the volume and start the test, or you need something else from this issue?</div><div class="gmail_extra"><div><div><br><div class="gmail_quote">2014-10-13 19:49 GMT+03:00 Pranith Kumar Karampuri <span dir="ltr">&lt;<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>&gt;</span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
  
    
  
  <div text="#000000" bgcolor="#FFFFFF"><span>
    <br>
    <div>On 10/13/2014 10:03 PM, Roman wrote:<br>
    </div>
    <blockquote type="cite">
      <div dir="ltr">hmm,
        <div>seems like another strange issue? Seen this before. Had to
          restart the volume to get my empty space back.</div>
        <div>
          <div>root@glstor-cli:/srv/nfs/HA-WIN-TT-1T# ls -l</div>
          <div>total 943718400</div>
          <div>-rw-r--r-- 1 root root 966367641600 Oct 13 16:55 disk</div>
          <div>root@glstor-cli:/srv/nfs/HA-WIN-TT-1T# rm disk</div>
          <div>root@glstor-cli:/srv/nfs/HA-WIN-TT-1T# df -h</div>
          <div>Filesystem                                            
             Size  Used Avail Use% Mounted on</div>
          <div>rootfs                                                
             282G  1.1G  266G   1% /</div>
          <div>udev                                                    
            10M     0   10M   0% /dev</div>
          <div>tmpfs                                                  
            1.4G  228K  1.4G   1% /run</div>
          <div>/dev/disk/by-uuid/c62ee3c0-c0e5-44af-b0cd-7cb3fbcc0fba
             282G  1.1G  266G   1% /</div>
          <div>tmpfs                                                  
            5.0M     0  5.0M   0% /run/lock</div>
          <div>tmpfs                                                  
            5.2G     0  5.2G   0% /run/shm</div>
          <div>stor1:HA-WIN-TT-1T                                    
            1008G  901G   57G  95% /srv/nfs/HA-WIN-TT-1T</div>
        </div>
        <div><br>
        </div>
        <div>no file, but size is still 901G.</div>
        <div>Both servers show the same.</div>
        <div>Do I really have to restart the volume to fix that?</div>
      </div>
    </blockquote></span>
    IMO this can happen if there is an fd leak. open-fd is the only
    variable that can change with volume restart. How do you re-create
    the bug?<span><font color="#888888"><br>
    <br>
    Pranith</font></span><div><div><br>
    <blockquote type="cite">
      <div class="gmail_extra"><br>
        <div class="gmail_quote">2014-10-13 19:30 GMT+03:00 Roman <span dir="ltr">&lt;<a href="mailto:romeo.r@gmail.com" target="_blank">romeo.r@gmail.com</a>&gt;</span>:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div dir="ltr">Sure.
              <div>I&#39;ll let it to run for this night .</div>
            </div>
            <div class="gmail_extra">
              <div>
                <div><br>
                  <div class="gmail_quote">2014-10-13 19:19 GMT+03:00
                    Pranith Kumar Karampuri <span dir="ltr">&lt;<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>&gt;</span>:<br>
                    <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                      <div text="#000000" bgcolor="#FFFFFF"> hi Roman,<br>
                             Do you think we can run this test again?
                        this time, could you enable &#39;gluster volume
                        profile &lt;volname&gt; start&#39;, do the same
                        test. Provide output of &#39;gluster volume profile
                        &lt;volname&gt; info&#39; and logs after the test?<span><font color="#888888"><br>
                            <br>
                            Pranith</font></span>
                        <div>
                          <div><br>
                            <div>On 10/13/2014 09:45 PM, Roman wrote:<br>
                            </div>
                            <blockquote type="cite">
                              <div dir="ltr">Sure !
                                <div><br>
                                </div>
                                <div>
                                  <div>root@stor1:~# gluster volume info</div>
                                  <div><br>
                                  </div>
                                  <div>Volume Name:
                                    HA-2TB-TT-Proxmox-cluster</div>
                                  <div>Type: Replicate</div>
                                  <div>Volume ID:
                                    66e38bde-c5fa-4ce2-be6e-6b2adeaa16c2</div>
                                  <div>Status: Started</div>
                                  <div>Number of Bricks: 1 x 2 = 2</div>
                                  <div>Transport-type: tcp</div>
                                  <div>Bricks:</div>
                                  <div>Brick1:
                                    stor1:/exports/HA-2TB-TT-Proxmox-cluster/2TB</div>
                                  <div>Brick2:
                                    stor2:/exports/HA-2TB-TT-Proxmox-cluster/2TB</div>
                                  <div>Options Reconfigured:</div>
                                  <div>nfs.disable: 0</div>
                                  <div>network.ping-timeout: 10</div>
                                  <div><br>
                                  </div>
                                  <div>Volume Name: HA-WIN-TT-1T</div>
                                  <div>Type: Replicate</div>
                                  <div>Volume ID:
                                    2937ac01-4cba-44a8-8ff8-0161b67f8ee4</div>
                                  <div>Status: Started</div>
                                  <div>Number of Bricks: 1 x 2 = 2</div>
                                  <div>Transport-type: tcp</div>
                                  <div>Bricks:</div>
                                  <div>Brick1: stor1:/exports/NFS-WIN/1T</div>
                                  <div>Brick2: stor2:/exports/NFS-WIN/1T</div>
                                  <div>Options Reconfigured:</div>
                                  <div>nfs.disable: 1</div>
                                  <div>network.ping-timeout: 10</div>
                                  <div><br>
                                  </div>
                                  <div><br>
                                  </div>
                                </div>
                              </div>
                              <div class="gmail_extra"><br>
                                <div class="gmail_quote">2014-10-13
                                  19:09 GMT+03:00 Pranith Kumar
                                  Karampuri <span dir="ltr">&lt;<a href="mailto:pkarampu@redhat.com" target="_blank">pkarampu@redhat.com</a>&gt;</span>:<br>
                                  <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                    <div text="#000000" bgcolor="#FFFFFF"> Could you give
                                      your &#39;gluster volume info&#39; output?<br>
                                      <br>
                                      Pranith
                                      <div>
                                        <div><br>
                                          <div>On 10/13/2014 09:36 PM,
                                            Roman wrote:<br>
                                          </div>
                                        </div>
                                      </div>
                                      <blockquote type="cite">
                                        <div>
                                          <div>
                                            <div dir="ltr">Hi,
                                              <div><br>
                                              </div>
                                              <div>I&#39;ve got this kind of
                                                setup (servers run
                                                replica)</div>
                                              <div><br>
                                              </div>
                                              <div><br>
                                              </div>
                                              <div>@ 10G backend</div>
                                              <div>gluster storage1</div>
                                              <div>gluster storage2</div>
                                              <div>gluster client1</div>
                                              <div><br>
                                              </div>
                                              <div>@1g backend</div>
                                              <div>other gluster clients</div>
                                              <div><br>
                                              </div>
                                              <div>Servers got HW RAID5
                                                with SAS disks.</div>
                                              <div><br>
                                              </div>
                                              <div>So today I&#39;ve desided
                                                to create a 900GB file
                                                for iscsi target that
                                                will be located @
                                                glusterfs separate
                                                volume, using dd (just a
                                                dummy file filled with
                                                zeros, bs=1G count 900)</div>
                                              <div>For the first of all
                                                the process took pretty
                                                lots of time, the
                                                writing speed was 130
                                                MB/sec (client port was
                                                2 gbps, servers ports
                                                were running @ 1gbps).</div>
                                              <div>Then it reported
                                                something like &quot;endpoint
                                                is not connected&quot; and
                                                all of my VMs on the
                                                other volume started to
                                                give me IO errors.</div>
                                              <div>Servers load was
                                                around 4,6 (total 12
                                                cores)</div>
                                              <div><br>
                                              </div>
                                              <div>Maybe it was due to
                                                timeout of 2 secs, so
                                                I&#39;ve made it a big
                                                higher, 10 sec.</div>
                                              <div><br>
                                              </div>
                                              <div>Also during the dd
                                                image creation time, VMs
                                                very often reported me
                                                that their disks are
                                                slow like</div>
                                              <div>
                                                <p>WARNINGs: Read IO
                                                  Wait time is -0.02
                                                  (outside range [0:1]).</p>
                                                <p>Is 130MB /sec is the
                                                  maximum bandwidth for
                                                  all of the volumes in
                                                  total? That why would
                                                  we need 10g backends?</p>
                                                <p>HW Raid local speed
                                                  is 300 MB/sec, so it
                                                  should not be an
                                                  issue. any ideas or
                                                  mby any advices?</p>
                                                <p><br>
                                                </p>
                                                <p>Maybe some1 got
                                                  optimized sysctl.conf
                                                  for 10G backend?</p>
                                                <p>mine is pretty
                                                  simple, which can be
                                                  found from googling.</p>
                                                <p><br>
                                                </p>
                                                <p>just to mention:
                                                  those VM-s were
                                                  connected using
                                                  separate 1gbps
                                                  intraface, which
                                                  means, they should not
                                                  be affected by the
                                                  client with 10g
                                                  backend.</p>
                                                <p><br>
                                                </p>
                                                <p>logs are pretty
                                                  useless, they just say
                                                   this during the
                                                  outage</p>
                                                <p><br>
                                                </p>
                                                <p>[2014-10-13
                                                  12:09:18.392910] W
                                                  [client-handshake.c:276:client_ping_cbk]
                                                  0-HA-2TB-TT-Proxmox-cluster-client-0:
                                                  timer must have
                                                  expired</p>
                                                <p>[2014-10-13
                                                  12:10:08.389708] C
                                                  [client-handshake.c:127:rpc_client_ping_timer_expired]
                                                  0-HA-2TB-TT-Proxmox-cluster-client-0:
                                                  server <a href="http://10.250.0.1:49159" target="_blank">10.250.0.1:49159</a> has
                                                  not responded in the
                                                  last 2 seconds,
                                                  disconnecting.</p>
                                                <p>[2014-10-13
                                                  12:10:08.390312] W
                                                  [client-handshake.c:276:client_ping_cbk]
                                                  0-HA-2TB-TT-Proxmox-cluster-client-0:
                                                  timer must have
                                                  expired</p>
                                              </div>
                                              <div>so I decided to set
                                                the timout a bit higher.</div>
                                              <div>
                                                <div><br>
                                                </div>
                                                <div>So it seems to me,
                                                  that under high load
                                                  GlusterFS is not
                                                  useable? 130 MB/s is
                                                  not that much to get
                                                  some kind of timeouts
                                                  or makeing the systme
                                                  so slow, that VM-s
                                                  feeling themselves
                                                  bad.</div>
                                                <div><br>
                                                </div>
                                                <div>Of course, after
                                                  the disconnection,
                                                  healing process was
                                                  started, but as VM-s
                                                  lost connection to
                                                  both of servers, it
                                                  was pretty useless,
                                                  they could not run
                                                  anymore. and BTW, when
                                                  u load the server with
                                                  such huge job (dd of
                                                  900GB), healing
                                                  process goes soooooo
                                                  slow :)</div>
                                                <div><br>
                                                </div>
                                                <div><br>
                                                </div>
                                                <div><br>
                                                </div>
                                                -- <br>
                                                Best regards,<br>
                                                Roman. </div>
                                            </div>
                                            <br>
                                            <fieldset></fieldset>
                                            <br>
                                          </div>
                                        </div>
                                        <pre>_______________________________________________
Gluster-users mailing list
<a href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a>
<a href="http://supercolony.gluster.org/mailman/listinfo/gluster-users" target="_blank">http://supercolony.gluster.org/mailman/listinfo/gluster-users</a></pre>
                                      </blockquote>
                                      <br>
                                    </div>
                                  </blockquote>
                                </div>
                                <br>
                                <br clear="all">
                                <div><br>
                                </div>
                                -- <br>
                                Best regards,<br>
                                Roman. </div>
                            </blockquote>
                            <br>
                          </div>
                        </div>
                      </div>
                    </blockquote>
                  </div>
                  <br>
                  <br clear="all">
                  <div><br>
                  </div>
                </div>
              </div>
              <span><font color="#888888">-- <br>
                  Best regards,<br>
                  Roman.
                </font></span></div>
          </blockquote>
        </div>
        <br>
        <br clear="all">
        <div><br>
        </div>
        -- <br>
        Best regards,<br>
        Roman.
      </div>
    </blockquote>
    <br>
  </div></div></div>

</blockquote></div><br><br clear="all"><div><br></div></div></div><span><font color="#888888">-- <br>Best regards,<br>Roman.
</font></span></div>
</blockquote></div><br><br clear="all"><div><br></div></div></div><span class="HOEnZb"><font color="#888888">-- <br>Best regards,<br>Roman.
</font></span></div>
</blockquote></div><br><br clear="all"><div><br></div>-- <br>Best regards,<br>Roman.
</div>