<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <br>
    This is the core file from the crash just now<br>
    <br>
    [root@psanaoss213 /]# ls -al core*<br>
    -rw------- 1 root root 4073594880 Jun&nbsp; 8 15:05 core.22682<br>
    <br>
    From yesterday:<br>
    [root@psanaoss214 /]# ls -al core*<br>
    -rw------- 1 root root 4362727424 Jun&nbsp; 8 00:58 core.13483<br>
    -rw------- 1 root root 4624773120 Jun&nbsp; 8 03:21 core.8792<br>
    <br>
    <br>
    On 06/08/2012 04:34 PM, Anand Avati wrote:
    <blockquote
cite="mid:CAFboF2xN93N-62LTGsM7fFMHCvxFQp5YJf-R_=QxF+J8vcyoTA@mail.gmail.com"
      type="cite">Is it possible the system was running low on memory? I
      see you have 48GB, but memory registration failure typically would
      be because the system limit on the number of pinnable pages in RAM
      was hit. Can you tell us the size of your core dump files after
      the crash?
      <div>
        <br>
      </div>
      <div>Avati<br>
        <br>
        <div class="gmail_quote">On Fri, Jun 8, 2012 at 4:22 PM, Ling Ho
          <span dir="ltr">&lt;<a moz-do-not-send="true"
              href="mailto:ling@slac.stanford.edu" target="_blank">ling@slac.stanford.edu</a>&gt;</span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex">
            Hello,<br>
            <br>
            I have a brick that crashed twice today, and another
            different brick that crashed just a while a go.<br>
            <br>
            This is what I see in one of the brick logs:<br>
            <br>
            patchset: git://<a moz-do-not-send="true"
              href="http://git.gluster.com/glusterfs.git"
              target="_blank">git.gluster.com/glusterfs.git</a><br>
            patchset: git://<a moz-do-not-send="true"
              href="http://git.gluster.com/glusterfs.git"
              target="_blank">git.gluster.com/glusterfs.git</a><br>
            signal received: 6<br>
            signal received: 6<br>
            time of crash: 2012-06-08 15:05:11<br>
            configuration details:<br>
            argp 1<br>
            backtrace 1<br>
            dlfcn 1<br>
            fdatasync 1<br>
            libpthread 1<br>
            llistxattr 1<br>
            setfsid 1<br>
            spinlock 1<br>
            epoll.h 1<br>
            xattr.h 1<br>
            st_atim.tv_nsec 1<br>
            package-string: glusterfs 3.2.6<br>
            /lib64/libc.so.6[0x34bc032900]<br>
            /lib64/libc.so.6(gsignal+0x35)[0x34bc032885]<br>
            /lib64/libc.so.6(abort+0x175)[0x34bc034065]<br>
            /lib64/libc.so.6[0x34bc06f977]<br>
            /lib64/libc.so.6[0x34bc075296]<br>
            /opt/glusterfs/3.2.6/lib64/libglusterfs.so.0(__gf_free+0x44)[0x7f1740ba25e4]<br>
            /opt/glusterfs/3.2.6/lib64/libgfrpc.so.0(rpc_transport_destroy+0x47)[0x7f1740956967]<br>
            /opt/glusterfs/3.2.6/lib64/libgfrpc.so.0(rpc_transport_unref+0x62)[0x7f1740956a32]<br>
            /opt/glusterfs/3.2.6/lib64/glusterfs/3.2.6/rpc-transport/rdma.so(+0xc135)[0x7f173ca27135]<br>
            /lib64/libpthread.so.0[0x34bc8077f1]<br>
            /lib64/libc.so.6(clone+0x6d)[0x34bc0e5ccd]<br>
            ---------<br>
            <br>
            And somewhere before these, there is also<br>
            [2012-06-08 15:05:07.512604] E [rdma.c:198:rdma_new_post]
            0-rpc-transport/rdma: memory registration failed<br>
            <br>
            I have 48GB of memory on the system:<br>
            <br>
            # free<br>
            &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; total &nbsp; &nbsp; &nbsp; used &nbsp; &nbsp; &nbsp; free &nbsp; &nbsp; shared &nbsp;
            &nbsp;buffers &nbsp; &nbsp; cached<br>
            Mem: &nbsp; &nbsp; &nbsp;49416716 &nbsp; 34496648 &nbsp; 14920068 &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;0 &nbsp; &nbsp;
            &nbsp;31692 &nbsp; 28209612<br>
            -/+ buffers/cache: &nbsp; &nbsp;6255344 &nbsp; 43161372<br>
            Swap: &nbsp; &nbsp; &nbsp;4194296 &nbsp; &nbsp; &nbsp; <a moz-do-not-send="true"
              href="tel:1740%20%20%20%204192556" value="+17404192556"
              target="_blank">1740 4192556</a><br>
            <br>
            # uname -a<br>
            Linux psanaoss213 2.6.32-220.7.1.el6.x86_64 #1 SMP Fri Feb
            10 15:22:22 EST 2012 x86_64 x86_64 x86_64 GNU/Linux<br>
            <br>
            The server gluster versions is 3.2.6-1. I am using have both
            rdma clients and tcp clients over 10Gb/s network.<br>
            <br>
            Any suggestion what I should look for?<br>
            <br>
            Is there a way to just restart the brick, and not glusterd
            on the server? I have 8 bricks on the server.<br>
            <br>
            Thanks,<br>
            ...<br>
            ling<br>
            <br>
            <br>
            Here's the volume info:<br>
            <br>
            # gluster volume info<br>
            <br>
            Volume Name: ana12<br>
            Type: Distribute<br>
            Status: Started<br>
            Number of Bricks: 40<br>
            Transport-type: tcp,rdma<br>
            Bricks:<br>
            Brick1: psanaoss214:/brick1<br>
            Brick2: psanaoss214:/brick2<br>
            Brick3: psanaoss214:/brick3<br>
            Brick4: psanaoss214:/brick4<br>
            Brick5: psanaoss214:/brick5<br>
            Brick6: psanaoss214:/brick6<br>
            Brick7: psanaoss214:/brick7<br>
            Brick8: psanaoss214:/brick8<br>
            Brick9: psanaoss211:/brick1<br>
            Brick10: psanaoss211:/brick2<br>
            Brick11: psanaoss211:/brick3<br>
            Brick12: psanaoss211:/brick4<br>
            Brick13: psanaoss211:/brick5<br>
            Brick14: psanaoss211:/brick6<br>
            Brick15: psanaoss211:/brick7<br>
            Brick16: psanaoss211:/brick8<br>
            Brick17: psanaoss212:/brick1<br>
            Brick18: psanaoss212:/brick2<br>
            Brick19: psanaoss212:/brick3<br>
            Brick20: psanaoss212:/brick4<br>
            Brick21: psanaoss212:/brick5<br>
            Brick22: psanaoss212:/brick6<br>
            Brick23: psanaoss212:/brick7<br>
            Brick24: psanaoss212:/brick8<br>
            Brick25: psanaoss213:/brick1<br>
            Brick26: psanaoss213:/brick2<br>
            Brick27: psanaoss213:/brick3<br>
            Brick28: psanaoss213:/brick4<br>
            Brick29: psanaoss213:/brick5<br>
            Brick30: psanaoss213:/brick6<br>
            Brick31: psanaoss213:/brick7<br>
            Brick32: psanaoss213:/brick8<br>
            Brick33: psanaoss215:/brick1<br>
            Brick34: psanaoss215:/brick2<br>
            Brick35: psanaoss215:/brick4<br>
            Brick36: psanaoss215:/brick5<br>
            Brick37: psanaoss215:/brick7<br>
            Brick38: psanaoss215:/brick8<br>
            Brick39: psanaoss215:/brick3<br>
            Brick40: psanaoss215:/brick6<br>
            Options Reconfigured:<br>
            performance.io-thread-count: 16<br>
            performance.write-behind-window-size: 16MB<br>
            performance.cache-size: 1GB<br>
            nfs.disable: on<br>
            performance.cache-refresh-timeout: 1<br>
            network.ping-timeout: 42<br>
            performance.cache-max-file-size: 1PB<br>
            <br>
            _______________________________________________<br>
            Gluster-users mailing list<br>
            <a moz-do-not-send="true"
              href="mailto:Gluster-users@gluster.org" target="_blank">Gluster-users@gluster.org</a><br>
            <a moz-do-not-send="true"
              href="http://gluster.org/cgi-bin/mailman/listinfo/gluster-users"
              target="_blank">http://gluster.org/cgi-bin/mailman/listinfo/gluster-users</a><br>
          </blockquote>
        </div>
        <br>
      </div>
    </blockquote>
    <br>
  </body>
</html>