[Gluster-devel] glusterfs coredump

Lian, George (NSB - CN/Hangzhou) george.lian at nokia-sbell.com
Wed Jan 23 05:53:19 UTC 2019


Hi, GlusterFS expert,

We have encounter a coredump of client process “glusterfs” issue recently, and it could be reproduced more easy when the IO load and CPU/memory load is highly during stability testing.
Our glusterfs release is 3.12.2
I have copy the call trace of core dump as the below, and have some question, wish can get help from you.


1)     Do you have encounter related issue? From the call trace, we could see the fd variable seems abnoram with the field “refcount” and “inode”,

For wb_inode->this, it become invalid value with 0xffffffffffffff00, did the value “0xffffffffffffff00”is some meaningful value? Because every coredump occurred, the value of inode->this is same with “0xffffffffffffff00”



2)     When I check the source code, in function wb_enqueue_common, could find function __wb_request_unref used instead of  wb_request_unref, and we could see though function wb_request_unref is defined, but never used!  Firstly it seems some strange, secondly, in wb_request_unref, there are lock mechanism to avoid race condition, but __wb_request_unref without those mechanism, and we could see there are more occurrence called from __wb_request_unref, will it lead to race issue?

[Current thread is 1 (Thread 0x7f54e82a3700 (LWP 6078))]
(gdb) bt
#0  0x00007f54e623197c in wb_fulfill (wb_inode=0x7f54d4066bd0, liabilities=0x7f54d0824440) at write-behind.c:1155
#1  0x00007f54e6233662 in wb_process_queue (wb_inode=0x7f54d4066bd0) at write-behind.c:1728
#2  0x00007f54e6234039 in wb_writev (frame=0x7f54d406d6c0, this=0x7f54e0014b10, fd=0x7f54d8019d70, vector=0x7f54d0018000, count=1, offset=33554431, flags=32770, iobref=0x7f54d021ec20, xdata=0x0)
    at write-behind.c:1842
#3  0x00007f54e6026fcb in du_writev_resume (ret=0, frame=0x7f54d0002260, opaque=0x7f54d0002260) at disk-usage.c:490
#4  0x00007f54ece07160 in synctask_wrap () at syncop.c:377
#5  0x00007f54eb3a2660 in ?? () from /lib64/libc.so.6
#6  0x0000000000000000 in ?? ()
(gdb) p wb_inode
$6 = (wb_inode_t *) 0x7f54d4066bd0
(gdb) p wb_inode->this
$1 = (xlator_t *) 0xffffffffffffff00
(gdb) frame 1
#1  0x00007f54e6233662 in wb_process_queue (wb_inode=0x7f54d4066bd0) at write-behind.c:1728
1728         in write-behind.c
(gdb) p wind_failure
$2 = 0
(gdb) p *wb_inode
$3 = {window_conf = 35840637416824320, window_current = 35840643167805440, transit = 35839681019027968, all = {next = 0xb000, prev = 0x7f54d4066bd000}, todo = {next = 0x7f54deadc0de00,
    prev = 0x7f54e00489e000}, liability = {next = 0x7f54000000a200, prev = 0xb000}, temptation = {next = 0x7f54d4066bd000, prev = 0x7f54deadc0de00}, wip = {next = 0x7f54e00489e000, prev = 0x7f54000000a200},
  gen = 45056, size = 35840591659782144, lock = {spinlock = 0, mutex = {__data = {__lock = 0, __count = 8344798, __owner = 0, __nusers = 8344799, __kind = 41472, __spins = 21504, __elision = 127, __list = {
          __prev = 0xb000, __next = 0x7f54d4066bd000}},
      __size = "\000\000\000\000\336T\177\000\000\000\000\000\337T\177\000\000\242\000\000\000T\177\000\000\260\000\000\000\000\000\000\000\320k\006\324T\177", __align = 35840634501726208}},
  this = 0xffffffffffffff00, dontsync = -1}
(gdb) frame 2
#2  0x00007f54e6234039 in wb_writev (frame=0x7f54d406d6c0, this=0x7f54e0014b10, fd=0x7f54d8019d70, vector=0x7f54d0018000, count=1, offset=33554431, flags=32770, iobref=0x7f54d021ec20, xdata=0x0)
    at write-behind.c:1842
1842         in write-behind.c
(gdb) p fd
$4 = (fd_t *) 0x7f54d8019d70
(gdb) p *fd
$5 = {pid = 140002378149040, flags = -670836240, refcount = 32596, inode_list = {next = 0x7f54d8019d80, prev = 0x7f54d8019d80}, inode = 0x0, lock = {spinlock = -536740032, mutex = {__data = {
        __lock = -536740032, __count = 32596, __owner = -453505333, __nusers = 32596, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}},
      __size = "@\377\001\340T\177\000\000\313\016\370\344T\177", '\000' <repeats 25 times>, __align = 140002512207680}}, _ctx = 0xffffffff, xl_count = 0, lk_ctx = 0x0, anonymous = (unknown: 3623984496)}
(gdb)


Thanks & Best Regards,
George

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gluster.org/pipermail/gluster-devel/attachments/20190123/2783e416/attachment-0001.html>


More information about the Gluster-devel mailing list