<br><br>On Monday, November 10, 2014, Vijay Bellur &lt;<a href="mailto:vbellur@redhat.com">vbellur@redhat.com</a>&gt; wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">On 11/08/2014 03:50 PM, Jaden Liang wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

Hi all,<br>

<br>

We are testing BD xlator to verify the KVM running with gluster. After some<br>

simple tests, we encountered a coredump of glusterfs lead by liblvm2app.so.<br>

Hope some one here might give some advises about this issue.<br>

<br>

We have debug for some time, and found out this coredump is triggered by a<br>

thread-safe issue. From the core file, the top function is _update_mda()<br>

with a invailid pointer which is from lvmcache_foreach_mda(). As we<br>

know, the glusterfsd<br>

has some io threads to simulate the async io. That will make more than 1<br>

thread run into<br>

bd_statfs_cbk(). And in liblvm2app.so, _text_read() will look up an info<br>

in a hash<br>

table named _pvid_hash. If no info item exist, it will allocate a new<br>

one. However,<br>

there isn&#39;t any lock to protect this operations! liblvm2app.so will get<br>

crashed with<br>

multi-thread like this precedures:<br>

<br>

Thread A and thread B go into bd_statfs_cbk() at the same time:<br>

1. A allocate an new info node, and put it into _pvid_hash, call<br>

lvmcache_foreach_mda().<br>

2. B looks up and get the info generaed by A in _pvid_hash, pass it to<br>

lvmcache_del_mdas(), this will free the info node.<br>

3. A keep using the info node which has been freed by B.<br>

4. Memory crash...<br>

<br>

</blockquote>

<br>

Thanks for the report and the steps to recreate the problem.<br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

#9  0x00007f83b599753f in _lvm_vg_open (mode=0x7f83b5c8971e &quot;r&quot;,<br>

vgname=0x11d2c50 &quot;bd-vg&quot;, libh=0x11d3c40,<br>

     flags=&lt;optimized out&gt;) at lvm_vg.c:221<br>

#10 lvm_vg_open (libh=0x11d3c40, vgname=0x11d2c50 &quot;bd-vg&quot;,<br>

mode=mode@entry=0x7f83b5c8971e &quot;r&quot;, flags=flags@entry=0)<br>

     at lvm_vg.c:238<br>

#11 0x00007f83b5c7ee36 in bd_statfs_cbk (frame=0x7f83b95416e4,<br>

cookie=&lt;optimized out&gt;, this=0x119eb90, op_ret=0, op_errno=0,<br>

     buff=0x7f83b1d0ac70, xdata=0x0) at bd.c:353<br>

</blockquote>

<br>

<br>

One quick fix would be to serialize calls to lvm_vg_open() by holding a lock in bd xlator. Have you tried attempting that?</blockquote><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

-Vijay<br></blockquote><div><br></div><div>Thanks for reviewing this issue.</div><div><br></div><div>Yes, we also noticed this spot and had added a lock to lvm_vg_open() to test. It looks find by now. We are </div><div>going to do some more tests with a real KVM to verify this modification. Then submit a patch. </div><div><br></div>