<html><head><meta http-equiv="Content-Type" content="text/html charset=GB2312"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">Hi Avati,<div><br></div><div><span class="Apple-tab-span" style="white-space:pre">        </span>It seem no use with your suggestion to remove entries for page size less than 4KB(even 4KB), the RES also increased with smbd.</div><div><br></div><div><br></div><div>Thanks</div><div>-Kane</div><div><br><div><div>在 2013-9-25,上午10:12,Anand Avati <<a href="mailto:avati@gluster.org">avati@gluster.org</a>> 写道:</div><br class="Apple-interchange-newline"><blockquote type="cite"><div dir="ltr">On Tue, Sep 24, 2013 at 6:37 PM, haiwei.xie-soulinfo <span dir="ltr"><<a href="mailto:haiwei.xie@soulinfo.com" target="_blank">haiwei.xie@soulinfo.com</a>></span> wrote:<br><div class="gmail_extra"><div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">hi,<br>
<br>
Our patch for this bug, running looks good. smbd will not exit with oom-kill. But it's not correct method.<br>
<br>
git version: release-3.4/886021a31bdac83c2124d08d64b81f22d82039d6<br>
<br>
diff --git a/api/src/glfs-fops.c b/api/src/glfs-fops.c<br>
index 66e7d69..535ee53 100644<br>
--- a/api/src/glfs-fops.c<br>
+++ b/api/src/glfs-fops.c<br>
@@ -713,7 +713,9 @@ glfs_pwritev (struct glfs_fd *glfd, const struct iovec *iovec, int iovcnt,<br>
}<br>
<br>
size = iov_length (iovec, iovcnt);<br>
-<br>
+#define MIN_LEN 8 * 1024<br>
+ if (size < MIN_LEN)<br>
+ size = MIN_LEN;<br>
iobuf = iobuf_get2 (subvol->ctx->iobuf_pool, size);<br>
if (!iobuf) {<br>
ret = -1;<br>
<div><div class="h5"><br></div></div></blockquote><div><br></div><div><br></div><div>Ah, looks like we need to tune the page_size/num_pages table in libglusterfs/src/iobuf.c. The table is allowing for too small pages. We should probably remove entries for page size less than 4KB. Just doing that might fix your issue:</div>
<div><br></div><div><pre style="">diff --git a/libglusterfs/src/iobuf.c b/libglusterfs/src/iobuf.c
index a89e962..0269004 100644
--- a/libglusterfs/src/iobuf.c
+++ b/libglusterfs/src/iobuf.c
@@ -24,9 +24,7 @@
/* Make sure this array is sorted based on pagesize */
struct iobuf_init_config gf_iobuf_init_config[] = {
/* { pagesize, num_pages }, */
- {128, 1024},
- {512, 512},
- {2 * 1024, 512},
+ {4 * 1024, 256},
{8 * 1024, 128},
{32 * 1024, 64},
{128 * 1024, 32},</pre></div><div><br></div><div>Avati</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div><div class="h5">
> On 09/13/2013 06:03 PM, kane wrote:<br>
> > Hi<br>
> ><br>
> > We use samba gluster vfs in IO test, but meet with gluster server smbd<br>
> > oom killer,<br>
> > The smbd process spend over 15g RES with top command show, in the end<br>
> > is our simple test code:<br>
> ><br>
> > gluster server vfs --> smbd --> client mount dir "/mnt/vfs"--> execute<br>
> > vfs test program "$ ./vfs 1000"<br>
> ><br>
> > then we can watch gluster server smbd RES with top command.<br>
> ><br>
> > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND<br>
> > 4000 soul 20 0 5486m 4.9g 10m R 108.4 31.5 111:07.07 smbd<br>
> > 3447 root 20 0 1408m 44m 2428 S 44.4 0.3 59:11.55 glusterfsd<br>
> ><br>
> > io test code:<br>
> > =======================================<br>
> > #define _LARGEFILE64_SOURCE<br>
> ><br>
> > #include <stdio.h><br>
> > #include <unistd.h><br>
> > #include <string.h><br>
> > #include <pthread.h><br>
> > #include <stdlib.h><br>
> > #include <fcntl.h><br>
> > #include <sys/types.h><br>
> ><br>
> > int WT = 1;<br>
> ><br>
> > #define RND(x) ((x>0)?(genrand() % (x)):0)<br>
> > extern unsigned long genrand();<br>
> > extern void sgenrand();<br>
> ><br>
> > /* Period parameters */<br>
> > #define N 624<br>
> > #define M 397<br>
> > #define MATRIX_A 0x9908b0df /* constant vector a */<br>
> > #define UPPER_MASK 0x80000000 /* most significant w-r bits */<br>
> > #define LOWER_MASK 0x7fffffff /* least significant r bits */<br>
> ><br>
> > /* Tempering parameters */<br>
> > #define TEMPERING_MASK_B 0x9d2c5680<br>
> > #define TEMPERING_MASK_C 0xefc60000<br>
> > #define TEMPERING_SHIFT_U(y) (y >> 11)<br>
> > #define TEMPERING_SHIFT_S(y) (y << 7)<br>
> > #define TEMPERING_SHIFT_T(y) (y << 15)<br>
> > #define TEMPERING_SHIFT_L(y) (y >> 18)<br>
> ><br>
> > static unsigned long mt[N]; /* the array for the state vector */<br>
> > static int mti=N+1; /* mti==N+1 means mt[N] is not initialized */<br>
> ><br>
> > /* Initializing the array with a seed */<br>
> > void<br>
> > sgenrand(seed)<br>
> > unsigned long seed;<br>
> > {<br>
> > int i;<br>
> ><br>
> > for (i=0;i<N;i++) {<br>
> > mt[i] = seed & 0xffff0000;<br>
> > seed = 69069 * seed + 1;<br>
> > mt[i] |= (seed & 0xffff0000) >> 16;<br>
> > seed = 69069 * seed + 1;<br>
> > }<br>
> > mti = N;<br>
> > }<br>
> ><br>
> > unsigned long<br>
> > genrand()<br>
> > {<br>
> > unsigned long y;<br>
> > static unsigned long mag01[2]={0x0, MATRIX_A};<br>
> > /* mag01[x] = x * MATRIX_A for x=0,1 */<br>
> ><br>
> > if (mti >= N) { /* generate N words at one time */<br>
> > int kk;<br>
> ><br>
> > if (mti == N+1) /* if sgenrand() has not been called, */<br>
> > sgenrand(4357); /* a default initial seed is used */<br>
> ><br>
> > for (kk=0;kk<N-M;kk++) {<br>
> > y = (mt[kk]&UPPER_MASK)|(mt[kk+1]&LOWER_MASK);<br>
> > mt[kk] = mt[kk+M] ^ (y >> 1) ^ mag01[y & 0x1];<br>
> > }<br>
> > for (;kk<N-1;kk++) {<br>
> > y = (mt[kk]&UPPER_MASK)|(mt[kk+1]&LOWER_MASK);<br>
> > mt[kk] = mt[kk+(M-N)] ^ (y >> 1) ^ mag01[y & 0x1];<br>
> > }<br>
> > y = (mt[N-1]&UPPER_MASK)|(mt[0]&LOWER_MASK);<br>
> > mt[N-1] = mt[M-1] ^ (y >> 1) ^ mag01[y & 0x1];<br>
> ><br>
> > mti = 0;<br>
> > }<br>
> > y = mt[mti++];<br>
> > y ^= TEMPERING_SHIFT_U(y);<br>
> > y ^= TEMPERING_SHIFT_S(y) & TEMPERING_MASK_B;<br>
> > y ^= TEMPERING_SHIFT_T(y) & TEMPERING_MASK_C;<br>
> > y ^= TEMPERING_SHIFT_L(y);<br>
> ><br>
> > return y;<br>
> > }<br>
> ><br>
> > char *initialize_file_source(int size)<br>
> > {<br>
> > char *new_source;<br>
> > int i;<br>
> ><br>
> > if ((new_source=(char *)malloc(size))==NULL) /* allocate buffer */<br>
> > fprintf(stderr,"Error: failed to allocate source file of size %d\n",size);<br>
> > else<br>
> > for (i=0; i<size; i++) /* file buffer with junk */<br>
> > new_source[i]=32+RND(95);<br>
> ><br>
> > return(new_source);<br>
> > }<br>
> ><br>
> > void *tran_file(void *map)<br>
> > {<br>
> > int block_size = 512;<br>
> > char *read_buffer; /* temporary space for reading file data into */<br>
> > int fd = open((char *)map, O_RDWR | O_CREAT | O_TRUNC, 0644);<br>
> > if(fd == -1) {<br>
> > perror("open");<br>
> > return ;<br>
> > }<br>
> ><br>
> > //read_buffer=(char *)malloc(block_size);<br>
</div></div>> > //memset(read_buffer, 0x56, block_size);<br>
<div><div class="h5">> > read_buffer = initialize_file_source(block_size);<br>
> > while(WT){<br>
> > sleep(1);<br>
> > pwrite(fd, read_buffer, block_size, 0);<br>
> > pread(fd, read_buffer, block_size, 0);<br>
> > }<br>
> > free(read_buffer);<br>
> > close(fd);<br>
> > }<br>
> ><br>
> > int main(int argc, char *argv[])<br>
> > {<br>
> > if(argc == 1) {<br>
> > printf("Set file name and file size in GB!\n");<br>
> > return -1;<br>
> > }<br>
> ><br>
> > unsigned long long length = atoll(argv[1]) * 1024 * 1024 * 1024;<br>
> > unsigned long long index = 0;<br>
> > unsigned long long count = 0;<br>
> > int fd = 0;<br>
> > char path[32];<br>
> > char *swap = NULL;<br>
> > int psize = 65536;<br>
> > int ret = 0;<br>
> > int ret_i = 0;<br>
> > void *tret;<br>
> > pthread_t rtid_1;<br>
> > pthread_t rtid_2;<br>
> > pthread_t rtid_3;<br>
> > pthread_t rtid_4;<br>
> > int fd_i = 0;<br>
> > int a_size = 256;<br>
> > char *append_buffer; /* temporary space for reading file data into */<br>
> ><br>
> > swap = (char *)malloc(sizeof(char) * psize);<br>
> > if(swap == NULL) {<br>
> > printf("Malloc Swap Failed!\n");<br>
> > close(fd);<br>
> > return -1;<br>
> > }<br>
> > memset(swap, 0x55, psize);<br>
> ><br>
> > append_buffer = initialize_file_source(a_size);<br>
> ><br>
> > pthread_create(&rtid_1, NULL, tran_file, "/mnt/vfs/a.map");<br>
> > pthread_create(&rtid_2, NULL, tran_file, "/mnt/vfs/b.map");<br>
> > pthread_create(&rtid_3, NULL, tran_file, "/mnt/vfs/c.map");<br>
> > pthread_create(&rtid_4, NULL, tran_file, "/mnt/vfs/d.map");<br>
> ><br>
> > while(index < length)<br>
> > {<br>
> > if((index % (1024*1024*1024)) == 0) {<br>
> > count++;<br>
> > sprintf(path, "/mnt/vfs/data/%08lld.dat", count);<br>
> > if((fd = open(path, O_RDWR | O_CREAT | O_TRUNC, S_IRUSR | S_IWUSR)) < 0)<br>
> > {<br>
> > printf("Can not open io file: %s\n", path);<br>
> > perror("open");<br>
> > return -1;<br>
> > }<br>
> > sprintf(path, "/mnt/vfs/data/%08lld.index", count);<br>
> > if((fd_i = open(path, O_RDWR | O_CREAT | O_APPEND, S_IRUSR | S_IWUSR))<br>
> > < 0)<br>
> > {<br>
> > printf("Can not open io file: %s\n", path);<br>
> > perror("open");<br>
> > return -1;<br>
> > }<br>
> > printf("...1GiB\n");<br>
> > }<br>
> ><br>
> > ret = pwrite(fd, swap, psize, (index % (1024*1024*1024)));<br>
> > if(ret != psize) {<br>
> > perror("write");<br>
> > printf("Write data file failed!\n");<br>
> > continue;<br>
> > }<br>
> > lseek64(fd_i, ((index*a_size)/65536), SEEK_SET);<br>
> > ret_i = write(fd_i, append_buffer, a_size);<br>
> > if(ret_i != a_size) {<br>
> > perror("write");<br>
> > printf("Write index file failed!\n");<br>
> > continue;<br>
> > }<br>
> ><br>
> > index += ret;<br>
> > if((index % (1024*1024*1024)) == 0) {<br>
> > close(fd);<br>
> > close(fd_i);<br>
> > }<br>
> > }<br>
> ><br>
> > WT = 0;<br>
> > pthread_join(rtid_1, &tret);<br>
> > pthread_join(rtid_2, &tret);<br>
> > pthread_join(rtid_3, &tret);<br>
> > pthread_join(rtid_4, &tret);<br>
> > printf("%s GB write\n", argv[1]);<br>
> > sync();<br>
> > free(swap);<br>
> > free(append_buffer);<br>
> ><br>
> > return 0;<br>
> > }<br>
> ><br>
> > =======================================<br>
> ><br>
> ><br>
> ><br>
> ><br>
> > kane<br>
> > ----------------------------------------------------------------<br>
</div></div>> > Email: <a href="mailto:kai.zhou@soulinfo.com">kai.zhou@soulinfo.com</a> <mailto:<a href="mailto:kai.zhou@soulinfo.com">kai.zhou@soulinfo.com</a>><br>
<div class=""><div class="h5">> > 电话: 0510-85385788-616<br>
> ><br>
> ><br>
> Thanks for the code. I will try to reproduce the issue using your code.<br>
><br>
> -Lala<br>
<br>
<br>
</div></div><span class=""><font color="#888888">--<br>
谢海威<br>
软件项目经理<br>
电话: <a href="tel:%2B86%2010-68920588" value="+861068920588">+86 10-68920588</a><br>
手机: <a href="tel:%2B86%2013911703586" value="+8613911703586">+86 13911703586</a><br>
Email: <a href="mailto:haiwei.xie@soulinfo.com">haiwei.xie@soulinfo.com</a><br>
<br>
_______________________________________________<br>
Gluster-devel mailing list<br>
<a href="mailto:Gluster-devel@nongnu.org">Gluster-devel@nongnu.org</a><br>
<a href="https://lists.nongnu.org/mailman/listinfo/gluster-devel" target="_blank">https://lists.nongnu.org/mailman/listinfo/gluster-devel</a><br>
</font></span></blockquote></div><br></div></div>
</blockquote></div><br></div></body></html>