[Gluster-devel] Can't use NFS with VMware ESXi

Fernando Frediani (Qube) fernando.frediani at qubenet.net
Fri May 25 20:32:25 UTC 2012


Anand,
Further to that I managed to mount the Datastore and deploy machines there, but when trying to power them On I get an error as if it couldn't find a file. Has anyone seen these kind of error before ? I would say that it could be a lock problem, but it doesn't seem to. Permissions maybe ? Or the way the NFS is exported ? (root_squash, no_root_squash, etc)

Here is the log:

An unexpected error was received from the ESX host while powering on VM vm-21112.
Failed to power on VM.
Unable to retrieve the current working directory: 0 (No such file or directory). Check if the directory has been deleted or unmounted. 
Unable to retrieve the current working directory: 0 (No such file or directory). Check if the directory has been deleted or unmounted. 
Unable to retrieve the current working directory: 0 (No such file or directory). Check if the directory has been deleted or unmounted.

Regards,

Fernando

-----Original Message-----
From: Fernando Frediani (Qube) 
Sent: 25 May 2012 20:34
To: 'Anand Babu Periasamy'
Cc: 'gluster-devel at nongnu.org'
Subject: RE: [Gluster-devel] Can't use NFS with VMware ESXi

Hi Anand,
Thanks for that . It actually worked using Distributed+Replicated.
However the 2 main reasons I am testing version 3.3 is first and mainly because of the Granular Locking therefore suited to run VMs and also I found that using Repstr(Replicated + Striped (+ distributed)) for VMDK files as they are normally large it was going to distribute it in many chunks across several bricks increasing both read and write performance when accessing it as that would spread the IOPS too all bricks and disks containing the chunks of the file. Also if I understand correctly, if a VM that has a massive VMDK file (2TB for example) using this new volume type it wouldn't be stored into a single brick preventing it to get unbalanced on the amount of free space compared to the others.

Am I right on my assumptions ?

Also with regards the problem I've reported below what do you think it could be and how to get that working ? I wanted afterwards to make a performance comparison between both volume types.

Thanks
Regards,

Fernando

-----Original Message-----
From: Anand Babu Periasamy [mailto:abperiasamy at gmail.com]
Sent: 25 May 2012 18:09
To: Fernando Frediani (Qube)
Cc: gluster-devel at nongnu.org
Subject: Re: [Gluster-devel] Can't use NFS with VMware ESXi

On Fri, May 25, 2012 at 4:36 AM, Fernando Frediani (Qube) <fernando.frediani at qubenet.net> wrote:
> Actually, even on another Linux machine mounting NFS has the same behaviour.
> I am able to mount it with "mount -t nfs ." but when I try "ls" it 
> hangs as well.
>
> One particular thing of the Gluster servers is that they have two 
> networks, one for management with default gateway and another only for 
> storage. I am only able to mount on the storage network.
>
> The hosts file has all nodes' names with the ips on the storage network.
>
>
>
> I tried to use this but didn't work either.
>
> gluster volume set VOLUME nfs.rpc-auth-allow 10.10.100.*
>
>
>
> Watching the nfs logs when I try a "ls" from the remote client it shows:
>
>
>
> pending frames:
>
>
>
> patchset: git://git.gluster.com/glusterfs.git
>
> signal received: 11
>
> time of crash: 2012-05-25 11:38:09
>
> configuration details:
>
> argp 1
>
> backtrace 1
>
> dlfcn 1
>
> fdatasync 1
>
> libpthread 1
>
> llistxattr 1
>
> setfsid 1
>
> spinlock 1
>
> epoll.h 1
>
> xattr.h 1
>
> st_atim.tv_nsec 1
>
> package-string: glusterfs 3.3.0beta4
>
> /lib64/libc.so.6(+0x32900)[0x7f1c92d92900]
>
> /usr/lib64/glusterfs/3.3.0beta4/xlator/cluster/replicate.so(afr_lookup
> +0xa5)[0x7f1c8e7a6ac5]
>
> /usr/lib64/glusterfs/3.3.0beta4/xlator/cluster/stripe.so(stripe_readdi
> rp_cbk+0x536)[0x7f1c8e543346]
>
> /usr/lib64/glusterfs/3.3.0beta4/xlator/cluster/replicate.so(afr_readdi
> rp_cbk+0x1ca)[0x7f1c8e76269a]
>
> /usr/lib64/glusterfs/3.3.0beta4/xlator/protocol/client.so(client3_1_re
> addirp_cbk+0x170)[0x7f1c8e9dbbe0]
>
> /usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa2)[0x7f1c9388b302]
>
> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0xb6)[0x7f1c9388b516]
>
> /usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x27)[0x7f1c93886e17]
>
> /usr/lib64/glusterfs/3.3.0beta4/rpc-transport/socket.so(socket_event_p
> oll_in+0x3f)[0x7f1c8f818c8f]
>
> /usr/lib64/glusterfs/3.3.0beta4/rpc-transport/socket.so(socket_event_h
> andler+0x188)[0x7f1c8f818e38]
>
> /usr/lib64/libglusterfs.so.0(+0x3eb51)[0x7f1c93ad0b51]
>
> /usr/sbin/glusterfs(main+0x502)[0x406612]
>
> /lib64/libc.so.6(__libc_start_main+0xfd)[0x7f1c92d7ecdd]
>
> /usr/sbin/glusterfs[0x404399]
>
>
>
> Thanks
>
>
> Fernando
>
>
>
> From: Fernando Frediani (Qube)
> Sent: 25 May 2012 10:44
> To: 'gluster-devel at nongnu.org'
> Subject: Can't use NFS with VMware ESXi
>
>
>
> Hi,
>
>
>
> I've setup a Gluster environment using CentOS 6.2 and GlusterFS 3.3
> beta4 and the new type of volume striped + replicated. My go is to use 
> it to run Virtual Machines (.vmdk files).
>
>
>
> Volume is created fine and the ESXi server mountw the Datastore using 
> Gluster  built-in NFS, however  when trying to use the Datastore or 
> even read, it hangs.
>
>
>
> Looking at the Gluster NFS logs I see:    "[socket.c:195:__socket_rwv]
> 0-socket.nfs-server: readv failed (Connection reset by peer)"
>
>
>
> In order to get the rpm files installed I had first to install these 
> two because of the some libraries:
> "compat-readline5-5.2-17.1.el6.x86_64".rpm
> and "openssl098e-0.9.8e-17.el6.centos.x86_64.rpm".Not sure if it has 
> anything to do with that.
>
>
>
> Has anyone ever used Gluster as a backend storage for ESXi ? Does it 
> actually work ?
>
>
>
> Regards,
>
>
>
> Fernando Frediani
> Lead Systems Engineer
>
> Qube Managed Services Limited
> 260-266 Goswell Road, London, EC1V 7EB, United Kingdom

Hi Fernando, can you please try distributed+replicated. I won't recommend replicated-stripe for VM environment. Stripe was largely developed for HPC pre and post processing jobs (large number of clients reading / writing same file). In any case, this looks like a bug in replicated-stripe.

--
Anand Babu Periasamy
Blog [http://www.unlocksmith.org]

Imagination is more important than knowledge --Albert Einstein




More information about the Gluster-devel mailing list