[OmniOS-discuss] esxi 5.5 to omnios r151014 nfs server issue

Schweiss, Chip chip at innovates.com
Fri Apr 10 13:11:49 UTC 2015


On Fri, Apr 10, 2015 at 7:51 AM, Hafiz Rafiyev <rafibeyli at gmail.com> wrote:

> I tested all suggested solutions(reset filesystems everyone@=modify,edited
> hosts files on esxi and omnios)  but nothing changed ,
>
>
> I have same random not connected NFS share problem(also pached esxi 5.5 to
> last release 2638301),
>
> I think  something changed in r151014 nfs stack,because when I restored to
> r151012 everythink running smooth
>
> Now I'm back to r151012,
>
> Hafiz
>
>
Have you adjusted the NFS heartbeats on ESXi?   I can't seem to located the
best practices paper I found a few years ago.  I think it was on Oracle or
Nexenta, but you need to adjust your heart beats from ESXi or it will think
your storage is offline.   Here's my settings:

[image: Inline image 1]

My data stores are still on r151012, so I can't say for sure this has
anything to do with the problem you are seeing.  But I have definitely seen
your problems before.  I figured out the heart beat problem by analyzing
tcpdumps while the datastores went off and online.

-Chip


>
>
>
> ----- Original Message -----
> From: omnios-discuss-request at lists.omniti.com
> To: "omnios-discuss" <omnios-discuss at lists.omniti.com>
> Sent: Monday, 6 April, 2015 16:41:22
> Subject: OmniOS-discuss Digest, Vol 37, Issue 16
>
> Send OmniOS-discuss mailing list submissions to
>         omnios-discuss at lists.omniti.com
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         http://lists.omniti.com/mailman/listinfo/omnios-discuss
> or, via email, send a message with subject or body 'help' to
>         omnios-discuss-request at lists.omniti.com
>
> You can reach the person managing the list at
>         omnios-discuss-owner at lists.omniti.com
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of OmniOS-discuss digest..."
>
>
> Today's Topics:
>
>    1. r151014 KVM crash (Johan Kragsterman)
>    2. Re: OmniOS r151014 is now out! (Natxo Asenjo)
>    3. pkgrecv r151014 (Al Slater)
>    4. esxi 5.5 to omnios r151014 nfs server issue (Hafiz Rafiyev)
>    5. Re: pkgrecv r151014 (Al Slater)
>    6. Re: esxi 5.5 to omnios r151014 nfs server issue (G?nther Alka)
>    7. Re: All SSD pool advice (Chris Nagele)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 6 Apr 2015 10:20:56 +0200
> From: Johan Kragsterman <johan.kragsterman at capvert.se>
> To: "omnios-discuss at lists.omniti.com"
>         <omnios-discuss at lists.omniti.com>
> Subject: [OmniOS-discuss] r151014 KVM crash
> Message-ID:
>         <
> OF8C4EC57F.10747AAD-ONC1257E1F.0027F11B-C1257E1F.002DDCA2 at inse.com>
> Content-Type: text/plain;       charset=ISO-8859-1
>
> Hi!
>
> I switched one of my development ?machines over to r151014. On that
> machine I got a few KVM VM's.
>
> One of them is a Linux terminal server, and when I wanted to
> update/upgrade it, both the general OS and the chroot environments I got in
> it, it crashed. I tried several times, and every time I did it, it crashed.
> It seems to run without problems when I don't do any heavy work on it, but
> with this update/upgrade, it runs for about ~5 min, then it crashes. It
> can't get started again, until I reboot the server.
>
> The following msg is from /var/adm/messages:
>
>
> 40b0000, id=1, base_msr= fee00000 PRIx64 base_address=fee00000
> Apr ?4 20:45:45 omni2 kvm: [ID 710719 kern.info] vmcs revision_id = f
> Apr ?4 20:45:45 omni2 kvm: [ID 420667 kern.info] kvm_lapic_reset:
> vcpu=ffffff06140a8000
> , id=2, base_msr= fee00000 PRIx64 base_address=fee00000
> Apr ?4 20:45:45 omni2 kvm: [ID 710719 kern.info] vmcs revision_id = f
> Apr ?4 20:45:45 omni2 kvm: [ID 420667 kern.info] kvm_lapic_reset:
> vcpu=ffffff0614236000
> , id=3, base_msr= fee00000 PRIx64 base_address=fee00000
> Apr ?4 20:45:45 omni2 kvm: [ID 710719 kern.info] vmcs revision_id = f
> Apr ?4 20:45:52 omni2 kvm: [ID 391722 kern.info] unhandled wrmsr:
> 0x1010101 data fffffd
> 7fffdfe8e0
> Apr ?4 20:45:52 omni2 last message repeated 3 times
> Apr ?4 20:45:52 omni2 kvm: [ID 391722 kern.info] unhandled wrmsr:
> 0xff3d0f9c data fffff
> d7fffdfe8b0
> Apr ?4 20:45:52 omni2 kvm: [ID 391722 kern.info] unhandled wrmsr: 0x0
> data 0
> Apr ?4 20:45:52 omni2 last message repeated 6 times
> Apr ?4 20:45:52 omni2 kvm: [ID 291337 kern.info] vcpu 1 received sipi
> with vector # 10
> Apr ?4 20:45:52 omni2 kvm: [ID 291337 kern.info] vcpu 2 received sipi
> with vector # 10
> Apr ?4 20:45:52 omni2 kvm: [ID 291337 kern.info] vcpu 3 received sipi
> with vector # 10
> Apr ?4 20:45:52 omni2 kvm: [ID 420667 kern.info] kvm_lapic_reset:
> vcpu=ffffff06140b0000
> , id=1, base_msr= fee00800 PRIx64 base_address=fee00000
> Apr ?4 20:45:52 omni2 kvm: [ID 420667 kern.info] kvm_lapic_reset:
> vcpu=ffffff06140a8000
> , id=2, base_msr= fee00800 PRIx64 base_address=fee00000
>
>
> Then it goes on like this:
>
>
> Apr ?4 20:46:25 omni2 kvm: [ID 713435 kern.info] unhandled rdmsr:
> 0xff311c4c
> Apr ?4 20:46:25 omni2 kvm: [ID 391722 kern.info] unhandled wrmsr:
> 0x525f2f data 8000000
> 01
> Apr ?4 20:46:25 omni2 kvm: [ID 713435 kern.info] unhandled rdmsr:
> 0xff311c4c
> Apr ?4 20:46:25 omni2 kvm: [ID 391722 kern.info] unhandled wrmsr:
> 0x525f2f data 8000000
> 01
> Apr ?4 20:46:25 omni2 kvm: [ID 713435 kern.info] unhandled rdmsr:
> 0xff311c4c
> Apr ?4 20:46:25 omni2 kvm: [ID 391722 kern.info] unhandled wrmsr:
> 0x525f2f data 8000000
> 01
> Apr ?4 20:46:25 omni2 kvm: [ID 713435 kern.info] unhandled rdmsr:
> 0xff311c4c
> Apr ?4 20:46:25 omni2 kvm: [ID 391722 kern.info] unhandled wrmsr:
> 0x525f2f data 8000000
> 01
> Apr ?4 20:46:25 omni2 kvm: [ID 713435 kern.info] unhandled rdmsr:
> 0xff311c4c
> Apr ?4 20:46:25 omni2 kvm: [ID 391722 kern.info] unhandled wrmsr:
> 0x525f2f data 8000000
> 01
> Apr ?4 20:46:25 omni2 kvm: [ID 713435 kern.info] unhandled rdmsr:
> 0xff311c4c
> Apr ?4 20:46:25 omni2 kvm: [ID 391722 kern.info] unhandled wrmsr:
> 0x525f2f data 8000000
> 01
> Apr ?4 20:46:34 omni2 kvm: [ID 713435 kern.info] unhandled rdmsr:
> 0xff311c4c
> Apr ?4 20:46:34 omni2 kvm: [ID 391722 kern.info] unhandled wrmsr:
> 0x525f2f data 2000000
> 001
> Apr ?4 20:46:34 omni2 kvm: [ID 713435 kern.info] unhandled rdmsr:
> 0xff311c4c
> Apr ?4 20:46:34 omni2 kvm: [ID 391722 kern.info] unhandled wrmsr:
> 0x525f2f data 2000000
> 001
> Apr ?4 20:46:34 omni2 kvm: [ID 713435 kern.info] unhandled rdmsr:
> 0xff311c4c
> Apr ?4 20:46:34 omni2 kvm: [ID 391722 kern.info] unhandled wrmsr:
> 0x525f2f data 2000000
> 001
> Apr ?4 20:46:34 omni2 kvm: [ID 713435 kern.info] unhandled rdmsr:
> 0xff311c4c
> Apr ?4 20:46:34 omni2 kvm: [ID 391722 kern.info] unhandled wrmsr:
> 0x525f2f data 2000000
> 001
> Apr ?4 20:46:34 omni2 kvm: [ID 713435 kern.info] unhandled rdmsr:
> 0xff311c4c
> Apr ?4 20:46:34 omni2 kvm: [ID 391722 kern.info] unhandled wrmsr:
> 0x525f2f data 2000000
> 001
> Apr ?4 20:46:34 omni2 kvm: [ID 713435 kern.info] unhandled rdmsr:
> 0xff311c4c
> Apr ?4 20:46:34 omni2 kvm: [ID 391722 kern.info] unhandled wrmsr:
> 0x525f2f data 2000000
>
>
>
> And like this:
>
>
>
> Apr ?4 20:50:45 omni2 kvm: [ID 713435 kern.info] unhandled rdmsr:
> 0xff311c4c
> Apr ?4 20:50:45 omni2 kvm: [ID 391722 kern.info] unhandled wrmsr:
> 0x526835 data 8
> Apr ?4 20:50:45 omni2 kvm: [ID 713435 kern.info] unhandled rdmsr:
> 0xff311c4c
> Apr ?4 20:50:45 omni2 kvm: [ID 391722 kern.info] unhandled wrmsr:
> 0x526835 data 8
> Apr ?4 20:50:45 omni2 kvm: [ID 713435 kern.info] unhandled rdmsr:
> 0xff311c4c
> Apr ?4 20:50:45 omni2 kvm: [ID 391722 kern.info] unhandled wrmsr:
> 0x526835 data 8
> Apr ?4 20:50:45 omni2 kvm: [ID 713435 kern.info] unhandled rdmsr:
> 0xff311c4c
> Apr ?4 20:50:45 omni2 kvm: [ID 391722 kern.info] unhandled wrmsr:
> 0x526835 data 8
> Apr ?4 20:50:45 omni2 kvm: [ID 713435 kern.info] unhandled rdmsr:
> 0xff311c4c
> Apr ?4 20:50:45 omni2 kvm: [ID 391722 kern.info] unhandled wrmsr:
> 0x526835 data 8
> Apr ?4 20:50:45 omni2 kvm: [ID 713435 kern.info] unhandled rdmsr:
> 0xff311c4c
> Apr ?4 20:50:45 omni2 kvm: [ID 391722 kern.info] unhandled wrmsr:
> 0x526835 data 10
> Apr ?4 20:50:45 omni2 kvm: [ID 713435 kern.info] unhandled rdmsr:
> 0xff311c4c
> Apr ?4 20:50:45 omni2 kvm: [ID 391722 kern.info] unhandled wrmsr:
> 0x526835 data 10
> Apr ?4 20:50:45 omni2 kvm: [ID 713435 kern.info] unhandled rdmsr:
> 0xff311c4c
> Apr ?4 20:50:45 omni2 kvm: [ID 391722 kern.info] unhandled wrmsr:
> 0x526835 data 10
> Apr ?4 20:50:45 omni2 kvm: [ID 713435 kern.info] unhandled rdmsr:
> 0xff311c4c
> Apr ?4 20:50:45 omni2 kvm: [ID 391722 kern.info] unhandled wrmsr:
> 0x526835 data 10
> Apr ?4 20:50:45 omni2 kvm: [ID 713435 kern.info] unhandled rdmsr:
> 0xff311c4c
> Apr ?4 20:50:45 omni2 kvm: [ID 391722 kern.info] unhandled wrmsr:
> 0x526835 data 10
> Apr ?4 20:50:45 omni2 kvm: [ID 713435 kern.info] unhandled rdmsr:
> 0xff311c4c
> Apr ?4 20:50:45 omni2 kvm: [ID 391722 kern.info] unhandled wrmsr:
> 0x526835 data 10
> Apr
>
>
> I switched back to r151012, and there everything is working fine...
>
> I do a rollback of the volumes I used for the chroots in the VM, because
> they've been messed up of the repetedly interupted upgrade attemts, so I
> run new updates/upgrades on the chroots, and even build new ones, and no
> problems here in r151012.
>
> So the problem seem to be exclusively in r151014.
>
> I got some messages on the omnios console after the VM crashes that I
> didn't record, unfortunatly. What I remember was that it was complaining
> about a bus, and it was also complains about either ps or pthread as well.
>
> I will go back to r151014 again, and run more tests like this, to get this
> clarified, and record the exact msg on the consol.
>
> Any suggestion?
>
>
>
> Best regards from/Med v?nliga h?lsningar fr?n
>
> Johan Kragsterman
>
> Capvert
>
>
>
> ------------------------------
>
> Message: 2
> Date: Mon, 6 Apr 2015 11:32:16 +0200
> From: Natxo Asenjo <natxo.asenjo at gmail.com>
> To: omnios-discuss <omnios-discuss at lists.omniti.com>
> Subject: Re: [OmniOS-discuss] OmniOS r151014 is now out!
> Message-ID:
>         <
> CAHBEJzUN7-4L0AyYxBLTxUXQbUsjmb0N7oOtWpx0uGPL0i7S5Q at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> On Fri, Apr 3, 2015 at 3:58 AM, Dan McDonald <danmcd at omniti.com> wrote:
>
> > Say hello to OmniOS r151014:
> >
> >         http://omnios.omniti.com/wiki.php/ReleaseNotes/r151014
>
>
> upgrade succesful on my home microserver :-)
>
> Congrats on the good work!
>
> --
> regards,
> natxo
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> https://omniosce.org/ml-archive/attachments/20150406/ef8ee9b8/attachment-0001.html
> >
>
> ------------------------------
>
> Message: 3
> Date: Mon, 06 Apr 2015 11:03:47 +0100
> From: Al Slater <al.slater at scluk.com>
> To: omnios-discuss at lists.omniti.com
> Subject: [OmniOS-discuss] pkgrecv r151014
> Message-ID: <55225A03.7020102 at scluk.com>
> Content-Type: text/plain; charset=utf-8
>
> Hi,
>
> I am trying to pkgrecv r151014 into my own repository and keep bumping
> into this:
>
> pkgrecv: Invalid contentpath opt/sunstudio12.1/prod/lib/sys/libsunir.so:
> chash failure: expected: b251c238070b6fdbf392194e85319e2c954a5384
> computed: 17d9899f959ac5835569e8870f7e02eb14607242. (happened 4 times)
>
> Is there a problem with this package in the repository?
>
> --
> Al Slater
>
>
>
>
> ------------------------------
>
> Message: 4
> Date: Mon, 6 Apr 2015 12:50:00 +0300 (EEST)
> From: Hafiz Rafiyev <rafibeyli at gmail.com>
> To: omnios-discuss <omnios-discuss at lists.omniti.com>
> Subject: [OmniOS-discuss] esxi 5.5 to omnios r151014 nfs server issue
> Message-ID:
>         <
> 1070639246.2109450.1428313800852.JavaMail.zimbra at cantekstil.com.tr>
> Content-Type: text/plain; charset=windows-1254
>
>
> After upgrade from r151012 to r151014 i have issue with nfs server,
> after upgrade, some of Esxi 5.5 nfs datastores connecting and some not,
>
> and it's being randomly,after omnios restart again some datastores
> connected and some not
>
> when looking omnios side,nfs server up and running,
>
> note:before upgrade all esxi datastores were connected and running,omnios
> running as VM,disks connected with  HBA passthruogh mode
>
> only log I see from omnios side is:
>
> nfs4cbd[468]: [ID 867284 daemon.notice] nfsv4 cannot determine local
> hostname binding for transport tcp6 - delegations will not be available on
> this transport
>
>
> regards
>
> Hafiz.
>
>
> ------------------------------
>
> Message: 5
> Date: Mon, 06 Apr 2015 11:24:30 +0100
> From: Al Slater <al.slater at scluk.com>
> To: omnios-discuss at lists.omniti.com
> Subject: Re: [OmniOS-discuss] pkgrecv r151014
> Message-ID: <55225EDE.2010902 at scluk.com>
> Content-Type: text/plain; charset=windows-1252
>
> On 06/04/15 11:03, Al Slater wrote:
> > Hi,
> >
> > I am trying to pkgrecv r151014 into my own repository and keep bumping
> > into this:
> >
> > pkgrecv: Invalid contentpath opt/sunstudio12.1/prod/lib/sys/libsunir.so:
> > chash failure: expected: b251c238070b6fdbf392194e85319e2c954a5384
> > computed: 17d9899f959ac5835569e8870f7e02eb14607242. (happened 4 times)
> >
> > Is there a problem with this package in the repository?
>
> Same happens with pkg install...
>
> # pkg install pkg:/developer/sunstudio12.1 at 12.1-0.151014
>            Packages to install:  1
>        Create boot environment: No
> Create backup boot environment: No
>
> DOWNLOAD                                PKGS         FILES    XFER (MB)
>   SPEED
> developer/sunstudio12.1                  0/1     5042/7006  203.1/256.3
>  3.0M/s
>
>
>
> Errors were encountered while attempting to retrieve package or file
> data for
> the requested operation.
> Details follow:
>
> Invalid contentpath opt/sunstudio12.1/prod/lib/sys/libsunir.so: chash
> failure: expected: b251c238070b6fdbf392194e85319e2c954a5384 computed:
> 17d9899f959ac5835569e8870f7e02eb14607242. (happened 4 times)
>
>
> regards
>
> --
> Al Slater
>
> Technical Director
> SCL
>
> Phone : +44 (0)1273 666607
> Fax   : +44 (0)1273 666601
> email : al.slater at scluk.com
>
> Stanton Consultancy Ltd
>
> Park Gate, 161 Preston Road, Brighton, East Sussex, BN1 6AU
>
> Registered in England Company number: 1957652 VAT number: GB 760 2433 55
>
>
>
> ------------------------------
>
> Message: 6
> Date: Mon, 6 Apr 2015 12:34:30 +0200
> From: G?nther Alka <alka at hfg-gmuend.de>
> To: omnios-discuss <omnios-discuss at lists.omniti.com>
> Subject: Re: [OmniOS-discuss] esxi 5.5 to omnios r151014 nfs server
>         issue
> Message-ID: <4700B3B3-2CED-407D-A131-62FE1E392B53 at hfg-gmuend.de>
> Content-Type: text/plain; charset=us-ascii
>
> just to rule out a permission problem
>
> can you recursively reset permissions of that filesystem to a
> everyone@=modify setting.
>
>
>
> > Am 06.04.2015 um 11:50 schrieb Hafiz Rafiyev <rafibeyli at gmail.com>:
> >
> >
> > After upgrade from r151012 to r151014 i have issue with nfs server,
> > after upgrade, some of Esxi 5.5 nfs datastores connecting and some not,
> >
> > and it's being randomly,after omnios restart again some datastores
> connected and some not
> >
> > when looking omnios side,nfs server up and running,
> >
> > note:before upgrade all esxi datastores were connected and
> running,omnios running as VM,disks connected with  HBA passthruogh mode
> >
> > only log I see from omnios side is:
> >
> > nfs4cbd[468]: [ID 867284 daemon.notice] nfsv4 cannot determine local
> hostname binding for transport tcp6 - delegations will not be available on
> this transport
> >
> >
> > regards
> >
> > Hafiz.
> > _______________________________________________
> > OmniOS-discuss mailing list
> > OmniOS-discuss at lists.omniti.com
> > http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
>
>
> ------------------------------
>
> Message: 7
> Date: Mon, 6 Apr 2015 09:41:19 -0400
> From: Chris Nagele <nagele at wildbit.com>
> To: "omnios-discuss at lists.omniti.com"
>         <omnios-discuss at lists.omniti.com>
> Subject: Re: [OmniOS-discuss] All SSD pool advice
> Message-ID:
>         <
> CAHfYOdUN_CWsmPVDCZGRh3pCUoSkRkWThwBj7khkj+ztiwC5Zg at mail.gmail.com>
> Content-Type: text/plain; charset=UTF-8
>
> Thanks everyone. Regarding the expanders, our 4U servers are on the
> following chassis:
>
> http://www.supermicro.com/products/chassis/4U/846/SC846E16-R1200.cfm
>
> We are using all SAS disks, except for the SSDs. How big is the risk
> here when it comes to SAS -> SATA conversion? Our newer servers have
> direct connections on each lane to the disk.
>
> Chris
>
> Chris Nagele
> Co-founder, Wildbit
> Beanstalk, Postmark, dploy.io
>
>
> On Sat, Apr 4, 2015 at 7:18 PM, Doug Hughes <doug at will.to> wrote:
> >
> > We have a couple of machines with all SSD pool (~6-10 Samsung 850 pro is
> the
> > current favorite). They work great for IOPS. Here's my take.
> > 1) you don't need a dedicated zil. Just let the zpool intersperse it
> amongst
> > the existing zpool devices. They are plenty fast enough.
> > 2) you don't need an L2arc for the same reason. a smaller number of
> > dedicated devices would likely cause more of a bottleneck than serving
> off
> > the existing pool devices (unless you were to put it on one of those
> giant
> > RDRAM things or similar, but that adds a lot of expense)
> >
> >
> >
> >
> >
> > On 4/4/2015 3:07 PM, Chris Nagele wrote:
> >
> > We've been running a few 4U Supermicro servers using ZeusRAM for zil and
> > SSDs for L2. The main disks are regular 1TB SAS.
> >
> > I'm considering moving to all SSD since the pricing has dropped so much.
> > What things should I know or do when moving to all SSD pools? I'm
> assuming I
> > don't need L2 and that I should keep the ZeusRAM. Should I only use
> certain
> > types of SSDs?
> >
> > Thanks,
> > Chris
> >
> >
> > --
> >
> > Chris Nagele
> > Co-founder, Wildbit
> > Beanstalk, Postmark, dploy.io
> >
> >
> >
> > _______________________________________________
> > OmniOS-discuss mailing list
> > OmniOS-discuss at lists.omniti.com
> > http://lists.omniti.com/mailman/listinfo/omnios-discuss
> >
> >
> >
> > _______________________________________________
> > OmniOS-discuss mailing list
> > OmniOS-discuss at lists.omniti.com
> > http://lists.omniti.com/mailman/listinfo/omnios-discuss
> >
>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
>
> ------------------------------
>
> End of OmniOS-discuss Digest, Vol 37, Issue 16
> **********************************************
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20150410/75fe4ca7/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 8184 bytes
Desc: not available
URL: <https://omniosce.org/ml-archive/attachments/20150410/75fe4ca7/attachment-0001.png>


More information about the OmniOS-discuss mailing list