From chip at innovates.com Tue Jan 2 14:40:58 2018 From: chip at innovates.com (Schweiss, Chip) Date: Tue, 2 Jan 2018 08:40:58 -0600 Subject: [OmniOS-discuss] rpcbind: t_bind failed Message-ID: About once every week or two I'm having NFS connections start to collapse to one of my servers. Clients will lose thier connections of the the course of several hours. The logs fill with these messages: Dec 25 16:21:14 mir-zfs03 rpcbind: [ID 452059 daemon.error] do_accept : t_bind failed : Couldn't allocate address Dec 25 16:21:14 mir-zfs03 /usr/lib/nfs/nfsd[27689]: [ID 396295 daemon.error] t_bind(file descriptor 188/transport tcp) TLI error 5 Dec 25 16:21:31 mir-zfs03 last message repeated 85 times Dec 25 16:21:31 mir-zfs03 rpcbind: [ID 452059 daemon.error] do_accept : t_bind failed : Couldn't allocate address Dec 25 16:21:32 mir-zfs03 /usr/lib/nfs/nfsd[27689]: [ID 396295 daemon.error] t_bind(file descriptor 188/transport tcp) TLI error 5 Dec 25 16:21:34 mir-zfs03 last message repeated 19 times Dec 25 16:21:37 mir-zfs03 /usr/lib/nfs/nfsd[27689]: [ID 396295 daemon.error] t_bind(file descriptor 200/transport tcp) TLI error 5 Dec 25 16:22:17 mir-zfs03 last message repeated 116 times Dec 25 16:22:21 mir-zfs03 /usr/lib/nfs/nfsd[27689]: [ID 396295 daemon.error] t_bind(file descriptor 206/transport tcp) TLI error 5 Dec 25 16:23:04 mir-zfs03 last message repeated 81 times This is a fully updated OmniOS CE r151022. I've tried restarting NFS services, but the only thing that has been successful in restoring services has been rebooting. I'm not finding anything useful via Google except the source code that spits out this message. HP-UX appears to have had the same issue that they patched years ago. I'm guessing shared NFS/RPC code. Any clue as to the cause of this and how to fix? -Chip -------------- next part -------------- An HTML attachment was scrubbed... URL: From chip at innovates.com Wed Jan 3 16:02:43 2018 From: chip at innovates.com (Schweiss, Chip) Date: Wed, 3 Jan 2018 10:02:43 -0600 Subject: [OmniOS-discuss] rpcbind: t_bind failed In-Reply-To: References: Message-ID: The problem occurred again starting last night. I have another clue, but I still don't know how it is occurring or how to fix it. It looks like all the TCP ports are in "bound" state, but not being released. How can I isolate the cause of this? # netstat -an UDP: IPv4 Local Address Remote Address State -------------------- -------------------- ---------- *.* Unbound *.57921 Idle *.42045 Idle *.* Unbound *.* Unbound *.* Unbound *.* Unbound *.33757 Idle *.37883 Idle *.40744 Idle *.4045 Idle *.111 Idle *.4045 Idle *.* Unbound *.60955 Idle *.42908 Idle *.56487 Idle *.111 Idle *.* Unbound *.60994 Idle *.* Unbound *.2049 Idle *.520 Idle *.2049 Idle *.46876 Idle *.50309 Idle *.123 Idle *.123 Idle 127.0.0.1.123 Idle 10.28.125.29.123 Idle *.57929 Idle *.64351 Idle *.63145 Idle *.39674 Idle *.65280 Idle *.52013 Idle *.47989 Idle *.* Unbound UDP: IPv6 Local Address Remote Address State If --------------------------------- --------------------------------- ---------- ----- *.57921 Idle *.* Unbound *.* Unbound *.40744 Idle *.111 Idle *.4045 Idle *.* Unbound *.60955 Idle *.42908 Idle *.* Unbound *.2049 Idle *.46876 Idle *.123 Idle ::1.123 Idle TCP: IPv4 Local Address Remote Address Swind Send-Q Rwind Recv-Q State -------------------- -------------------- ----- ------ ----- ------ ----------- *.60571 *.* 0 0 1057280 0 BOUND *.46344 *.* 0 0 1057280 0 BOUND *.39729 *.* 0 0 1057280 0 BOUND *.43531 *.* 0 0 1057280 0 BOUND *.49051 *.* 0 0 1057280 0 BOUND *.44876 *.* 0 0 1057280 0 BOUND *.65416 *.* 0 0 1057280 0 BOUND *.47714 *.* 0 0 1057280 0 BOUND *.59055 *.* 0 0 1057280 0 BOUND *.45033 *.* 0 0 1057280 0 BOUND *.63321 *.* 0 0 1057280 0 BOUND *.43896 *.* 0 0 1057280 0 BOUND *.46627 *.* 0 0 1057280 0 BOUND *.35555 *.* 0 0 1057280 0 BOUND *.36115 *.* 0 0 1057280 0 BOUND *.51969 *.* 0 0 1057280 0 BOUND *.63741 *.* 0 0 1057280 0 BOUND *.45747 *.* 0 0 1057280 0 BOUND *.33245 *.* 0 0 1057280 0 BOUND *.49925 *.* 0 0 1057280 0 BOUND *.63503 *.* 0 0 1057280 0 BOUND *.45319 *.* 0 0 1057280 0 BOUND *.39977 *.* 0 0 1057280 0 BOUND ....lots of lines deleted... *.37396 *.* 0 0 1057280 0 BOUND *.33735 *.* 0 0 1057280 0 BOUND *.35695 *.* 0 0 1057280 0 BOUND *.36589 *.* 0 0 1057280 0 BOUND *.41484 *.* 0 0 1057280 0 BOUND *.63428 *.* 0 0 1057280 0 BOUND *.54891 *.* 0 0 1057280 0 BOUND *.60222 *.* 0 0 1057280 0 BOUND *.40494 *.* 0 0 1057280 0 BOUND TCP: IPv6 Local Address Remote Address Swind Send-Q Rwind Recv-Q State If --------------------------------- --------------------------------- ----- ------ ----- ------ ----------- ----- *.54749 *.* 0 0 128000 0 LISTEN ::1.5999 *.* 0 0 128000 0 LISTEN *.4045 *.* 0 0 1049200 0 LISTEN *.111 *.* 0 0 128000 0 LISTEN *.* *.* 0 0 128000 0 IDLE *.45543 *.* 0 0 128000 0 LISTEN *.2049 *.* 0 0 1049200 0 LISTEN *.53926 *.* 0 0 128000 0 LISTEN *.50379 *.* 0 0 128000 0 LISTEN Active UNIX domain sockets Address Type Vnode Conn Local Addr Remote Addr ffffd063569210e8 stream-ord 0000000 0000000 ffffd06356921498 stream-ord 0000000 0000000 ffffd06356921bf8 stream-ord 0000000 0000000 ffffd063569260e0 stream-ord 0000000 0000000 ffffd06356926490 stream-ord ffffd0635646d300 0000000 private/scache ffffd06356926840 stream-ord 0000000 0000000 ffffd06356926bf0 stream-ord 0000000 0000000 ffffd063569290d8 stream-ord ffffd0635646d200 0000000 private/anvil ffffd06356929488 stream-ord 0000000 0000000 ffffd06356929838 stream-ord 0000000 0000000 ffffd06356929be8 stream-ord ffffd06356810480 0000000 private/lmtp ffffd0635692d0d0 stream-ord 0000000 0000000 ffffd0635692d480 stream-ord 0000000 0000000 ffffd0635692d830 stream-ord ffffd06356810280 0000000 private/virtual ffffd0635692dbe0 stream-ord 0000000 0000000 ffffd063569320c8 stream-ord 0000000 0000000 ffffd06356932478 stream-ord ffffd0635685aa00 0000000 private/local ffffd06356932828 stream-ord 0000000 0000000 ffffd06356932bd8 stream-ord 0000000 0000000 ffffd063569360c0 stream-ord ffffd0635685ad00 0000000 private/discard ffffd06356936470 stream-ord 0000000 0000000 ffffd06356936820 stream-ord 0000000 0000000 ffffd06356936bd0 stream-ord ffffd0635685ab00 0000000 private/retry ffffd0635693b0b8 stream-ord 0000000 0000000 ffffd0635693b468 stream-ord 0000000 0000000 ffffd0635693b818 stream-ord ffffd0635685ae00 0000000 private/error ffffd0635693bbc8 stream-ord 0000000 0000000 ffffd063568e10b0 stream-ord 0000000 0000000 ffffd063568e1460 stream-ord ffffd0635685a400 0000000 public/showq ffffd063568e1810 stream-ord 0000000 0000000 ffffd063568e1bc0 stream-ord 0000000 0000000 ffffd063568e60a8 stream-ord ffffd0635685a600 0000000 private/relay ffffd063568e6458 stream-ord 0000000 0000000 ffffd063568e6808 stream-ord 0000000 0000000 ffffd063568e6bb8 stream-ord ffffd0635685a900 0000000 private/smtp ffffd063568ea0a0 stream-ord 0000000 0000000 ffffd063568ea450 stream-ord 0000000 0000000 ffffd063568ea800 stream-ord ffffd0635646d100 0000000 private/proxywrite ffffd063568eabb0 stream-ord 0000000 0000000 ffffd063568f0098 stream-ord 0000000 0000000 ffffd063568f0448 stream-ord ffffd0635685ac00 0000000 private/proxymap ffffd063568f07f8 stream-ord 0000000 0000000 ffffd063568f0ba8 stream-ord 0000000 0000000 ffffd063568f2090 stream-ord ffffd0635685a200 0000000 public/flush ffffd063568f2440 stream-ord 0000000 0000000 ffffd063568f27f0 stream-ord 0000000 0000000 ffffd063568f2ba0 stream-ord ffffd0635685a500 0000000 private/verify ffffd063568f9088 stream-ord 0000000 0000000 ffffd063568f9438 stream-ord 0000000 0000000 ffffd063568f97e8 stream-ord ffffd06356810080 0000000 private/trace ffffd063568f9b98 stream-ord 0000000 0000000 ffffd063568fd080 stream-ord 0000000 0000000 ffffd063568fd430 stream-ord ffffd06356810180 0000000 private/defer ffffd063568fd7e0 stream-ord 0000000 0000000 ffffd063568fdb90 stream-ord 0000000 0000000 ffffd06356840078 stream-ord ffffd0635685a700 0000000 private/bounce ffffd06356840428 stream-ord 0000000 0000000 ffffd063568407d8 stream-ord 0000000 0000000 ffffd06356840b88 stream-ord ffffd0635685a800 0000000 private/rewrite ffffd06356843070 stream-ord ffffd06356810380 0000000 private/tlsmgr ffffd06356843420 stream-ord 0000000 0000000 ffffd063568437d0 stream-ord 0000000 0000000 ffffd06356849068 stream-ord 0000000 0000000 ffffd06356849418 stream-ord 0000000 0000000 ffffd063568497c8 stream-ord ffffd0635685a000 0000000 public/qmgr ffffd06356849b78 stream-ord ffffd0635685a100 0000000 public/cleanup ffffd0635684d060 stream-ord 0000000 0000000 ffffd0635684d410 stream-ord 0000000 0000000 ffffd0635684db70 stream-ord 0000000 0000000 ffffd06355646058 stream-ord 0000000 0000000 ffffd06355646b68 stream-ord ffffd0635685a300 0000000 public/pickup ffffd063551bf3f8 stream-ord ffffd063193fe900 0000000 /var/run/.inetd.uds ffffd063550e7b50 dgram ffffd063550eb380 0000000 /var/run/in.rdisc_mib ffffd06355031798 dgram ffffd063536c8800 0000000 /var/run/in.ndpd_mib ffffd06355031b48 stream-ord ffffd063536c8c00 0000000 /var/run/in.ndpd_ipadm ffffd0635265a028 stream-ord 0000000 ffffd0634e4acd00 /var/run/dbus/system_bus_socket ffffd0635265a788 stream-ord 0000000 ffffd063500ffc80 /var/run/hald/dbus-y1Me9kLIpf ffffd0635265ab38 stream-ord 0000000 0000000 /var/run/hald/dbus-y1Me9kLIpf ffffd06351d553d0 stream-ord 0000000 0000000 /var/run/hald/dbus-y1Me9kLIpf ffffd06351d55780 stream-ord 0000000 0000000 /var/run/hald/dbus-y1Me9kLIpf ffffd06351d55b30 stream-ord 0000000 ffffd063500ffc80 /var/run/hald/dbus-y1Me9kLIpf ffffd06351996018 stream-ord 0000000 ffffd063500ffc80 /var/run/hald/dbus-y1Me9kLIpf ffffd063519963c8 stream-ord 0000000 0000000 /var/run/hald/dbus-y1Me9kLIpf ffffd06351996778 stream-ord 0000000 ffffd063500ffc80 /var/run/hald/dbus-y1Me9kLIpf ffffd063500fe010 stream-ord 0000000 0000000 /var/run/hald/dbus-5Qrha0Wmu3 ffffd063500fe3c0 stream-ord 0000000 ffffd063500ffa80 /var/run/hald/dbus-5Qrha0Wmu3 ffffd063500fe770 stream-ord ffffd063500ffa80 0000000 /var/run/hald/dbus-5Qrha0Wmu3 ffffd063500feb20 stream-ord ffffd063500ffc80 0000000 /var/run/hald/dbus-y1Me9kLIpf ffffd0634e4ad008 stream-ord 0000000 0000000 ffffd0634e4ad3b8 stream-ord 0000000 0000000 ffffd0634e4ad768 stream-ord 0000000 0000000 /var/run/dbus/system_bus_socket ffffd0634e4adb18 stream-ord ffffd0634e4acd00 0000000 /var/run/dbus/system_bus_socket A sorted output shows nearly all 64K ports in bound state. On Tue, Jan 2, 2018 at 8:40 AM, Schweiss, Chip wrote: > About once every week or two I'm having NFS connections start to collapse > to one of my servers. Clients will lose thier connections of the the > course of several hours. The logs fill with these messages: > > Dec 25 16:21:14 mir-zfs03 rpcbind: [ID 452059 daemon.error] do_accept : > t_bind failed : Couldn't allocate address > Dec 25 16:21:14 mir-zfs03 /usr/lib/nfs/nfsd[27689]: [ID 396295 > daemon.error] t_bind(file descriptor 188/transport tcp) TLI error 5 > Dec 25 16:21:31 mir-zfs03 last message repeated 85 times > Dec 25 16:21:31 mir-zfs03 rpcbind: [ID 452059 daemon.error] do_accept : > t_bind failed : Couldn't allocate address > Dec 25 16:21:32 mir-zfs03 /usr/lib/nfs/nfsd[27689]: [ID 396295 > daemon.error] t_bind(file descriptor 188/transport tcp) TLI error 5 > Dec 25 16:21:34 mir-zfs03 last message repeated 19 times > Dec 25 16:21:37 mir-zfs03 /usr/lib/nfs/nfsd[27689]: [ID 396295 > daemon.error] t_bind(file descriptor 200/transport tcp) TLI error 5 > Dec 25 16:22:17 mir-zfs03 last message repeated 116 times > Dec 25 16:22:21 mir-zfs03 /usr/lib/nfs/nfsd[27689]: [ID 396295 > daemon.error] t_bind(file descriptor 206/transport tcp) TLI error 5 > Dec 25 16:23:04 mir-zfs03 last message repeated 81 times > > This is a fully updated OmniOS CE r151022. > > I've tried restarting NFS services, but the only thing that has been > successful in restoring services has been rebooting. > > I'm not finding anything useful via Google except the source code that > spits out this message. HP-UX appears to have had the same issue that > they patched years ago. I'm guessing shared NFS/RPC code. > > Any clue as to the cause of this and how to fix? > > -Chip > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mir at miras.org Wed Jan 3 16:14:40 2018 From: mir at miras.org (Michael Rasmussen) Date: Wed, 3 Jan 2018 17:14:40 +0100 Subject: [OmniOS-discuss] rpcbind: t_bind failed In-Reply-To: References: Message-ID: <20180103171440.1dcfda5a@sleipner.datanom.net> On Wed, 3 Jan 2018 10:02:43 -0600 "Schweiss, Chip" wrote: > The problem occurred again starting last night. I have another clue, but I > still don't know how it is occurring or how to fix it. > > It looks like all the TCP ports are in "bound" state, but not being > released. > > How can I isolate the cause of this? > lsof should be able to tell you which program is listening on a specific port: lsof -i :port -- Hilsen/Regards Michael Rasmussen Get my public GnuPG keys: michael rasmussen cc http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E mir datanom net http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C mir miras org http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917 -------------------------------------------------------------- /usr/games/fortune -es says: You can never tell which way the train went by looking at the tracks. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From chip at innovates.com Wed Jan 3 18:55:09 2018 From: chip at innovates.com (Schweiss, Chip) Date: Wed, 3 Jan 2018 12:55:09 -0600 Subject: [OmniOS-discuss] [zfs] Re: rpcbind: t_bind failed In-Reply-To: <20180103163208.GD1629@telcontar> References: <20180103163208.GD1629@telcontar> Message-ID: Hopefully the patch Marcel is talking about fixes this. I've at least figured out enough to predict when the problem is imminent. We have been migrating to using automounter instead of hard mounts which could to be related to this problem growing over time. Just an FYI: I've kept the server running in this state, but moved its storage pool to a sister server. The port binding problem remains with NO NFS clients connected, but neither pfiles or lsof shows rpcbind as the culprit: # netstat -an|grep BOUND|wc -l 32739 # /opt/ozmt/bin/SunOS/lsof -i:41155 {nothing returned} # pfiles `pgrep rpcbind` 449: /usr/sbin/rpcbind Current rlimit: 65536 file descriptors 0: S_IFCHR mode:0666 dev:527,0 ino:70778888 uid:0 gid:3 rdev:135,2 O_RDWR /devices/pseudo/mm at 0:null offset:0 1: S_IFCHR mode:0666 dev:527,0 ino:70778888 uid:0 gid:3 rdev:135,2 O_RDWR /devices/pseudo/mm at 0:null offset:0 2: S_IFCHR mode:0666 dev:527,0 ino:70778888 uid:0 gid:3 rdev:135,2 O_RDWR /devices/pseudo/mm at 0:null offset:0 3: S_IFCHR mode:0000 dev:527,0 ino:61271 uid:0 gid:0 rdev:231,64 O_RDWR sockname: AF_INET6 :: port: 111 /devices/pseudo/udp6 at 0:udp6 offset:0 4: S_IFCHR mode:0000 dev:527,0 ino:50998 uid:0 gid:0 rdev:231,59 O_RDWR sockname: AF_INET6 :: port: 0 /devices/pseudo/udp6 at 0:udp6 offset:0 5: S_IFCHR mode:0000 dev:527,0 ino:61264 uid:0 gid:0 rdev:231,58 O_RDWR sockname: AF_INET6 :: port: 60955 /devices/pseudo/udp6 at 0:udp6 offset:0 6: S_IFCHR mode:0000 dev:527,0 ino:64334 uid:0 gid:0 rdev:224,57 O_RDWR sockname: AF_INET6 :: port: 111 /devices/pseudo/tcp6 at 0:tcp6 offset:0 7: S_IFCHR mode:0000 dev:527,0 ino:64333 uid:0 gid:0 rdev:224,56 O_RDWR sockname: AF_INET6 :: port: 0 /devices/pseudo/tcp6 at 0:tcp6 offset:0 8: S_IFCHR mode:0000 dev:527,0 ino:64332 uid:0 gid:0 rdev:230,55 O_RDWR sockname: AF_INET 0.0.0.0 port: 111 /devices/pseudo/udp at 0:udp offset:0 9: S_IFCHR mode:0000 dev:527,0 ino:64330 uid:0 gid:0 rdev:230,54 O_RDWR sockname: AF_INET 0.0.0.0 port: 0 /devices/pseudo/udp at 0:udp offset:0 10: S_IFCHR mode:0000 dev:527,0 ino:64331 uid:0 gid:0 rdev:230,53 O_RDWR sockname: AF_INET 0.0.0.0 port: 60994 /devices/pseudo/udp at 0:udp offset:0 11: S_IFCHR mode:0000 dev:527,0 ino:64327 uid:0 gid:0 rdev:223,52 O_RDWR sockname: AF_INET 0.0.0.0 port: 111 /devices/pseudo/tcp at 0:tcp offset:0 12: S_IFCHR mode:0000 dev:527,0 ino:64326 uid:0 gid:0 rdev:223,51 O_RDWR sockname: AF_INET 0.0.0.0 port: 0 /devices/pseudo/tcp at 0:tcp offset:0 13: S_IFCHR mode:0000 dev:527,0 ino:64324 uid:0 gid:0 rdev:226,32 O_RDWR /devices/pseudo/tl at 0:ticlts offset:0 14: S_IFCHR mode:0000 dev:527,0 ino:64328 uid:0 gid:0 rdev:226,33 O_RDWR /devices/pseudo/tl at 0:ticlts offset:0 15: S_IFCHR mode:0000 dev:527,0 ino:64324 uid:0 gid:0 rdev:226,35 O_RDWR /devices/pseudo/tl at 0:ticlts offset:0 16: S_IFCHR mode:0000 dev:527,0 ino:64322 uid:0 gid:0 rdev:226,36 O_RDWR /devices/pseudo/tl at 0:ticotsord offset:0 17: S_IFCHR mode:0000 dev:527,0 ino:64321 uid:0 gid:0 rdev:226,37 O_RDWR /devices/pseudo/tl at 0:ticotsord offset:0 18: S_IFCHR mode:0000 dev:527,0 ino:64030 uid:0 gid:0 rdev:226,39 O_RDWR /devices/pseudo/tl at 0:ticots offset:0 19: S_IFCHR mode:0000 dev:527,0 ino:64029 uid:0 gid:0 rdev:226,40 O_RDWR /devices/pseudo/tl at 0:ticots offset:0 20: S_IFIFO mode:0000 dev:525,0 ino:206 uid:1 gid:12 rdev:0,0 O_RDWR|O_NONBLOCK 21: S_IFIFO mode:0000 dev:525,0 ino:206 uid:1 gid:12 rdev:0,0 O_RDWR|O_NONBLOCK 23: S_IFCHR mode:0000 dev:527,0 ino:33089 uid:0 gid:0 rdev:129,21273 O_WRONLY FD_CLOEXEC /devices/pseudo/log at 0:conslog offset:0 Restarting rpcbind doesn't affect it either: # svcadm restart svc:/network/rpc/bind:default # netstat -an|grep BOUND|wc -l 32739 In the interim of this patch getting integrated I'll monitor the number of bound ports to know when I should fail my pool over again. On Wed, Jan 3, 2018 at 10:32 AM, Marcel Telka wrote: > On Wed, Jan 03, 2018 at 10:02:43AM -0600, Schweiss, Chip wrote: > > The problem occurred again starting last night. I have another clue, > but I > > still don't know how it is occurring or how to fix it. > > > > It looks like all the TCP ports are in "bound" state, but not being > > released. > > > > How can I isolate the cause of this? > > This is a bug in rpcmod, very likely related to > https://www.illumos.org/issues/1616 > > I discussed this few weeks back with some guy who faced the same issue. It > looks like he found the cause and have a fix for it. I thought he will > post a > review request, but that didn't happened for some reason yet. > > I'll try to push this forward... > > > Thanks. > > -- > +-------------------------------------------+ > | Marcel Telka e-mail: marcel at telka.sk | > | homepage: http://telka.sk/ | > | jabber: marcel at jabber.sk | > +-------------------------------------------+ > > ------------------------------------------ > illumos-zfs > Archives: https://illumos.topicbox.com/groups/zfs/discussions/ > T8f10bde64dc0d5c5-Mb17ca753ce6f6fbed5124147 > Powered by Topicbox: https://topicbox.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From paladinemishakal at gmail.com Fri Jan 5 03:58:16 2018 From: paladinemishakal at gmail.com (Lawrence Giam) Date: Fri, 5 Jan 2018 11:58:16 +0800 Subject: [OmniOS-discuss] Problem with BSD Loader and boot mirror In-Reply-To: References: Message-ID: Hi All, Seem like no one have got this issue??? In that case, how do I work with the BSD loader to check and also to try load the rpool ? Thanks & Regards. On Tue, Dec 19, 2017 at 6:44 PM, Lawrence Giam wrote: > Hi All, > > I have a physical server which I am taking the time to install and test > the OmniOS SCE R151022. > > As with the Grub loader and OmniOS R151014, I use the following steps to > setup a boot mirror: > 1. Partition a partition on the SSD for the boot mirror. > 2. Do the prtvtoc /dev/rdsk/c2t1d0s0 | fmthard -s - /dev/rdsk/c2t0d0s0 > 3. Attach the partition to the rpool : zpool attach -f rpool c2t1d0s0 > c2t0d0s0 > 4. Wait for the resilver to finish and then reboot the server, ensure it > boots ok. > 5. After boots ok, run the installgrub /boot/grub/stage1 /boot/grub/stage2 > /dev/rdsk/c2t0d0s0 > 6. Reboot and ensure booting is good. > 7. Shutdown and Simulate c2t1d0s0 failure by taking out the SSD. > 8. Power up server and the system still boot to OmniOS but with alert that > one of the boot mirror is missing. > > As with the BSD loader and OmniOS R151022, I use the following steps to > setup a boot mirror: > 1. Partition a partition on the SSD for the boot mirror. > 2. Do the prtvtoc /dev/rdsk/c2t1d0s0 | fmthard -s - /dev/rdsk/c2t0d0s0 > 3. Attach the partition to the rpool : zpool attach -f rpool c2t1d0s0 > c2t0d0s0 > 4. Wait for the resilver to finish and then reboot the server, ensure it > boots ok. > 5. After boots ok, run bootadm install-bootloader > 6. Reboot and ensure booting is good. > 7. Shutdown and Simulate c2t1d0s0 failure by taking out the SSD. > 8. Power up server and the system boots with the following message: > Loading complete > Consoles: internal video/keyboard > BIOS drive C: is disk 0 > BIOS drive D: is disk 1 > ZFS: i/o error - all block copies unavailable > ZFS: can't read MOS of pool rpool > ZFS: i/o error - all block copies unavailable > ZFS: pool tankAAA is not supported > BIOS 608kB/1983056kB available memory > > illumos/x86 ZFS enabled bootstrap loader, Revision 1.1 > ZFS: can't find pool by guid > ZFS: can't find pool by guid > loading CORE EXT words > loading SEARCH & SEACH-EXT words > loading John-Hopkins locals > loading MARKER > loading ficl O-O extensions > loading ficl utility classes > loading ficl string class > > start not found > > Type '?' for a list of commands, 'help' for more detailed help. > ok > ------------------------------------------------------------ > -------------------------------------- > > I have 2 pools - rpool and tankAAA configured but somehow the BSD loader is > 1. unable to recognise the boot mirror > 2. cannot see the other dataset (eg. tankAAA) > > Next, I shutdown the server and put back the disk and power up the server > again and it is able to boot as before normally. > > Is my step to do the boot mirror wrong or is there something that I am > missing out? > > Thanks & Regards. > > > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chip at innovates.com Fri Jan 5 14:57:23 2018 From: chip at innovates.com (Schweiss, Chip) Date: Fri, 5 Jan 2018 08:57:23 -0600 Subject: [OmniOS-discuss] OmniOSce installer rpool slicing Message-ID: In the previous Solaris style installer we had the option of only using a portion of the disk that the rpool went on. This was very good for SSDs that perform better and last longer if they have some additional slack space that never has data written to it. Is there a way to achieve this with the new installer? -Chip -------------- next part -------------- An HTML attachment was scrubbed... URL: From jcoombs at staff.gwi.net Fri Jan 5 15:08:47 2018 From: jcoombs at staff.gwi.net (Josh Coombs) Date: Fri, 5 Jan 2018 10:08:47 -0500 Subject: [OmniOS-discuss] OmniOSce installer rpool slicing In-Reply-To: References: Message-ID: I've been overprovisioning my SSDs using either MFG supplied tools in the case of Samsung or Intel units and raw SATA commands on old Vertex drives. The OS sees a smaller volume and the drive knows that 'slack' space is truly slack. Joshua Coombs GWI *office* 207-494-2140 www.gwi.net On Fri, Jan 5, 2018 at 9:57 AM, Schweiss, Chip wrote: > In the previous Solaris style installer we had the option of only using a > portion of the disk that the rpool went on. This was very good for SSDs > that perform better and last longer if they have some additional slack > space that never has data written to it. > > Is there a way to achieve this with the new installer? > > -Chip > > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vab at bb-c.de Fri Jan 5 15:11:14 2018 From: vab at bb-c.de (Volker A. Brandt) Date: Fri, 5 Jan 2018 16:11:14 +0100 Subject: [OmniOS-discuss] OmniOSce installer rpool slicing In-Reply-To: References: Message-ID: <23119.38290.578094.766436@shelob.bb-c.de> Hi Chip! > In the previous Solaris style installer we had the option of only using a > portion of the disk that the rpool went on.? ?This was very good for SSDs that > perform better and last longer if they have some additional slack space that > never has data written to it.?? > > Is there a way to achieve this with the new installer? Yes. Just drop to the shell from the installation menu and create your rpool using fdisk, format, and zpool create. Exit the shell and select "use existing pool". Regards -- Volker -- ------------------------------------------------------------------------ Volker A. Brandt Consulting and Support for Oracle Solaris Brandt & Brandt Computer GmbH WWW: http://www.bb-c.de/ Am Wiesenpfad 6, 53340 Meckenheim, GERMANY Email: vab at bb-c.de Handelsregister: Amtsgericht Bonn, HRB 10513 Schuhgr??e: 46 Gesch?ftsf?hrer: Rainer J.H. Brandt und Volker A. Brandt "When logic and proportion have fallen sloppy dead" From chip at innovates.com Fri Jan 5 15:14:34 2018 From: chip at innovates.com (Schweiss, Chip) Date: Fri, 5 Jan 2018 09:14:34 -0600 Subject: [OmniOS-discuss] OmniOSce installer rpool slicing In-Reply-To: <23119.38290.578094.766436@shelob.bb-c.de> References: <23119.38290.578094.766436@shelob.bb-c.de> Message-ID: I didn't think about that. Thanks! On Fri, Jan 5, 2018 at 9:11 AM, Volker A. Brandt wrote: > Hi Chip! > > > > In the previous Solaris style installer we had the option of only using a > > portion of the disk that the rpool went on. This was very good for > SSDs that > > perform better and last longer if they have some additional slack space > that > > never has data written to it. > > > > Is there a way to achieve this with the new installer? > > Yes. Just drop to the shell from the installation menu and create your > rpool using fdisk, format, and zpool create. Exit the shell and select > "use existing pool". > > > Regards -- Volker > -- > ------------------------------------------------------------------------ > Volker A. Brandt Consulting and Support for Oracle Solaris > Brandt & Brandt Computer GmbH WWW: http://www.bb-c.de/ > Am Wiesenpfad 6, 53340 Meckenheim, GERMANY Email: vab at bb-c.de > Handelsregister: Amtsgericht Bonn, HRB 10513 Schuhgr??e: 46 > Gesch?ftsf?hrer: Rainer J.H. Brandt und Volker A. Brandt > > "When logic and proportion have fallen sloppy dead" > -------------- next part -------------- An HTML attachment was scrubbed... URL: From omnios at citrus-it.net Fri Jan 5 17:07:48 2018 From: omnios at citrus-it.net (Andy Fiddaman) Date: Fri, 5 Jan 2018 17:07:48 +0000 (UTC) Subject: [OmniOS-discuss] OmniOSce installer rpool slicing In-Reply-To: References: <23119.38290.578094.766436@shelob.bb-c.de> Message-ID: On Fri, 5 Jan 2018, Schweiss, Chip wrote: ; I didn't think about that. Thanks! ; ; On Fri, Jan 5, 2018 at 9:11 AM, Volker A. Brandt wrote: ; ; > Yes. Just drop to the shell from the installation menu and create your ; > rpool using fdisk, format, and zpool create. Exit the shell and select ; > "use existing pool". Some more hints on this at https://lists.omniti.com/pipermail/omnios-discuss/2017-November/009402.html Regards, Andy -- Citrus IT Limited | +44 (0)333 0124 007 | enquiries at citrus-it.co.uk Rock House Farm | Green Moor | Wortley | Sheffield | S35 7DQ Registered in England and Wales | Company number 4899123 From natxo.asenjo at gmail.com Fri Jan 5 19:26:04 2018 From: natxo.asenjo at gmail.com (Natxo Asenjo) Date: Fri, 5 Jan 2018 20:26:04 +0100 Subject: [OmniOS-discuss] omnios not updating Message-ID: hi, my home filer running OmniOS v11 r151024d keeps updating the same set of packages: # pkg update -v WARNING: The boot environment being modified is not the active one. Changes made in the active BE will not be reflected on the next boot. Packages to update: 7 Estimated space available: 198.39 GB Estimated space to be consumed: 116.78 MB Create boot environment: Yes Activate boot environment: Yes Create backup boot environment: No Rebuild boot archive: Yes Changed packages: omnios editor/vim 8.0.586-0.151024:20171030T140745Z -> 8.0.586-0.151024:20171201T220955Z library/nspr 4.17-0.151024:20171030T152143Z -> 4.17-0.151024:20171213T000253Z library/security/openssl 1.0.2.13-0.151024 -> 1.0.2.14-0.151024 network/rsync 3.1.2-0.151024:20171030T152409Z -> 3.1.2-0.151024:20171207T202651Z release/name 0.5.11-0.151024:20171129T095705Z -> 0.5.11-0.151024:20171218T121347Z system/kernel/dtrace/providers 0.5.11-0.151024:20171030T151622Z -> 0.5.11-0.151024:20171203T191145Z system/library/mozilla-nss 3.33-0.151024:20171030T152105Z -> 3.33-0.151024:20171213T000211Z It runs correctly, I get this message: A clone of r151024-1 exists and has been updated and activated. On the next boot the Boot Environment r151024-6 will be mounted on '/'. Reboot when ready to switch to this updated BE. Updating package cache 1/1 --------------------------------------------------------------------------- NOTE: Please review release notes posted at: http://www.omniosce.org/releasenotes But after rebooting If I retry pkg update -nv, I see the same set of available patches. If I run pkg history, the last patched date is last month: 2017-12-10T14:04:54 refresh-publishers pkg Succeeded 2017-12-10T14:04:55 rebuild-image-catalogs pkg Succeeded 2017-12-19T20:31:07 refresh-publishers pkg Succeeded 2017-12-19T20:31:09 rebuild-image-catalogs pkg Succeeded Any idea as to what is going on? ;-) Thanks! -- Groeten, natxo -------------- next part -------------- An HTML attachment was scrubbed... URL: From natxo.asenjo at gmail.com Fri Jan 5 19:31:36 2018 From: natxo.asenjo at gmail.com (Natxo Asenjo) Date: Fri, 5 Jan 2018 20:31:36 +0100 Subject: [OmniOS-discuss] omnios not updating In-Reply-To: References: Message-ID: ok, rebooting solved this non issue. Sorry for the noise On Fri, Jan 5, 2018 at 8:26 PM, Natxo Asenjo wrote: > hi, > > my home filer running OmniOS v11 r151024d keeps updating the same set of > packages: > > # pkg update -v > WARNING: The boot environment being modified is not the active one. > Changes > made in the active BE will not be reflected on the next boot. > > Packages to update: 7 > Estimated space available: 198.39 GB > Estimated space to be consumed: 116.78 MB > Create boot environment: Yes > Activate boot environment: Yes > Create backup boot environment: No > Rebuild boot archive: Yes > > Changed packages: > omnios > editor/vim > 8.0.586-0.151024:20171030T140745Z -> 8.0.586-0.151024:20171201T220955Z > library/nspr > 4.17-0.151024:20171030T152143Z -> 4.17-0.151024:20171213T000253Z > library/security/openssl > 1.0.2.13-0.151024 -> 1.0.2.14-0.151024 > network/rsync > 3.1.2-0.151024:20171030T152409Z -> 3.1.2-0.151024:20171207T202651Z > release/name > 0.5.11-0.151024:20171129T095705Z -> 0.5.11-0.151024:20171218T121347Z > system/kernel/dtrace/providers > 0.5.11-0.151024:20171030T151622Z -> 0.5.11-0.151024:20171203T191145Z > system/library/mozilla-nss > 3.33-0.151024:20171030T152105Z -> 3.33-0.151024:20171213T000211Z > > It runs correctly, I get this message: > > A clone of r151024-1 exists and has been updated and activated. > On the next boot the Boot Environment r151024-6 will be > mounted on '/'. Reboot when ready to switch to this updated BE. > > Updating package cache 1/1 > > ------------------------------------------------------------ > --------------- > NOTE: Please review release notes posted at: > > http://www.omniosce.org/releasenotes > > But after rebooting If I retry pkg update -nv, I see the same set of > available patches. If I run pkg history, the last patched date is last > month: > > 2017-12-10T14:04:54 refresh-publishers pkg > Succeeded > 2017-12-10T14:04:55 rebuild-image-catalogs pkg > Succeeded > 2017-12-19T20:31:07 refresh-publishers pkg > Succeeded > 2017-12-19T20:31:09 rebuild-image-catalogs pkg > Succeeded > > > Any idea as to what is going on? ;-) > > Thanks! > -- > Groeten, > natxo > -- -- Groeten, natxo -------------- next part -------------- An HTML attachment was scrubbed... URL: From youzhong at gmail.com Sun Jan 7 20:15:33 2018 From: youzhong at gmail.com (Youzhong Yang) Date: Sun, 7 Jan 2018 15:15:33 -0500 Subject: [OmniOS-discuss] [zfs] Re: rpcbind: t_bind failed In-Reply-To: References: <20180103163208.GD1629@telcontar> Message-ID: Not sure if it's the same issue we reported 3 years ago. We applied our patch and haven't seen this issue ever since. https://illumos.topicbox.com/groups/developer/Te5808458a5a5a14f-M74735db9aeccaa5d8c3a70a4 On Wed, Jan 3, 2018 at 1:55 PM, Schweiss, Chip wrote: > Hopefully the patch Marcel is talking about fixes this. I've at least > figured out enough to predict when the problem is imminent. > > We have been migrating to using automounter instead of hard mounts which > could to be related to this problem growing over time. > > Just an FYI: I've kept the server running in this state, but moved its > storage pool to a sister server. The port binding problem remains with NO > NFS clients connected, but neither pfiles or lsof shows rpcbind as the > culprit: > > # netstat -an|grep BOUND|wc -l > 32739 > > # /opt/ozmt/bin/SunOS/lsof -i:41155 > > {nothing returned} > > # pfiles `pgrep rpcbind` > 449: /usr/sbin/rpcbind > Current rlimit: 65536 file descriptors > 0: S_IFCHR mode:0666 dev:527,0 ino:70778888 uid:0 gid:3 rdev:135,2 > O_RDWR > /devices/pseudo/mm at 0:null > offset:0 > 1: S_IFCHR mode:0666 dev:527,0 ino:70778888 uid:0 gid:3 rdev:135,2 > O_RDWR > /devices/pseudo/mm at 0:null > offset:0 > 2: S_IFCHR mode:0666 dev:527,0 ino:70778888 uid:0 gid:3 rdev:135,2 > O_RDWR > /devices/pseudo/mm at 0:null > offset:0 > 3: S_IFCHR mode:0000 dev:527,0 ino:61271 uid:0 gid:0 rdev:231,64 > O_RDWR > sockname: AF_INET6 :: port: 111 > /devices/pseudo/udp6 at 0:udp6 > offset:0 > 4: S_IFCHR mode:0000 dev:527,0 ino:50998 uid:0 gid:0 rdev:231,59 > O_RDWR > sockname: AF_INET6 :: port: 0 > /devices/pseudo/udp6 at 0:udp6 > offset:0 > 5: S_IFCHR mode:0000 dev:527,0 ino:61264 uid:0 gid:0 rdev:231,58 > O_RDWR > sockname: AF_INET6 :: port: 60955 > /devices/pseudo/udp6 at 0:udp6 > offset:0 > 6: S_IFCHR mode:0000 dev:527,0 ino:64334 uid:0 gid:0 rdev:224,57 > O_RDWR > sockname: AF_INET6 :: port: 111 > /devices/pseudo/tcp6 at 0:tcp6 > offset:0 > 7: S_IFCHR mode:0000 dev:527,0 ino:64333 uid:0 gid:0 rdev:224,56 > O_RDWR > sockname: AF_INET6 :: port: 0 > /devices/pseudo/tcp6 at 0:tcp6 > offset:0 > 8: S_IFCHR mode:0000 dev:527,0 ino:64332 uid:0 gid:0 rdev:230,55 > O_RDWR > sockname: AF_INET 0.0.0.0 port: 111 > /devices/pseudo/udp at 0:udp > offset:0 > 9: S_IFCHR mode:0000 dev:527,0 ino:64330 uid:0 gid:0 rdev:230,54 > O_RDWR > sockname: AF_INET 0.0.0.0 port: 0 > /devices/pseudo/udp at 0:udp > offset:0 > 10: S_IFCHR mode:0000 dev:527,0 ino:64331 uid:0 gid:0 rdev:230,53 > O_RDWR > sockname: AF_INET 0.0.0.0 port: 60994 > /devices/pseudo/udp at 0:udp > offset:0 > 11: S_IFCHR mode:0000 dev:527,0 ino:64327 uid:0 gid:0 rdev:223,52 > O_RDWR > sockname: AF_INET 0.0.0.0 port: 111 > /devices/pseudo/tcp at 0:tcp > offset:0 > 12: S_IFCHR mode:0000 dev:527,0 ino:64326 uid:0 gid:0 rdev:223,51 > O_RDWR > sockname: AF_INET 0.0.0.0 port: 0 > /devices/pseudo/tcp at 0:tcp > offset:0 > 13: S_IFCHR mode:0000 dev:527,0 ino:64324 uid:0 gid:0 rdev:226,32 > O_RDWR > /devices/pseudo/tl at 0:ticlts > offset:0 > 14: S_IFCHR mode:0000 dev:527,0 ino:64328 uid:0 gid:0 rdev:226,33 > O_RDWR > /devices/pseudo/tl at 0:ticlts > offset:0 > 15: S_IFCHR mode:0000 dev:527,0 ino:64324 uid:0 gid:0 rdev:226,35 > O_RDWR > /devices/pseudo/tl at 0:ticlts > offset:0 > 16: S_IFCHR mode:0000 dev:527,0 ino:64322 uid:0 gid:0 rdev:226,36 > O_RDWR > /devices/pseudo/tl at 0:ticotsord > offset:0 > 17: S_IFCHR mode:0000 dev:527,0 ino:64321 uid:0 gid:0 rdev:226,37 > O_RDWR > /devices/pseudo/tl at 0:ticotsord > offset:0 > 18: S_IFCHR mode:0000 dev:527,0 ino:64030 uid:0 gid:0 rdev:226,39 > O_RDWR > /devices/pseudo/tl at 0:ticots > offset:0 > 19: S_IFCHR mode:0000 dev:527,0 ino:64029 uid:0 gid:0 rdev:226,40 > O_RDWR > /devices/pseudo/tl at 0:ticots > offset:0 > 20: S_IFIFO mode:0000 dev:525,0 ino:206 uid:1 gid:12 rdev:0,0 > O_RDWR|O_NONBLOCK > 21: S_IFIFO mode:0000 dev:525,0 ino:206 uid:1 gid:12 rdev:0,0 > O_RDWR|O_NONBLOCK > 23: S_IFCHR mode:0000 dev:527,0 ino:33089 uid:0 gid:0 rdev:129,21273 > O_WRONLY FD_CLOEXEC > /devices/pseudo/log at 0:conslog > offset:0 > > Restarting rpcbind doesn't affect it either: > > # svcadm restart svc:/network/rpc/bind:default > > # netstat -an|grep BOUND|wc -l > 32739 > > In the interim of this patch getting integrated I'll monitor the number of > bound ports to know when I should fail my pool over again. > > > On Wed, Jan 3, 2018 at 10:32 AM, Marcel Telka wrote: > >> On Wed, Jan 03, 2018 at 10:02:43AM -0600, Schweiss, Chip wrote: >> > The problem occurred again starting last night. I have another clue, >> but I >> > still don't know how it is occurring or how to fix it. >> > >> > It looks like all the TCP ports are in "bound" state, but not being >> > released. >> > >> > How can I isolate the cause of this? >> >> This is a bug in rpcmod, very likely related to >> https://www.illumos.org/issues/1616 >> >> I discussed this few weeks back with some guy who faced the same issue. >> It >> looks like he found the cause and have a fix for it. I thought he will >> post a >> review request, but that didn't happened for some reason yet. >> >> I'll try to push this forward... >> >> >> Thanks. >> >> -- >> +-------------------------------------------+ >> | Marcel Telka e-mail: marcel at telka.sk | >> | homepage: http://telka.sk/ | >> | jabber: marcel at jabber.sk | >> +-------------------------------------------+ >> >> ------------------------------------------ >> illumos-zfs >> Archives: https://illumos.topicbox.com/groups/zfs/discussions/T8f10bde >> 64dc0d5c5-Mb17ca753ce6f6fbed5124147 >> Powered by Topicbox: https://topicbox.com >> > > *illumos-zfs* | Archives > > | Powered by Topicbox > -------------- next part -------------- An HTML attachment was scrubbed... URL: From youzhong at gmail.com Mon Jan 8 16:43:24 2018 From: youzhong at gmail.com (Youzhong Yang) Date: Mon, 8 Jan 2018 11:43:24 -0500 Subject: [OmniOS-discuss] [zfs] Re: rpcbind: t_bind failed In-Reply-To: References: <20180103163208.GD1629@telcontar> Message-ID: This is our patch. It was applied 3 years ago so the line number could be different for the latest version of the file. diff --git a/usr/src/uts/common/rpc/clnt_cots.c b/usr/src/uts/common/rpc/clnt_cots.c index 4466e93..0a0951d 100644 --- a/usr/src/uts/common/rpc/clnt_cots.c +++ b/usr/src/uts/common/rpc/clnt_cots.c @@ -2285,6 +2285,7 @@ start_retry_loop: if (rpcerr->re_status == RPC_SUCCESS) rpcerr->re_status = RPC_XPRTFAILED; cm_entry->x_connected = FALSE; + cm_entry->x_dead = TRUE; } else cm_entry->x_connected = connected; @@ -2403,6 +2404,7 @@ connmgr_wrapconnect( if (rpcerr->re_status == RPC_SUCCESS) rpcerr->re_status = RPC_XPRTFAILED; cm_entry->x_connected = FALSE; + cm_entry->x_dead = TRUE; } else cm_entry->x_connected = connected; On Mon, Jan 8, 2018 at 11:21 AM, Dan McDonald wrote: > > > > On Jan 7, 2018, at 3:15 PM, Youzhong Yang wrote: > > > > Not sure if it's the same issue we reported 3 years ago. We applied our > patch and haven't seen this issue ever since. > > > > https://illumos.topicbox.com/groups/developer/Te5808458a5a5a14f- > M74735db9aeccaa5d8c3a70a4 > > To quote that e-mail: > > > Hi Marcel: > > It looks like we're getting an "early disconnect". This is what is > leading to the accumulation of bound reserved ports. The scenario for > reproduction is a follows: > > > > 1. Linux DEBIAN7.4 client acquires and releases lock on file on > server (via NFS). > > 2. reboot Linux client (but do so > > _before_ > > MIR_CLNT_IDLE_TIMEOUT interval fires on server side). > > 3. when Linux client comes back up, repeat step 1. > > > > At this point, a cm_entry with only the ORDREL flag set in > x_state_word will remain in the cm_entry linked list list (cm_hd). > > It appears that without at least a DEAD flag set in x_state_word, > this cm_entry will remain bound to a the port...and will never be garbage > collected. > > > > To experiment, we added, > > "cm_entry->x_dead = TRUE;" > > at line 2272 and 2390 to here: > > https://github.com/joyent/illumos-joyent/blob/master/ > usr/src/uts/common/rpc/clnt_cots.c > > > > Testing with the above reproduction scenario, we are taking the > path of line 2272 -- and that with the DEAD flag set in x_state_word, these > "zombie" cm_entries are being now cleaned up, and we're no longer > accumulating/leaking reserved ports. > > > > Is there more to it? This seems too simple of a fix. Are there > unintended consequences we should be looking/testing for? Does this seem > like it might be #1616 as well? > > Thoughts? > > Thanks! > > -Ken & Youzhong > > > And here's the patch in diff form for easier consumption: > > diff --git a/usr/src/uts/common/rpc/clnt_cots.c > b/usr/src/uts/common/rpc/clnt_cots.c > index 2e64ab0..f9b78ff 100644 > --- a/usr/src/uts/common/rpc/clnt_cots.c > +++ b/usr/src/uts/common/rpc/clnt_cots.c > @@ -2269,6 +2269,7 @@ start_retry_loop: > cm_entry->x_ordrel = FALSE; > > cm_entry->x_tidu_size = tidu_size; > + cm_entry->x_dead = TRUE; > > if (cm_entry->x_early_disc) { > /* > @@ -2387,6 +2388,7 @@ connmgr_wrapconnect( > > mutex_enter(&connmgr_lock); > > + cm_entry->x_dead = TRUE; > > if (cm_entry->x_early_disc) { > /* > > Went back and checked my notes - I was traveling when that thread was > going on, so I likely missed it altogether in the hustle/bustle of that. > > It seems at first glance you're being too aggressive in setting X_DEAD > (note that this code gives you BOTH ways to set the flag, via a C bit > fields OR the macro form... makes for very difficult reading, IMHO), but it > my concern was valid you'd likely see far more outright failures. > > Maybe that patch is all we need? > > Dan > > > ------------------------------------------ > illumos-zfs > Archives: https://illumos.topicbox.com/groups/zfs/discussions/ > T8f10bde64dc0d5c5-M9dee7d96157a6ad5c11c472a > Powered by Topicbox: https://topicbox.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From softwareinforjam at gmail.com Tue Jan 9 22:07:47 2018 From: softwareinforjam at gmail.com (Software Information) Date: Tue, 9 Jan 2018 17:07:47 -0500 Subject: [OmniOS-discuss] Problem with attach Message-ID: Hi All I am not sure how this happened. I thought I followed the instructions but I now have a problem. My non-global zone is now out of sync with my global zone. Global zone version: entire at 11-0.151022:20180108T221634Z Non-Global zone version: entire at 11-0.151020:20161102T012108Z I tried using zoneadm -z zonename attach -u but that failed. Is there a way to sync the non-global zone? Regards -------------- next part -------------- An HTML attachment was scrubbed... URL: From hasslerd at gmx.li Tue Jan 9 23:56:48 2018 From: hasslerd at gmx.li (Dominik Hassler) Date: Wed, 10 Jan 2018 00:56:48 +0100 Subject: [OmniOS-discuss] Problem with attach In-Reply-To: References: Message-ID: Hi, I am not aware of an entire at 11-0.151022:20180108T221634Z in our IPS repos as we only ship those for r151022: hadfl at r151022-build:~$ pkg list -avf entire FMRI IFO pkg://omnios/entire at 11-0.151022:20171031T101418Z i-- pkg://omnios/entire at 11-0.151022:20170917T145315Z --- pkg://omnios/entire at 11-0.151022:20170511T002513Z --- Could you please elaborate a bit more how you did the upgrade (i.e. steps you did, setting publisher etc) On 01/09/2018 11:07 PM, Software Information wrote: > Hi All > I am not sure how this happened. I thought I followed the instructions > but I now have a problem. My non-global zone is now out of sync with my > global zone. > > ?Global zone version: entire at 11-0.151022:20180108T221634Z > ?Non-Global zone version: entire at 11-0.151020:20161102T012108Z > > I tried using zoneadm -z zonename attach -u but that failed. Is there a > way to sync the non-global zone? > > Regards > > > > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss > From groenveld at acm.org Wed Jan 10 00:33:55 2018 From: groenveld at acm.org (John D Groenveld) Date: Tue, 09 Jan 2018 19:33:55 -0500 Subject: [OmniOS-discuss] Problem with attach In-Reply-To: Your message of "Wed, 10 Jan 2018 00:56:48 +0100." References: Message-ID: <201801100033.w0A0Xth7012438@groenveld.us> In message , Dominik Hassler writes: >Could you please elaborate a bit more how you did the upgrade (i.e. >steps you did, setting publisher etc) And the attach.log. John groenveld at acm.org From stephan.budach at jvm.de Fri Jan 12 10:17:09 2018 From: stephan.budach at jvm.de (Stephan Budach) Date: Fri, 12 Jan 2018 11:17:09 +0100 (CET) Subject: [OmniOS-discuss] NVMe under omniOS CE In-Reply-To: <478079415.1033.1515752082909.JavaMail.stephan.budach@stephan.budach.jvm.de> Message-ID: <811459154.1043.1515752239683.JavaMail.stephan.budach@stephan.budach.jvm.de> Hi, I finally got the first of my two Supermicro 2028R-N48M NVME servers. I installed the latest omniOSce on it and as it seems, it doesn't recognize the NVMe drives.This box is equipped with 24x Intel DC P4500, PCIe 3.1 NVMe. Does anybody know, why those are not recognized? Thanks, Stephan -- -------------- next part -------------- An HTML attachment was scrubbed... URL: From mir at miras.org Fri Jan 12 11:37:38 2018 From: mir at miras.org (Michael Rasmussen) Date: Fri, 12 Jan 2018 12:37:38 +0100 Subject: [OmniOS-discuss] NVMe under omniOS CE In-Reply-To: <811459154.1043.1515752239683.JavaMail.stephan.budach@stephan.budach.jvm.de> References: <811459154.1043.1515752239683.JavaMail.stephan.budach@stephan.budach.jvm.de> Message-ID: If I recall Dan once mentioned that only 3.0 is supported. ?Sent from BlueMail ? On Jan 12, 2018, 11:21, at 11:21, Stephan Budach wrote: >Hi, > > >I finally got the first of my two Supermicro 2028R-N48M NVME servers. I >installed the latest omniOSce on it and as it seems, it doesn't >recognize the NVMe drives.This box is equipped with 24x Intel DC P4500, >PCIe 3.1 NVMe. > > >Does anybody know, why those are not recognized? > > >Thanks, >Stephan > >-- > > > > >------------------------------------------------------------------------ > >_______________________________________________ >OmniOS-discuss mailing list >OmniOS-discuss at lists.omniti.com >http://lists.omniti.com/mailman/listinfo/omnios-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephan.budach at jvm.de Fri Jan 12 11:39:52 2018 From: stephan.budach at jvm.de (Stephan Budach) Date: Fri, 12 Jan 2018 12:39:52 +0100 (CET) Subject: [OmniOS-discuss] NVMe under omniOS CE In-Reply-To: <811459154.1043.1515752239683.JavaMail.stephan.budach@stephan.budach.jvm.de> References: <811459154.1043.1515752239683.JavaMail.stephan.budach@stephan.budach.jvm.de> Message-ID: <1806929444.1090.1515757203871.JavaMail.stephan.budach@stephan.budach.jvm.de> Shoot - please forgive my ignorance? uncommenting strict-version in nvme.conf sloved that. Cheers, Stephan ----- Urspr?ngliche Mail ----- > Von: "Stephan Budach" > An: "omnios-discuss" > Gesendet: Freitag, 12. Januar 2018 11:17:09 > Betreff: [OmniOS-discuss] NVMe under omniOS CE > Hi, > I finally got the first of my two Supermicro 2028R-N48M NVME servers. > I installed the latest omniOSce on it and as it seems, it doesn't > recognize the NVMe drives.This box is equipped with 24x Intel DC > P4500, PCIe 3.1 NVMe. > Does anybody know, why those are not recognized? > Thanks, > Stephan > -- > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From vab at bb-c.de Sat Jan 13 21:38:36 2018 From: vab at bb-c.de (Volker A. Brandt) Date: Sat, 13 Jan 2018 22:38:36 +0100 Subject: [OmniOS-discuss] Invitation to an OmniOS event near Frankfurt, Germany (Tue Jan 16) Message-ID: <23130.31836.110356.702888@shelob.bb-c.de> [Stupid me sent this to the -bounce addr first -- no Reply-To :-(] Hello all! Here is an invitation to an OmniOS-related event in Frankfurt, Germany. This is the regular monthly meeting of the Frankfurt OpenSolaris User Group (FRAOSUG). Yes, we still exist. :-) We will meet next Tuesday (Jan 16th 2018) at 6:30pm in Dreieich. The meeting is held in German, so the invitation is also in German: ------------------------------------------------------------------------ Am kommenden Dienstag l?dt die FRAOSUG zur monatlichen Veranstaltung ein. Auch diesmal sind wir n?her am "urspr?nglichen" OpenSolaris, denn es gibt "alles rund um OmniOS CE". Nach einer kleinen Einf?hrung, was OmniOS und speziell OmniOS CE eigentlich ist, wollen wir uns anschauen, warum OmniOS als "legitimer Nachfolger" von OpenSolaris auf Servern gilt. Insbesondere wird eine vollst?ndige Installation von OmniOS auf einem HP G8 Microserver live vorgef?hrt, inklusive Konfiguration des neuen von FreeBSD abstammenden Bootloaders f?r die serielle Konsole des HP- iLO. Falls jemand einen Laptop mit VirtualBox dabei hat, gibt es auch die M?glichkeit, ein vorbereitetes OmniOS-Image zu kopieren und selbst zu installieren. Unser Treffen findet diesmal bei Oracle in Dreieich statt: https://fraosug.de/ Anmeldung ?ber unser Umfrage-Tool: https://owncloud-002.qutic.com/index.php/apps/polls/poll/iJtyBtV1eNaCkvdo Die Veranstaltung ist kostenlos, und wie immer gilt: Das Verwenden von Solaris keine Vorbedingung f?r die Teilnahme... ------------------------------------------------------------------------ Basically we are going to do an intro to what OmniOS (and CE) is, followed by a live Kayak network installation (if I manage to get it to work :-). Attendance is free, and we will be at the Oracle office in Dreieich. Everyone is welcome. Make sure to register via our web page. Regards -- Volker -- ------------------------------------------------------------------------ Volker A. Brandt Consulting and Support for Oracle Solaris Brandt & Brandt Computer GmbH WWW: http://www.bb-c.de/ Am Wiesenpfad 6, 53340 Meckenheim, GERMANY Email: vab at bb-c.de Handelsregister: Amtsgericht Bonn, HRB 10513 Schuhgr??e: 46 Gesch?ftsf?hrer: Rainer J.H. Brandt und Volker A. Brandt "When logic and proportion have fallen sloppy dead" From chip at innovates.com Wed Jan 17 14:36:50 2018 From: chip at innovates.com (Schweiss, Chip) Date: Wed, 17 Jan 2018 08:36:50 -0600 Subject: [OmniOS-discuss] [zfs] Re: rpcbind: t_bind failed In-Reply-To: <8E071B37-24A7-4EC6-B9FD-D1983929CFEC@joyent.com> References: <20180103163208.GD1629@telcontar> <8E071B37-24A7-4EC6-B9FD-D1983929CFEC@joyent.com> Message-ID: I haven't seen this bug filed yet. Please submit this. For anyone using automounter this bug is a ticking time bomb. I've been able to extend my frequency of reboots by about a week with ndd -set /dev/tcp tcp_smallest_anon_port 1024 However, until this is fixed, I'm forced to reboot every couple weeks. Thank you, -Chip On Mon, Jan 8, 2018 at 10:46 AM, Dan McDonald wrote: > OH PHEW! > > > On Jan 8, 2018, at 11:43 AM, Youzhong Yang wrote: > > > > This is our patch. It was applied 3 years ago so the line number could > be different for the latest version of the file. > > diff --git a/usr/src/uts/common/rpc/clnt_cots.c > b/usr/src/uts/common/rpc/clnt_cots.c > > index 4466e93..0a0951d 100644 > > --- a/usr/src/uts/common/rpc/clnt_cots.c > > +++ b/usr/src/uts/common/rpc/clnt_cots.c > > @@ -2285,6 +2285,7 @@ start_retry_loop: > > if (rpcerr->re_status == RPC_SUCCESS) > > rpcerr->re_status = RPC_XPRTFAILED; > > cm_entry->x_connected = FALSE; > > + cm_entry->x_dead = TRUE; > > } else > > cm_entry->x_connected = connected; > > > > @@ -2403,6 +2404,7 @@ connmgr_wrapconnect( > > if (rpcerr->re_status == RPC_SUCCESS) > > rpcerr->re_status = RPC_XPRTFAILED; > > cm_entry->x_connected = FALSE; > > + cm_entry->x_dead = TRUE; > > } else > > cm_entry->x_connected = connected; > > This makes TONS more sense, and alleviates/obviates my concerns previously. > > If there isn't a bug already, please file one. Once filed or found, > please add me as a code reviewer for this. > > Thanks, > Dan > > > ------------------------------------------ > illumos-zfs > Archives: https://illumos.topicbox.com/groups/zfs/discussions/ > T8f10bde64dc0d5c5-M889b6aaf7cbeb0b32617f321 > Powered by Topicbox: https://topicbox.com > -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul.jochum at nokia.com Mon Jan 22 15:24:17 2018 From: paul.jochum at nokia.com (Paul Jochum) Date: Mon, 22 Jan 2018 09:24:17 -0600 Subject: [OmniOS-discuss] zpool replace command returns internal error: out of memory Message-ID: <70b93224-944d-485f-0a43-b570a3e563fe@nokia.com> Hi All: Last Saturday, I updated my servers to the latest version of OmniOS-CE (r151024j), and today, while trying to replace a drive, I received the following error message; #? zpool replace zfs_pool c11t5000C5003A017D5Bd0 c11t5000C5003A39950Bd0 internal error: out of memory Some information about my system: # uname -a SunOS lss-nfsa05 5.11 omnios-r151024-e482f10563 i86pc i386 i86pc root at lss-nfsa05:~# uptime ?09:20:28? up?? 0:18,? 3 users,? load average: 0.41, 0.30, 0.23 load averages:? 0.38,? 0.30,? 0.23;?????????????? up 0+00:19:09 09:20:45 61 processes: 60 sleeping, 1 on cpu CPU states: 99.3% idle,? 0.0% user,? 0.7% kernel,? 0.0% iowait, 0.0% swap Kernel: 1623 ctxsw, 1 trap, 1446 intr, 254 syscall, 1 flt Memory: 128G phys mem, 118G free mem, 4096M total swap, 4096M free swap I have tried rebooting my system and also exporting and importing the zfs_pool filesystem, but neither step helped. Any suggestions on what to try next, or what info to collect to help debug this? thanks, Paul From softwareinforjam at gmail.com Tue Jan 23 02:13:00 2018 From: softwareinforjam at gmail.com (Software Information) Date: Mon, 22 Jan 2018 21:13:00 -0500 Subject: [OmniOS-discuss] Zone welcome messsage Message-ID: Hi All I was just wondering. I recently updated my host machine from r151020 to r151022. The welcome message on the host machine is fine but when I log on to the non-global zone, I still see the welcome message for r151020 below. OmniOS 5.11 omnios-r151020-4151d05 March 2017 But whe I do a uname -a, in the non-global zone, I get : SunOS test-zone 5.11 omnios-r151022-f9693432c2 i86pc i386 i86pc Is there any way to make the non-global zone show the correct welcome message? Thanks and regards SI -------------- next part -------------- An HTML attachment was scrubbed... URL: From danmcd at kebe.com Tue Jan 23 02:23:40 2018 From: danmcd at kebe.com (Dan McDonald) Date: Mon, 22 Jan 2018 21:23:40 -0500 Subject: [OmniOS-discuss] Zone welcome messsage In-Reply-To: References: Message-ID: <20180123022340.GB28479@everywhere.local> On Mon, Jan 22, 2018 at 09:13:00PM -0500, Software Information wrote: > Hi All > I was just wondering. I recently updated my host machine from r151020 to > r151022. > The welcome message on the host machine is fine but when I log on to the > non-global zone, I still see the welcome message for r151020 below. > > OmniOS 5.11 omnios-r151020-4151d05 March 2017 > > But whe I do a uname -a, in the non-global zone, I get : > SunOS test-zone 5.11 omnios-r151022-f9693432c2 i86pc i386 i86pc > > Is there any way to make the non-global zone show the correct welcome > message? If it's an ipkg zone, you'll have to "pkg update" inside the zone (and possibly reboot it). If it's an lipkg zone, make sure you're using the "-r" flag when updating the global, OR use "pkg update" inside like an ipkg zone. Dan From softwareinforjam at gmail.com Thu Jan 25 01:51:05 2018 From: softwareinforjam at gmail.com (Software Information) Date: Wed, 24 Jan 2018 20:51:05 -0500 Subject: [OmniOS-discuss] NTP Service error Message-ID: Hi All Today I made the switch updating my r151020 box to omniosce in production and I am now left with just two issues. ntp won't start on one zone. All the dependent services are online. The log says: [ Jan 24 19:04:50 Method "start" exited with status 96. ] [ Jan 24 19:49:39 Enabled. ] [ Jan 24 19:49:40 Executing start method ("/lib/svc/method/ntp start"). ] [ Jan 24 19:49:40 svc.startd could not set context for method: ] setppriv: Not owner Not quite sure what this means. Could anyone give me a pointer please? Kind Regards -------------- next part -------------- An HTML attachment was scrubbed... URL: From omnios at citrus-it.net Thu Jan 25 07:28:23 2018 From: omnios at citrus-it.net (Andy Fiddaman) Date: Thu, 25 Jan 2018 07:28:23 +0000 (UTC) Subject: [OmniOS-discuss] NTP Service error In-Reply-To: References: Message-ID: On Wed, 24 Jan 2018, Software Information wrote: ; Hi All ; Today I made the switch updating my r151020 box to omniosce in production ; and I am now left with just two issues. ; ; ntp won't start on one zone. All the dependent services are online. ; ; The log says: ; [ Jan 24 19:04:50 Method "start" exited with status 96. ] ; [ Jan 24 19:49:39 Enabled. ] ; [ Jan 24 19:49:40 Executing start method ("/lib/svc/method/ntp start"). ] ; [ Jan 24 19:49:40 svc.startd could not set context for method: ] ; setppriv: Not owner To run NTP in a zone, the zone needs the sys_time privilege. Is that still present in the zone config? # zonecfg -z ntp0 info | grep limitpriv limitpriv: default,proc_priocntl,sys_time I can't think why it would have gone away during an upgrade but it's the first thing to check. Regards, Andy -- Citrus IT Limited | +44 (0)333 0124 007 | enquiries at citrus-it.co.uk Rock House Farm | Green Moor | Wortley | Sheffield | S35 7DQ Registered in England and Wales | Company number 4899123 From jimklimov at cos.ru Thu Jan 25 09:29:28 2018 From: jimklimov at cos.ru (Jim Klimov) Date: Thu, 25 Jan 2018 09:29:28 +0000 Subject: [OmniOS-discuss] NTP Service error In-Reply-To: References: Message-ID: <893C93D0-B6E3-4352-A6F1-836320473AC0@cos.ru> On January 25, 2018 1:51:05 AM UTC, Software Information wrote: >Hi All >Today I made the switch updating my r151020 box to omniosce in >production >and I am now left with just two issues. > >ntp won't start on one zone. All the dependent services are online. > >The log says: >[ Jan 24 19:04:50 Method "start" exited with status 96. ] >[ Jan 24 19:49:39 Enabled. ] >[ Jan 24 19:49:40 Executing start method ("/lib/svc/method/ntp start"). >] >[ Jan 24 19:49:40 svc.startd could not set context for method: ] >setppriv: Not owner > >Not quite sure what this means. Could anyone give me a pointer please? > >Kind Regards Local zones normally can not control the host system clock, so you need to add a into the zone's XML descriptor (or do the equivalent via zonecfg). Not sure it will let you actually set host time from the local zone (probably you'll need an NTP client in the NGZ to set the physical clock), but this will allow you to run an NTP server to give out time to clients. Note that you might have to fiddle with ntp.conf also, so the server does not report itself as a useless 'stratum 16' (since it has not confirmed setting the clock from a source of known reliability). Note it can take some 15 minutes for ntpd to set its own stratum even when it is in charge of host clock sync; let us know if you succeed to force a number otherwise (fudge did not help me back when...) Hope this helps, Jim -- Typos courtesy of K-9 Mail on my Android From omnios at citrus-it.net Thu Jan 25 15:23:38 2018 From: omnios at citrus-it.net (Andy Fiddaman) Date: Thu, 25 Jan 2018 15:23:38 +0000 (UTC) Subject: [OmniOS-discuss] NTP Service error In-Reply-To: References: Message-ID: On Thu, 25 Jan 2018, Software Information wrote: ; Hi. Thanks for replying. ; ; Running zonecfg -z zone_name info | grep limitpriv results in: ; limitpriv: default,dtrace_proc,dtrace_user, ; ; *sys_time* ; So that's still there Try adding proc_priocntl too. Seems it's also needed by NTP. Andy -- Citrus IT Limited | +44 (0)333 0124 007 | enquiries at citrus-it.co.uk Rock House Farm | Green Moor | Wortley | Sheffield | S35 7DQ Registered in England and Wales | Company number 4899123 From omnios at citrus-it.net Thu Jan 25 16:18:13 2018 From: omnios at citrus-it.net (Andy Fiddaman) Date: Thu, 25 Jan 2018 16:18:13 +0000 (UTC) Subject: [OmniOS-discuss] NTP Service error In-Reply-To: References: Message-ID: On Thu, 25 Jan 2018, Software Information wrote: ; I ran ; # zonecfg -z zonename set limitpriv="proc_priocntl" ; ; But it resulted in: ; zonename:* invalid privilege: sys_mount* You need to add the privilege, not replace what's already there: # zonecfg -z zonename set limitpriv=default,proc_priocntl,sys_time -- Citrus IT Limited | +44 (0)333 0124 007 | enquiries at citrus-it.co.uk Rock House Farm | Green Moor | Wortley | Sheffield | S35 7DQ Registered in England and Wales | Company number 4899123 From softwareinforjam at gmail.com Thu Jan 25 21:25:33 2018 From: softwareinforjam at gmail.com (Software Information) Date: Thu, 25 Jan 2018 16:25:33 -0500 Subject: [OmniOS-discuss] NTP Service error In-Reply-To: References: Message-ID: The command: *zonecfg -z zonename set limitpriv=default,proc_* *priocntl,sys_time* actually fixed the problem. I thought it didn't but only to realize I didn't reboot the zone. That was my bad. Thanks so much for the support. Kind Regards. On Thu, Jan 25, 2018 at 11:18 AM, Andy Fiddaman wrote: > > > On Thu, 25 Jan 2018, Software Information wrote: > > ; I ran > ; # zonecfg -z zonename set limitpriv="proc_priocntl" > ; > ; But it resulted in: > ; zonename:* invalid privilege: sys_mount* > > You need to add the privilege, not replace what's already there: > > # zonecfg -z zonename set limitpriv=default,proc_priocntl,sys_time > > -- > Citrus IT Limited | +44 (0)333 0124 007 | enquiries at citrus-it.co.uk > Rock House Farm | Green Moor | Wortley | Sheffield | S35 7DQ > Registered in England and Wales | Company number 4899123 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From paladinemishakal at gmail.com Tue Jan 30 07:16:52 2018 From: paladinemishakal at gmail.com (Lawrence Giam) Date: Tue, 30 Jan 2018 15:16:52 +0800 Subject: [OmniOS-discuss] Intel Chipset support Message-ID: Hi All, Is there a place I can go find out what Intel chipset is supported by OmniOS? Thanks & Regards. -------------- next part -------------- An HTML attachment was scrubbed... URL: From manuel at oetiker.ch Tue Jan 30 07:58:54 2018 From: manuel at oetiker.ch (Manuel Oetiker) Date: Tue, 30 Jan 2018 08:58:54 +0100 (CET) Subject: [OmniOS-discuss] Intel Chipset support In-Reply-To: References: Message-ID: <1848885924.488266.1517299134606.JavaMail.zimbra@oetiker.ch> hi https://illumos.org/hcl/ cheers Manuel ----- Original Message ----- > From: "Lawrence Giam" > To: "omnios-discuss" > Sent: Tuesday, January 30, 2018 8:16:52 AM > Subject: [OmniOS-discuss] Intel Chipset support > Hi All, > Is there a place I can go find out what Intel chipset is supported by OmniOS? > Thanks & Regards. > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss From stephan.budach at jvm.de Wed Jan 31 18:39:06 2018 From: stephan.budach at jvm.de (Stephan Budach) Date: Wed, 31 Jan 2018 19:39:06 +0100 (CET) Subject: [OmniOS-discuss] How to safely remove/replace NVMe SSDs In-Reply-To: <1205068386.4148.1517423581346.JavaMail.stephan.budach@stephanbudach.local> Message-ID: <1533340439.4158.1517423920653.JavaMail.stephan.budach@stephanbudach.local> Hi, I have purchased two if those Supermicro NVMe servers: SSG-2028R-NR48N. Both of them are equipped with 24x Intel P DC4500 U.2 devices, which are obviously hot-pluggable, at least they seem to be. ;) At the moment, I am trying to familiarize myself with the handling of those devices and I am having quite a hard time, coming up with a method of safely removing/replacing such a device. I am able to detach a nvme device using nvmeadm, but removing it from the system by pulling it out, causes the kernel to retire the pci device and I have not yet found a way to get the re-inserted device online again. Anybody having some experience how to handle those NVMe devices? Thanks, Stephan -------------- next part -------------- An HTML attachment was scrubbed... URL: