From chip at innovates.com  Tue Jan  2 14:40:58 2018
From: chip at innovates.com (Schweiss, Chip)
Date: Tue, 2 Jan 2018 08:40:58 -0600
Subject: [OmniOS-discuss] rpcbind: t_bind failed
Message-ID: <CALeZrrQ9yEGtvJBz96wDG2np1YQ6N1ykJfOU1qhDh-aoF5K-Bg@mail.gmail.com>

About once every week or two I'm having NFS connections start to collapse
to one of my servers.   Clients will lose thier connections of the the
course of several hours. The logs fill with these messages:

Dec 25 16:21:14 mir-zfs03 rpcbind: [ID 452059 daemon.error]  do_accept :
t_bind failed : Couldn't allocate address
Dec 25 16:21:14 mir-zfs03 /usr/lib/nfs/nfsd[27689]: [ID 396295
daemon.error] t_bind(file descriptor 188/transport tcp) TLI error 5
Dec 25 16:21:31 mir-zfs03 last message repeated 85 times
Dec 25 16:21:31 mir-zfs03 rpcbind: [ID 452059 daemon.error]  do_accept :
t_bind failed : Couldn't allocate address
Dec 25 16:21:32 mir-zfs03 /usr/lib/nfs/nfsd[27689]: [ID 396295
daemon.error] t_bind(file descriptor 188/transport tcp) TLI error 5
Dec 25 16:21:34 mir-zfs03 last message repeated 19 times
Dec 25 16:21:37 mir-zfs03 /usr/lib/nfs/nfsd[27689]: [ID 396295
daemon.error] t_bind(file descriptor 200/transport tcp) TLI error 5
Dec 25 16:22:17 mir-zfs03 last message repeated 116 times
Dec 25 16:22:21 mir-zfs03 /usr/lib/nfs/nfsd[27689]: [ID 396295
daemon.error] t_bind(file descriptor 206/transport tcp) TLI error 5
Dec 25 16:23:04 mir-zfs03 last message repeated 81 times

This is a fully updated OmniOS CE r151022.

I've tried restarting NFS services, but the only thing that has been
successful in restoring services has been rebooting.

I'm not finding anything useful via Google except the source code that
spits out this message.   HP-UX appears to have had the same issue that
they patched years ago.   I'm guessing shared NFS/RPC code.

Any clue as to the cause of this and how to fix?

-Chip
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20180102/378d460d/attachment.html>

From chip at innovates.com  Wed Jan  3 16:02:43 2018
From: chip at innovates.com (Schweiss, Chip)
Date: Wed, 3 Jan 2018 10:02:43 -0600
Subject: [OmniOS-discuss] rpcbind: t_bind failed
In-Reply-To: <CALeZrrQ9yEGtvJBz96wDG2np1YQ6N1ykJfOU1qhDh-aoF5K-Bg@mail.gmail.com>
References: <CALeZrrQ9yEGtvJBz96wDG2np1YQ6N1ykJfOU1qhDh-aoF5K-Bg@mail.gmail.com>
Message-ID: <CALeZrrRnJMtkXM3KmwKt3tHP83gWER1p25+UqF8oagTrjgJZfg@mail.gmail.com>

The problem occurred again starting last night.  I have another clue, but I
still don't know how it is occurring or how to fix it.

It looks like all the TCP ports are in "bound" state, but not being
released.

How can I isolate the cause of this?

# netstat -an

UDP: IPv4
   Local Address        Remote Address      State
-------------------- -------------------- ----------
      *.*                                 Unbound
      *.57921                             Idle
      *.42045                             Idle
      *.*                                 Unbound
      *.*                                 Unbound
      *.*                                 Unbound
      *.*                                 Unbound
      *.33757                             Idle
      *.37883                             Idle
      *.40744                             Idle
      *.4045                              Idle
      *.111                               Idle
      *.4045                              Idle
      *.*                                 Unbound
      *.60955                             Idle
      *.42908                             Idle
      *.56487                             Idle
      *.111                               Idle
      *.*                                 Unbound
      *.60994                             Idle
      *.*                                 Unbound
      *.2049                              Idle
      *.520                               Idle
      *.2049                              Idle
      *.46876                             Idle
      *.50309                             Idle
      *.123                               Idle
      *.123                               Idle
127.0.0.1.123                             Idle
10.28.125.29.123                          Idle
      *.57929                             Idle
      *.64351                             Idle
      *.63145                             Idle
      *.39674                             Idle
      *.65280                             Idle
      *.52013                             Idle
      *.47989                             Idle
      *.*                                 Unbound

UDP: IPv6
   Local Address                     Remote Address
 State      If
--------------------------------- ---------------------------------
---------- -----
      *.57921                                                       Idle
      *.*                                                           Unbound
      *.*                                                           Unbound
      *.40744                                                       Idle
      *.111                                                         Idle
      *.4045                                                        Idle
      *.*                                                           Unbound
      *.60955                                                       Idle
      *.42908                                                       Idle
      *.*                                                           Unbound
      *.2049                                                        Idle
      *.46876                                                       Idle
      *.123                                                         Idle
::1.123                                                             Idle

TCP: IPv4
   Local Address        Remote Address    Swind Send-Q Rwind Recv-Q    State
-------------------- -------------------- ----- ------ ----- ------
-----------
      *.60571              *.*                0      0 1057280      0 BOUND
      *.46344              *.*                0      0 1057280      0 BOUND
      *.39729              *.*                0      0 1057280      0 BOUND
      *.43531              *.*                0      0 1057280      0 BOUND
      *.49051              *.*                0      0 1057280      0 BOUND
      *.44876              *.*                0      0 1057280      0 BOUND
      *.65416              *.*                0      0 1057280      0 BOUND
      *.47714              *.*                0      0 1057280      0 BOUND
      *.59055              *.*                0      0 1057280      0 BOUND
      *.45033              *.*                0      0 1057280      0 BOUND
      *.63321              *.*                0      0 1057280      0 BOUND
      *.43896              *.*                0      0 1057280      0 BOUND
      *.46627              *.*                0      0 1057280      0 BOUND
      *.35555              *.*                0      0 1057280      0 BOUND
      *.36115              *.*                0      0 1057280      0 BOUND
      *.51969              *.*                0      0 1057280      0 BOUND
      *.63741              *.*                0      0 1057280      0 BOUND
      *.45747              *.*                0      0 1057280      0 BOUND
      *.33245              *.*                0      0 1057280      0 BOUND
      *.49925              *.*                0      0 1057280      0 BOUND
      *.63503              *.*                0      0 1057280      0 BOUND
      *.45319              *.*                0      0 1057280      0 BOUND
      *.39977              *.*                0      0 1057280      0 BOUND


....lots of lines deleted...

      *.37396              *.*                0      0 1057280      0 BOUND
      *.33735              *.*                0      0 1057280      0 BOUND
      *.35695              *.*                0      0 1057280      0 BOUND
      *.36589              *.*                0      0 1057280      0 BOUND
      *.41484              *.*                0      0 1057280      0 BOUND
      *.63428              *.*                0      0 1057280      0 BOUND
      *.54891              *.*                0      0 1057280      0 BOUND
      *.60222              *.*                0      0 1057280      0 BOUND
      *.40494              *.*                0      0 1057280      0 BOUND

TCP: IPv6
   Local Address                     Remote Address                 Swind
Send-Q Rwind Recv-Q   State      If
--------------------------------- --------------------------------- -----
------ ----- ------ ----------- -----
      *.54749                           *.*                             0
    0 128000      0 LISTEN
::1.5999                                *.*                             0
    0 128000      0 LISTEN
      *.4045                            *.*                             0
    0 1049200      0 LISTEN
      *.111                             *.*                             0
    0 128000      0 LISTEN
      *.*                               *.*                             0
    0 128000      0 IDLE
      *.45543                           *.*                             0
    0 128000      0 LISTEN
      *.2049                            *.*                             0
    0 1049200      0 LISTEN
      *.53926                           *.*                             0
    0 128000      0 LISTEN
      *.50379                           *.*                             0
    0 128000      0 LISTEN

Active UNIX domain sockets
Address  Type          Vnode     Conn  Local Addr      Remote Addr
ffffd063569210e8 stream-ord 0000000 0000000
ffffd06356921498 stream-ord 0000000 0000000
ffffd06356921bf8 stream-ord 0000000 0000000
ffffd063569260e0 stream-ord 0000000 0000000
ffffd06356926490 stream-ord ffffd0635646d300 0000000 private/scache
ffffd06356926840 stream-ord 0000000 0000000
ffffd06356926bf0 stream-ord 0000000 0000000
ffffd063569290d8 stream-ord ffffd0635646d200 0000000 private/anvil
ffffd06356929488 stream-ord 0000000 0000000
ffffd06356929838 stream-ord 0000000 0000000
ffffd06356929be8 stream-ord ffffd06356810480 0000000 private/lmtp
ffffd0635692d0d0 stream-ord 0000000 0000000
ffffd0635692d480 stream-ord 0000000 0000000
ffffd0635692d830 stream-ord ffffd06356810280 0000000 private/virtual
ffffd0635692dbe0 stream-ord 0000000 0000000
ffffd063569320c8 stream-ord 0000000 0000000
ffffd06356932478 stream-ord ffffd0635685aa00 0000000 private/local
ffffd06356932828 stream-ord 0000000 0000000
ffffd06356932bd8 stream-ord 0000000 0000000
ffffd063569360c0 stream-ord ffffd0635685ad00 0000000 private/discard
ffffd06356936470 stream-ord 0000000 0000000
ffffd06356936820 stream-ord 0000000 0000000
ffffd06356936bd0 stream-ord ffffd0635685ab00 0000000 private/retry
ffffd0635693b0b8 stream-ord 0000000 0000000
ffffd0635693b468 stream-ord 0000000 0000000
ffffd0635693b818 stream-ord ffffd0635685ae00 0000000 private/error
ffffd0635693bbc8 stream-ord 0000000 0000000
ffffd063568e10b0 stream-ord 0000000 0000000
ffffd063568e1460 stream-ord ffffd0635685a400 0000000 public/showq
ffffd063568e1810 stream-ord 0000000 0000000
ffffd063568e1bc0 stream-ord 0000000 0000000
ffffd063568e60a8 stream-ord ffffd0635685a600 0000000 private/relay
ffffd063568e6458 stream-ord 0000000 0000000
ffffd063568e6808 stream-ord 0000000 0000000
ffffd063568e6bb8 stream-ord ffffd0635685a900 0000000 private/smtp
ffffd063568ea0a0 stream-ord 0000000 0000000
ffffd063568ea450 stream-ord 0000000 0000000
ffffd063568ea800 stream-ord ffffd0635646d100 0000000 private/proxywrite
ffffd063568eabb0 stream-ord 0000000 0000000
ffffd063568f0098 stream-ord 0000000 0000000
ffffd063568f0448 stream-ord ffffd0635685ac00 0000000 private/proxymap
ffffd063568f07f8 stream-ord 0000000 0000000
ffffd063568f0ba8 stream-ord 0000000 0000000
ffffd063568f2090 stream-ord ffffd0635685a200 0000000 public/flush
ffffd063568f2440 stream-ord 0000000 0000000
ffffd063568f27f0 stream-ord 0000000 0000000
ffffd063568f2ba0 stream-ord ffffd0635685a500 0000000 private/verify
ffffd063568f9088 stream-ord 0000000 0000000
ffffd063568f9438 stream-ord 0000000 0000000
ffffd063568f97e8 stream-ord ffffd06356810080 0000000 private/trace
ffffd063568f9b98 stream-ord 0000000 0000000
ffffd063568fd080 stream-ord 0000000 0000000
ffffd063568fd430 stream-ord ffffd06356810180 0000000 private/defer
ffffd063568fd7e0 stream-ord 0000000 0000000
ffffd063568fdb90 stream-ord 0000000 0000000
ffffd06356840078 stream-ord ffffd0635685a700 0000000 private/bounce
ffffd06356840428 stream-ord 0000000 0000000
ffffd063568407d8 stream-ord 0000000 0000000
ffffd06356840b88 stream-ord ffffd0635685a800 0000000 private/rewrite
ffffd06356843070 stream-ord ffffd06356810380 0000000 private/tlsmgr
ffffd06356843420 stream-ord 0000000 0000000
ffffd063568437d0 stream-ord 0000000 0000000
ffffd06356849068 stream-ord 0000000 0000000
ffffd06356849418 stream-ord 0000000 0000000
ffffd063568497c8 stream-ord ffffd0635685a000 0000000 public/qmgr
ffffd06356849b78 stream-ord ffffd0635685a100 0000000 public/cleanup
ffffd0635684d060 stream-ord 0000000 0000000
ffffd0635684d410 stream-ord 0000000 0000000
ffffd0635684db70 stream-ord 0000000 0000000
ffffd06355646058 stream-ord 0000000 0000000
ffffd06355646b68 stream-ord ffffd0635685a300 0000000 public/pickup
ffffd063551bf3f8 stream-ord ffffd063193fe900 0000000 /var/run/.inetd.uds
ffffd063550e7b50 dgram      ffffd063550eb380 0000000 /var/run/in.rdisc_mib
ffffd06355031798 dgram      ffffd063536c8800 0000000 /var/run/in.ndpd_mib
ffffd06355031b48 stream-ord ffffd063536c8c00 0000000 /var/run/in.ndpd_ipadm
ffffd0635265a028 stream-ord 0000000 ffffd0634e4acd00
/var/run/dbus/system_bus_socket
ffffd0635265a788 stream-ord 0000000 ffffd063500ffc80
/var/run/hald/dbus-y1Me9kLIpf
ffffd0635265ab38 stream-ord 0000000 0000000 /var/run/hald/dbus-y1Me9kLIpf
ffffd06351d553d0 stream-ord 0000000 0000000 /var/run/hald/dbus-y1Me9kLIpf
ffffd06351d55780 stream-ord 0000000 0000000 /var/run/hald/dbus-y1Me9kLIpf
ffffd06351d55b30 stream-ord 0000000 ffffd063500ffc80
/var/run/hald/dbus-y1Me9kLIpf
ffffd06351996018 stream-ord 0000000 ffffd063500ffc80
/var/run/hald/dbus-y1Me9kLIpf
ffffd063519963c8 stream-ord 0000000 0000000 /var/run/hald/dbus-y1Me9kLIpf
ffffd06351996778 stream-ord 0000000 ffffd063500ffc80
/var/run/hald/dbus-y1Me9kLIpf
ffffd063500fe010 stream-ord 0000000 0000000 /var/run/hald/dbus-5Qrha0Wmu3
ffffd063500fe3c0 stream-ord 0000000 ffffd063500ffa80
/var/run/hald/dbus-5Qrha0Wmu3
ffffd063500fe770 stream-ord ffffd063500ffa80 0000000
/var/run/hald/dbus-5Qrha0Wmu3
ffffd063500feb20 stream-ord ffffd063500ffc80 0000000
/var/run/hald/dbus-y1Me9kLIpf
ffffd0634e4ad008 stream-ord 0000000 0000000
ffffd0634e4ad3b8 stream-ord 0000000 0000000
ffffd0634e4ad768 stream-ord 0000000 0000000 /var/run/dbus/system_bus_socket
ffffd0634e4adb18 stream-ord ffffd0634e4acd00 0000000
/var/run/dbus/system_bus_socket


A sorted output shows nearly all 64K ports in bound state.

On Tue, Jan 2, 2018 at 8:40 AM, Schweiss, Chip <chip at innovates.com> wrote:

> About once every week or two I'm having NFS connections start to collapse
> to one of my servers.   Clients will lose thier connections of the the
> course of several hours. The logs fill with these messages:
>
> Dec 25 16:21:14 mir-zfs03 rpcbind: [ID 452059 daemon.error]  do_accept :
> t_bind failed : Couldn't allocate address
> Dec 25 16:21:14 mir-zfs03 /usr/lib/nfs/nfsd[27689]: [ID 396295
> daemon.error] t_bind(file descriptor 188/transport tcp) TLI error 5
> Dec 25 16:21:31 mir-zfs03 last message repeated 85 times
> Dec 25 16:21:31 mir-zfs03 rpcbind: [ID 452059 daemon.error]  do_accept :
> t_bind failed : Couldn't allocate address
> Dec 25 16:21:32 mir-zfs03 /usr/lib/nfs/nfsd[27689]: [ID 396295
> daemon.error] t_bind(file descriptor 188/transport tcp) TLI error 5
> Dec 25 16:21:34 mir-zfs03 last message repeated 19 times
> Dec 25 16:21:37 mir-zfs03 /usr/lib/nfs/nfsd[27689]: [ID 396295
> daemon.error] t_bind(file descriptor 200/transport tcp) TLI error 5
> Dec 25 16:22:17 mir-zfs03 last message repeated 116 times
> Dec 25 16:22:21 mir-zfs03 /usr/lib/nfs/nfsd[27689]: [ID 396295
> daemon.error] t_bind(file descriptor 206/transport tcp) TLI error 5
> Dec 25 16:23:04 mir-zfs03 last message repeated 81 times
>
> This is a fully updated OmniOS CE r151022.
>
> I've tried restarting NFS services, but the only thing that has been
> successful in restoring services has been rebooting.
>
> I'm not finding anything useful via Google except the source code that
> spits out this message.   HP-UX appears to have had the same issue that
> they patched years ago.   I'm guessing shared NFS/RPC code.
>
> Any clue as to the cause of this and how to fix?
>
> -Chip
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20180103/d9b516ea/attachment-0001.html>

From mir at miras.org  Wed Jan  3 16:14:40 2018
From: mir at miras.org (Michael Rasmussen)
Date: Wed, 3 Jan 2018 17:14:40 +0100
Subject: [OmniOS-discuss] rpcbind: t_bind failed
In-Reply-To: <CALeZrrRnJMtkXM3KmwKt3tHP83gWER1p25+UqF8oagTrjgJZfg@mail.gmail.com>
References: <CALeZrrQ9yEGtvJBz96wDG2np1YQ6N1ykJfOU1qhDh-aoF5K-Bg@mail.gmail.com>
	<CALeZrrRnJMtkXM3KmwKt3tHP83gWER1p25+UqF8oagTrjgJZfg@mail.gmail.com>
Message-ID: <20180103171440.1dcfda5a@sleipner.datanom.net>

On Wed, 3 Jan 2018 10:02:43 -0600
"Schweiss, Chip" <chip at innovates.com> wrote:

> The problem occurred again starting last night.  I have another clue, but I
> still don't know how it is occurring or how to fix it.
> 
> It looks like all the TCP ports are in "bound" state, but not being
> released.
> 
> How can I isolate the cause of this?
> 
lsof should be able to tell you which program is listening on a
specific port: lsof -i :port

-- 
Hilsen/Regards
Michael Rasmussen

Get my public GnuPG keys:
michael <at> rasmussen <dot> cc
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E
mir <at> datanom <dot> net
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C
mir <at> miras <dot> org
http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917
--------------------------------------------------------------
/usr/games/fortune -es says:
You can never tell which way the train went by looking at the tracks.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <https://omniosce.org/ml-archive/attachments/20180103/2ccf248a/attachment.bin>

From chip at innovates.com  Wed Jan  3 18:55:09 2018
From: chip at innovates.com (Schweiss, Chip)
Date: Wed, 3 Jan 2018 12:55:09 -0600
Subject: [OmniOS-discuss] [zfs] Re: rpcbind: t_bind failed
In-Reply-To: <20180103163208.GD1629@telcontar>
References: <CALeZrrQ9yEGtvJBz96wDG2np1YQ6N1ykJfOU1qhDh-aoF5K-Bg@mail.gmail.com>
	<CALeZrrRnJMtkXM3KmwKt3tHP83gWER1p25+UqF8oagTrjgJZfg@mail.gmail.com>
	<20180103163208.GD1629@telcontar>
Message-ID: <CALeZrrSE4y1R9yENeqAdbyAGKxSC7QcEMiQ6LnCieUqY0Uc-BA@mail.gmail.com>

Hopefully the patch Marcel is talking about fixes this.  I've at least
figured out enough to predict when the problem is imminent.

We have been migrating to using automounter instead of hard mounts which
could to be related to this problem growing over time.

Just an FYI:  I've kept the server running in this state, but moved its
storage pool to a sister server.   The port binding problem remains with NO
NFS clients connected, but neither pfiles or lsof shows rpcbind as the
culprit:

# netstat -an|grep BOUND|wc -l
32739

# /opt/ozmt/bin/SunOS/lsof -i:41155

{nothing returned}

# pfiles `pgrep rpcbind`
449:    /usr/sbin/rpcbind
  Current rlimit: 65536 file descriptors
   0: S_IFCHR mode:0666 dev:527,0 ino:70778888 uid:0 gid:3 rdev:135,2
      O_RDWR
      /devices/pseudo/mm at 0:null
      offset:0
   1: S_IFCHR mode:0666 dev:527,0 ino:70778888 uid:0 gid:3 rdev:135,2
      O_RDWR
      /devices/pseudo/mm at 0:null
      offset:0
   2: S_IFCHR mode:0666 dev:527,0 ino:70778888 uid:0 gid:3 rdev:135,2
      O_RDWR
      /devices/pseudo/mm at 0:null
      offset:0
   3: S_IFCHR mode:0000 dev:527,0 ino:61271 uid:0 gid:0 rdev:231,64
      O_RDWR
        sockname: AF_INET6 ::  port: 111
      /devices/pseudo/udp6 at 0:udp6
      offset:0
   4: S_IFCHR mode:0000 dev:527,0 ino:50998 uid:0 gid:0 rdev:231,59
      O_RDWR
        sockname: AF_INET6 ::  port: 0
      /devices/pseudo/udp6 at 0:udp6
      offset:0
   5: S_IFCHR mode:0000 dev:527,0 ino:61264 uid:0 gid:0 rdev:231,58
      O_RDWR
        sockname: AF_INET6 ::  port: 60955
      /devices/pseudo/udp6 at 0:udp6
      offset:0
   6: S_IFCHR mode:0000 dev:527,0 ino:64334 uid:0 gid:0 rdev:224,57
      O_RDWR
        sockname: AF_INET6 ::  port: 111
      /devices/pseudo/tcp6 at 0:tcp6
      offset:0
   7: S_IFCHR mode:0000 dev:527,0 ino:64333 uid:0 gid:0 rdev:224,56
      O_RDWR
        sockname: AF_INET6 ::  port: 0
      /devices/pseudo/tcp6 at 0:tcp6
      offset:0
   8: S_IFCHR mode:0000 dev:527,0 ino:64332 uid:0 gid:0 rdev:230,55
      O_RDWR
        sockname: AF_INET 0.0.0.0  port: 111
      /devices/pseudo/udp at 0:udp
      offset:0
   9: S_IFCHR mode:0000 dev:527,0 ino:64330 uid:0 gid:0 rdev:230,54
      O_RDWR
        sockname: AF_INET 0.0.0.0  port: 0
      /devices/pseudo/udp at 0:udp
      offset:0
  10: S_IFCHR mode:0000 dev:527,0 ino:64331 uid:0 gid:0 rdev:230,53
      O_RDWR
        sockname: AF_INET 0.0.0.0  port: 60994
      /devices/pseudo/udp at 0:udp
      offset:0
  11: S_IFCHR mode:0000 dev:527,0 ino:64327 uid:0 gid:0 rdev:223,52
      O_RDWR
        sockname: AF_INET 0.0.0.0  port: 111
      /devices/pseudo/tcp at 0:tcp
      offset:0
  12: S_IFCHR mode:0000 dev:527,0 ino:64326 uid:0 gid:0 rdev:223,51
      O_RDWR
        sockname: AF_INET 0.0.0.0  port: 0
      /devices/pseudo/tcp at 0:tcp
      offset:0
  13: S_IFCHR mode:0000 dev:527,0 ino:64324 uid:0 gid:0 rdev:226,32
      O_RDWR
      /devices/pseudo/tl at 0:ticlts
      offset:0
  14: S_IFCHR mode:0000 dev:527,0 ino:64328 uid:0 gid:0 rdev:226,33
      O_RDWR
      /devices/pseudo/tl at 0:ticlts
      offset:0
  15: S_IFCHR mode:0000 dev:527,0 ino:64324 uid:0 gid:0 rdev:226,35
      O_RDWR
      /devices/pseudo/tl at 0:ticlts
      offset:0
  16: S_IFCHR mode:0000 dev:527,0 ino:64322 uid:0 gid:0 rdev:226,36
      O_RDWR
      /devices/pseudo/tl at 0:ticotsord
      offset:0
  17: S_IFCHR mode:0000 dev:527,0 ino:64321 uid:0 gid:0 rdev:226,37
      O_RDWR
      /devices/pseudo/tl at 0:ticotsord
      offset:0
  18: S_IFCHR mode:0000 dev:527,0 ino:64030 uid:0 gid:0 rdev:226,39
      O_RDWR
      /devices/pseudo/tl at 0:ticots
      offset:0
  19: S_IFCHR mode:0000 dev:527,0 ino:64029 uid:0 gid:0 rdev:226,40
      O_RDWR
      /devices/pseudo/tl at 0:ticots
      offset:0
  20: S_IFIFO mode:0000 dev:525,0 ino:206 uid:1 gid:12 rdev:0,0
      O_RDWR|O_NONBLOCK
  21: S_IFIFO mode:0000 dev:525,0 ino:206 uid:1 gid:12 rdev:0,0
      O_RDWR|O_NONBLOCK
  23: S_IFCHR mode:0000 dev:527,0 ino:33089 uid:0 gid:0 rdev:129,21273
      O_WRONLY FD_CLOEXEC
      /devices/pseudo/log at 0:conslog
      offset:0

Restarting rpcbind doesn't affect it either:

# svcadm restart svc:/network/rpc/bind:default

# netstat -an|grep BOUND|wc -l
32739

In the interim of this patch getting integrated I'll monitor the number of
bound ports to know when I should fail my pool over again.


On Wed, Jan 3, 2018 at 10:32 AM, Marcel Telka <marcel at telka.sk> wrote:

> On Wed, Jan 03, 2018 at 10:02:43AM -0600, Schweiss, Chip wrote:
> > The problem occurred again starting last night.  I have another clue,
> but I
> > still don't know how it is occurring or how to fix it.
> >
> > It looks like all the TCP ports are in "bound" state, but not being
> > released.
> >
> > How can I isolate the cause of this?
>
> This is a bug in rpcmod, very likely related to
> https://www.illumos.org/issues/1616
>
> I discussed this few weeks back with some guy who faced the same issue.  It
> looks like he found the cause and have a fix for it.  I thought he will
> post a
> review request, but that didn't happened for some reason yet.
>
> I'll try to push this forward...
>
>
> Thanks.
>
> --
> +-------------------------------------------+
> | Marcel Telka   e-mail:   marcel at telka.sk  |
> |                homepage: http://telka.sk/ |
> |                jabber:   marcel at jabber.sk |
> +-------------------------------------------+
>
> ------------------------------------------
> illumos-zfs
> Archives: https://illumos.topicbox.com/groups/zfs/discussions/
> T8f10bde64dc0d5c5-Mb17ca753ce6f6fbed5124147
> Powered by Topicbox: https://topicbox.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20180103/4de28636/attachment-0001.html>

From paladinemishakal at gmail.com  Fri Jan  5 03:58:16 2018
From: paladinemishakal at gmail.com (Lawrence Giam)
Date: Fri, 5 Jan 2018 11:58:16 +0800
Subject: [OmniOS-discuss] Problem with BSD Loader and boot mirror
In-Reply-To: <CAGueQCeBtdHOibve7_BekLkV1Mh5yLd-9BXWe5CWsp9eet_JMg@mail.gmail.com>
References: <CAGueQCeBtdHOibve7_BekLkV1Mh5yLd-9BXWe5CWsp9eet_JMg@mail.gmail.com>
Message-ID: <CAGueQCfJ0-w5xo13pUbBZcdbc1teqTG-25x+suANvSBBL=Ey-w@mail.gmail.com>

Hi All,

Seem like no one have got this issue???

In that case, how do I work with the BSD loader to check and also to try
load the rpool ?

Thanks & Regards.

On Tue, Dec 19, 2017 at 6:44 PM, Lawrence Giam <paladinemishakal at gmail.com>
wrote:

> Hi All,
>
> I have a physical server which I am taking the time to install and test
> the OmniOS SCE R151022.
>
> As with the Grub loader and OmniOS R151014, I use the following steps to
> setup a boot mirror:
> 1. Partition a partition on the SSD for the boot mirror.
> 2. Do the prtvtoc /dev/rdsk/c2t1d0s0 | fmthard -s - /dev/rdsk/c2t0d0s0
> 3. Attach the partition to the rpool : zpool attach -f rpool c2t1d0s0
> c2t0d0s0
> 4. Wait for the resilver to finish and then reboot the server, ensure it
> boots ok.
> 5. After boots ok, run the installgrub /boot/grub/stage1 /boot/grub/stage2
> /dev/rdsk/c2t0d0s0
> 6. Reboot and ensure booting is good.
> 7. Shutdown and Simulate c2t1d0s0 failure by taking out the SSD.
> 8. Power up server and the system still boot to OmniOS but with alert that
> one of the boot mirror is missing.
>
> As with the BSD loader and OmniOS R151022, I use the following steps to
> setup a boot mirror:
> 1. Partition a partition on the SSD for the boot mirror.
> 2. Do the prtvtoc /dev/rdsk/c2t1d0s0 | fmthard -s - /dev/rdsk/c2t0d0s0
> 3. Attach the partition to the rpool : zpool attach -f rpool c2t1d0s0
> c2t0d0s0
> 4. Wait for the resilver to finish and then reboot the server, ensure it
> boots ok.
> 5. After boots ok, run bootadm install-bootloader
> 6. Reboot and ensure booting is good.
> 7. Shutdown and Simulate c2t1d0s0 failure by taking out the SSD.
> 8. Power up server and the system boots with the following message:
> Loading complete
> Consoles: internal video/keyboard
> BIOS drive C: is disk 0
> BIOS drive D: is disk 1
> ZFS: i/o error - all block copies unavailable
> ZFS: can't read MOS of pool rpool
> ZFS: i/o error - all block copies unavailable
> ZFS: pool tankAAA is not supported
> BIOS 608kB/1983056kB available memory
>
> illumos/x86 ZFS enabled bootstrap loader, Revision 1.1
> ZFS: can't find pool by guid
> ZFS: can't find pool by guid
> loading CORE EXT words
> loading SEARCH & SEACH-EXT words
> loading John-Hopkins locals
> loading MARKER
> loading ficl O-O extensions
> loading ficl utility classes
> loading ficl string class
>
> start not found
>
> Type '?' for a list of commands, 'help' for more detailed help.
> ok
> ------------------------------------------------------------
> --------------------------------------
>
> I have 2 pools - rpool and tankAAA configured but somehow the BSD loader is
> 1. unable to recognise the boot mirror
> 2. cannot see the other dataset (eg. tankAAA)
>
> Next, I shutdown the server and put back the disk and power up the server
> again and it is able to boot as before normally.
>
> Is my step to do the boot mirror wrong or is there something that I am
> missing out?
>
> Thanks & Regards.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20180105/18603649/attachment.html>

From chip at innovates.com  Fri Jan  5 14:57:23 2018
From: chip at innovates.com (Schweiss, Chip)
Date: Fri, 5 Jan 2018 08:57:23 -0600
Subject: [OmniOS-discuss] OmniOSce installer rpool slicing
Message-ID: <CALeZrrQpCtdcPo6v0phx-bHz_gcDeBHi7uhR8dbEMyDqc+CzcQ@mail.gmail.com>

In the previous Solaris style installer we had the option of only using a
portion of the disk that the rpool went on.   This was very good for SSDs
that perform better and last longer if they have some additional slack
space that never has data written to it.

Is there a way to achieve this with the new installer?

-Chip
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20180105/df2fdd5f/attachment.html>

From jcoombs at staff.gwi.net  Fri Jan  5 15:08:47 2018
From: jcoombs at staff.gwi.net (Josh Coombs)
Date: Fri, 5 Jan 2018 10:08:47 -0500
Subject: [OmniOS-discuss] OmniOSce installer rpool slicing
In-Reply-To: <CALeZrrQpCtdcPo6v0phx-bHz_gcDeBHi7uhR8dbEMyDqc+CzcQ@mail.gmail.com>
References: <CALeZrrQpCtdcPo6v0phx-bHz_gcDeBHi7uhR8dbEMyDqc+CzcQ@mail.gmail.com>
Message-ID: <CACcUnf9eDxWx4cPAP+5aosUd-ZmGJChK_P6r5_sD9neNdq3XHQ@mail.gmail.com>

I've been overprovisioning my SSDs using either MFG supplied tools in the
case of Samsung or Intel units and raw SATA commands on old Vertex drives.
The OS sees a smaller volume and the drive knows that 'slack' space is
truly slack.

Joshua Coombs
GWI

*office* 207-494-2140
www.gwi.net

On Fri, Jan 5, 2018 at 9:57 AM, Schweiss, Chip <chip at innovates.com> wrote:

> In the previous Solaris style installer we had the option of only using a
> portion of the disk that the rpool went on.   This was very good for SSDs
> that perform better and last longer if they have some additional slack
> space that never has data written to it.
>
> Is there a way to achieve this with the new installer?
>
> -Chip
>
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20180105/ae3d6a90/attachment.html>

From vab at bb-c.de  Fri Jan  5 15:11:14 2018
From: vab at bb-c.de (Volker A. Brandt)
Date: Fri, 5 Jan 2018 16:11:14 +0100
Subject: [OmniOS-discuss] OmniOSce installer rpool slicing
In-Reply-To: <CALeZrrQpCtdcPo6v0phx-bHz_gcDeBHi7uhR8dbEMyDqc+CzcQ@mail.gmail.com>
References: <CALeZrrQpCtdcPo6v0phx-bHz_gcDeBHi7uhR8dbEMyDqc+CzcQ@mail.gmail.com>
Message-ID: <23119.38290.578094.766436@shelob.bb-c.de>

Hi Chip!


> In the previous Solaris style installer we had the option of only using a
> portion of the disk that the rpool went on.? ?This was very good for SSDs that
> perform better and last longer if they have some additional slack space that
> never has data written to it.??
> 
> Is there a way to achieve this with the new installer?

Yes.  Just drop to the shell from the installation menu and create your
rpool using fdisk, format, and zpool create.  Exit the shell and select
"use existing pool".


Regards -- Volker
-- 
------------------------------------------------------------------------
Volker A. Brandt               Consulting and Support for Oracle Solaris
Brandt & Brandt Computer GmbH                   WWW: http://www.bb-c.de/
Am Wiesenpfad 6, 53340 Meckenheim, GERMANY            Email: vab at bb-c.de
Handelsregister: Amtsgericht Bonn, HRB 10513              Schuhgr??e: 46
Gesch?ftsf?hrer: Rainer J.H. Brandt und Volker A. Brandt

"When logic and proportion have fallen sloppy dead"

From chip at innovates.com  Fri Jan  5 15:14:34 2018
From: chip at innovates.com (Schweiss, Chip)
Date: Fri, 5 Jan 2018 09:14:34 -0600
Subject: [OmniOS-discuss] OmniOSce installer rpool slicing
In-Reply-To: <23119.38290.578094.766436@shelob.bb-c.de>
References: <CALeZrrQpCtdcPo6v0phx-bHz_gcDeBHi7uhR8dbEMyDqc+CzcQ@mail.gmail.com>
	<23119.38290.578094.766436@shelob.bb-c.de>
Message-ID: <CALeZrrTE94gtcZ2UpeSJotZ=PWViG8PqPB_7T_8tvhLENNKjNg@mail.gmail.com>

I didn't think about that.  Thanks!

On Fri, Jan 5, 2018 at 9:11 AM, Volker A. Brandt <vab at bb-c.de> wrote:

> Hi Chip!
>
>
> > In the previous Solaris style installer we had the option of only using a
> > portion of the disk that the rpool went on.   This was very good for
> SSDs that
> > perform better and last longer if they have some additional slack space
> that
> > never has data written to it.
> >
> > Is there a way to achieve this with the new installer?
>
> Yes.  Just drop to the shell from the installation menu and create your
> rpool using fdisk, format, and zpool create.  Exit the shell and select
> "use existing pool".
>
>
> Regards -- Volker
> --
> ------------------------------------------------------------------------
> Volker A. Brandt               Consulting and Support for Oracle Solaris
> Brandt & Brandt Computer GmbH                   WWW: http://www.bb-c.de/
> Am Wiesenpfad 6, 53340 Meckenheim, GERMANY            Email: vab at bb-c.de
> Handelsregister: Amtsgericht Bonn, HRB 10513              Schuhgr??e: 46
> Gesch?ftsf?hrer: Rainer J.H. Brandt und Volker A. Brandt
>
> "When logic and proportion have fallen sloppy dead"
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20180105/81415789/attachment-0001.html>

From omnios at citrus-it.net  Fri Jan  5 17:07:48 2018
From: omnios at citrus-it.net (Andy Fiddaman)
Date: Fri, 5 Jan 2018 17:07:48 +0000 (UTC)
Subject: [OmniOS-discuss] OmniOSce installer rpool slicing
In-Reply-To: <CALeZrrTE94gtcZ2UpeSJotZ=PWViG8PqPB_7T_8tvhLENNKjNg@mail.gmail.com>
References: <CALeZrrQpCtdcPo6v0phx-bHz_gcDeBHi7uhR8dbEMyDqc+CzcQ@mail.gmail.com>
	<23119.38290.578094.766436@shelob.bb-c.de>
	<CALeZrrTE94gtcZ2UpeSJotZ=PWViG8PqPB_7T_8tvhLENNKjNg@mail.gmail.com>
Message-ID: <nycvar.TFB.7.76.1801051706370.1362@erncre.pvgehf-vg.arg>


On Fri, 5 Jan 2018, Schweiss, Chip wrote:

; I didn't think about that.  Thanks!
;
; On Fri, Jan 5, 2018 at 9:11 AM, Volker A. Brandt <vab at bb-c.de> wrote:
;
; > Yes.  Just drop to the shell from the installation menu and create your
; > rpool using fdisk, format, and zpool create.  Exit the shell and select
; > "use existing pool".

Some more hints on this at https://lists.omniti.com/pipermail/omnios-discuss/2017-November/009402.html

Regards,

Andy

-- 
Citrus IT Limited | +44 (0)333 0124 007 | enquiries at citrus-it.co.uk
Rock House Farm | Green Moor | Wortley | Sheffield | S35 7DQ
Registered in England and Wales | Company number 4899123


From natxo.asenjo at gmail.com  Fri Jan  5 19:26:04 2018
From: natxo.asenjo at gmail.com (Natxo Asenjo)
Date: Fri, 5 Jan 2018 20:26:04 +0100
Subject: [OmniOS-discuss] omnios not updating
Message-ID: <CAHBEJzUvDamg-HGM6kf7K4iF7X-bs_Vy_FtG+V9LX3Cc-i9MjQ@mail.gmail.com>

hi,

my home filer running OmniOS v11 r151024d keeps updating the same set of
packages:

# pkg update -v
WARNING: The boot environment being modified is not the active one.  Changes
made in the active BE will not be reflected on the next boot.

            Packages to update:         7
     Estimated space available: 198.39 GB
Estimated space to be consumed: 116.78 MB
       Create boot environment:       Yes
     Activate boot environment:       Yes
Create backup boot environment:        No
          Rebuild boot archive:       Yes

Changed packages:
omnios
  editor/vim
    8.0.586-0.151024:20171030T140745Z -> 8.0.586-0.151024:20171201T220955Z
  library/nspr
    4.17-0.151024:20171030T152143Z -> 4.17-0.151024:20171213T000253Z
  library/security/openssl
    1.0.2.13-0.151024 -> 1.0.2.14-0.151024
  network/rsync
    3.1.2-0.151024:20171030T152409Z -> 3.1.2-0.151024:20171207T202651Z
  release/name
    0.5.11-0.151024:20171129T095705Z -> 0.5.11-0.151024:20171218T121347Z
  system/kernel/dtrace/providers
    0.5.11-0.151024:20171030T151622Z -> 0.5.11-0.151024:20171203T191145Z
  system/library/mozilla-nss
    3.33-0.151024:20171030T152105Z -> 3.33-0.151024:20171213T000211Z

It runs correctly, I get this message:

A clone of r151024-1 exists and has been updated and activated.
On the next boot the Boot Environment r151024-6 will be
mounted on '/'.  Reboot when ready to switch to this updated BE.

Updating package cache                           1/1

---------------------------------------------------------------------------
NOTE: Please review release notes posted at:

http://www.omniosce.org/releasenotes

But after rebooting If I retry pkg update -nv, I see the same set of
available patches. If I run pkg history, the last patched date is last
month:

2017-12-10T14:04:54      refresh-publishers       pkg
Succeeded
2017-12-10T14:04:55      rebuild-image-catalogs   pkg
Succeeded
2017-12-19T20:31:07      refresh-publishers       pkg
Succeeded
2017-12-19T20:31:09      rebuild-image-catalogs   pkg
Succeeded


Any idea as to what is going on? ;-)

Thanks!
--
Groeten,
natxo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20180105/3383042e/attachment.html>

From natxo.asenjo at gmail.com  Fri Jan  5 19:31:36 2018
From: natxo.asenjo at gmail.com (Natxo Asenjo)
Date: Fri, 5 Jan 2018 20:31:36 +0100
Subject: [OmniOS-discuss] omnios not updating
In-Reply-To: <CAHBEJzUvDamg-HGM6kf7K4iF7X-bs_Vy_FtG+V9LX3Cc-i9MjQ@mail.gmail.com>
References: <CAHBEJzUvDamg-HGM6kf7K4iF7X-bs_Vy_FtG+V9LX3Cc-i9MjQ@mail.gmail.com>
Message-ID: <CAHBEJzXdFq0wNnVz9NUOz-UsVzXyTR6+inoDBumN98p8RcizfQ@mail.gmail.com>

ok, rebooting solved this non issue.

Sorry for the noise

On Fri, Jan 5, 2018 at 8:26 PM, Natxo Asenjo <natxo.asenjo at gmail.com> wrote:

> hi,
>
> my home filer running OmniOS v11 r151024d keeps updating the same set of
> packages:
>
> # pkg update -v
> WARNING: The boot environment being modified is not the active one.
> Changes
> made in the active BE will not be reflected on the next boot.
>
>             Packages to update:         7
>      Estimated space available: 198.39 GB
> Estimated space to be consumed: 116.78 MB
>        Create boot environment:       Yes
>      Activate boot environment:       Yes
> Create backup boot environment:        No
>           Rebuild boot archive:       Yes
>
> Changed packages:
> omnios
>   editor/vim
>     8.0.586-0.151024:20171030T140745Z -> 8.0.586-0.151024:20171201T220955Z
>   library/nspr
>     4.17-0.151024:20171030T152143Z -> 4.17-0.151024:20171213T000253Z
>   library/security/openssl
>     1.0.2.13-0.151024 -> 1.0.2.14-0.151024
>   network/rsync
>     3.1.2-0.151024:20171030T152409Z -> 3.1.2-0.151024:20171207T202651Z
>   release/name
>     0.5.11-0.151024:20171129T095705Z -> 0.5.11-0.151024:20171218T121347Z
>   system/kernel/dtrace/providers
>     0.5.11-0.151024:20171030T151622Z -> 0.5.11-0.151024:20171203T191145Z
>   system/library/mozilla-nss
>     3.33-0.151024:20171030T152105Z -> 3.33-0.151024:20171213T000211Z
>
> It runs correctly, I get this message:
>
> A clone of r151024-1 exists and has been updated and activated.
> On the next boot the Boot Environment r151024-6 will be
> mounted on '/'.  Reboot when ready to switch to this updated BE.
>
> Updating package cache                           1/1
>
> ------------------------------------------------------------
> ---------------
> NOTE: Please review release notes posted at:
>
> http://www.omniosce.org/releasenotes
>
> But after rebooting If I retry pkg update -nv, I see the same set of
> available patches. If I run pkg history, the last patched date is last
> month:
>
> 2017-12-10T14:04:54      refresh-publishers       pkg
> Succeeded
> 2017-12-10T14:04:55      rebuild-image-catalogs   pkg
> Succeeded
> 2017-12-19T20:31:07      refresh-publishers       pkg
> Succeeded
> 2017-12-19T20:31:09      rebuild-image-catalogs   pkg
> Succeeded
>
>
> Any idea as to what is going on? ;-)
>
> Thanks!
> --
> Groeten,
> natxo
>


-- 
--
Groeten,
natxo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20180105/0b4d6ca0/attachment.html>

From youzhong at gmail.com  Sun Jan  7 20:15:33 2018
From: youzhong at gmail.com (Youzhong Yang)
Date: Sun, 7 Jan 2018 15:15:33 -0500
Subject: [OmniOS-discuss] [zfs] Re: rpcbind: t_bind failed
In-Reply-To: <CALeZrrSE4y1R9yENeqAdbyAGKxSC7QcEMiQ6LnCieUqY0Uc-BA@mail.gmail.com>
References: <CALeZrrQ9yEGtvJBz96wDG2np1YQ6N1ykJfOU1qhDh-aoF5K-Bg@mail.gmail.com>
	<CALeZrrRnJMtkXM3KmwKt3tHP83gWER1p25+UqF8oagTrjgJZfg@mail.gmail.com>
	<20180103163208.GD1629@telcontar>
	<CALeZrrSE4y1R9yENeqAdbyAGKxSC7QcEMiQ6LnCieUqY0Uc-BA@mail.gmail.com>
Message-ID: <CADpNCvYV3foWjDWPmc1a1yTKpggLr0=LYkMvsje=4McPEh6iqQ@mail.gmail.com>

Not sure if it's the same issue we reported 3 years ago. We applied our
patch and haven't seen this issue ever since.

https://illumos.topicbox.com/groups/developer/Te5808458a5a5a14f-M74735db9aeccaa5d8c3a70a4


On Wed, Jan 3, 2018 at 1:55 PM, Schweiss, Chip <chip at innovates.com> wrote:

> Hopefully the patch Marcel is talking about fixes this.  I've at least
> figured out enough to predict when the problem is imminent.
>
> We have been migrating to using automounter instead of hard mounts which
> could to be related to this problem growing over time.
>
> Just an FYI:  I've kept the server running in this state, but moved its
> storage pool to a sister server.   The port binding problem remains with NO
> NFS clients connected, but neither pfiles or lsof shows rpcbind as the
> culprit:
>
> # netstat -an|grep BOUND|wc -l
> 32739
>
> # /opt/ozmt/bin/SunOS/lsof -i:41155
>
> {nothing returned}
>
> # pfiles `pgrep rpcbind`
> 449:    /usr/sbin/rpcbind
>   Current rlimit: 65536 file descriptors
>    0: S_IFCHR mode:0666 dev:527,0 ino:70778888 uid:0 gid:3 rdev:135,2
>       O_RDWR
>       /devices/pseudo/mm at 0:null
>       offset:0
>    1: S_IFCHR mode:0666 dev:527,0 ino:70778888 uid:0 gid:3 rdev:135,2
>       O_RDWR
>       /devices/pseudo/mm at 0:null
>       offset:0
>    2: S_IFCHR mode:0666 dev:527,0 ino:70778888 uid:0 gid:3 rdev:135,2
>       O_RDWR
>       /devices/pseudo/mm at 0:null
>       offset:0
>    3: S_IFCHR mode:0000 dev:527,0 ino:61271 uid:0 gid:0 rdev:231,64
>       O_RDWR
>         sockname: AF_INET6 ::  port: 111
>       /devices/pseudo/udp6 at 0:udp6
>       offset:0
>    4: S_IFCHR mode:0000 dev:527,0 ino:50998 uid:0 gid:0 rdev:231,59
>       O_RDWR
>         sockname: AF_INET6 ::  port: 0
>       /devices/pseudo/udp6 at 0:udp6
>       offset:0
>    5: S_IFCHR mode:0000 dev:527,0 ino:61264 uid:0 gid:0 rdev:231,58
>       O_RDWR
>         sockname: AF_INET6 ::  port: 60955
>       /devices/pseudo/udp6 at 0:udp6
>       offset:0
>    6: S_IFCHR mode:0000 dev:527,0 ino:64334 uid:0 gid:0 rdev:224,57
>       O_RDWR
>         sockname: AF_INET6 ::  port: 111
>       /devices/pseudo/tcp6 at 0:tcp6
>       offset:0
>    7: S_IFCHR mode:0000 dev:527,0 ino:64333 uid:0 gid:0 rdev:224,56
>       O_RDWR
>         sockname: AF_INET6 ::  port: 0
>       /devices/pseudo/tcp6 at 0:tcp6
>       offset:0
>    8: S_IFCHR mode:0000 dev:527,0 ino:64332 uid:0 gid:0 rdev:230,55
>       O_RDWR
>         sockname: AF_INET 0.0.0.0  port: 111
>       /devices/pseudo/udp at 0:udp
>       offset:0
>    9: S_IFCHR mode:0000 dev:527,0 ino:64330 uid:0 gid:0 rdev:230,54
>       O_RDWR
>         sockname: AF_INET 0.0.0.0  port: 0
>       /devices/pseudo/udp at 0:udp
>       offset:0
>   10: S_IFCHR mode:0000 dev:527,0 ino:64331 uid:0 gid:0 rdev:230,53
>       O_RDWR
>         sockname: AF_INET 0.0.0.0  port: 60994
>       /devices/pseudo/udp at 0:udp
>       offset:0
>   11: S_IFCHR mode:0000 dev:527,0 ino:64327 uid:0 gid:0 rdev:223,52
>       O_RDWR
>         sockname: AF_INET 0.0.0.0  port: 111
>       /devices/pseudo/tcp at 0:tcp
>       offset:0
>   12: S_IFCHR mode:0000 dev:527,0 ino:64326 uid:0 gid:0 rdev:223,51
>       O_RDWR
>         sockname: AF_INET 0.0.0.0  port: 0
>       /devices/pseudo/tcp at 0:tcp
>       offset:0
>   13: S_IFCHR mode:0000 dev:527,0 ino:64324 uid:0 gid:0 rdev:226,32
>       O_RDWR
>       /devices/pseudo/tl at 0:ticlts
>       offset:0
>   14: S_IFCHR mode:0000 dev:527,0 ino:64328 uid:0 gid:0 rdev:226,33
>       O_RDWR
>       /devices/pseudo/tl at 0:ticlts
>       offset:0
>   15: S_IFCHR mode:0000 dev:527,0 ino:64324 uid:0 gid:0 rdev:226,35
>       O_RDWR
>       /devices/pseudo/tl at 0:ticlts
>       offset:0
>   16: S_IFCHR mode:0000 dev:527,0 ino:64322 uid:0 gid:0 rdev:226,36
>       O_RDWR
>       /devices/pseudo/tl at 0:ticotsord
>       offset:0
>   17: S_IFCHR mode:0000 dev:527,0 ino:64321 uid:0 gid:0 rdev:226,37
>       O_RDWR
>       /devices/pseudo/tl at 0:ticotsord
>       offset:0
>   18: S_IFCHR mode:0000 dev:527,0 ino:64030 uid:0 gid:0 rdev:226,39
>       O_RDWR
>       /devices/pseudo/tl at 0:ticots
>       offset:0
>   19: S_IFCHR mode:0000 dev:527,0 ino:64029 uid:0 gid:0 rdev:226,40
>       O_RDWR
>       /devices/pseudo/tl at 0:ticots
>       offset:0
>   20: S_IFIFO mode:0000 dev:525,0 ino:206 uid:1 gid:12 rdev:0,0
>       O_RDWR|O_NONBLOCK
>   21: S_IFIFO mode:0000 dev:525,0 ino:206 uid:1 gid:12 rdev:0,0
>       O_RDWR|O_NONBLOCK
>   23: S_IFCHR mode:0000 dev:527,0 ino:33089 uid:0 gid:0 rdev:129,21273
>       O_WRONLY FD_CLOEXEC
>       /devices/pseudo/log at 0:conslog
>       offset:0
>
> Restarting rpcbind doesn't affect it either:
>
> # svcadm restart svc:/network/rpc/bind:default
>
> # netstat -an|grep BOUND|wc -l
> 32739
>
> In the interim of this patch getting integrated I'll monitor the number of
> bound ports to know when I should fail my pool over again.
>
>
> On Wed, Jan 3, 2018 at 10:32 AM, Marcel Telka <marcel at telka.sk> wrote:
>
>> On Wed, Jan 03, 2018 at 10:02:43AM -0600, Schweiss, Chip wrote:
>> > The problem occurred again starting last night.  I have another clue,
>> but I
>> > still don't know how it is occurring or how to fix it.
>> >
>> > It looks like all the TCP ports are in "bound" state, but not being
>> > released.
>> >
>> > How can I isolate the cause of this?
>>
>> This is a bug in rpcmod, very likely related to
>> https://www.illumos.org/issues/1616
>>
>> I discussed this few weeks back with some guy who faced the same issue.
>> It
>> looks like he found the cause and have a fix for it.  I thought he will
>> post a
>> review request, but that didn't happened for some reason yet.
>>
>> I'll try to push this forward...
>>
>>
>> Thanks.
>>
>> --
>> +-------------------------------------------+
>> | Marcel Telka   e-mail:   marcel at telka.sk  |
>> |                homepage: http://telka.sk/ |
>> |                jabber:   marcel at jabber.sk |
>> +-------------------------------------------+
>>
>> ------------------------------------------
>> illumos-zfs
>> Archives: https://illumos.topicbox.com/groups/zfs/discussions/T8f10bde
>> 64dc0d5c5-Mb17ca753ce6f6fbed5124147
>> Powered by Topicbox: https://topicbox.com
>>
>
> *illumos-zfs* | Archives
> <https://illumos.topicbox.com/groups/zfs/discussions/T8f10bde64dc0d5c5-M4fe740db1a7d9f22a30655f7>
> | Powered by Topicbox <https://topicbox.com>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20180107/ea8c877a/attachment-0001.html>

From youzhong at gmail.com  Mon Jan  8 16:43:24 2018
From: youzhong at gmail.com (Youzhong Yang)
Date: Mon, 8 Jan 2018 11:43:24 -0500
Subject: [OmniOS-discuss] [zfs] Re: rpcbind: t_bind failed
In-Reply-To: <EFBBE2CE-D7E4-4D1F-9230-E7631C294700@joyent.com>
References: <CALeZrrQ9yEGtvJBz96wDG2np1YQ6N1ykJfOU1qhDh-aoF5K-Bg@mail.gmail.com>
	<CALeZrrRnJMtkXM3KmwKt3tHP83gWER1p25+UqF8oagTrjgJZfg@mail.gmail.com>
	<20180103163208.GD1629@telcontar>
	<CALeZrrSE4y1R9yENeqAdbyAGKxSC7QcEMiQ6LnCieUqY0Uc-BA@mail.gmail.com>
	<CADpNCvYV3foWjDWPmc1a1yTKpggLr0=LYkMvsje=4McPEh6iqQ@mail.gmail.com>
	<EFBBE2CE-D7E4-4D1F-9230-E7631C294700@joyent.com>
Message-ID: <CADpNCvbWsqussBnE7JKX0vXrYabAREYr2BKNFdTK2StUxOBf5g@mail.gmail.com>

This is our patch. It was applied 3 years ago so the line number could be
different for the latest version of the file.
diff --git a/usr/src/uts/common/rpc/clnt_cots.c
b/usr/src/uts/common/rpc/clnt_cots.c
index 4466e93..0a0951d 100644
--- a/usr/src/uts/common/rpc/clnt_cots.c
<http://bsegit/cgi-bin/cgit/smartos-repo/illumos-joyent.git/.git/tree/usr/src/uts/common/rpc/clnt_cots.c?h=tmw-3.5p1&id=01ab13000b73a72f417e287ae71a7e175b2da20a>
+++ b/usr/src/uts/common/rpc/clnt_cots.c
<http://bsegit/cgi-bin/cgit/smartos-repo/illumos-joyent.git/.git/tree/usr/src/uts/common/rpc/clnt_cots.c?h=tmw-3.5p1&id=bab3bdcd1929211af2a220ea41cf946fafc521ed>
@@ -2285,6 +2285,7 @@ start_retry_loop:
if (rpcerr->re_status == RPC_SUCCESS)
rpcerr->re_status = RPC_XPRTFAILED;
cm_entry->x_connected = FALSE;
+ cm_entry->x_dead = TRUE;
} else
cm_entry->x_connected = connected;
@@ -2403,6 +2404,7 @@ connmgr_wrapconnect(
if (rpcerr->re_status == RPC_SUCCESS)
rpcerr->re_status = RPC_XPRTFAILED;
cm_entry->x_connected = FALSE;
+ cm_entry->x_dead = TRUE;
} else
cm_entry->x_connected = connected;

On Mon, Jan 8, 2018 at 11:21 AM, Dan McDonald <danmcd at joyent.com> wrote:

>
>
> > On Jan 7, 2018, at 3:15 PM, Youzhong Yang <youzhong at gmail.com> wrote:
> >
> > Not sure if it's the same issue we reported 3 years ago. We applied our
> patch and haven't seen this issue ever since.
> >
> > https://illumos.topicbox.com/groups/developer/Te5808458a5a5a14f-
> M74735db9aeccaa5d8c3a70a4
>
> To quote that e-mail:
>
> > Hi Marcel:
> >       It looks like we're getting an "early disconnect". This is what is
> leading to the accumulation of bound reserved ports. The scenario for
> reproduction is a follows:
> >
> >       1. Linux  DEBIAN7.4 client acquires and releases lock on file on
> server (via NFS).
> >       2. reboot Linux client (but do so
> > _before_
> >  MIR_CLNT_IDLE_TIMEOUT interval fires on server side).
> >       3. when Linux client comes back up, repeat step 1.
> >
> >       At this point, a cm_entry with only the ORDREL flag set in
> x_state_word will remain in the cm_entry linked list list (cm_hd).
> >       It appears that without at least a DEAD flag set in x_state_word,
> this cm_entry will remain bound to a the port...and will never be garbage
> collected.
> >
> >       To experiment, we added,
> >       "cm_entry->x_dead = TRUE;"
> >       at line 2272 and 2390 to here:
> > https://github.com/joyent/illumos-joyent/blob/master/
> usr/src/uts/common/rpc/clnt_cots.c
> >
> >       Testing with the above reproduction scenario, we are taking the
> path of line 2272 -- and that with the DEAD flag set in x_state_word, these
> "zombie" cm_entries are being now cleaned up, and we're no longer
> accumulating/leaking reserved ports.
> >
> >       Is there more to it? This seems too simple of a fix. Are there
> unintended consequences we should be looking/testing for? Does this seem
> like it might be #1616 as well?
> >       Thoughts?
> >       Thanks!
> >       -Ken & Youzhong
>
>
> And here's the patch in diff form for easier consumption:
>
> diff --git a/usr/src/uts/common/rpc/clnt_cots.c
> b/usr/src/uts/common/rpc/clnt_cots.c
> index 2e64ab0..f9b78ff 100644
> --- a/usr/src/uts/common/rpc/clnt_cots.c
> +++ b/usr/src/uts/common/rpc/clnt_cots.c
> @@ -2269,6 +2269,7 @@ start_retry_loop:
>                 cm_entry->x_ordrel = FALSE;
>
>         cm_entry->x_tidu_size = tidu_size;
> +       cm_entry->x_dead = TRUE;
>
>         if (cm_entry->x_early_disc) {
>                 /*
> @@ -2387,6 +2388,7 @@ connmgr_wrapconnect(
>
>                 mutex_enter(&connmgr_lock);
>
> +               cm_entry->x_dead = TRUE;
>
>                 if (cm_entry->x_early_disc) {
>                         /*
>
> Went back and checked my notes - I was traveling when that thread was
> going on, so I likely missed it altogether in the hustle/bustle of that.
>
> It seems at first glance you're being too aggressive in setting X_DEAD
> (note that this code gives you BOTH ways to set the flag, via a C bit
> fields OR the macro form... makes for very difficult reading, IMHO), but it
> my concern was valid you'd likely see far more outright failures.
>
> Maybe that patch is all we need?
>
> Dan
>
>
> ------------------------------------------
> illumos-zfs
> Archives: https://illumos.topicbox.com/groups/zfs/discussions/
> T8f10bde64dc0d5c5-M9dee7d96157a6ad5c11c472a
> Powered by Topicbox: https://topicbox.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20180108/56cee5d9/attachment.html>

From softwareinforjam at gmail.com  Tue Jan  9 22:07:47 2018
From: softwareinforjam at gmail.com (Software Information)
Date: Tue, 9 Jan 2018 17:07:47 -0500
Subject: [OmniOS-discuss] Problem with attach
Message-ID: <CAFM8920UdS5X4yHz9MOOiJguAh3sdmh8ouAp31xzCmd8b35H3w@mail.gmail.com>

Hi All
I am not sure how this happened. I thought I followed the instructions but
I now have a problem. My non-global zone is now out of sync with my global
zone.

 Global zone version: entire at 11-0.151022:20180108T221634Z
 Non-Global zone version: entire at 11-0.151020:20161102T012108Z

I tried using zoneadm -z zonename attach -u but that failed. Is there a way
to sync the non-global zone?

Regards
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20180109/c64d0c0f/attachment.html>

From hasslerd at gmx.li  Tue Jan  9 23:56:48 2018
From: hasslerd at gmx.li (Dominik Hassler)
Date: Wed, 10 Jan 2018 00:56:48 +0100
Subject: [OmniOS-discuss] Problem with attach
In-Reply-To: <CAFM8920UdS5X4yHz9MOOiJguAh3sdmh8ouAp31xzCmd8b35H3w@mail.gmail.com>
References: <CAFM8920UdS5X4yHz9MOOiJguAh3sdmh8ouAp31xzCmd8b35H3w@mail.gmail.com>
Message-ID: <f1d9830b-1310-0e21-3075-696525c22b65@gmx.li>

Hi,

I am not aware of an entire at 11-0.151022:20180108T221634Z in our IPS
repos as we only ship those for r151022:

hadfl at r151022-build:~$ pkg list -avf entire
FMRI
    IFO
pkg://omnios/entire at 11-0.151022:20171031T101418Z
    i--
pkg://omnios/entire at 11-0.151022:20170917T145315Z
    ---
pkg://omnios/entire at 11-0.151022:20170511T002513Z
    ---

Could you please elaborate a bit more how you did the upgrade (i.e.
steps you did, setting publisher etc)


On 01/09/2018 11:07 PM, Software Information wrote:
> Hi All
> I am not sure how this happened. I thought I followed the instructions
> but I now have a problem. My non-global zone is now out of sync with my
> global zone.
> 
> ?Global zone version: entire at 11-0.151022:20180108T221634Z
> ?Non-Global zone version: entire at 11-0.151020:20161102T012108Z
> 
> I tried using zoneadm -z zonename attach -u but that failed. Is there a
> way to sync the non-global zone?
> 
> Regards
> 
> 
> 
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
> 

From groenveld at acm.org  Wed Jan 10 00:33:55 2018
From: groenveld at acm.org (John D Groenveld)
Date: Tue, 09 Jan 2018 19:33:55 -0500
Subject: [OmniOS-discuss] Problem with attach
In-Reply-To: Your message of "Wed, 10 Jan 2018 00:56:48 +0100."
	<f1d9830b-1310-0e21-3075-696525c22b65@gmx.li>
References: <CAFM8920UdS5X4yHz9MOOiJguAh3sdmh8ouAp31xzCmd8b35H3w@mail.gmail.com><f1d9830b-1310-0e21-3075-696525c22b65@gmx.li>
Message-ID: <201801100033.w0A0Xth7012438@groenveld.us>

In message <f1d9830b-1310-0e21-3075-696525c22b65 at gmx.li>, Dominik Hassler writes:
>Could you please elaborate a bit more how you did the upgrade (i.e.
>steps you did, setting publisher etc)

And the attach.log.

John
groenveld at acm.org

From stephan.budach at jvm.de  Fri Jan 12 10:17:09 2018
From: stephan.budach at jvm.de (Stephan Budach)
Date: Fri, 12 Jan 2018 11:17:09 +0100 (CET)
Subject: [OmniOS-discuss] NVMe under omniOS CE
In-Reply-To: <478079415.1033.1515752082909.JavaMail.stephan.budach@stephan.budach.jvm.de>
Message-ID: <811459154.1043.1515752239683.JavaMail.stephan.budach@stephan.budach.jvm.de>

Hi, 


I finally got the first of my two Supermicro 2028R-N48M NVME servers. I installed the latest omniOSce on it and as it seems, it doesn't recognize the NVMe drives.This box is equipped with 24x Intel DC P4500, PCIe 3.1 NVMe. 


Does anybody know, why those are not recognized? 


Thanks, 
Stephan 

-- 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20180112/de3115e6/attachment-0001.html>

From mir at miras.org  Fri Jan 12 11:37:38 2018
From: mir at miras.org (Michael Rasmussen)
Date: Fri, 12 Jan 2018 12:37:38 +0100
Subject: [OmniOS-discuss] NVMe under omniOS CE
In-Reply-To: <811459154.1043.1515752239683.JavaMail.stephan.budach@stephan.budach.jvm.de>
References: <811459154.1043.1515752239683.JavaMail.stephan.budach@stephan.budach.jvm.de>
Message-ID: <b4033451-6cef-40d7-b5f9-113e0fe3ef24@miras.org>

If I recall Dan once mentioned that only 3.0 is supported.

?Sent from BlueMail ?

On Jan 12, 2018, 11:21, at 11:21, Stephan Budach <stephan.budach at jvm.de> wrote:
>Hi, 
>
>
>I finally got the first of my two Supermicro 2028R-N48M NVME servers. I
>installed the latest omniOSce on it and as it seems, it doesn't
>recognize the NVMe drives.This box is equipped with 24x Intel DC P4500,
>PCIe 3.1 NVMe. 
>
>
>Does anybody know, why those are not recognized? 
>
>
>Thanks, 
>Stephan 
>
>-- 
>
>
>
>
>------------------------------------------------------------------------
>
>_______________________________________________
>OmniOS-discuss mailing list
>OmniOS-discuss at lists.omniti.com
>http://lists.omniti.com/mailman/listinfo/omnios-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20180112/ee788af5/attachment.html>

From stephan.budach at jvm.de  Fri Jan 12 11:39:52 2018
From: stephan.budach at jvm.de (Stephan Budach)
Date: Fri, 12 Jan 2018 12:39:52 +0100 (CET)
Subject: [OmniOS-discuss] NVMe under omniOS CE
In-Reply-To: <811459154.1043.1515752239683.JavaMail.stephan.budach@stephan.budach.jvm.de>
References: <811459154.1043.1515752239683.JavaMail.stephan.budach@stephan.budach.jvm.de>
Message-ID: <1806929444.1090.1515757203871.JavaMail.stephan.budach@stephan.budach.jvm.de>

Shoot - please forgive my ignorance? uncommenting strict-version in nvme.conf sloved that. 

Cheers, 
Stephan 

----- Urspr?ngliche Mail -----

> Von: "Stephan Budach" <stephan.budach at jvm.de>
> An: "omnios-discuss" <omnios-discuss at lists.omniti.com>
> Gesendet: Freitag, 12. Januar 2018 11:17:09
> Betreff: [OmniOS-discuss] NVMe under omniOS CE

> Hi,

> I finally got the first of my two Supermicro 2028R-N48M NVME servers.
> I installed the latest omniOSce on it and as it seems, it doesn't
> recognize the NVMe drives.This box is equipped with 24x Intel DC
> P4500, PCIe 3.1 NVMe.

> Does anybody know, why those are not recognized?

> Thanks,
> Stephan

> --

> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20180112/e33b8fbe/attachment.html>

From vab at bb-c.de  Sat Jan 13 21:38:36 2018
From: vab at bb-c.de (Volker A. Brandt)
Date: Sat, 13 Jan 2018 22:38:36 +0100
Subject: [OmniOS-discuss] Invitation to an OmniOS event near Frankfurt,
	Germany (Tue Jan 16)
Message-ID: <23130.31836.110356.702888@shelob.bb-c.de>

[Stupid me sent this to the -bounce addr first -- no Reply-To :-(]


Hello all!


Here is an invitation to an OmniOS-related event in Frankfurt, Germany.
This is the regular monthly meeting of the Frankfurt OpenSolaris User
Group (FRAOSUG).  Yes, we still exist. :-)

We will meet next Tuesday (Jan 16th 2018) at 6:30pm in Dreieich.  The
meeting is held in German, so the invitation is also in German:

------------------------------------------------------------------------
Am kommenden Dienstag l?dt die FRAOSUG zur monatlichen Veranstaltung
ein.  Auch diesmal sind wir n?her am "urspr?nglichen" OpenSolaris, denn
es gibt "alles rund um OmniOS CE".  Nach einer kleinen Einf?hrung, was
OmniOS und speziell OmniOS CE eigentlich ist, wollen wir uns anschauen,
warum OmniOS als "legitimer Nachfolger" von OpenSolaris auf Servern
gilt.

Insbesondere wird eine vollst?ndige Installation von OmniOS auf einem
HP G8 Microserver live vorgef?hrt, inklusive Konfiguration des neuen
von FreeBSD abstammenden Bootloaders f?r die serielle Konsole des HP-
iLO.

Falls jemand einen Laptop mit VirtualBox dabei hat, gibt es auch die
M?glichkeit, ein vorbereitetes OmniOS-Image zu kopieren und selbst zu
installieren.

Unser Treffen findet diesmal bei Oracle in Dreieich statt:

  https://fraosug.de/

Anmeldung ?ber unser Umfrage-Tool:

  https://owncloud-002.qutic.com/index.php/apps/polls/poll/iJtyBtV1eNaCkvdo

Die Veranstaltung ist kostenlos, und wie immer gilt:  Das Verwenden von
Solaris keine Vorbedingung f?r die Teilnahme...
------------------------------------------------------------------------

Basically we are going to do an intro to what OmniOS (and CE) is, 
followed by a live Kayak network installation (if I manage to get it
to work :-).

Attendance is free, and we will be at the Oracle office in Dreieich.
Everyone is welcome.  Make sure to register via our web page.


Regards -- Volker
-- 
------------------------------------------------------------------------
Volker A. Brandt               Consulting and Support for Oracle Solaris
Brandt & Brandt Computer GmbH                   WWW: http://www.bb-c.de/
Am Wiesenpfad 6, 53340 Meckenheim, GERMANY            Email: vab at bb-c.de
Handelsregister: Amtsgericht Bonn, HRB 10513              Schuhgr??e: 46
Gesch?ftsf?hrer: Rainer J.H. Brandt und Volker A. Brandt

"When logic and proportion have fallen sloppy dead"

From chip at innovates.com  Wed Jan 17 14:36:50 2018
From: chip at innovates.com (Schweiss, Chip)
Date: Wed, 17 Jan 2018 08:36:50 -0600
Subject: [OmniOS-discuss] [zfs] Re: rpcbind: t_bind failed
In-Reply-To: <8E071B37-24A7-4EC6-B9FD-D1983929CFEC@joyent.com>
References: <CALeZrrQ9yEGtvJBz96wDG2np1YQ6N1ykJfOU1qhDh-aoF5K-Bg@mail.gmail.com>
	<CALeZrrRnJMtkXM3KmwKt3tHP83gWER1p25+UqF8oagTrjgJZfg@mail.gmail.com>
	<20180103163208.GD1629@telcontar>
	<CALeZrrSE4y1R9yENeqAdbyAGKxSC7QcEMiQ6LnCieUqY0Uc-BA@mail.gmail.com>
	<CADpNCvYV3foWjDWPmc1a1yTKpggLr0=LYkMvsje=4McPEh6iqQ@mail.gmail.com>
	<EFBBE2CE-D7E4-4D1F-9230-E7631C294700@joyent.com>
	<CADpNCvbWsqussBnE7JKX0vXrYabAREYr2BKNFdTK2StUxOBf5g@mail.gmail.com>
	<8E071B37-24A7-4EC6-B9FD-D1983929CFEC@joyent.com>
Message-ID: <CALeZrrTvQymG3MUH0jATQXn3UMu+FKrSyZwgNp6GkHOwvURoYg@mail.gmail.com>

I haven't seen this bug filed yet.

Please submit this.  For anyone using automounter this bug is a ticking
time bomb.

I've been able to extend my frequency of reboots by about a week with

ndd -set /dev/tcp tcp_smallest_anon_port 1024

However, until this is fixed, I'm forced to reboot every couple weeks.

Thank you,

-Chip

On Mon, Jan 8, 2018 at 10:46 AM, Dan McDonald <danmcd at joyent.com> wrote:

> OH PHEW!
>
> > On Jan 8, 2018, at 11:43 AM, Youzhong Yang <youzhong at gmail.com> wrote:
> >
> > This is our patch. It was applied 3 years ago so the line number could
> be different for the latest version of the file.
> > diff --git a/usr/src/uts/common/rpc/clnt_cots.c
> b/usr/src/uts/common/rpc/clnt_cots.c
> > index 4466e93..0a0951d 100644
> > --- a/usr/src/uts/common/rpc/clnt_cots.c
> > +++ b/usr/src/uts/common/rpc/clnt_cots.c
> > @@ -2285,6 +2285,7 @@ start_retry_loop:
> >               if (rpcerr->re_status == RPC_SUCCESS)
> >                       rpcerr->re_status = RPC_XPRTFAILED;
> >               cm_entry->x_connected = FALSE;
> > +             cm_entry->x_dead = TRUE;
> >       } else
> >               cm_entry->x_connected = connected;
> >
> > @@ -2403,6 +2404,7 @@ connmgr_wrapconnect(
> >                       if (rpcerr->re_status == RPC_SUCCESS)
> >                               rpcerr->re_status = RPC_XPRTFAILED;
> >                       cm_entry->x_connected = FALSE;
> > +                     cm_entry->x_dead = TRUE;
> >               } else
> >                       cm_entry->x_connected = connected;
>
> This makes TONS more sense, and alleviates/obviates my concerns previously.
>
> If there isn't a bug already, please file one.  Once filed or found,
> please add me as a code reviewer for this.
>
> Thanks,
> Dan
>
>
> ------------------------------------------
> illumos-zfs
> Archives: https://illumos.topicbox.com/groups/zfs/discussions/
> T8f10bde64dc0d5c5-M889b6aaf7cbeb0b32617f321
> Powered by Topicbox: https://topicbox.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20180117/fb83d37d/attachment.html>

From paul.jochum at nokia.com  Mon Jan 22 15:24:17 2018
From: paul.jochum at nokia.com (Paul Jochum)
Date: Mon, 22 Jan 2018 09:24:17 -0600
Subject: [OmniOS-discuss] zpool replace command returns internal error: out
	of memory
Message-ID: <70b93224-944d-485f-0a43-b570a3e563fe@nokia.com>

Hi All:

Last Saturday, I updated my servers to the latest version of OmniOS-CE 
(r151024j), and today, while trying to replace a drive, I received the 
following error message;

#? zpool replace zfs_pool c11t5000C5003A017D5Bd0 c11t5000C5003A39950Bd0
internal error: out of memory

Some information about my system:

# uname -a
SunOS lss-nfsa05 5.11 omnios-r151024-e482f10563 i86pc i386 i86pc

root at lss-nfsa05:~# uptime
 ?09:20:28? up?? 0:18,? 3 users,? load average: 0.41, 0.30, 0.23


load averages:? 0.38,? 0.30,? 0.23;?????????????? up 0+00:19:09 09:20:45
61 processes: 60 sleeping, 1 on cpu
CPU states: 99.3% idle,? 0.0% user,? 0.7% kernel,? 0.0% iowait, 0.0% swap
Kernel: 1623 ctxsw, 1 trap, 1446 intr, 254 syscall, 1 flt
Memory: 128G phys mem, 118G free mem, 4096M total swap, 4096M free swap


I have tried rebooting my system and also exporting and importing the 
zfs_pool filesystem, but neither step helped.

Any suggestions on what to try next, or what info to collect to help 
debug this?

thanks,
Paul

From softwareinforjam at gmail.com  Tue Jan 23 02:13:00 2018
From: softwareinforjam at gmail.com (Software Information)
Date: Mon, 22 Jan 2018 21:13:00 -0500
Subject: [OmniOS-discuss] Zone welcome messsage
Message-ID: <CAFM8921df7g0X8=LgN2t0Khui5pvTthB++sE6gjzML7mGCFZJg@mail.gmail.com>

Hi All
I was just  wondering. I recently updated my host machine from r151020 to
r151022.
The welcome message on the host machine is fine but when I log on to the
non-global zone, I still see the welcome message for r151020 below.

OmniOS 5.11     omnios-r151020-4151d05  March 2017

But whe I do a uname -a, in the non-global zone, I get :
SunOS test-zone 5.11 omnios-r151022-f9693432c2 i86pc i386 i86pc

Is there any way to make the non-global zone show the correct welcome
message?

Thanks and regards
SI
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20180122/a762342b/attachment.html>

From danmcd at kebe.com  Tue Jan 23 02:23:40 2018
From: danmcd at kebe.com (Dan McDonald)
Date: Mon, 22 Jan 2018 21:23:40 -0500
Subject: [OmniOS-discuss] Zone welcome messsage
In-Reply-To: <CAFM8921df7g0X8=LgN2t0Khui5pvTthB++sE6gjzML7mGCFZJg@mail.gmail.com>
References: <CAFM8921df7g0X8=LgN2t0Khui5pvTthB++sE6gjzML7mGCFZJg@mail.gmail.com>
Message-ID: <20180123022340.GB28479@everywhere.local>

On Mon, Jan 22, 2018 at 09:13:00PM -0500, Software Information wrote:
> Hi All
> I was just  wondering. I recently updated my host machine from r151020 to
> r151022.
> The welcome message on the host machine is fine but when I log on to the
> non-global zone, I still see the welcome message for r151020 below.
> 
> OmniOS 5.11     omnios-r151020-4151d05  March 2017
> 
> But whe I do a uname -a, in the non-global zone, I get :
> SunOS test-zone 5.11 omnios-r151022-f9693432c2 i86pc i386 i86pc
> 
> Is there any way to make the non-global zone show the correct welcome
> message?

If it's an ipkg zone, you'll have to "pkg update" inside the zone (and
possibly reboot it).  If it's an lipkg zone, make sure you're using the "-r"
flag when updating the global, OR use "pkg update" inside like an ipkg zone.

Dan

From softwareinforjam at gmail.com  Thu Jan 25 01:51:05 2018
From: softwareinforjam at gmail.com (Software Information)
Date: Wed, 24 Jan 2018 20:51:05 -0500
Subject: [OmniOS-discuss] NTP Service error
Message-ID: <CAFM8920NDybKiMptuN59vA0=9gBWNpPiH8PPk9hYgNYXwxpzYw@mail.gmail.com>

Hi All
Today I made the switch updating my r151020 box to omniosce in production
and I am now left with just two issues.

ntp won't start on one zone. All the dependent services are online.

The log says:
[ Jan 24 19:04:50 Method "start" exited with status 96. ]
[ Jan 24 19:49:39 Enabled. ]
[ Jan 24 19:49:40 Executing start method ("/lib/svc/method/ntp start"). ]
[ Jan 24 19:49:40 svc.startd could not set context for method:  ]
setppriv: Not owner

Not quite sure what this means. Could anyone give me a pointer please?

Kind Regards
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20180124/f176fe9d/attachment.html>

From omnios at citrus-it.net  Thu Jan 25 07:28:23 2018
From: omnios at citrus-it.net (Andy Fiddaman)
Date: Thu, 25 Jan 2018 07:28:23 +0000 (UTC)
Subject: [OmniOS-discuss] NTP Service error
In-Reply-To: <CAFM8920NDybKiMptuN59vA0=9gBWNpPiH8PPk9hYgNYXwxpzYw@mail.gmail.com>
References: <CAFM8920NDybKiMptuN59vA0=9gBWNpPiH8PPk9hYgNYXwxpzYw@mail.gmail.com>
Message-ID: <nycvar.TFB.7.76.1801250725310.9032@erncre.pvgehf-vg.arg>


On Wed, 24 Jan 2018, Software Information wrote:

; Hi All
; Today I made the switch updating my r151020 box to omniosce in production
; and I am now left with just two issues.
;
; ntp won't start on one zone. All the dependent services are online.
;
; The log says:
; [ Jan 24 19:04:50 Method "start" exited with status 96. ]
; [ Jan 24 19:49:39 Enabled. ]
; [ Jan 24 19:49:40 Executing start method ("/lib/svc/method/ntp start"). ]
; [ Jan 24 19:49:40 svc.startd could not set context for method:  ]
; setppriv: Not owner

To run NTP in a zone, the zone needs the sys_time privilege. Is that
still present in the zone config?

# zonecfg -z ntp0 info | grep limitpriv
limitpriv: default,proc_priocntl,sys_time

I can't think why it would have gone away during an upgrade but it's
the first thing to check.

Regards,

Andy
-- 
Citrus IT Limited | +44 (0)333 0124 007 | enquiries at citrus-it.co.uk
Rock House Farm | Green Moor | Wortley | Sheffield | S35 7DQ
Registered in England and Wales | Company number 4899123


From jimklimov at cos.ru  Thu Jan 25 09:29:28 2018
From: jimklimov at cos.ru (Jim Klimov)
Date: Thu, 25 Jan 2018 09:29:28 +0000
Subject: [OmniOS-discuss] NTP Service error
In-Reply-To: <CAFM8920NDybKiMptuN59vA0=9gBWNpPiH8PPk9hYgNYXwxpzYw@mail.gmail.com>
References: <CAFM8920NDybKiMptuN59vA0=9gBWNpPiH8PPk9hYgNYXwxpzYw@mail.gmail.com>
Message-ID: <893C93D0-B6E3-4352-A6F1-836320473AC0@cos.ru>

On January 25, 2018 1:51:05 AM UTC, Software Information <softwareinforjam at gmail.com> wrote:
>Hi All
>Today I made the switch updating my r151020 box to omniosce in
>production
>and I am now left with just two issues.
>
>ntp won't start on one zone. All the dependent services are online.
>
>The log says:
>[ Jan 24 19:04:50 Method "start" exited with status 96. ]
>[ Jan 24 19:49:39 Enabled. ]
>[ Jan 24 19:49:40 Executing start method ("/lib/svc/method/ntp start").
>]
>[ Jan 24 19:49:40 svc.startd could not set context for method:  ]
>setppriv: Not owner
>
>Not quite sure what this means. Could anyone give me a pointer please?
>
>Kind Regards

Local zones normally can not control the host system clock, so you need to add a 

  <zone ... limitpriv="default,priv_sys_time">

into the zone's XML descriptor (or do the equivalent via zonecfg).

Not sure it will let you actually set host time from the local zone (probably you'll need an NTP client in the NGZ to set the physical clock), but this will allow you to run an NTP server to give out time to clients.

Note that you might have to fiddle with ntp.conf also, so the server does not report itself as a useless 'stratum 16' (since it has not confirmed setting the clock from a source of known reliability). Note it can take some 15 minutes for ntpd to set its own stratum even when it is in charge of host clock sync; let us know if you succeed to force a number otherwise (fudge did not help me back when...)

Hope this helps,
Jim
--
Typos courtesy of K-9 Mail on my Android

From omnios at citrus-it.net  Thu Jan 25 15:23:38 2018
From: omnios at citrus-it.net (Andy Fiddaman)
Date: Thu, 25 Jan 2018 15:23:38 +0000 (UTC)
Subject: [OmniOS-discuss] NTP Service error
In-Reply-To: <CAFM8921aFHCAoNeL6SFAaaBtKtO+6ow=JbnZ1bnK_X5pQn9Edw@mail.gmail.com>
References: <CAFM8920NDybKiMptuN59vA0=9gBWNpPiH8PPk9hYgNYXwxpzYw@mail.gmail.com>
	<nycvar.TFB.7.76.1801250725310.9032@erncre.pvgehf-vg.arg>
	<CAFM8921aFHCAoNeL6SFAaaBtKtO+6ow=JbnZ1bnK_X5pQn9Edw@mail.gmail.com>
Message-ID: <nycvar.TFB.7.76.1801251523030.5573@erncre.pvgehf-vg.arg>


On Thu, 25 Jan 2018, Software Information wrote:

; Hi. Thanks for replying.
;
; Running zonecfg -z zone_name info | grep limitpriv results in:
; limitpriv: default,dtrace_proc,dtrace_user,
;
; *sys_time*
; So that's still there

Try adding proc_priocntl too. Seems it's also needed by NTP.

Andy

-- 
Citrus IT Limited | +44 (0)333 0124 007 | enquiries at citrus-it.co.uk
Rock House Farm | Green Moor | Wortley | Sheffield | S35 7DQ
Registered in England and Wales | Company number 4899123


From omnios at citrus-it.net  Thu Jan 25 16:18:13 2018
From: omnios at citrus-it.net (Andy Fiddaman)
Date: Thu, 25 Jan 2018 16:18:13 +0000 (UTC)
Subject: [OmniOS-discuss] NTP Service error
In-Reply-To: <CAFM8922A+3L8-BBAmgVw72GP_-T4uctd+a+CmhY2eftJM02tRw@mail.gmail.com>
References: <CAFM8920NDybKiMptuN59vA0=9gBWNpPiH8PPk9hYgNYXwxpzYw@mail.gmail.com>
	<nycvar.TFB.7.76.1801250725310.9032@erncre.pvgehf-vg.arg>
	<CAFM8921aFHCAoNeL6SFAaaBtKtO+6ow=JbnZ1bnK_X5pQn9Edw@mail.gmail.com>
	<nycvar.TFB.7.76.1801251523030.5573@erncre.pvgehf-vg.arg>
	<CAFM8922A+3L8-BBAmgVw72GP_-T4uctd+a+CmhY2eftJM02tRw@mail.gmail.com>
Message-ID: <nycvar.TFB.7.76.1801251617050.24629@erncre.pvgehf-vg.arg>


On Thu, 25 Jan 2018, Software Information wrote:

; I ran
; # zonecfg -z zonename set limitpriv="proc_priocntl"
;
; But it resulted in:
; zonename:* invalid privilege: sys_mount*

You need to add the privilege, not replace what's already there:

# zonecfg -z zonename set limitpriv=default,proc_priocntl,sys_time

-- 
Citrus IT Limited | +44 (0)333 0124 007 | enquiries at citrus-it.co.uk
Rock House Farm | Green Moor | Wortley | Sheffield | S35 7DQ
Registered in England and Wales | Company number 4899123


From softwareinforjam at gmail.com  Thu Jan 25 21:25:33 2018
From: softwareinforjam at gmail.com (Software Information)
Date: Thu, 25 Jan 2018 16:25:33 -0500
Subject: [OmniOS-discuss] NTP Service error
In-Reply-To: <nycvar.TFB.7.76.1801251617050.24629@erncre.pvgehf-vg.arg>
References: <CAFM8920NDybKiMptuN59vA0=9gBWNpPiH8PPk9hYgNYXwxpzYw@mail.gmail.com>
	<nycvar.TFB.7.76.1801250725310.9032@erncre.pvgehf-vg.arg>
	<CAFM8921aFHCAoNeL6SFAaaBtKtO+6ow=JbnZ1bnK_X5pQn9Edw@mail.gmail.com>
	<nycvar.TFB.7.76.1801251523030.5573@erncre.pvgehf-vg.arg>
	<CAFM8922A+3L8-BBAmgVw72GP_-T4uctd+a+CmhY2eftJM02tRw@mail.gmail.com>
	<nycvar.TFB.7.76.1801251617050.24629@erncre.pvgehf-vg.arg>
Message-ID: <CAFM8920V4nfk1fs6VRa=YKxYsWsggu5aFE86TuJhe2Wb66UPFw@mail.gmail.com>

The command: *zonecfg -z zonename set limitpriv=default,proc_*
*priocntl,sys_time* actually fixed the problem. I thought it didn't but
only to realize I didn't reboot the zone. That was my bad. Thanks so much
for the support.

Kind Regards.

On Thu, Jan 25, 2018 at 11:18 AM, Andy Fiddaman <omnios at citrus-it.net>
wrote:

>
>
> On Thu, 25 Jan 2018, Software Information wrote:
>
> ; I ran
> ; # zonecfg -z zonename set limitpriv="proc_priocntl"
> ;
> ; But it resulted in:
> ; zonename:* invalid privilege: sys_mount*
>
> You need to add the privilege, not replace what's already there:
>
> # zonecfg -z zonename set limitpriv=default,proc_priocntl,sys_time
>
> --
> Citrus IT Limited | +44 (0)333 0124 007 | enquiries at citrus-it.co.uk
> Rock House Farm | Green Moor | Wortley | Sheffield | S35 7DQ
> Registered in England and Wales | Company number 4899123
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20180125/54b5397e/attachment-0001.html>

From paladinemishakal at gmail.com  Tue Jan 30 07:16:52 2018
From: paladinemishakal at gmail.com (Lawrence Giam)
Date: Tue, 30 Jan 2018 15:16:52 +0800
Subject: [OmniOS-discuss] Intel Chipset support
Message-ID: <CAGueQCeyiU4V_2nee_BcBBoga=o+1Tf1NDnSMaA1w24K_GooXQ@mail.gmail.com>

Hi All,

Is there a place I can go find out what Intel chipset is supported by
OmniOS?

Thanks & Regards.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20180130/6444038e/attachment.html>

From manuel at oetiker.ch  Tue Jan 30 07:58:54 2018
From: manuel at oetiker.ch (Manuel Oetiker)
Date: Tue, 30 Jan 2018 08:58:54 +0100 (CET)
Subject: [OmniOS-discuss] Intel Chipset support
In-Reply-To: <CAGueQCeyiU4V_2nee_BcBBoga=o+1Tf1NDnSMaA1w24K_GooXQ@mail.gmail.com>
References: <CAGueQCeyiU4V_2nee_BcBBoga=o+1Tf1NDnSMaA1w24K_GooXQ@mail.gmail.com>
Message-ID: <1848885924.488266.1517299134606.JavaMail.zimbra@oetiker.ch>

hi 

https://illumos.org/hcl/ 

cheers Manuel

----- Original Message -----
> From: "Lawrence Giam" <paladinemishakal at gmail.com>
> To: "omnios-discuss" <omnios-discuss at lists.omniti.com>
> Sent: Tuesday, January 30, 2018 8:16:52 AM
> Subject: [OmniOS-discuss] Intel Chipset support

> Hi All,
> Is there a place I can go find out what Intel chipset is supported by OmniOS?

> Thanks & Regards.

> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss

From stephan.budach at jvm.de  Wed Jan 31 18:39:06 2018
From: stephan.budach at jvm.de (Stephan Budach)
Date: Wed, 31 Jan 2018 19:39:06 +0100 (CET)
Subject: [OmniOS-discuss] How to safely remove/replace NVMe SSDs
In-Reply-To: <1205068386.4148.1517423581346.JavaMail.stephan.budach@stephanbudach.local>
Message-ID: <1533340439.4158.1517423920653.JavaMail.stephan.budach@stephanbudach.local>

Hi, 


I have purchased two if those Supermicro NVMe servers: SSG-2028R-NR48N. Both of them are equipped with 24x Intel P DC4500 U.2 devices, which are obviously hot-pluggable, at least they seem to be. ;) 


At the moment, I am trying to familiarize myself with the handling of those devices and I am having quite a hard time, coming up with a method of safely removing/replacing such a device. I am able to detach a nvme device using nvmeadm, but removing it from the system by pulling it out, causes the kernel to retire the pci device and I have not yet found a way to get the re-inserted device online again. 


Anybody having some experience how to handle those NVMe devices? 


Thanks, 
Stephan 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20180131/a0aa9671/attachment.html>