From nomad at ee.washington.edu  Thu Aug 23 15:37:01 2018
From: nomad at ee.washington.edu (Lee Damon)
Date: Thu, 23 Aug 2018 08:37:01 -0700
Subject: [OmniOS-discuss] Slow NFS writes in 151026
Message-ID: <CABXAO0KcnmQdyj9TwexYUC6TCfMp1400yytDZ954LheEP=ZESA@mail.gmail.com>

I recently installed a new host. So new I couldn't install LTS on it so
I've installed 151026.

This host is strictly for serving ZFS-based NFS & CIFS. Everything else is
just default.

Over time it has become fairly obvious to me that NFS writes are ... well,
abysmal.

This example is copying a 36GB directory of mixed size/type files. The
first copy is strictly on a filesystem on the new server. The second is
reading from the new server to an existing one. The third is doing the same
read/write activity as test one but on an existing server running 151022.

on new fileserver:

: || nomad at omics1 fs2test ; time cp -rp 004test omics1/004test-1

real 22m27.225s
user 0m0.188s
sys 0m29.880s

reading from new fileserver, writing to existing fileserver:

: || nomad at omics1 hvfs2test ; time cp -rp /misc/fs2test/004test .

real 2m9.770s
user 0m0.180s
sys 0m28.694s

existing fileserver:

: || nomad at omics1 hvfs2test ; time cp -rp 004test omics1/004test-1

real 2m14.158s
user 0m0.242s
sys 0m30.313s

While the user and system times are consistent across all tests the wall
clock time of the first test is 10x that of the others. I've seen wall
clock time on these tests take as long as 50 minutes. All tests were done
on the same CentOS 7 host.

Watching snoop collect packets I see multiple-minutes-long pauses while
writing to the new server. If I'm reading the heat maps right -
https://drive.google.com/open?id=1zcX9ryXjrPMH0_uUbfywiTTnJDau4WW0 - it
seems to be spending about 81% of its time in _t_cancel, waiting on a
thread to cancel. I'm not a dev, haven't looked at the code, so it's quite
possible I'm misunderstanding what the map is saying.

The client spends a lot of time so stuck in diskwait that it can take
several minutes to respond after a SIGINT, SIGHUP, or SIGKILL to the cp
process.

Is anyone else seeing similar problems?
nomad
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20180823/3753dfff/attachment.html>

From bfriesen at simple.dallas.tx.us  Thu Aug 23 16:08:57 2018
From: bfriesen at simple.dallas.tx.us (Bob Friesenhahn)
Date: Thu, 23 Aug 2018 11:08:57 -0500 (CDT)
Subject: [OmniOS-discuss] Slow NFS writes in 151026
In-Reply-To: <CABXAO0KcnmQdyj9TwexYUC6TCfMp1400yytDZ954LheEP=ZESA@mail.gmail.com>
References: <CABXAO0KcnmQdyj9TwexYUC6TCfMp1400yytDZ954LheEP=ZESA@mail.gmail.com>
Message-ID: <alpine.GSO.2.20.1808231102310.3639@scrappy.simplesystems.org>

What does 'zpool status poolname' (replace poolname with the name of 
the pool which is NFS exported) say?

What is the output of 'iostat -xnE' on your new server?

What is the native block size for the disks you used and what is the 
nature of the disks (SATA, SAS, near-term storage, exceptionally 
large size, etc.)?

Do you have dedicated ZIL SSDs in your pool?

Have you done a continual ping from the NFS client to the server to 
see if there are packet drops?

If you use some other TCP-based protocol to transfer a file from the 
client to the server do you see any strange hangs during the transfer?

Bob
-- 
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

From nomad at ee.washington.edu  Thu Aug 23 16:38:38 2018
From: nomad at ee.washington.edu (Lee Damon)
Date: Thu, 23 Aug 2018 09:38:38 -0700
Subject: [OmniOS-discuss] Slow NFS writes in 151026
In-Reply-To: <alpine.GSO.2.20.1808231102310.3639@scrappy.simplesystems.org>
References: <CABXAO0KcnmQdyj9TwexYUC6TCfMp1400yytDZ954LheEP=ZESA@mail.gmail.com>
	<alpine.GSO.2.20.1808231102310.3639@scrappy.simplesystems.org>
Message-ID: <CABXAO0+u-4HO8_vzdz3ndcg3vsF550jAu-UVhmP5wrx70azc7A@mail.gmail.com>

These are 12TB SAS drives (Seagate ST12000NM0027) for data & hot spare. ZIL
& L2ARC are 480GB INTEL SSDSC2KG48 SSDs. Everything is left at default for
sector size, etc. They were basically prepared for into the pool with a
simple fdisk -B /dev/rdsk/drive.

Ping never shows loss of connectivity. I ran this for about 5 minutes
during a test:

  303 packets transmitted, 303 received, 0% packet loss, time 302021ms
  rtt min/avg/max/mdev = 0.109/0.281/2.881/0.227 ms

CIFS, scp, and rsync do not exhibit the problem. I forgot to mention that
local copies on the file server are also as fast as I would expect (~2 min).

  pool: pool0
 state: ONLINE
  scan: none requested
config:

        NAME                         STATE     READ WRITE CKSUM
        pool0                        ONLINE       0     0     0
          raidz2-0                   ONLINE       0     0     0
            c0t5000C500A612DA93d0    ONLINE       0     0     0
            c0t5000C500957D4A93d0    ONLINE       0     0     0
            c0t5000C500957D4C1Bd0    ONLINE       0     0     0
            c0t5000C500957D25B3d0    ONLINE       0     0     0
            c0t5000C500957D27F3d0    ONLINE       0     0     0
            c0t5000C500957D2553d0    ONLINE       0     0     0
        logs
          mirror-1                   ONLINE       0     0     0
            c0t55CD2E414EC0FF43d0s0  ONLINE       0     0     0
            c3t0d0s0                 ONLINE       0     0     0
        cache
          c0t55CD2E414EC0FF43d0s1    ONLINE       0     0     0
          c3t0d0s1                   ONLINE       0     0     0
        spares
          c0t5000C50095722E27d0      AVAIL

iostat:
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
   15.8   57.5 1302.5 5372.3 823.0  0.6 11233.2    8.4   9   9 pool0
    0.1   19.9    1.7  163.0  0.0  0.0    1.0    0.1   0   0 rpool
    0.1   10.2    0.9   81.5  0.0  0.0    0.0    0.0   0   0 c1t4d0
    0.0   10.1    0.8   81.5  0.0  0.0    0.0    0.1   0   0 c1t5d0
    2.7   19.5  337.8 1359.4  0.4  0.0   16.1    0.5   8   1 c3t0d0
    1.9    6.0  114.5  442.2  0.0  0.0    0.0    5.0   0   1
c0t5000C500957D27F3d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.1   0   0
c0t5000C50095722E27d0
    1.5    6.0   86.2  442.9  0.0  0.0    0.0    4.7   0   1
c0t5000C500957D25B3d0
    1.5    6.0   78.2  442.5  0.0  0.0    0.0    4.2   0   1
c0t5000C500957D4C1Bd0
    1.7    6.1  102.4  442.8  0.0  0.0    0.0    4.7   0   1
c0t5000C500A612DA93d0
    1.6    6.0   86.6  442.5  0.0  0.0    0.0    4.3   0   1
c0t5000C500957D2553d0
    2.0    5.9  122.2  442.2  0.0  0.0    0.0    5.0   0   1
c0t5000C500957D4A93d0
    2.9   19.5  374.7 1357.9  0.0  0.0    0.0    1.2   0   1
c0t55CD2E414EC0FF43d0
c1t4d0           Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA      Product: INTEL SSDSC2KB24 Revision: 0121 Serial No:
BTYS817407RE240
Size: 240.06GB <240057409536 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c1t5d0           Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA      Product: INTEL SSDSC2KB24 Revision: 0121 Serial No:
BTYS817409YS240
Size: 240.06GB <240057409536 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c3t0d0           Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA      Product: INTEL SSDSC2KG48 Revision: 0121 Serial No:
BTYM7405027L480
Size: 480.10GB <480103981056 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c0t5000C500957D27F3d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: SEAGATE  Product: ST12000NM0027    Revision: E001 Serial No:
ZJV0VFGX0000J74
Size: 12000.14GB <12000138625024 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c0t5000C50095722E27d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: SEAGATE  Product: ST12000NM0027    Revision: E001 Serial No:
ZJV0S42H0000J75
Size: 12000.14GB <12000138625024 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c0t5000C500957D25B3d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: SEAGATE  Product: ST12000NM0027    Revision: E001 Serial No:
ZJV0VFJV0000J74
Size: 12000.14GB <12000138625024 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c0t5000C500957D4C1Bd0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: SEAGATE  Product: ST12000NM0027    Revision: E001 Serial No:
ZJV0P6050000J83
Size: 12000.14GB <12000138625024 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c0t5000C500A612DA93d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: SEAGATE  Product: ST12000NM0027    Revision: E001 Serial No:
ZJV0WCCN0000J80
Size: 12000.14GB <12000138625024 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c0t5000C500957D2553d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: SEAGATE  Product: ST12000NM0027    Revision: E001 Serial No:
ZJV0VFK30000J74
Size: 12000.14GB <12000138625024 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c0t5000C500957D4A93d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: SEAGATE  Product: ST12000NM0027    Revision: E001 Serial No:
ZJV0VBQ80000R81
Size: 12000.14GB <12000138625024 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0
c0t55CD2E414EC0FF43d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Vendor: ATA      Product: INTEL SSDSC2KG48 Revision: 0121 Serial No:
BTYM740600ZT480
Size: 480.10GB <480103981056 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0

nomad
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20180823/7b77f0f7/attachment-0001.html>

From vab at bb-c.de  Thu Aug 23 16:51:02 2018
From: vab at bb-c.de (Volker A. Brandt)
Date: Thu, 23 Aug 2018 18:51:02 +0200
Subject: [OmniOS-discuss] pkg update broken on r151026 for lipkg branded NGZs
Message-ID: <23422.58870.730172.385552@shelob.bb-c.de>

Hello all!


I have a very strange problem doing a pkg update on a r151026 system.
This machine has 11 NGZs, all are lipkg brand.  The GZ is running

  SunOS radbug 5.11 omnios-r151026-b6848f4455 i86pc i386 i86pc

(before the update).  When I run pkg update with the "-r" flag, it
shows some packages it wants to update, then does it's thing, and
... stops.  No new BE is created:

  # pkg update -v -rC0 --be-name=ooce-026-20180823

  [...]

  Planning linked:  9/11 done; 2 working: zone:kayak zone:omnit3
  Linked image 'zone:omnit3' output:
  |             Packages to update:        11
  |             Services to change:         2
  |      Estimated space available: 426.13 GB
  | Estimated space to be consumed: 173.45 MB
  |           Rebuild boot archive:        No
  |
  | Changed packages:
  | omnios
  |   SUNWcs
  |     0.5.11-0.151026:20180622T094606Z -> 0.5.11-0.151026:20180814T181134Z
  |   developer/debug/mdb
  |     0.5.11-0.151026:20180621T235844Z -> 0.5.11-0.151026:20180814T181141Z
  |   library/security/openssl
  |     1.0.2.15-0.151026 -> 1.0.2.16-0.151026
  |   network/dns/bind
  |     9.11.3-0.151026 -> 9.11.4-0.151026
  |   network/openssh
  |     7.6.1-0.151026:20180420T101453Z -> 7.6.1-0.151026:20180818T202827Z
  |   network/openssh-server
  |     7.6.1-0.151026:20180420T101522Z -> 7.6.1-0.151026:20180818T202943Z
  |   release/name
  |     0.5.11-0.151026:20180622T100612Z -> 0.5.11-0.151026:20180820T120713Z
  |   service/network/ntp
  |     4.2.8.11-0.151026 -> 4.2.8.12-0.151026
  |   system/kernel
  |     0.5.11-0.151026:20180621T235958Z -> 0.5.11-0.151026:20180814T181345Z
  |   system/kernel/platform
  |     0.5.11-0.151026:20180621T235956Z -> 0.5.11-0.151026:20180814T181344Z
  |   web/curl
  |     7.60.0-0.151026 -> 7.61.0-0.151026
  |
  | Services:
  |   restart_fmri:
  |     svc:/network/ntp:default
  |     svc:/network/ssh:default
  |
  | Editable files to change:
  |   Update:
  |     etc/motd

  [...]

  Planning linked: 11/11 done
  DOWNLOAD                                PKGS         FILES    XFER (MB)   SPEED
  Completed                              11/11     2263/2263    46.3/46.3    0B/s

  Downloading linked:  0/11 done; 11 working: zone:kayak zone:omnib0 zone:omnib1 zone:omnib2 zone:omnib3 zone:omnib4 zone:omnit0 zone:omnit1 zone:omnit2 zone:omnit3 zone:omnit4
  Downloading linked:  1/11 done; 10 working: zone:kayak zone:omnib1 zone:omnib2 zone:omnib3 zone:omnib4 zone:omnit0 zone:omnit1 zone:omnit2 zone:omnit3 zone:omnit4
  Downloading linked:  2/11 done; 9 working: zone:omnib1 zone:omnib2 zone:omnib3 zone:omnib4 zone:omnit0 zone:omnit1 zone:omnit2 zone:omnit3 zone:omnit4
  Linked progress: \||||||-|98.540u 11.950s 0:51.57 214.2%        0+0k 0+0io 0pf+0w
  Exit 1

Note that it just returned exit code 1 right in the middle of the
"Linked progress" display.

When I omit the "-r", things change:

  # zonename
  omnib0

  # pkg update -v -C0 --be-name=ooce-026-20180823

  [...]

  Planning linked: 10/11 done; 1 working: zone:omnit4
  Linked image 'zone:omnit4' output:
  |             Packages to update:         1
  |      Estimated space available: 426.01 GB
  | Estimated space to be consumed:  35.03 MB
  |           Rebuild boot archive:        No
  |
  | Changed packages:
  | omnios
  |   SUNWcs
  |     0.5.11-0.151026:20180622T094606Z -> 0.5.11-0.151026:20180814T181134Z
  |
  | Editable files to change:
  |   Update:
  |     etc/motd

A new BE is created.  However, it just updates the SUNWcs package
containing the new motd file.

When I boot into the new BE and retry "pkg update -rC0", I get the
same result: It just stops without a new BE.  The GZ is now on:

  SunOS radbug 5.11 omnios-r151026-51c7d6fd75 i86pc i386 i86pc


Logging into any one zone, I can update that zone individually.  The update
will try to apply all 11 packages that are newer in the repository.
However, that produces an error because bootadm update-archive is
run and subsequently fails:

  # pkg update -v --be-name=deleteme
              Packages to update:        11
  [...]
    system/kernel/platform
      0.5.11-0.151026:20180621T235956Z -> 0.5.11-0.151026:20180814T181344Z
    web/curl
      7.60.0-0.151026 -> 7.61.0-0.151026

  DOWNLOAD                                PKGS         FILES    XFER (MB)   SPEED
  Completed                              11/11     1519/1519    30.0/30.0  3.9M/s

  PHASE                                          ITEMS
  Removing old actions                           28/28
  Installing new actions                         78/78
  Updating modified actions                  1520/1520
  Updating package state database                 Done
  Updating package cache                         11/11
  Updating image state                            Done
  Creating fast lookup database                   Done
  pkg: '/sbin/bootadm update-archive -R /tmp/tmp36Jtli' failed.
  with a return code of 1.
  Updating package cache                           3/3
  pkg: unable to activate deleteme
  Updating package cache                           3/3

  [...]

  # beadm list
  BE       Active Mountpoint     Space Policy Created
  zbe      xb     -              2.45M static 2018-07-11 23:16
  zbe-1    xb     -              204K  static 2018-08-23 17:33
  zbe-2    NR     /              238K  static 2018-08-23 17:59
  deleteme -      /tmp/tmp36Jtli 1.05G static 2018-08-23 18:28

  # beadm unmount deleteme
  Unmounted successfully

  # beadm activate deleteme
  Unable to activate deleteme.
  BE promotion failed.


Before all that, I had to update pkg which worked fine using -r -C0.
I am now running pkg://omnios/package/pkg at 0.5.11-0.151026:20180725T094123Z
which is the current version in the repo.

Effectively I cannot pkg update my system including the zones any more.
I have previously updated this system without any problems.

Any ideas?


Thanks -- Volker
-- 
------------------------------------------------------------------------
Volker A. Brandt        Consulting and Support for Solaris-based Systems
Brandt & Brandt Computer GmbH                   WWW: http://www.bb-c.de/
Am Wiesenpfad 6, 53340 Meckenheim, GERMANY            Email: vab at bb-c.de
Handelsregister: Amtsgericht Bonn, HRB 10513              Schuhgr??e: 46
Gesch?ftsf?hrer: Rainer J.H. Brandt und Volker A. Brandt

"When logic and proportion have fallen sloppy dead"

From doug at will.to  Thu Aug 23 16:56:24 2018
From: doug at will.to (Doug Hughes)
Date: Thu, 23 Aug 2018 12:56:24 -0400
Subject: [OmniOS-discuss] Slow NFS writes in 151026
In-Reply-To: <CABXAO0+u-4HO8_vzdz3ndcg3vsF550jAu-UVhmP5wrx70azc7A@mail.gmail.com>
References: <CABXAO0KcnmQdyj9TwexYUC6TCfMp1400yytDZ954LheEP=ZESA@mail.gmail.com>
	<alpine.GSO.2.20.1808231102310.3639@scrappy.simplesystems.org>
	<CABXAO0+u-4HO8_vzdz3ndcg3vsF550jAu-UVhmP5wrx70azc7A@mail.gmail.com>
Message-ID: <acb98c5c-d634-0c10-4192-10c18d81ee5f@will.to>

NFS writes (especially for lots of small files) to omniOS *really* 
benefit from having ZIL on those SSD.

You could remove the cache from the pool, carve of an 8GB chunk for ZIL 
on each and the rest for L2arc if you want that.

then add a mirrored zil using the 8GB chunks and the other partition for 
l2arc

An SSD zil helps with metadata update absorption and small files writes 
that are synchronous over NFS. A lot. (that's my experience)


On 8/23/2018 12:38 PM, Lee Damon wrote:
> These are 12TB SAS drives (Seagate?ST12000NM0027) for data & hot 
> spare. ZIL & L2ARC are 480GB?INTEL SSDSC2KG48?SSDs. Everything is left 
> at default for sector size, etc. They were basically prepared for into 
> the pool with a simple fdisk -B /dev/rdsk/drive.
>
> Ping never shows loss of connectivity. I ran this for about 5 minutes 
> during a test:
>
> ? 303 packets transmitted, 303 received, 0% packet loss, time 302021ms
> ? rtt min/avg/max/mdev = 0.109/0.281/2.881/0.227 ms
>
> CIFS, scp, and rsync do not exhibit the problem. I forgot to mention 
> that local copies on the file server are also as fast as I would 
> expect (~2 min).
>
> ? pool: pool0
> ?state: ONLINE
> ? scan: none requested
> config:
>
> ? ? ? ? NAME? ? ? ? ? ? ? ? ? ? ? ? ?STATE? ? ?READ WRITE CKSUM
> ? ? ? ? pool0? ? ? ? ? ? ? ? ? ? ? ? ONLINE? ? ? ?0 ?0? ? ?0
> ? ? ? ? ? raidz2-0? ? ? ? ? ? ? ? ? ?ONLINE? ? ? ?0 ?0? ? ?0
> ? ? ? ? ? ? c0t5000C500A612DA93d0? ? ONLINE? ? ? ?0 ?0? ? ?0
> ? ? ? ? ? ? c0t5000C500957D4A93d0? ? ONLINE? ? ? ?0 ?0? ? ?0
> ? ? ? ? ? ? c0t5000C500957D4C1Bd0? ? ONLINE? ? ? ?0 ?0? ? ?0
> ? ? ? ? ? ? c0t5000C500957D25B3d0? ? ONLINE? ? ? ?0 ?0? ? ?0
> ? ? ? ? ? ? c0t5000C500957D27F3d0? ? ONLINE? ? ? ?0 ?0? ? ?0
> ? ? ? ? ? ? c0t5000C500957D2553d0? ? ONLINE? ? ? ?0 ?0? ? ?0
> ? ? ? ? logs
> ? ? ? ? ? mirror-1? ? ? ? ? ? ? ? ? ?ONLINE? ? ? ?0 ?0? ? ?0
> ? ? ? ? ? ? c0t55CD2E414EC0FF43d0s0? ONLINE? ? ? ?0 ?0? ? ?0
> ? ? ? ? ? ? c3t0d0s0? ? ? ? ? ? ? ? ?ONLINE? ? ? ?0 ?0? ? ?0
> ? ? ? ? cache
> ? ? ? ? ? c0t55CD2E414EC0FF43d0s1? ? ONLINE? ? ? ?0 ?0? ? ?0
> ? ? ? ? ? c3t0d0s1? ? ? ? ? ? ? ? ? ?ONLINE? ? ? ?0 ?0? ? ?0
> ? ? ? ? spares
> ? ? ? ? ? c0t5000C50095722E27d0? ? ? AVAIL
>
> iostat:
> ? ? ? ? ? ? ? ? ? ? extended device statistics
> ? ? r/s? ? w/s? ?kr/s? ?kw/s wait actv wsvc_t asvc_t %w? %b device
> ? ?15.8? ?57.5 1302.5 5372.3 823.0? 0.6 11233.2? ? 8.4 ?9? ?9 pool0
> ? ? 0.1? ?19.9? ? 1.7? 163.0? 0.0? 0.0? ? 1.0? ? 0.1 ?0? ?0 rpool
> ? ? 0.1? ?10.2? ? 0.9? ?81.5? 0.0? 0.0? ? 0.0? ? 0.0 ?0? ?0 c1t4d0
> ? ? 0.0? ?10.1? ? 0.8? ?81.5? 0.0? 0.0? ? 0.0? ? 0.1 ?0? ?0 c1t5d0
> ? ? 2.7? ?19.5? 337.8 1359.4? 0.4? 0.0? ?16.1? ? 0.5 ?8? ?1 c3t0d0
> ? ? 1.9? ? 6.0? 114.5? 442.2? 0.0? 0.0? ? 0.0? ? 5.0 ?0? ?1 
> c0t5000C500957D27F3d0
> ? ? 0.0? ? 0.0? ? 0.0? ? 0.0? 0.0? 0.0? ? 0.0? ? 0.1 ?0? ?0 
> c0t5000C50095722E27d0
> ? ? 1.5? ? 6.0? ?86.2? 442.9? 0.0? 0.0? ? 0.0? ? 4.7 ?0? ?1 
> c0t5000C500957D25B3d0
> ? ? 1.5? ? 6.0? ?78.2? 442.5? 0.0? 0.0? ? 0.0? ? 4.2 ?0? ?1 
> c0t5000C500957D4C1Bd0
> ? ? 1.7? ? 6.1? 102.4? 442.8? 0.0? 0.0? ? 0.0? ? 4.7 ?0? ?1 
> c0t5000C500A612DA93d0
> ? ? 1.6? ? 6.0? ?86.6? 442.5? 0.0? 0.0? ? 0.0? ? 4.3 ?0? ?1 
> c0t5000C500957D2553d0
> ? ? 2.0? ? 5.9? 122.2? 442.2? 0.0? 0.0? ? 0.0? ? 5.0 ?0? ?1 
> c0t5000C500957D4A93d0
> ? ? 2.9? ?19.5? 374.7 1357.9? 0.0? 0.0? ? 0.0? ? 1.2 ?0? ?1 
> c0t55CD2E414EC0FF43d0
> c1t4d0? ? ? ? ? ?Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
> Vendor: ATA? ? ? Product: INTEL SSDSC2KB24 Revision: 0121 Serial No: 
> BTYS817407RE240
> Size: 240.06GB <240057409536 bytes>
> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
> Illegal Request: 0 Predictive Failure Analysis: 0
> c1t5d0? ? ? ? ? ?Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
> Vendor: ATA? ? ? Product: INTEL SSDSC2KB24 Revision: 0121 Serial No: 
> BTYS817409YS240
> Size: 240.06GB <240057409536 bytes>
> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
> Illegal Request: 0 Predictive Failure Analysis: 0
> c3t0d0? ? ? ? ? ?Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
> Vendor: ATA? ? ? Product: INTEL SSDSC2KG48 Revision: 0121 Serial No: 
> BTYM7405027L480
> Size: 480.10GB <480103981056 bytes>
> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
> Illegal Request: 0 Predictive Failure Analysis: 0
> c0t5000C500957D27F3d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
> Vendor: SEAGATE? Product: ST12000NM0027? ? Revision: E001 Serial No: 
> ZJV0VFGX0000J74
> Size: 12000.14GB <12000138625024 bytes>
> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
> Illegal Request: 0 Predictive Failure Analysis: 0
> c0t5000C50095722E27d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
> Vendor: SEAGATE? Product: ST12000NM0027? ? Revision: E001 Serial No: 
> ZJV0S42H0000J75
> Size: 12000.14GB <12000138625024 bytes>
> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
> Illegal Request: 0 Predictive Failure Analysis: 0
> c0t5000C500957D25B3d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
> Vendor: SEAGATE? Product: ST12000NM0027? ? Revision: E001 Serial No: 
> ZJV0VFJV0000J74
> Size: 12000.14GB <12000138625024 bytes>
> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
> Illegal Request: 0 Predictive Failure Analysis: 0
> c0t5000C500957D4C1Bd0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
> Vendor: SEAGATE? Product: ST12000NM0027? ? Revision: E001 Serial No: 
> ZJV0P6050000J83
> Size: 12000.14GB <12000138625024 bytes>
> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
> Illegal Request: 0 Predictive Failure Analysis: 0
> c0t5000C500A612DA93d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
> Vendor: SEAGATE? Product: ST12000NM0027? ? Revision: E001 Serial No: 
> ZJV0WCCN0000J80
> Size: 12000.14GB <12000138625024 bytes>
> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
> Illegal Request: 0 Predictive Failure Analysis: 0
> c0t5000C500957D2553d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
> Vendor: SEAGATE? Product: ST12000NM0027? ? Revision: E001 Serial No: 
> ZJV0VFK30000J74
> Size: 12000.14GB <12000138625024 bytes>
> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
> Illegal Request: 0 Predictive Failure Analysis: 0
> c0t5000C500957D4A93d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
> Vendor: SEAGATE? Product: ST12000NM0027? ? Revision: E001 Serial No: 
> ZJV0VBQ80000R81
> Size: 12000.14GB <12000138625024 bytes>
> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
> Illegal Request: 0 Predictive Failure Analysis: 0
> c0t55CD2E414EC0FF43d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
> Vendor: ATA? ? ? Product: INTEL SSDSC2KG48 Revision: 0121 Serial No: 
> BTYM740600ZT480
> Size: 480.10GB <480103981056 bytes>
> Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
> Illegal Request: 0 Predictive Failure Analysis: 0
>
> nomad
>
>
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20180823/3d01361f/attachment-0001.html>

From vab at bb-c.de  Thu Aug 23 17:00:18 2018
From: vab at bb-c.de (Volker A. Brandt)
Date: Thu, 23 Aug 2018 19:00:18 +0200
Subject: [OmniOS-discuss] pkg update broken on r151026 for lipkg branded
	NGZs
In-Reply-To: <23422.58870.730172.385552@shelob.bb-c.de>
References: <23422.58870.730172.385552@shelob.bb-c.de>
Message-ID: <23422.59426.339070.444420@shelob.bb-c.de>

> When I omit the "-r", things change:
> 
>   # zonename
>   omnib0

Wrong cut&paste, the problem is in the GZ.


Thanks -- Volker
-- 
------------------------------------------------------------------------
Volker A. Brandt        Consulting and Support for Solaris-based Systems
Brandt & Brandt Computer GmbH                   WWW: http://www.bb-c.de/
Am Wiesenpfad 6, 53340 Meckenheim, GERMANY            Email: vab at bb-c.de
Handelsregister: Amtsgericht Bonn, HRB 10513              Schuhgr??e: 46
Gesch?ftsf?hrer: Rainer J.H. Brandt und Volker A. Brandt

"When logic and proportion have fallen sloppy dead"

From bfriesen at simple.dallas.tx.us  Thu Aug 23 17:22:32 2018
From: bfriesen at simple.dallas.tx.us (Bob Friesenhahn)
Date: Thu, 23 Aug 2018 12:22:32 -0500 (CDT)
Subject: [OmniOS-discuss] Slow NFS writes in 151026
In-Reply-To: <CABXAO0+u-4HO8_vzdz3ndcg3vsF550jAu-UVhmP5wrx70azc7A@mail.gmail.com>
References: <CABXAO0KcnmQdyj9TwexYUC6TCfMp1400yytDZ954LheEP=ZESA@mail.gmail.com>
	<alpine.GSO.2.20.1808231102310.3639@scrappy.simplesystems.org>
	<CABXAO0+u-4HO8_vzdz3ndcg3vsF550jAu-UVhmP5wrx70azc7A@mail.gmail.com>
Message-ID: <alpine.GSO.2.20.1808231219120.3639@scrappy.simplesystems.org>

On Thu, 23 Aug 2018, Lee Damon wrote:

> These are 12TB SAS drives (Seagate ST12000NM0027) for data & hot spare. ZIL
> & L2ARC are 480GB INTEL SSDSC2KG48 SSDs. Everything is left at default for
> sector size, etc. They were basically prepared for into the pool with a
> simple fdisk -B /dev/rdsk/drive.

The device c3t0d0 looks like it is overloaded or experiencing issues 
due to a high read/write load and wsvc_t is very high.

>        logs
>          mirror-1                   ONLINE       0     0     0
>            c0t55CD2E414EC0FF43d0s0  ONLINE       0     0     0
>            c3t0d0s0                 ONLINE       0     0     0
>        cache
>          c0t55CD2E414EC0FF43d0s1    ONLINE       0     0     0
>          c3t0d0s1                   ONLINE       0     0     0

I am confused by the above.  Does the trailing 's0' and 's1' indicate 
that partitions were used rather than whole disks for logs and cache 
and so each SSD is providing both log and cache via partitions?

Bob
-- 
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

From bfriesen at simple.dallas.tx.us  Thu Aug 23 17:27:43 2018
From: bfriesen at simple.dallas.tx.us (Bob Friesenhahn)
Date: Thu, 23 Aug 2018 12:27:43 -0500 (CDT)
Subject: [OmniOS-discuss] Slow NFS writes in 151026
In-Reply-To: <acb98c5c-d634-0c10-4192-10c18d81ee5f@will.to>
References: <CABXAO0KcnmQdyj9TwexYUC6TCfMp1400yytDZ954LheEP=ZESA@mail.gmail.com>
	<alpine.GSO.2.20.1808231102310.3639@scrappy.simplesystems.org>
	<CABXAO0+u-4HO8_vzdz3ndcg3vsF550jAu-UVhmP5wrx70azc7A@mail.gmail.com>
	<acb98c5c-d634-0c10-4192-10c18d81ee5f@will.to>
Message-ID: <alpine.GSO.2.20.1808231225460.3639@scrappy.simplesystems.org>

On Thu, 23 Aug 2018, Doug Hughes wrote:

> NFS writes (especially for lots of small files) to omniOS *really* benefit 
> from having ZIL on those SSD.
>
> You could remove the cache from the pool, carve of an 8GB chunk for ZIL on 
> each and the rest for L2arc if you want that.
>
> then add a mirrored zil using the 8GB chunks and the other partition for 
> l2arc
>
> An SSD zil helps with metadata update absorption and small files writes that 
> are synchronous over NFS. A lot. (that's my experience)

It looks like that is what he did but it looks like an error was made 
in that a spinning disk may have been added as a log drive 
(c0t55CD2E414EC0FF43d0s0) rather than an SSD as was intended.

Bob
-- 
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

From nomad at ee.washington.edu  Thu Aug 23 17:32:32 2018
From: nomad at ee.washington.edu (Lee Damon)
Date: Thu, 23 Aug 2018 10:32:32 -0700
Subject: [OmniOS-discuss] Slow NFS writes in 151026
In-Reply-To: <alpine.GSO.2.20.1808231219120.3639@scrappy.simplesystems.org>
References: <CABXAO0KcnmQdyj9TwexYUC6TCfMp1400yytDZ954LheEP=ZESA@mail.gmail.com>
	<alpine.GSO.2.20.1808231102310.3639@scrappy.simplesystems.org>
	<CABXAO0+u-4HO8_vzdz3ndcg3vsF550jAu-UVhmP5wrx70azc7A@mail.gmail.com>
	<alpine.GSO.2.20.1808231219120.3639@scrappy.simplesystems.org>
Message-ID: <27021750-ae20-bb20-2636-d40c563c1e1f@ee.washington.edu>

On 8/23/18 10:22 , Bob Friesenhahn wrote:
>> ?????? logs
>> ???????? mirror-1?????????????????? ONLINE?????? 0???? 0???? 0
>> ?????????? c0t55CD2E414EC0FF43d0s0? ONLINE?????? 0???? 0???? 0
>> ?????????? c3t0d0s0???????????????? ONLINE?????? 0???? 0???? 0
>> ?????? cache
>> ???????? c0t55CD2E414EC0FF43d0s1??? ONLINE?????? 0???? 0???? 0
>> ???????? c3t0d0s1?????????????????? ONLINE?????? 0???? 0???? 0
> 
> I am confused by the above.? Does the trailing 's0' and 's1' indicate 
> that partitions were used rather than whole disks for logs and cache and 
> so each SSD is providing both log and cache via partitions?

Correct. Two 480GB SSDs split into two partitions. I have other pools 
configured the same way (on 151022) with no problems. I don't do that 
with spinning rust, mind you, just with SSDs.

nomad

From bfriesen at simple.dallas.tx.us  Thu Aug 23 17:34:18 2018
From: bfriesen at simple.dallas.tx.us (Bob Friesenhahn)
Date: Thu, 23 Aug 2018 12:34:18 -0500 (CDT)
Subject: [OmniOS-discuss] Slow NFS writes in 151026
In-Reply-To: <CABXAO0+u-4HO8_vzdz3ndcg3vsF550jAu-UVhmP5wrx70azc7A@mail.gmail.com>
References: <CABXAO0KcnmQdyj9TwexYUC6TCfMp1400yytDZ954LheEP=ZESA@mail.gmail.com>
	<alpine.GSO.2.20.1808231102310.3639@scrappy.simplesystems.org>
	<CABXAO0+u-4HO8_vzdz3ndcg3vsF550jAu-UVhmP5wrx70azc7A@mail.gmail.com>
Message-ID: <alpine.GSO.2.20.1808231232540.3639@scrappy.simplesystems.org>

Lee,

Just in case you did not see my follow-up post, it looks like there is 
an error in your pool configuration that a large spinning disk was 
added as a log device rather than a SSD as was intended.  Luckily it 
should be possible to fix this without restarting the pool from 
scratch.

Bob
-- 
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

From bfriesen at simple.dallas.tx.us  Thu Aug 23 17:37:48 2018
From: bfriesen at simple.dallas.tx.us (Bob Friesenhahn)
Date: Thu, 23 Aug 2018 12:37:48 -0500 (CDT)
Subject: [OmniOS-discuss] Slow NFS writes in 151026
In-Reply-To: <alpine.GSO.2.20.1808231232540.3639@scrappy.simplesystems.org>
References: <CABXAO0KcnmQdyj9TwexYUC6TCfMp1400yytDZ954LheEP=ZESA@mail.gmail.com>
	<alpine.GSO.2.20.1808231102310.3639@scrappy.simplesystems.org>
	<CABXAO0+u-4HO8_vzdz3ndcg3vsF550jAu-UVhmP5wrx70azc7A@mail.gmail.com>
	<alpine.GSO.2.20.1808231232540.3639@scrappy.simplesystems.org>
Message-ID: <alpine.GSO.2.20.1808231236470.3639@scrappy.simplesystems.org>

On Thu, 23 Aug 2018, Bob Friesenhahn wrote:

> Just in case you did not see my follow-up post, it looks like there is an 
> error in your pool configuration that a large spinning disk was added as a 
> log device rather than a SSD as was intended.  Luckily it should be possible 
> to fix this without restarting the pool from scratch.

Alas, it looks like I was wrong about this.  It seems that the two 
SSDs are presented with much different device names.

Bob
-- 
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

From nomad at ee.washington.edu  Thu Aug 23 17:39:32 2018
From: nomad at ee.washington.edu (Lee Damon)
Date: Thu, 23 Aug 2018 10:39:32 -0700
Subject: [OmniOS-discuss] Slow NFS writes in 151026
In-Reply-To: <alpine.GSO.2.20.1808231232540.3639@scrappy.simplesystems.org>
References: <CABXAO0KcnmQdyj9TwexYUC6TCfMp1400yytDZ954LheEP=ZESA@mail.gmail.com>
	<alpine.GSO.2.20.1808231102310.3639@scrappy.simplesystems.org>
	<CABXAO0+u-4HO8_vzdz3ndcg3vsF550jAu-UVhmP5wrx70azc7A@mail.gmail.com>
	<alpine.GSO.2.20.1808231232540.3639@scrappy.simplesystems.org>
Message-ID: <CABXAO0LX=XqCv_4Zd4RnE+Mri+81-fH7Z=Kvrwth=pM-GgPibw@mail.gmail.com>

Do you mean c0t55CD2E414EC0FF43d0?

It's an SSD. It just has a long name because it's in a hotswap sled instead
being inside the chassis.
    Hardware properties:
                name='devid' type=string items=1
                    value='id1,sd at n55cd2e414ec0ff43'
                name='class' type=string items=1
                    value='scsi'
                name='inquiry-revision-id' type=string items=1
                    value='0121'
                name='inquiry-product-id' type=string items=1
                    value='INTEL SSDSC2KG48'
                name='inquiry-vendor-id' type=string items=1
                    value='ATA'
                name='inquiry-device-type' type=int items=1
                    value=00000000
                name='pm-capable' type=int items=1
                    value=00000001
                name='compatible' type=string items=4
                    value='scsiclass,00.vATA.pINTEL_SSDSC2KG48.r0121' +
'scsiclass,00.vATA.pINTEL_SSDSC2KG48' + 'scsiclass,00' + 'scsiclass'
                name='client-guid' type=string items=1
                    value='55cd2e414ec0ff43'
nomad

On Thu, Aug 23, 2018 at 10:35 AM Bob Friesenhahn <
bfriesen at simple.dallas.tx.us> wrote:

> Lee,
>
> Just in case you did not see my follow-up post, it looks like there is
> an error in your pool configuration that a large spinning disk was
> added as a log device rather than a SSD as was intended.  Luckily it
> should be possible to fix this without restarting the pool from
> scratch.
>
> Bob
> --
> Bob Friesenhahn
> bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
> GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20180823/6268427e/attachment.html>

From vab at bb-c.de  Thu Aug 23 17:43:30 2018
From: vab at bb-c.de (Volker A. Brandt)
Date: Thu, 23 Aug 2018 19:43:30 +0200
Subject: [OmniOS-discuss] pkg update broken on r151026 for lipkg branded
	NGZs
In-Reply-To: <23422.58870.730172.385552@shelob.bb-c.de>
References: <23422.58870.730172.385552@shelob.bb-c.de>
Message-ID: <23422.62018.786614.396822@shelob.bb-c.de>

Hello all!


After some hours of frustration, I wrote:
> I have a very strange problem doing a pkg update on a r151026 system.
> This machine has 11 NGZs, all are lipkg brand.

[...]
 
> Effectively I cannot pkg update my system including the zones any more.
> I have previously updated this system without any problems.

After the mail, *another* reboot, and *another* test, and it works.
With no changes whatsoever.  *sigh*


Regards -- Volker
-- 
------------------------------------------------------------------------
Volker A. Brandt        Consulting and Support for Solaris-based Systems
Brandt & Brandt Computer GmbH                   WWW: http://www.bb-c.de/
Am Wiesenpfad 6, 53340 Meckenheim, GERMANY            Email: vab at bb-c.de
Handelsregister: Amtsgericht Bonn, HRB 10513              Schuhgr??e: 46
Gesch?ftsf?hrer: Rainer J.H. Brandt und Volker A. Brandt

"When logic and proportion have fallen sloppy dead"

From doug at will.to  Thu Aug 23 19:19:44 2018
From: doug at will.to (Doug Hughes)
Date: Thu, 23 Aug 2018 15:19:44 -0400
Subject: [OmniOS-discuss] Slow NFS writes in 151026
In-Reply-To: <CABXAO0LX=XqCv_4Zd4RnE+Mri+81-fH7Z=Kvrwth=pM-GgPibw@mail.gmail.com>
References: <CABXAO0KcnmQdyj9TwexYUC6TCfMp1400yytDZ954LheEP=ZESA@mail.gmail.com>
	<alpine.GSO.2.20.1808231102310.3639@scrappy.simplesystems.org>
	<CABXAO0+u-4HO8_vzdz3ndcg3vsF550jAu-UVhmP5wrx70azc7A@mail.gmail.com>
	<alpine.GSO.2.20.1808231232540.3639@scrappy.simplesystems.org>
	<CABXAO0LX=XqCv_4Zd4RnE+Mri+81-fH7Z=Kvrwth=pM-GgPibw@mail.gmail.com>
Message-ID: <b135466e-bee6-554d-8849-f5aec08f774a@will.to>

Out of curiosity, if you disable the zil through the evil zfs tuning 
wiki mechanisms (diagnostic purposes only), does it dramatically help?

If not, there's something else going on, if yes, it could be that the 
l2arc and zil are interfering with each other (I could imagine that the 
l2arc is causing a lot of need for erasing of blocks on the SSD which 
could be dramatically slowing things down. I haven't been following the 
implementation of TRIM support and any outstanding issues)


On 8/23/2018 1:39 PM, Lee Damon wrote:
> Do you mean c0t55CD2E414EC0FF43d0?
>
> It's an SSD. It just has a long name because it's in a hotswap sled 
> instead being inside the chassis.
> ? ? Hardware properties:
> ? ? ? ? ? ? ? ? name='devid' type=string items=1
> ? ? ? ? ? ? ? ? ? ? value='id1,sd at n55cd2e414ec0ff43'
> ? ? ? ? ? ? ? ? name='class' type=string items=1
> ? ? ? ? ? ? ? ? ? ? value='scsi'
> ? ? ? ? ? ? ? ? name='inquiry-revision-id' type=string items=1
> ? ? ? ? ? ? ? ? ? ? value='0121'
> ? ? ? ? ? ? ? ? name='inquiry-product-id' type=string items=1
> ? ? ? ? ? ? ? ? ? ? value='INTEL SSDSC2KG48'
> ? ? ? ? ? ? ? ? name='inquiry-vendor-id' type=string items=1
> ? ? ? ? ? ? ? ? ? ? value='ATA'
> ? ? ? ? ? ? ? ? name='inquiry-device-type' type=int items=1
> ? ? ? ? ? ? ? ? ? ? value=00000000
> ? ? ? ? ? ? ? ? name='pm-capable' type=int items=1
> ? ? ? ? ? ? ? ? ? ? value=00000001
> ? ? ? ? ? ? ? ? name='compatible' type=string items=4
> value='scsiclass,00.vATA.pINTEL_SSDSC2KG48.r0121' + 
> 'scsiclass,00.vATA.pINTEL_SSDSC2KG48' + 'scsiclass,00' + 'scsiclass'
> ? ? ? ? ? ? ? ? name='client-guid' type=string items=1
> ? ? ? ? ? ? ? ? ? ? value='55cd2e414ec0ff43'
> nomad
>
> On Thu, Aug 23, 2018 at 10:35 AM Bob Friesenhahn 
> <bfriesen at simple.dallas.tx.us <mailto:bfriesen at simple.dallas.tx.us>> 
> wrote:
>
>     Lee,
>
>     Just in case you did not see my follow-up post, it looks like
>     there is
>     an error in your pool configuration that a large spinning disk was
>     added as a log device rather than a SSD as was intended. Luckily it
>     should be possible to fix this without restarting the pool from
>     scratch.
>
>     Bob
>     -- 
>     Bob Friesenhahn
>     bfriesen at simple.dallas.tx.us
>     <mailto:bfriesen at simple.dallas.tx.us>,
>     http://www.simplesystems.org/users/bfriesen/
>     GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
>
>
>
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20180823/2edc0b98/attachment.html>

From nomad at ee.washington.edu  Thu Aug 23 23:43:50 2018
From: nomad at ee.washington.edu (Lee Damon)
Date: Thu, 23 Aug 2018 16:43:50 -0700
Subject: [OmniOS-discuss] Slow NFS writes in 151026
In-Reply-To: <alpine.GSO.2.20.1808231236470.3639@scrappy.simplesystems.org>
References: <CABXAO0KcnmQdyj9TwexYUC6TCfMp1400yytDZ954LheEP=ZESA@mail.gmail.com>
	<alpine.GSO.2.20.1808231102310.3639@scrappy.simplesystems.org>
	<CABXAO0+u-4HO8_vzdz3ndcg3vsF550jAu-UVhmP5wrx70azc7A@mail.gmail.com>
	<alpine.GSO.2.20.1808231232540.3639@scrappy.simplesystems.org>
	<alpine.GSO.2.20.1808231236470.3639@scrappy.simplesystems.org>
Message-ID: <CABXAO0Jt_h9ANSYK7e2UewkG3iaRBNmZ7=w8amjhRtDjmTdV+g@mail.gmail.com>

(I've just changed from digest to regular subscription as I see there
are messages relevant to this that I haven't received yet...)

Doug, I'm not familiar with the evil zfs tuning wiki mechanism. I'll
have to see if Google can help me find it.

As for the ZIL+ L2ARC on the same SSD potentially being the problem,
clearly I can't say with 100% certanty that it is not a problem however I
have a second host (running 151022) with _exactly_ the same configuration
of hard drives + split-SSD and NFS writes to that pool are fine.

hvfs2 is ~18 months old but the chrup0 pool is a few months old.

time cp -rp /misc/fs1test/004test /misc/hvfs2chru/omics1

real    3m11.431s
user    0m0.177s
sys     0m28.030s

time cp -rp /misc/fs1test/004test /misc/fs2test/omics1

real    21m13.412s
user    0m0.188s
sys     0m28.678s

nomad

From nomad at ee.washington.edu  Fri Aug 24 00:23:25 2018
From: nomad at ee.washington.edu (Lee Damon)
Date: Thu, 23 Aug 2018 17:23:25 -0700
Subject: [OmniOS-discuss] Slow NFS writes in 151026
In-Reply-To: <CABXAO0Jt_h9ANSYK7e2UewkG3iaRBNmZ7=w8amjhRtDjmTdV+g@mail.gmail.com>
References: <CABXAO0KcnmQdyj9TwexYUC6TCfMp1400yytDZ954LheEP=ZESA@mail.gmail.com>
	<alpine.GSO.2.20.1808231102310.3639@scrappy.simplesystems.org>
	<CABXAO0+u-4HO8_vzdz3ndcg3vsF550jAu-UVhmP5wrx70azc7A@mail.gmail.com>
	<alpine.GSO.2.20.1808231232540.3639@scrappy.simplesystems.org>
	<alpine.GSO.2.20.1808231236470.3639@scrappy.simplesystems.org>
	<CABXAO0Jt_h9ANSYK7e2UewkG3iaRBNmZ7=w8amjhRtDjmTdV+g@mail.gmail.com>
Message-ID: <17f1d58d-4a5d-e4d3-979e-3ef7014d9396@ee.washington.edu>

(This doesn't appear to have gone out so I'm re-sending. Apologies if 
it's a duplicate.)

On 8/23/18 16:43 , Lee Damon wrote:
> (I've just changed from digest to regular subscription as I see there
> are messages relevant to this that I haven't received yet...)
> 
> Doug, I'm not familiar with the evil zfs tuning wiki mechanism. I'll
> have to see if Google can help me find it.
> 
> As for the ZIL+ L2ARC on the same SSD potentially being the problem,
> clearly I can't say with 100% certanty that it is not a problem however I
> have a second host (running 151022) with _exactly_ the same configuration
> of hard drives + split-SSD and NFS writes to that pool are fine.
> 
> hvfs2 is ~18 months old but the chrup0 pool is a few months old.
> 
> time cp -rp /misc/fs1test/004test /misc/hvfs2chru/omics1
> 
> real    3m11.431s
> user    0m0.177s
> sys     0m28.030s
> 
> time cp -rp /misc/fs1test/004test /misc/fs2test/omics1
> 
> real    21m13.412s
> user    0m0.188s
> sys     0m28.678s
> 
> nomad
> 


From doug at will.to  Fri Aug 24 00:33:43 2018
From: doug at will.to (Doug Hughes)
Date: Thu, 23 Aug 2018 20:33:43 -0400
Subject: [OmniOS-discuss] Slow NFS writes in 151026
In-Reply-To: <CABXAO0Jt_h9ANSYK7e2UewkG3iaRBNmZ7=w8amjhRtDjmTdV+g@mail.gmail.com>
References: <CABXAO0KcnmQdyj9TwexYUC6TCfMp1400yytDZ954LheEP=ZESA@mail.gmail.com>
	<alpine.GSO.2.20.1808231102310.3639@scrappy.simplesystems.org>
	<CABXAO0+u-4HO8_vzdz3ndcg3vsF550jAu-UVhmP5wrx70azc7A@mail.gmail.com>
	<alpine.GSO.2.20.1808231232540.3639@scrappy.simplesystems.org>
	<alpine.GSO.2.20.1808231236470.3639@scrappy.simplesystems.org>
	<CABXAO0Jt_h9ANSYK7e2UewkG3iaRBNmZ7=w8amjhRtDjmTdV+g@mail.gmail.com>
Message-ID: <c1be738d-6631-3f3a-9846-d6c82a2ea96e@will.to>

Evil tuning here:

https://www.solaris-cookbook.eu/solaris/solaris-10-zfs-evil-tuning-guide/

It's at the bottom. Where it says "Disabling the ZIL (Don't)"

I could see a lack of TRIM/erase support in background as a strong 
possibility caused by continuous use of blocks from the L2ARC over time.

Are you getting a high hit rate on your L2arc?

http://blog.harschsystems.com/2010/09/08/arcstat-pl-updated-for-l2arc-statistics/

If not, you might think about just dropping it all together and then

This, as old as it is, may not be accurate, but it doesn't give me high 
confidence that Trim support was added to illumos. Maybe it was and 
somebody else can chime in:

http://open-zfs.org/wiki/Features

zpool iostat -v <pool> may also be interesting for the l2arc/zil devices


On 8/23/2018 7:43 PM, Lee Damon wrote:
> (I've just changed from digest to regular subscription as I see there
> are messages relevant to this that I haven't received yet...)
>
> Doug, I'm not familiar with the evil zfs tuning wiki mechanism. I'll
> have to see if Google can help me find it.
>
> As for the ZIL+ L2ARC on the same SSD potentially being the problem,
> clearly I can't say with 100% certanty that it is not a problem however I
> have a second host (running 151022) with _exactly_ the same configuration
> of hard drives + split-SSD and NFS writes to that pool are fine.
>
> hvfs2 is ~18 months old but the chrup0 pool is a few months old.
>
> time cp -rp /misc/fs1test/004test /misc/hvfs2chru/omics1
>
> real    3m11.431s
> user    0m0.177s
> sys     0m28.030s
>
> time cp -rp /misc/fs1test/004test /misc/fs2test/omics1
>
> real    21m13.412s
> user    0m0.188s
> sys     0m28.678s
>
> nomad
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss


From richard.elling at richardelling.com  Fri Aug 24 03:17:13 2018
From: richard.elling at richardelling.com (Richard Elling)
Date: Thu, 23 Aug 2018 20:17:13 -0700
Subject: [OmniOS-discuss] Slow NFS writes in 151026
In-Reply-To: <17f1d58d-4a5d-e4d3-979e-3ef7014d9396@ee.washington.edu>
References: <CABXAO0KcnmQdyj9TwexYUC6TCfMp1400yytDZ954LheEP=ZESA@mail.gmail.com>
	<alpine.GSO.2.20.1808231102310.3639@scrappy.simplesystems.org>
	<CABXAO0+u-4HO8_vzdz3ndcg3vsF550jAu-UVhmP5wrx70azc7A@mail.gmail.com>
	<alpine.GSO.2.20.1808231232540.3639@scrappy.simplesystems.org>
	<alpine.GSO.2.20.1808231236470.3639@scrappy.simplesystems.org>
	<CABXAO0Jt_h9ANSYK7e2UewkG3iaRBNmZ7=w8amjhRtDjmTdV+g@mail.gmail.com>
	<17f1d58d-4a5d-e4d3-979e-3ef7014d9396@ee.washington.edu>
Message-ID: <8F575E13-0CC8-46BB-8FF7-0E5DEF87210D@richardelling.com>

fwiw, nfssvrstat breaks down the NFS writes by sync, async, and commits: explicitly for determining how the workload will impact ZIL. For writing many files, the (compound) operations can also include creates and sync-on-close that also impacts performance.

  -- richard


> On Aug 23, 2018, at 5:23 PM, Lee Damon <nomad at ee.washington.edu> wrote:
> 
> (This doesn't appear to have gone out so I'm re-sending. Apologies if it's a duplicate.)
> 
>> On 8/23/18 16:43 , Lee Damon wrote:
>> (I've just changed from digest to regular subscription as I see there
>> are messages relevant to this that I haven't received yet...)
>> Doug, I'm not familiar with the evil zfs tuning wiki mechanism. I'll
>> have to see if Google can help me find it.
>> As for the ZIL+ L2ARC on the same SSD potentially being the problem,
>> clearly I can't say with 100% certanty that it is not a problem however I
>> have a second host (running 151022) with _exactly_ the same configuration
>> of hard drives + split-SSD and NFS writes to that pool are fine.
>> hvfs2 is ~18 months old but the chrup0 pool is a few months old.
>> time cp -rp /misc/fs1test/004test /misc/hvfs2chru/omics1
>> real    3m11.431s
>> user    0m0.177s
>> sys     0m28.030s
>> time cp -rp /misc/fs1test/004test /misc/fs2test/omics1
>> real    21m13.412s
>> user    0m0.188s
>> sys     0m28.678s
>> nomad
> 
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss

From feigin at iis.ee.ethz.ch  Fri Aug 24 08:07:06 2018
From: feigin at iis.ee.ethz.ch (Adam Feigin)
Date: Fri, 24 Aug 2018 10:07:06 +0200
Subject: [OmniOS-discuss] Slow NFS writes in 151026
In-Reply-To: <mailman.82930.1535080637.88858.omnios-discuss@lists.omniti.com>
References: <mailman.82930.1535080637.88858.omnios-discuss@lists.omniti.com>
Message-ID: <c63865f2-e5c9-affc-b8ea-b98c9e4a84ae@iis.ee.ethz.ch>

Hi Lee:


I've been experiencing something very similar. I recently (several
months ago) moved a ~30T pool from a "old" OpenIndiana 151a9 system ,
where it had been working flawlessly for several years, to a "new"
OmniOSce 151022 installation (zpool export on old, zpool import on new).


Now, I have extremely poor NFS write speeds on the new system. I've even
swapped the cards (LSI SAS, 10G Ethernet) from the OI system to the
OmniOS system, to eliminate some hardware discrepancies, but this had no
effect whatsoever. Its not a network problem. I can happily get near
line-rate on the 10G network between the server and various 10G
connected hosts. Its not a ZIL/L2ARC problem either, removing them
(they're on SSDs, as yours) had minimal effect.


The new hardware is signifcantly more performant, with nearly 10x more
memory (240G vs 32G), more cores and faster CPUs; I never expected
performance to get worse.


I'm not convinced its a "pure" NFS problem either, as I've noticed some
other strange performance degradation on the new system. The pool used
to take somewhere between 40 - 60 hours to run a scrub on the OI system.
Recent scrubs were taking 400+ hours. After a recent pkg update and
reboot, the last scrub took ~159 hours. During the scrub, I noticed that
the scanning speed, while starting out relatively fast, pretty much
monotonically decreased in speed as time when on, going from 50 M/s near
the beginning to 17M/s at the end. I have to see what happens at the
next monthly scrub of the pool.


Have you looked at your scrub performance ?

What else is different between the 2 machines ?


From tobi at oetiker.ch  Fri Aug 24 08:54:29 2018
From: tobi at oetiker.ch (Tobias Oetiker)
Date: Fri, 24 Aug 2018 10:54:29 +0200 (CEST)
Subject: [OmniOS-discuss] Slow NFS writes in 151026
In-Reply-To: <c63865f2-e5c9-affc-b8ea-b98c9e4a84ae@iis.ee.ethz.ch>
References: <mailman.82930.1535080637.88858.omnios-discuss@lists.omniti.com>
	<c63865f2-e5c9-affc-b8ea-b98c9e4a84ae@iis.ee.ethz.ch>
Message-ID: <1305271323.35866.1535100869773.JavaMail.zimbra@oetiker.ch>

Hi All,

Lee has opened a issue here https://github.com/omniosorg/illumos-omnios/issues/256
it might be a good place to discuss this.

I have also posted a very simple tests script (not sure if it is enough to reproduce the problem, but it would at least give a common baseline as to what we are talking about).

cheers
tobi

----- On Aug 24, 2018, at 10:07 AM, Adam Feigin feigin at iis.ee.ethz.ch wrote:

> Hi Lee:
> 
> 
> I've been experiencing something very similar. I recently (several
> months ago) moved a ~30T pool from a "old" OpenIndiana 151a9 system ,
> where it had been working flawlessly for several years, to a "new"
> OmniOSce 151022 installation (zpool export on old, zpool import on new).
> 
> 
> Now, I have extremely poor NFS write speeds on the new system. I've even
> swapped the cards (LSI SAS, 10G Ethernet) from the OI system to the
> OmniOS system, to eliminate some hardware discrepancies, but this had no
> effect whatsoever. Its not a network problem. I can happily get near
> line-rate on the 10G network between the server and various 10G
> connected hosts. Its not a ZIL/L2ARC problem either, removing them
> (they're on SSDs, as yours) had minimal effect.
> 
> 
> The new hardware is signifcantly more performant, with nearly 10x more
> memory (240G vs 32G), more cores and faster CPUs; I never expected
> performance to get worse.
> 
> 
> I'm not convinced its a "pure" NFS problem either, as I've noticed some
> other strange performance degradation on the new system. The pool used
> to take somewhere between 40 - 60 hours to run a scrub on the OI system.
> Recent scrubs were taking 400+ hours. After a recent pkg update and
> reboot, the last scrub took ~159 hours. During the scrub, I noticed that
> the scanning speed, while starting out relatively fast, pretty much
> monotonically decreased in speed as time when on, going from 50 M/s near
> the beginning to 17M/s at the end. I have to see what happens at the
> next monthly scrub of the pool.
> 
> 
> Have you looked at your scrub performance ?
> 
> What else is different between the 2 machines ?
> 
> 
> 
> _______________________________________________
> OmniOS-discuss mailing list
> OmniOS-discuss at lists.omniti.com
> http://lists.omniti.com/mailman/listinfo/omnios-discuss

-- 
Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland
www.oetiker.ch tobi at oetiker.ch +41 62 775 9902

From nomad at ee.washington.edu  Fri Aug 24 15:11:17 2018
From: nomad at ee.washington.edu (Lee Damon)
Date: Fri, 24 Aug 2018 08:11:17 -0700
Subject: [OmniOS-discuss] Slow NFS writes in 151026
In-Reply-To: <c63865f2-e5c9-affc-b8ea-b98c9e4a84ae@iis.ee.ethz.ch>
References: <mailman.82930.1535080637.88858.omnios-discuss@lists.omniti.com>
	<c63865f2-e5c9-affc-b8ea-b98c9e4a84ae@iis.ee.ethz.ch>
Message-ID: <ce982c3e-c7ab-9dfc-fb54-761cb2ae178a@ee.washington.edu>

Adam,

I'm having no problems at all with my 151022 hosts. They're all doing 
well for NFS reads & writes. I only see the degredation in write speed 
on the 151026 host I recently installed.

> Have you looked at your scrub performance ?

I had bad scrub performance on a host that had a bad drive causing bus 
contention. That host hasn't scrubbed again since then so I can't say if 
the problem is still there.

> What else is different between the 2 machines ?

Age of hardware. The 151022 host is ~18 months old while the 151026 host 
is ~2 months old. The '26 host has never had anything but 151026 
installed on it because I couldn't get the 22 installer to boot on it (I 
don't remember the details now, that was 2 months ago).

The '26 host has 98GB RAM while the '22 host has 128GB.

Other than that the pools in question are the same in terms of drives, 
ZIL, and L2ARC type/config.

nomad

From Ergi.Thanasko at avsquad.com  Fri Aug 24 16:55:01 2018
From: Ergi.Thanasko at avsquad.com (Ergi Thanasko)
Date: Fri, 24 Aug 2018 16:55:01 +0000
Subject: [OmniOS-discuss] ARC  or memory perfomance benchmarks
Message-ID: <0557B620-77A2-4373-A1A8-1888D7AC73A3@avsquad.com>

We are building a new box with Skylake CPU 3.6Ghz and  ddr4 rdim 2666Mhz. Been  using  iozone for multithreaded random  IO zpool testing and getting some awesome speed test. What I really want to test it the ram speed. Supermicro gave me some benchmarks for ram at around 200GB/sec sustained bandwidth for 768G of ram.  I want to see how it compares with my other DDR3 boxes. I am having trouble finding utilities that will test the ram speed on Solaris, OmniOS or Opendianna. Any help is appreciated


[/Users/ergithanasko/Library/Containers/com.microsoft.Outlook/Data/Library/Caches/Signatures/signature_107557883]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20180824/1dfcfd37/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 7911 bytes
Desc: image001.png
URL: <https://omniosce.org/ml-archive/attachments/20180824/1dfcfd37/attachment-0001.png>

From bfriesen at simple.dallas.tx.us  Fri Aug 24 20:40:34 2018
From: bfriesen at simple.dallas.tx.us (Bob Friesenhahn)
Date: Fri, 24 Aug 2018 15:40:34 -0500 (CDT)
Subject: [OmniOS-discuss] ARC or memory perfomance benchmarks
In-Reply-To: <0557B620-77A2-4373-A1A8-1888D7AC73A3@avsquad.com>
References: <0557B620-77A2-4373-A1A8-1888D7AC73A3@avsquad.com>
Message-ID: <alpine.GSO.2.20.1808241538310.3639@scrappy.simplesystems.org>

On Fri, 24 Aug 2018, Ergi Thanasko wrote:

> We are building a new box with Skylake CPU 3.6Ghz and ddr4 rdim 
> 2666Mhz. Been using iozone for multithreaded random IO zpool testing 
> and getting some awesome speed test. What I really want to test it 
> the ram speed. Supermicro gave me some benchmarks for ram at around 
> 200GB/sec sustained bandwidth for 768G of ram.  I want to see how it 
> compares with my other DDR3 boxes. I am having trouble finding 
> utilities that will test the ram speed on Solaris, OmniOS or 
> Opendianna. Any help is appreciated

The classic RAM speed benchmark is the 'stream' benchmark, which you 
can obtain from https://www.cs.virginia.edu/stream/

Bob
-- 
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

From Ergi.Thanasko at avsquad.com  Fri Aug 24 20:52:21 2018
From: Ergi.Thanasko at avsquad.com (Ergi Thanasko)
Date: Fri, 24 Aug 2018 20:52:21 +0000
Subject: [OmniOS-discuss] ARC or memory perfomance benchmarks
In-Reply-To: <alpine.GSO.2.20.1808241538310.3639@scrappy.simplesystems.org>
References: <0557B620-77A2-4373-A1A8-1888D7AC73A3@avsquad.com>
	<alpine.GSO.2.20.1808241538310.3639@scrappy.simplesystems.org>
Message-ID: <F95AD642-8E32-41BA-B898-470237AC2656@avsquad.com>

Thnx Bob,
That is what SM used on Redhat 7.3, anyone has compiled version around that feels like sharing. 


?On 8/24/18, 1:40 PM, "Bob Friesenhahn" <bfriesen at simple.dallas.tx.us> wrote:

    On Fri, 24 Aug 2018, Ergi Thanasko wrote:
    
    > We are building a new box with Skylake CPU 3.6Ghz and ddr4 rdim 
    > 2666Mhz. Been using iozone for multithreaded random IO zpool testing 
    > and getting some awesome speed test. What I really want to test it 
    > the ram speed. Supermicro gave me some benchmarks for ram at around 
    > 200GB/sec sustained bandwidth for 768G of ram.  I want to see how it 
    > compares with my other DDR3 boxes. I am having trouble finding 
    > utilities that will test the ram speed on Solaris, OmniOS or 
    > Opendianna. Any help is appreciated
    
    The classic RAM speed benchmark is the 'stream' benchmark, which you 
    can obtain from https://www.cs.virginia.edu/stream/
    
    Bob
    -- 
    Bob Friesenhahn
    bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
    GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/
    

From pkam at bloom.pl  Sat Aug 25 21:37:30 2018
From: pkam at bloom.pl (Piotr Kaminski)
Date: Sat, 25 Aug 2018 23:37:30 +0200
Subject: [OmniOS-discuss] CIFS access denied to some users from AD - again
Message-ID: <5759dce4-7fec-227a-2fb4-177503d7673a@bloom.pl>

Hi Everybody,

I would like to refresh my post sent around 3 month ago. The issue still
persists...

What I've got is

  * Ubuntu 16.04 with Samba 4 as AD DC
  * OmniOSce CIFS server is joined to AD domain
  * Windows 10 Pro joined to AD domain
  * and some more client computers joined

I do AD administration from Win10 with RSAT. I've created a lot of
accounts for employees.

PROBLEM: Some users are denied access to OmniOSce shares while other
users can connect without problems. I would like to stress: the issue is
present only with OmniOS shares. Users ARE authorised thru AD DC.

  * There is ACL rule for a "employees" AD group allowing access for the
    members,
  * there are about 20 members and only a few of them have problem,
  * problematic accounts CAN? connect to another Windows machine via RDP
    and are authorized by AD DC (I even changed passwords to check and
    still can connect with the new passwords),
  * problematic accounts cannot access the CIFS share from OmniIOSce server.

When I try to access the server from Ubuntu machine I get the following
with "good_user":

    $ smbclient -U test26 -L //omnios
    Enter test26's password: 
    Domain=[DOMAIN_NAME] OS=[SunOS 5.11 omnios-r151026-51c7d] Server=[Native SMB service]

    	Sharename       Type      Comment
    	---------       ----      -------
    	public          Disk      
    	c$              Disk      Default Share
    	test1           Disk      
    	test2           Disk      
    	ipc$            IPC       Remote IPC
    	test            Disk      
    Domain=[DOMAIN_NAME] OS=[SunOS 5.11 omnios-r151026-51c7d] Server=[Native SMB service]

    	Server               Comment
    	---------            -------

    	Workgroup            Master
    	---------            -------

and with "bad_user" I get

    # smbclient -U bad_user -L //omnios
    Enter bad_user's password: 
    session setup failed: NT_STATUS_ACCESS_DENIED

The same results are obtained from Windows machine with? "net view
\\omnios" ? command

  * When I log in to Windows machine with "bad user" I can log in
    properly but "net view" command produces error 53.
  * When I log in to the same Windows machine with "good user", I can
    list shares with "net view" command.

I cannot see any difference between the users. They are members of the
same AD groups. They were created one by one.

As a workaround I can disable problematic accounts, create new accounts
and they work as a charm. But that is just a temporary? workaround.

Can the issue be related to SID numbers? Maybe OmniOS does not like some
of them?

I have the following ID mappings on OmniOS:

# idmap list
add???? winuser:administrator at local.domain_name.net? unixuser:root
add???? wingroup:administrators at local.domain_name.net??????? unixgroup:root
add -d? winuser:*@local.domain_name.net????? unixuser:domain_name

The issue drives me crazy. Any help or thoughts appreciated.

Regards,

-- 
Piotr

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20180825/4803560d/attachment.html>

From nomad at ee.washington.edu  Wed Aug 29 21:12:50 2018
From: nomad at ee.washington.edu (Lee Damon)
Date: Wed, 29 Aug 2018 14:12:50 -0700
Subject: [OmniOS-discuss] Question about ndpd.conf in 151026
Message-ID: <12aed56e-4a66-787d-ccb5-3fed658b7ce1@ee.washington.edu>

I have an /etc/inet/ndpd.conf file that has exactly two lines:
   ifdefault StatelessAddrConf false
   ifdefault StatefulAddrConf false

On my test host running 151022 when I
   sudo ipadm create-addr -T addrconf aggr0/v6
ipadm show-if shows the interface with an fe80:: address and nothing else.

However, when I do it on my 151026 host it gives both the fe80:: address 
and a fully routeable address based on the host's MAC address. This is 
not what I expect to see. I've tried with 'if aggr0' instead of 
'ifdefault', same result. I've tried with duplicated lines for both 
ifdefault and if aggr0 and that just breaks things (so I know it's 
reading the file). I've also tried with just a StatelessAddrConf or 
StatefullAddrConf line, no change.

I don't see any references to ndpd in the release notes for 151024 or 
151026 so I'm presuming no changes were made that should have impacted this.

Any suggestions of what I'm missing?

thanks,
nomad

From chip at innovates.com  Thu Aug 30 14:29:53 2018
From: chip at innovates.com (Schweiss, Chip)
Date: Thu, 30 Aug 2018 09:29:53 -0500
Subject: [OmniOS-discuss] Panic on OmniOS CE r151022ay
Message-ID: <CALeZrrSa-C7Gtj1BxWePvgS20PUHG3L+vF2Lz8j=ywVwrzujVQ@mail.gmail.com>

I've seen this panic twice now in the past couple weeks.   Does anyone know
if there is a patch already that fixes this?  Looks like another xattr
problem.

# fmdump -Vp -u b7c9840b-8bb1-cbbc-e165-a5b6fa34078b
TIME                           UUID
 SUNW-MSG-ID
Aug 30 2018 08:29:32.089419000 b7c9840b-8bb1-cbbc-e165-a5b6fa34078b
SUNOS-8000-KL

  TIME                 CLASS                                 ENA
  Aug 30 08:27:50.8299 ireport.os.sunos.panic.dump_pending_on_device
0x0000000000000000

nvlist version: 0
        version = 0x0
        class = list.suspect
        uuid = b7c9840b-8bb1-cbbc-e165-a5b6fa34078b
        code = SUNOS-8000-KL
        diag-time = 1535635766 223254
        de = fmd:///module/software-diagnosis
        fault-list-sz = 0x1
        fault-list = (array of embedded nvlists)
        (start fault-list[0])
        nvlist version: 0
                version = 0x0
                class = defect.sunos.kernel.panic
                certainty = 0x64
                asru = sw:///:path=/var/crash//.b7c9840b-8bb1-cbbc-e165-
a5b6fa34078b
                resource = sw:///:path=/var/crash//.b7c9840b-8bb1-cbbc-e165-
a5b6fa34078b
                savecore-succcess = 0
                os-instance-uuid = b7c9840b-8bb1-cbbc-e165-a5b6fa34078b
                panicstr = BAD TRAP: type=d (#gp General protection)
rp=ffffd001e9855360 addr=ffffd063784ee8d0
                panicstack = unix:real_mode_stop_cpu_stage2_end+b203 () |
unix:trap+a70 () | unix:cmntrap+e6 () | zfs:zfs_getattr+1a0 () |
genunix:fop_getattr+a8 () | genunix:xattr_dir_getattr+16c () |
genunix:fop_getattr+a8 () | nfssrv:rfs4_delegated_getattr+20 () |
nfssrv:acl3_getxattrdir+102 () | nfssrv:common_dispatch+5ab () |
nfssrv:acl_dispatch+2d () | rpcmod:svc_getreq+1c1 () | rpcmod:svc_run+e0 ()
| rpcmod:svc_do_run+8e () | nfs:nfssys+111 () | unix:brand_sys_sysenter+1d3
() |
                crashtime = 1535633923
                panic-time = Thu Aug 30 07:58:43 2018 CDT
        (end fault-list[0])

        fault-status = 0x1
        severity = Major
        __ttl = 0x1
        __tod = 0x5b87f13c 0x5546cf8

Let me know what other information I can provide here.

-Chip
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20180830/d170d519/attachment-0001.html>

From chip at innovates.com  Thu Aug 30 14:42:15 2018
From: chip at innovates.com (Schweiss, Chip)
Date: Thu, 30 Aug 2018 09:42:15 -0500
Subject: [OmniOS-discuss] Panic on OmniOS CE r151022ay
In-Reply-To: <CALeZrrSa-C7Gtj1BxWePvgS20PUHG3L+vF2Lz8j=ywVwrzujVQ@mail.gmail.com>
References: <CALeZrrSa-C7Gtj1BxWePvgS20PUHG3L+vF2Lz8j=ywVwrzujVQ@mail.gmail.com>
Message-ID: <CALeZrrRVxChKAG=B2T-Uj7rA9BRMG6SeHUrdC3XfP4gAb=6scg@mail.gmail.com>

Here's the dump from the panic:

ftp://ftp.nrg.wustl.edu/pub/zfs/mirpool03-xattr-20180830-vmdump.1


On Thu, Aug 30, 2018 at 9:29 AM, Schweiss, Chip <chip at innovates.com> wrote:

> I've seen this panic twice now in the past couple weeks.   Does anyone
> know if there is a patch already that fixes this?  Looks like another xattr
> problem.
>
> # fmdump -Vp -u b7c9840b-8bb1-cbbc-e165-a5b6fa34078b
> TIME                           UUID
>  SUNW-MSG-ID
> Aug 30 2018 08:29:32.089419000 b7c9840b-8bb1-cbbc-e165-a5b6fa34078b
> SUNOS-8000-KL
>
>   TIME                 CLASS                                 ENA
>   Aug 30 08:27:50.8299 ireport.os.sunos.panic.dump_pending_on_device
> 0x0000000000000000
>
> nvlist version: 0
>         version = 0x0
>         class = list.suspect
>         uuid = b7c9840b-8bb1-cbbc-e165-a5b6fa34078b
>         code = SUNOS-8000-KL
>         diag-time = 1535635766 223254
>         de = fmd:///module/software-diagnosis
>         fault-list-sz = 0x1
>         fault-list = (array of embedded nvlists)
>         (start fault-list[0])
>         nvlist version: 0
>                 version = 0x0
>                 class = defect.sunos.kernel.panic
>                 certainty = 0x64
>                 asru = sw:///:path=/var/crash//.b7c98
> 40b-8bb1-cbbc-e165-a5b6fa34078b
>                 resource = sw:///:path=/var/crash//.b7c98
> 40b-8bb1-cbbc-e165-a5b6fa34078b
>                 savecore-succcess = 0
>                 os-instance-uuid = b7c9840b-8bb1-cbbc-e165-a5b6fa34078b
>                 panicstr = BAD TRAP: type=d (#gp General protection)
> rp=ffffd001e9855360 addr=ffffd063784ee8d0
>                 panicstack = unix:real_mode_stop_cpu_stage2_end+b203 () |
> unix:trap+a70 () | unix:cmntrap+e6 () | zfs:zfs_getattr+1a0 () |
> genunix:fop_getattr+a8 () | genunix:xattr_dir_getattr+16c () |
> genunix:fop_getattr+a8 () | nfssrv:rfs4_delegated_getattr+20 () |
> nfssrv:acl3_getxattrdir+102 () | nfssrv:common_dispatch+5ab () |
> nfssrv:acl_dispatch+2d () | rpcmod:svc_getreq+1c1 () | rpcmod:svc_run+e0 ()
> | rpcmod:svc_do_run+8e () | nfs:nfssys+111 () | unix:brand_sys_sysenter+1d3
> () |
>                 crashtime = 1535633923
>                 panic-time = Thu Aug 30 07:58:43 2018 CDT
>         (end fault-list[0])
>
>         fault-status = 0x1
>         severity = Major
>         __ttl = 0x1
>         __tod = 0x5b87f13c 0x5546cf8
>
> Let me know what other information I can provide here.
>
> -Chip
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20180830/e72364a3/attachment.html>

From omnios at citrus-it.net  Thu Aug 30 22:08:56 2018
From: omnios at citrus-it.net (Andy Fiddaman)
Date: Thu, 30 Aug 2018 22:08:56 +0000 (UTC)
Subject: [OmniOS-discuss] Panic on OmniOS CE r151022ay
In-Reply-To: <CALeZrrRVxChKAG=B2T-Uj7rA9BRMG6SeHUrdC3XfP4gAb=6scg@mail.gmail.com>
References: <CALeZrrSa-C7Gtj1BxWePvgS20PUHG3L+vF2Lz8j=ywVwrzujVQ@mail.gmail.com>
	<CALeZrrRVxChKAG=B2T-Uj7rA9BRMG6SeHUrdC3XfP4gAb=6scg@mail.gmail.com>
Message-ID: <nycvar.TFB.7.76.1808302204150.23230@erncre.pvgehf-vg.arg>


On Thu, 30 Aug 2018, Schweiss, Chip wrote:

; >                 panicstack = unix:real_mode_stop_cpu_stage2_end+b203 () |
; > unix:trap+a70 () | unix:cmntrap+e6 () | zfs:zfs_getattr+1a0 () |
; > genunix:fop_getattr+a8 () | genunix:xattr_dir_getattr+16c () |
; > genunix:fop_getattr+a8 () | nfssrv:rfs4_delegated_getattr+20 () |
; > nfssrv:acl3_getxattrdir+102 () | nfssrv:common_dispatch+5ab () |
; > nfssrv:acl_dispatch+2d () | rpcmod:svc_getreq+1c1 () | rpcmod:svc_run+e0 ()
; > | rpcmod:svc_do_run+8e () | nfs:nfssys+111 () | unix:brand_sys_sysenter+1d3

That does look quite similar to issue 8806 that was fixed earlier in the
year. Can you check that the fix is in place on your box since you're
running a version of OmniOS from May.

If this produces any output, then the fix is missing, otherwise it's something
else.

	mdb -ke xattr_dir_inactive::dis | grep mutex

Please can you open an issue for this at
	https://github.com/omniosorg/illumos-omnios/issues/new
in the first instance as it may be OmniOS-specific?

Andy

-- 
Citrus IT Limited | +44 (0)333 0124 007 | enquiries at citrus-it.co.uk
Rock House Farm | Green Moor | Wortley | Sheffield | S35 7DQ
Registered in England and Wales | Company number 4899123


From chip at innovates.com  Fri Aug 31 13:09:09 2018
From: chip at innovates.com (Schweiss, Chip)
Date: Fri, 31 Aug 2018 08:09:09 -0500
Subject: [OmniOS-discuss] Panic on OmniOS CE r151022ay
In-Reply-To: <nycvar.TFB.7.76.1808302204150.23230@erncre.pvgehf-vg.arg>
References: <CALeZrrSa-C7Gtj1BxWePvgS20PUHG3L+vF2Lz8j=ywVwrzujVQ@mail.gmail.com>
	<CALeZrrRVxChKAG=B2T-Uj7rA9BRMG6SeHUrdC3XfP4gAb=6scg@mail.gmail.com>
	<nycvar.TFB.7.76.1808302204150.23230@erncre.pvgehf-vg.arg>
Message-ID: <CALeZrrRtD_Kzp8DaJvegY6hxD6pkUjEqvC4_xkM3emLkBFQ_kQ@mail.gmail.com>

Looks like the fix is missing:

# mdb -ke xattr_dir_inactive::dis | grep mutex
xattr_dir_inactive+0x1f:        call   -0x304cf4        <mutex_enter>
xattr_dir_inactive+0x3c:        call   -0x304bf1        <mutex_exit>
xattr_dir_inactive+0x73:        call   -0x304c28        <mutex_exit>

Looking closer I thought I had updated this system after the first crash
but did not.  However,  I had explicitly put that patch in place back in
January, but it may not have made it into later OmniOS CE releases that the
system was upgraded to.

I just ran the test on an r151022bk system and it passes.   I'll get this
system updated ASAP.

Thanks!
-Chip

On Thu, Aug 30, 2018 at 5:08 PM, Andy Fiddaman <omnios at citrus-it.net> wrote:

>
> On Thu, 30 Aug 2018, Schweiss, Chip wrote:
>
> ; >                 panicstack = unix:real_mode_stop_cpu_stage2_end+b203
> () |
> ; > unix:trap+a70 () | unix:cmntrap+e6 () | zfs:zfs_getattr+1a0 () |
> ; > genunix:fop_getattr+a8 () | genunix:xattr_dir_getattr+16c () |
> ; > genunix:fop_getattr+a8 () | nfssrv:rfs4_delegated_getattr+20 () |
> ; > nfssrv:acl3_getxattrdir+102 () | nfssrv:common_dispatch+5ab () |
> ; > nfssrv:acl_dispatch+2d () | rpcmod:svc_getreq+1c1 () |
> rpcmod:svc_run+e0 ()
> ; > | rpcmod:svc_do_run+8e () | nfs:nfssys+111 () |
> unix:brand_sys_sysenter+1d3
>
> That does look quite similar to issue 8806 that was fixed earlier in the
> year. Can you check that the fix is in place on your box since you're
> running a version of OmniOS from May.
>
> If this produces any output, then the fix is missing, otherwise it's
> something
> else.
>
>         mdb -ke xattr_dir_inactive::dis | grep mutex
>
> Please can you open an issue for this at
>         https://github.com/omniosorg/illumos-omnios/issues/new
> in the first instance as it may be OmniOS-specific?
>
> Andy
>
> --
> Citrus IT Limited | +44 (0)333 0124 007 | enquiries at citrus-it.co.uk
> Rock House Farm | Green Moor | Wortley | Sheffield | S35 7DQ
> Registered in England and Wales | Company number 4899123
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20180831/4bf8b226/attachment.html>