From steve at linuxsuite.org Tue Apr 1 14:17:40 2014 From: steve at linuxsuite.org (steve at linuxsuite.org) Date: Tue, 1 Apr 2014 10:17:40 -0400 Subject: [OmniOS-discuss] How to disable ata module / driver at boot In-Reply-To: <201403312331.s2VNVOIW011926@elvis.arl.psu.edu> References: <7409d33d8efc08eccda1cecdc31bd7ea.squirrel@emailmg.netfirms.com> <201403312331.s2VNVOIW011926@elvis.arl.psu.edu> Message-ID: <6cda07987dc35bb6735ccd08af13f165.squirrel@emailmg.netfirms.com> > In message > <7409d33d8efc08eccda1cecdc31bd7ea.squirrel at emailmg.netfirms.com>, st > eve at linuxsuite.org writes: >> May not be related, but I would like to reboot so that OmniOS >>does not >>see the device by not loading the driver / module. I do not need the >>device after >>system install.. > > disable-ata=true > > Thanks. Is there an entry that can be put into /etc/system that will prevent the module from loading also? -steve > John > groenveld at acm.org > From jdg117 at elvis.arl.psu.edu Tue Apr 1 15:18:07 2014 From: jdg117 at elvis.arl.psu.edu (John D Groenveld) Date: Tue, 01 Apr 2014 11:18:07 -0400 Subject: [OmniOS-discuss] How to disable ata module / driver at boot In-Reply-To: Your message of "Tue, 01 Apr 2014 10:17:40 EDT." <6cda07987dc35bb6735ccd08af13f165.squirrel@emailmg.netfirms.com> References: <7409d33d8efc08eccda1cecdc31bd7ea.squirrel@emailmg.netfirms.com> <201403312331.s2VNVOIW011926@elvis.arl.psu.edu> <6cda07987dc35bb6735ccd08af13f165.squirrel@emailmg.netfirms.com> Message-ID: <201404011518.s31FI7LJ021915@elvis.arl.psu.edu> In message <6cda07987dc35bb6735ccd08af13f165.squirrel at emailmg.netfirms.com>, st eve at linuxsuite.org writes: > Thanks. Is there an entry that can be put into /etc/system that >will prevent the module from loading also? exclude: ata John groenveld at acm.org From groups at tierarzt-mueller.de Wed Apr 2 11:47:52 2014 From: groups at tierarzt-mueller.de (Alexander Lesle) Date: Wed, 2 Apr 2014 13:47:52 +0200 Subject: [OmniOS-discuss] Bad kernel fault at addr=0xffffff04f8e08b88 Message-ID: <8210579099.20140402134752@tierarzt-mueller.de> Hello All I have had a kernel panic and dont know what happend. Message on console: Apr 2 12:19:42 aio fmd: [ID 377184 daemon.error] SUNW-MSG-ID: SUNOS-8000-KL, TYPE: Defect, VER: 1, SEVERITY: Major Apr 2 12:19:42 aio EVENT-TIME: Mi. Apr 2 12:19:42 CEST 2014 Apr 2 12:19:42 aio PLATFORM: VMware-Virtual-Platform, CSN: VMware-56-4d-8a-b3-c5-36-3b-b8-27-ef-49-0b-c8-94-81-50, HOSTNAME: aio Apr 2 12:19:42 aio SOURCE: software-diagnosis, REV: 0.1 Apr 2 12:19:42 aio EVENT-ID: 1630fc26-9694-e811-803c-956e16302b39 Apr 2 12:19:42 aio DESC: The system has rebooted after a kernel panic. Refer to http://illumos.org/msg/SUNOS-8000-KL for more information. Apr 2 12:19:42 aio AUTO-RESPONSE: The failed system image was dumped to the dump device. If savecore is enabled (see dumpadm(1M)) a copy of the dump will be written to the savecore directory /var/crash/unknown. Apr 2 12:19:42 aio IMPACT: There may be some performance impact while the panic is copied to the savecore directory. Disk space usage by panics can be substantial. Apr 2 12:19:42 aio REC-ACTION: If savecore is not enabled then please take steps to preserve the crash image. Apr 2 12:19:42 aio Use 'fmdump -Vp -u 1630fc26-9694-e811-803c-956e16302b39' to view more panic detail. Please refer to the knowledge article for addi But what is defect?? Apr 2 12:19:44 aio ^Mpanic[cpu1]/thread=ffffff04ebd5b840: Apr 2 12:19:44 aio genunix: [ID 683410 kern.notice] BAD TRAP: type=e (#pf Page fault) rp=ffffff001f591630 addr=ffffff04f8e08b88 Apr 2 12:19:44 aio unix: [ID 100000 kern.notice] Apr 2 12:19:44 aio unix: [ID 839527 kern.notice] nc: Apr 2 12:19:44 aio unix: [ID 753105 kern.notice] #pf Page fault Apr 2 12:19:44 aio unix: [ID 532287 kern.notice] Bad kernel fault at addr=0xffffff04f8e08b88 Apr 2 12:19:44 aio unix: [ID 243837 kern.notice] pid=10842, pc=0xfffffffffbb34880, sp=0xffffff001f591720, eflags=0x10282 Apr 2 12:19:44 aio unix: [ID 211416 kern.notice] cr0: 8005003b cr4: 406b8 Apr 2 12:19:44 aio unix: [ID 624947 kern.notice] cr2: ffffff04f8e08b88 Apr 2 12:19:44 aio unix: [ID 625075 kern.notice] cr3: 436a09000 Apr 2 12:19:44 aio unix: [ID 625715 kern.notice] cr8: c Apr 2 12:19:44 aio unix: [ID 100000 kern.notice] Apr 2 12:19:44 aio unix: [ID 592667 kern.notice] rdi: ffffff001f5917b8 rsi: ffffff04f8e08b88 rdx: 80bd000 Apr 2 12:19:44 aio unix: [ID 592667 kern.notice] rcx: 0 r8: 0 r9: 2 Apr 2 12:19:44 aio unix: [ID 592667 kern.notice] rax: ffffffff rbx: ffffff04edba0408 rbp: ffffff001f591730 Apr 2 12:19:44 aio unix: [ID 592667 kern.notice] r10: fffffffffbcf3500 r11: 0 r12: ffffff001f5917b8 Apr 2 12:19:44 aio unix: [ID 592667 kern.notice] r13: ffffff04f8e08ba8 r14: ffffff04f8e08b88 r15: 20 Apr 2 12:19:44 aio unix: [ID 592667 kern.notice] fsb: 0 gsb: ffffff04ea0e1580 ds: 4b Apr 2 12:19:44 aio unix: [ID 592667 kern.notice] es: 4b fs: 0 gs: 1c3 Apr 2 12:19:44 aio unix: [ID 592667 kern.notice] trp: e err: 0 rip: fffffffffbb34880 Apr 2 12:19:44 aio unix: [ID 592667 kern.notice] cs: 30 rfl: 10282 rsp: ffffff001f591720 Apr 2 12:19:44 aio unix: [ID 266532 kern.notice] ss: 38 Apr 2 12:19:44 aio unix: [ID 100000 kern.notice] What I have done? Its running a replicate job (nappit) to backup a FS from aio_server to backup_server. Any hints? -- Best Regards Alexander April, 02 2014 -------------- next part -------------- A non-text attachment was scrubbed... Name: kernelcrash.PNG Type: image/png Size: 141570 bytes Desc: not available URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: messages.txt URL: From ben at fluffy.co.uk Wed Apr 2 12:01:04 2014 From: ben at fluffy.co.uk (Ben Summers) Date: Wed, 2 Apr 2014 13:01:04 +0100 Subject: [OmniOS-discuss] Bad kernel fault at addr=0xffffff04f8e08b88 In-Reply-To: <8210579099.20140402134752@tierarzt-mueller.de> References: <8210579099.20140402134752@tierarzt-mueller.de> Message-ID: <204F8923-878C-4520-869D-07951FAFE6EB@fluffy.co.uk> Alexander I note this is a VMware VM. If you install VMware tools, you will get crashes when you power off the VM in some versions of VMware. Which VMware are you using? And can you try it without the "VMware Host-Guest Filesystem" and "vmblock" features? Ben On 2 Apr 2014, at 12:47, Alexander Lesle wrote: > Hello All > > I have had a kernel panic and dont know what happend. > > Message on console: > Apr 2 12:19:42 aio fmd: [ID 377184 daemon.error] SUNW-MSG-ID: SUNOS-8000-KL, TYPE: Defect, VER: 1, SEVERITY: Major > Apr 2 12:19:42 aio EVENT-TIME: Mi. Apr 2 12:19:42 CEST 2014 > Apr 2 12:19:42 aio PLATFORM: VMware-Virtual-Platform, CSN: VMware-56-4d-8a-b3-c5-36-3b-b8-27-ef-49-0b-c8-94-81-50, HOSTNAME: aio > Apr 2 12:19:42 aio SOURCE: software-diagnosis, REV: 0.1 > Apr 2 12:19:42 aio EVENT-ID: 1630fc26-9694-e811-803c-956e16302b39 > Apr 2 12:19:42 aio DESC: The system has rebooted after a kernel panic. Refer to http://illumos.org/msg/SUNOS-8000-KL for more information. > Apr 2 12:19:42 aio AUTO-RESPONSE: The failed system image was dumped to the dump device. If savecore is enabled (see dumpadm(1M)) a copy of the dump will be written to the savecore directory /var/crash/unknown. > Apr 2 12:19:42 aio IMPACT: There may be some performance impact while the panic is copied to the savecore directory. Disk space usage by panics can be substantial. > Apr 2 12:19:42 aio REC-ACTION: If savecore is not enabled then please take steps to preserve the crash image. > Apr 2 12:19:42 aio Use 'fmdump -Vp -u 1630fc26-9694-e811-803c-956e16302b39' to view more panic detail. Please refer to the knowledge article for addi > > But what is defect?? > > Apr 2 12:19:44 aio ^Mpanic[cpu1]/thread=ffffff04ebd5b840: > Apr 2 12:19:44 aio genunix: [ID 683410 kern.notice] BAD TRAP: type=e (#pf Page fault) rp=ffffff001f591630 addr=ffffff04f8e08b88 > Apr 2 12:19:44 aio unix: [ID 100000 kern.notice] > Apr 2 12:19:44 aio unix: [ID 839527 kern.notice] nc: > Apr 2 12:19:44 aio unix: [ID 753105 kern.notice] #pf Page fault > Apr 2 12:19:44 aio unix: [ID 532287 kern.notice] Bad kernel fault at addr=0xffffff04f8e08b88 > Apr 2 12:19:44 aio unix: [ID 243837 kern.notice] pid=10842, pc=0xfffffffffbb34880, sp=0xffffff001f591720, eflags=0x10282 > Apr 2 12:19:44 aio unix: [ID 211416 kern.notice] cr0: 8005003b cr4: 406b8 > Apr 2 12:19:44 aio unix: [ID 624947 kern.notice] cr2: ffffff04f8e08b88 > Apr 2 12:19:44 aio unix: [ID 625075 kern.notice] cr3: 436a09000 > Apr 2 12:19:44 aio unix: [ID 625715 kern.notice] cr8: c > Apr 2 12:19:44 aio unix: [ID 100000 kern.notice] > Apr 2 12:19:44 aio unix: [ID 592667 kern.notice] rdi: ffffff001f5917b8 rsi: ffffff04f8e08b88 rdx: 80bd000 > Apr 2 12:19:44 aio unix: [ID 592667 kern.notice] rcx: 0 r8: 0 r9: 2 > Apr 2 12:19:44 aio unix: [ID 592667 kern.notice] rax: ffffffff rbx: ffffff04edba0408 rbp: ffffff001f591730 > Apr 2 12:19:44 aio unix: [ID 592667 kern.notice] r10: fffffffffbcf3500 r11: 0 r12: ffffff001f5917b8 > Apr 2 12:19:44 aio unix: [ID 592667 kern.notice] r13: ffffff04f8e08ba8 r14: ffffff04f8e08b88 r15: 20 > Apr 2 12:19:44 aio unix: [ID 592667 kern.notice] fsb: 0 gsb: ffffff04ea0e1580 ds: 4b > Apr 2 12:19:44 aio unix: [ID 592667 kern.notice] es: 4b fs: 0 gs: 1c3 > Apr 2 12:19:44 aio unix: [ID 592667 kern.notice] trp: e err: 0 rip: fffffffffbb34880 > Apr 2 12:19:44 aio unix: [ID 592667 kern.notice] cs: 30 rfl: 10282 rsp: ffffff001f591720 > Apr 2 12:19:44 aio unix: [ID 266532 kern.notice] ss: 38 > Apr 2 12:19:44 aio unix: [ID 100000 kern.notice] > > What I have done? Its running a replicate job (nappit) to backup > a FS from aio_server to backup_server. > > Any hints? > > -- > Best Regards > Alexander > April, 02 2014_______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss -- http://bens.me.uk From dswartz at druber.com Wed Apr 2 12:13:30 2014 From: dswartz at druber.com (Dan Swartzendruber) Date: Wed, 02 Apr 2014 08:13:30 -0400 Subject: [OmniOS-discuss] Bad kernel fault at addr=0xffffff04f8e08b88 Message-ID: I have had this with current omnios and esxi 5?1 but not guest vs. Haven't tried with esxi 5?5 yet. Ben Summers wrote: > >Alexander > >I note this is a VMware VM. If you install VMware tools, you will get crashes when you power off the VM in some versions of VMware. > >Which VMware are you using? And can you try it without the "VMware Host-Guest Filesystem" and "vmblock" features? > >Ben > > > > >On 2 Apr 2014, at 12:47, Alexander Lesle wrote: > >> Hello All >> >> I have had a kernel panic and dont know what happend. >> >> Message on console: >> Apr 2 12:19:42 aio fmd: [ID 377184 daemon.error] SUNW-MSG-ID: SUNOS-8000-KL, TYPE: Defect, VER: 1, SEVERITY: Major >> Apr 2 12:19:42 aio EVENT-TIME: Mi. Apr 2 12:19:42 CEST 2014 >> Apr 2 12:19:42 aio PLATFORM: VMware-Virtual-Platform, CSN: VMware-56-4d-8a-b3-c5-36-3b-b8-27-ef-49-0b-c8-94-81-50, HOSTNAME: aio >> Apr 2 12:19:42 aio SOURCE: software-diagnosis, REV: 0.1 >> Apr 2 12:19:42 aio EVENT-ID: 1630fc26-9694-e811-803c-956e16302b39 >> Apr 2 12:19:42 aio DESC: The system has rebooted after a kernel panic. Refer to http://illumos.org/msg/SUNOS-8000-KL for more information. >> Apr 2 12:19:42 aio AUTO-RESPONSE: The failed system image was dumped to the dump device. If savecore is enabled (see dumpadm(1M)) a copy of the dump will be written to the savecore directory /var/crash/unknown. >> Apr 2 12:19:42 aio IMPACT: There may be some performance impact while the panic is copied to the savecore directory. Disk space usage by panics can be substantial. >> Apr 2 12:19:42 aio REC-ACTION: If savecore is not enabled then please take steps to preserve the crash image. >> Apr 2 12:19:42 aio Use 'fmdump -Vp -u 1630fc26-9694-e811-803c-956e16302b39' to view more panic detail. Please refer to the knowledge article for addi >> >> But what is defect?? >> >> Apr 2 12:19:44 aio ^Mpanic[cpu1]/thread=ffffff04ebd5b840: >> Apr 2 12:19:44 aio genunix: [ID 683410 kern.notice] BAD TRAP: type=e (#pf Page fault) rp=ffffff001f591630 addr=ffffff04f8e08b88 >> Apr 2 12:19:44 aio unix: [ID 100000 kern.notice] >> Apr 2 12:19:44 aio unix: [ID 839527 kern.notice] nc: >> Apr 2 12:19:44 aio unix: [ID 753105 kern.notice] #pf Page fault >> Apr 2 12:19:44 aio unix: [ID 532287 kern.notice] Bad kernel fault at addr=0xffffff04f8e08b88 >> Apr 2 12:19:44 aio unix: [ID 243837 kern.notice] pid=10842, pc=0xfffffffffbb34880, sp=0xffffff001f591720, eflags=0x10282 >> Apr 2 12:19:44 aio unix: [ID 211416 kern.notice] cr0: 8005003b cr4: 406b8 >> Apr 2 12:19:44 aio unix: [ID 624947 kern.notice] cr2: ffffff04f8e08b88 >> Apr 2 12:19:44 aio unix: [ID 625075 kern.notice] cr3: 436a09000 >> Apr 2 12:19:44 aio unix: [ID 625715 kern.notice] cr8: c >> Apr 2 12:19:44 aio unix: [ID 100000 kern.notice] >> Apr 2 12:19:44 aio unix: [ID 592667 kern.notice] rdi: ffffff001f5917b8 rsi: ffffff04f8e08b88 rdx: 80bd000 >> Apr 2 12:19:44 aio unix: [ID 592667 kern.notice] rcx: 0 r8: 0 r9: 2 >> Apr 2 12:19:44 aio unix: [ID 592667 kern.notice] rax: ffffffff rbx: ffffff04edba0408 rbp: ffffff001f591730 >> Apr 2 12:19:44 aio unix: [ID 592667 kern.notice] r10: fffffffffbcf3500 r11: 0 r12: ffffff001f5917b8 >> Apr 2 12:19:44 aio unix: [ID 592667 kern.notice] r13: ffffff04f8e08ba8 r14: ffffff04f8e08b88 r15: 20 >> Apr 2 12:19:44 aio unix: [ID 592667 kern.notice] fsb: 0 gsb: ffffff04ea0e1580 ds: 4b >> Apr 2 12:19:44 aio unix: [ID 592667 kern.notice] es: 4b fs: 0 gs: 1c3 >> Apr 2 12:19:44 aio unix: [ID 592667 kern.notice] trp: e err: 0 rip: fffffffffbb34880 >> Apr 2 12:19:44 aio unix: [ID 592667 kern.notice] cs: 30 rfl: 10282 rsp: ffffff001f591720 >> Apr 2 12:19:44 aio unix: [ID 266532 kern.notice] ss: 38 >> Apr 2 12:19:44 aio unix: [ID 100000 kern.notice] >> >> What I have done? Its running a replicate job (nappit) to backup >> a FS from aio_server to backup_server. >> >> Any hints? >> >> -- >> Best Regards >> Alexander >> April, 02 2014_______________________________________________ >> OmniOS-discuss mailing list >> OmniOS-discuss at lists.omniti.com >> http://lists.omniti.com/mailman/listinfo/omnios-discuss > >-- >http://bens.me.uk > >_______________________________________________ >OmniOS-discuss mailing list >OmniOS-discuss at lists.omniti.com >http://lists.omniti.com/mailman/listinfo/omnios-discuss From groups at tierarzt-mueller.de Wed Apr 2 14:46:50 2014 From: groups at tierarzt-mueller.de (Alexander Lesle) Date: Wed, 2 Apr 2014 16:46:50 +0200 Subject: [OmniOS-discuss] Bad kernel fault at addr=0xffffff04f8e08b88 In-Reply-To: <204F8923-878C-4520-869D-07951FAFE6EB@fluffy.co.uk> References: <8210579099.20140402134752@tierarzt-mueller.de> <204F8923-878C-4520-869D-07951FAFE6EB@fluffy.co.uk> Message-ID: <1116464220.20140402164650@tierarzt-mueller.de> Hello Ben Summers and List, On April, 02 2014, 14:01 wrote in [1]: > I note this is a VMware VM. If you install VMware tools, you will > get crashes when you power off the VM in some versions of VMware. > Which VMware are you using? Yes Omnios is on ESXi5.5 with 2 vnics vmxnet3. But I dont power off I copies a snapshot from the VM to my standalone Omnios backup-server. (Napp-it/Job/replicate) > And can you try it without the "VMware > Host-Guest Filesystem" and "vmblock" features? Dont understand what I have to do. Can you explain it for me, please. -- > On 2 Apr 2014, at 12:47, Alexander Lesle > wrote: >> Hello All >> >> I have had a kernel panic and dont know what happend. >> >> Message on console: >> Apr 2 12:19:42 aio fmd: [ID 377184 daemon.error] SUNW-MSG-ID: SUNOS-8000-KL, TYPE: Defect, VER: 1, SEVERITY: Major >> Apr 2 12:19:42 aio EVENT-TIME: Mi. Apr 2 12:19:42 CEST 2014 >> Apr 2 12:19:42 aio PLATFORM: VMware-Virtual-Platform, CSN: VMware-56-4d-8a-b3-c5-36-3b-b8-27-ef-49-0b-c8-94-81-50, HOSTNAME: aio >> Apr 2 12:19:42 aio SOURCE: software-diagnosis, REV: 0.1 >> Apr 2 12:19:42 aio EVENT-ID: 1630fc26-9694-e811-803c-956e16302b39 >> Apr 2 12:19:42 aio DESC: The system has rebooted after a kernel panic. Refer to http://illumos.org/msg/SUNOS-8000-KL for more information. >> Apr 2 12:19:42 aio AUTO-RESPONSE: The failed system image was dumped to the dump device. If savecore is enabled (see dumpadm(1M)) a copy of the dump will be written to the savecore directory /var/crash/unknown. >> Apr 2 12:19:42 aio IMPACT: There may be some performance impact while the panic is copied to the savecore directory. Disk space usage by panics can be substantial. >> Apr 2 12:19:42 aio REC-ACTION: If savecore is not enabled then please take steps to preserve the crash image. >> Apr 2 12:19:42 aio Use 'fmdump -Vp -u 1630fc26-9694-e811-803c-956e16302b39' to view more panic detail. Please refer to the knowledge article for addi >> >> But what is defect?? >> >> Apr 2 12:19:44 aio ^Mpanic[cpu1]/thread=ffffff04ebd5b840: >> Apr 2 12:19:44 aio genunix: [ID 683410 kern.notice] BAD TRAP: type=e (#pf Page fault) rp=ffffff001f591630 addr=ffffff04f8e08b88 >> Apr 2 12:19:44 aio unix: [ID 100000 kern.notice] >> Apr 2 12:19:44 aio unix: [ID 839527 kern.notice] nc: >> Apr 2 12:19:44 aio unix: [ID 753105 kern.notice] #pf Page fault >> Apr 2 12:19:44 aio unix: [ID 532287 kern.notice] Bad kernel fault at addr=0xffffff04f8e08b88 >> Apr 2 12:19:44 aio unix: [ID 243837 kern.notice] pid=10842, pc=0xfffffffffbb34880, sp=0xffffff001f591720, eflags=0x10282 >> Apr 2 12:19:44 aio unix: [ID 211416 kern.notice] cr0: 8005003b cr4: 406b8 >> Apr 2 12:19:44 aio unix: [ID 624947 kern.notice] cr2: ffffff04f8e08b88 >> Apr 2 12:19:44 aio unix: [ID 625075 kern.notice] cr3: 436a09000 >> Apr 2 12:19:44 aio unix: [ID 625715 kern.notice] cr8: c >> Apr 2 12:19:44 aio unix: [ID 100000 kern.notice] >> Apr 2 12:19:44 aio unix: [ID 592667 kern.notice] rdi: ffffff001f5917b8 rsi: ffffff04f8e08b88 rdx: 80bd000 >> Apr 2 12:19:44 aio unix: [ID 592667 kern.notice] rcx: 0 r8: 0 r9: 2 >> Apr 2 12:19:44 aio unix: [ID 592667 kern.notice] rax: ffffffff rbx: ffffff04edba0408 rbp: ffffff001f591730 >> Apr 2 12:19:44 aio unix: [ID 592667 kern.notice] r10: fffffffffbcf3500 r11: 0 r12: ffffff001f5917b8 >> Apr 2 12:19:44 aio unix: [ID 592667 kern.notice] r13: ffffff04f8e08ba8 r14: ffffff04f8e08b88 r15: 20 >> Apr 2 12:19:44 aio unix: [ID 592667 kern.notice] fsb: 0 gsb: ffffff04ea0e1580 ds: 4b >> Apr 2 12:19:44 aio unix: [ID 592667 kern.notice] es: 4b fs: 0 gs: 1c3 >> Apr 2 12:19:44 aio unix: [ID 592667 kern.notice] trp: e err: 0 rip: fffffffffbb34880 >> Apr 2 12:19:44 aio unix: [ID 592667 kern.notice] cs: 30 rfl: 10282 rsp: ffffff001f591720 >> Apr 2 12:19:44 aio unix: [ID 266532 kern.notice] ss: 38 >> Apr 2 12:19:44 aio unix: [ID 100000 kern.notice] >> >> What I have done? Its running a replicate job (nappit) to backup >> a FS from aio_server to backup_server. >> >> Any hints? >> >> -- >> Best Regards >> Alexander >> April, 02 2014_______________________________________________ >> OmniOS-discuss mailing list >> OmniOS-discuss at lists.omniti.com >> http://lists.omniti.com/mailman/listinfo/omnios-discuss > -- > http://bens.me.uk -- Best Regards Alexander April, 02 2014 ........ [1] mid:204F8923-878C-4520-869D-07951FAFE6EB at fluffy.co.uk ........ From groups at tierarzt-mueller.de Wed Apr 2 14:54:36 2014 From: groups at tierarzt-mueller.de (Alexander Lesle) Date: Wed, 2 Apr 2014 16:54:36 +0200 Subject: [OmniOS-discuss] Bad kernel fault at addr=0xffffff04f8e08b88 In-Reply-To: References: Message-ID: <1877857159.20140402165436@tierarzt-mueller.de> Hello Ben, Dan and List, I have forgotten to send the output from fmdump: root at aio:~# fmdump -Vp -u 1630fc26-9694-e811-803c-956e16302b39 TIME UUID SUNW-MSG-ID Apr 02 2014 12:19:42.085291000 1630fc26-9694-e811-803c-956e16302b39 SUNOS-8000-KL TIME CLASS ENA Apr 02 12:19:42.0809 ireport.os.sunos.panic.dump_available 0x0000000000000000 Apr 02 12:20:24.9371 ireport.os.sunos.panic.dump_pending_on_device 0x0000000000000000 nvlist version: 0 version = 0x0 class = list.suspect uuid = 1630fc26-9694-e811-803c-956e16302b39 code = SUNOS-8000-KL diag-time = 1396433982 81733 de = fmd:///module/software-diagnosis fault-list-sz = 0x1 fault-list = (array of embedded nvlists) (start fault-list[0]) nvlist version: 0 version = 0x0 class = defect.sunos.kernel.panic certainty = 0x64 asru = sw:///:path=/var/crash/unknown/.1630fc26-9694-e811-803c-956e16302b39 resource = sw:///:path=/var/crash/unknown/.1630fc26-9694-e811-803c-956e16302b39 savecore-succcess = 1 dump-dir = /var/crash/unknown dump-files = vmdump.1 os-instance-uuid = 1630fc26-9694-e811-803c-956e16302b39 panicstr = BAD TRAP: type=e (#pf Page fault) rp=ffffff001f591630 addr=ffffff04f8e08b88 panicstack = unix:real_mode_stop_cpu_stage2_end+9de3 () | unix:trap+db3 () | unix:cmntrap+e6 () | genunix:as_segcompar+10 () | genunix:avl_find+72 () | genunix:as_segat+3d () | genunix:as_fault+27a () | unix:pagefault+96 () | unix:trap+d23 () | unix:cmntrap+e6 () | unix:bcopy_altentry+55a () | genunix:uiomove+f8 () | fifofs:fifo_read+192 () | genunix:fop_read+5b () | genunix:read+2a7 () | genunix:read32+1e () | unix:brand_sys_sysenter+1c9 () | crashtime = 1396433985 panic-time = 2. April 2014 12:19:45 CEST CEST (end fault-list[0]) fault-status = 0x1 severity = Major __ttl = 0x1 __tod = 0x533be43e 0x5156ff8 root at aio:~# Hope it helps to help me. :-) -- Best Regards Alexander April, 02 2014 ........ [1] mid:hl2pasmm82socvki2ad8vcs0.1396440810905 at email.android.com ........ From dswartz at druber.com Fri Apr 4 19:58:05 2014 From: dswartz at druber.com (Dan Swartzendruber) Date: Fri, 4 Apr 2014 15:58:05 -0400 Subject: [OmniOS-discuss] Installing local packages? In-Reply-To: <1393535433.707.6.camel@exilis.si-consulting.us> References: <1393535433.707.6.camel@exilis.si-consulting.us> Message-ID: <3b67171d9cc159088bcb71bc9862f73a.squirrel@webmail.druber.com> Okay, this has got to be something incredibly obvious and stupid, but I've been looking at it for an hour and am stumped. I wanted to try a pacemaker active/passive cluster using omnios. I found saso's guide on zfs-create.blogspot.com. So I downloaded, bunzipped and untarred his archive. He says to install the prebuild packages in the prebuilt_packages subdir using the pkgadd command. No matter what I do, I can't get this to work. The four packages are all gzipped, but after copying them to /var/spool/pkg, whether I gunzip them or not, I get: pkgadd: ERROR: no packages were found in Yet: root at vsa3:/var/spool/pkg# ls -l total 65461 -rw-r--r-- 1 root root 17152000 Apr 4 15:56 CNCclusterglue.pkg -rw-r--r-- 1 root root 13154816 Apr 4 15:56 CNCheartbeat.pkg -rw-r--r-- 1 root root 39056896 Apr 4 15:56 CNCpacemaker.pkg -rw-r--r-- 1 root root 1488896 Apr 4 15:56 CNCrsrcagents.pkg FWIW, if I try 'pkgadd -d .' while in the prebuild_packages subdir, I get the same message, only referring to that subdir. Google has been 1000% useless for resolving this. Any tips would be much appreciated. Thanks! From danmcd at omniti.com Fri Apr 4 20:07:03 2014 From: danmcd at omniti.com (Dan McDonald) Date: Fri, 4 Apr 2014 16:07:03 -0400 Subject: [OmniOS-discuss] Installing local packages? In-Reply-To: <3b67171d9cc159088bcb71bc9862f73a.squirrel@webmail.druber.com> References: <1393535433.707.6.camel@exilis.si-consulting.us> <3b67171d9cc159088bcb71bc9862f73a.squirrel@webmail.druber.com> Message-ID: <27167B29-1E7A-4DE2-B4A8-BBA3A3FF09AF@omniti.com> On Apr 4, 2014, at 3:58 PM, Dan Swartzendruber wrote: > > Okay, this has got to be something incredibly obvious and stupid, but I've > been looking at it for an hour and am stumped. I wanted to try a > pacemaker active/passive cluster using omnios. I found saso's guide on > zfs-create.blogspot.com. So I downloaded, bunzipped and untarred his > archive. He says to install the prebuild packages in the > prebuilt_packages subdir using the pkgadd command. No matter what I do, I > can't get this to work. The four packages are all gzipped, but after > copying them to /var/spool/pkg, whether I gunzip them or not, I get: > > pkgadd: ERROR: no packages were found in > > Yet: > > root at vsa3:/var/spool/pkg# ls -l > total 65461 > -rw-r--r-- 1 root root 17152000 Apr 4 15:56 CNCclusterglue.pkg > -rw-r--r-- 1 root root 13154816 Apr 4 15:56 CNCheartbeat.pkg > -rw-r--r-- 1 root root 39056896 Apr 4 15:56 CNCpacemaker.pkg > -rw-r--r-- 1 root root 1488896 Apr 4 15:56 CNCrsrcagents.pkg > > FWIW, if I try 'pkgadd -d .' while in the prebuild_packages subdir, I get > the same message, only referring to that subdir. pkgadd -d CNCclusterglue.pkg That should give you whateever's in that .pkg file. Repeat with the other .pkg files. Dan From dswartz at druber.com Fri Apr 4 20:14:19 2014 From: dswartz at druber.com (Dan Swartzendruber) Date: Fri, 4 Apr 2014 16:14:19 -0400 Subject: [OmniOS-discuss] Installing local packages? In-Reply-To: <27167B29-1E7A-4DE2-B4A8-BBA3A3FF09AF@omniti.com> References: <1393535433.707.6.camel@exilis.si-consulting.us> <3b67171d9cc159088bcb71bc9862f73a.squirrel@webmail.druber.com> <27167B29-1E7A-4DE2-B4A8-BBA3A3FF09AF@omniti.com> Message-ID: > ly referring to that subdir. > > pkgadd -d CNCclusterglue.pkg > > That should give you whateever's in that .pkg file. Repeat with the other > .pkg files. Ah, okay, that's got it, thanks! Kinda puzzle at the manpage which seems to be telling me if I do 'pkgadd' with no arguments, it will serve up any packages in /var/spool/pkg and if I give '-d SOMEDIR', it will do so for 'SOMEDIR'. Hmmm... From esproul at omniti.com Fri Apr 4 20:58:15 2014 From: esproul at omniti.com (Eric Sproul) Date: Fri, 4 Apr 2014 16:58:15 -0400 Subject: [OmniOS-discuss] Installing local packages? In-Reply-To: References: <1393535433.707.6.camel@exilis.si-consulting.us> <3b67171d9cc159088bcb71bc9862f73a.squirrel@webmail.druber.com> <27167B29-1E7A-4DE2-B4A8-BBA3A3FF09AF@omniti.com> Message-ID: On Fri, Apr 4, 2014 at 4:14 PM, Dan Swartzendruber wrote: > Ah, okay, that's got it, thanks! Kinda puzzle at the manpage which seems > to be telling me if I do 'pkgadd' with no arguments, it will serve up any > packages in /var/spool/pkg and if I give '-d SOMEDIR', it will do so for > 'SOMEDIR'. Hmmm... These are SVR4 packages, which can exist either as a "datastream" (single-file archive) or as a "file system", which is a directory layout. In SVR4 parlance, -d means "device" which could be a file, directory or any other block or character device. The man page of pkgtrans(1) has gory details if you're morbidly curious. Eric From dswartz at druber.com Fri Apr 4 21:05:46 2014 From: dswartz at druber.com (Dan Swartzendruber) Date: Fri, 4 Apr 2014 17:05:46 -0400 Subject: [OmniOS-discuss] Installing local packages? In-Reply-To: References: <1393535433.707.6.camel@exilis.si-consulting.us> <3b67171d9cc159088bcb71bc9862f73a.squirrel@webmail.druber.com> <27167B29-1E7A-4DE2-B4A8-BBA3A3FF09AF@omniti.com> Message-ID: > On Fri, Apr 4, 2014 at 4:14 PM, Dan Swartzendruber > wrote: >> Ah, okay, that's got it, thanks! Kinda puzzle at the manpage which >> seems >> to be telling me if I do 'pkgadd' with no arguments, it will serve up >> any >> packages in /var/spool/pkg and if I give '-d SOMEDIR', it will do so for >> 'SOMEDIR'. Hmmm... > > These are SVR4 packages, which can exist either as a "datastream" > (single-file archive) or as a "file system", which is a directory > layout. In SVR4 parlance, -d means "device" which could be a file, > directory or any other block or character device. Yeah, I get that. What I don't get is the manpage telling me 'pkgadd -d /foo' will install any packages in the directory '/foo', but it doesn't :( Either I am stupid or the wording in the manpage is confusing. At any rate, it worked finally :) From johan.kragsterman at capvert.se Mon Apr 7 09:19:05 2014 From: johan.kragsterman at capvert.se (Johan Kragsterman) Date: Mon, 7 Apr 2014 11:19:05 +0200 Subject: [OmniOS-discuss] crash Message-ID: Hej! Got a crash here, that I would like someone have a look at. Hardware is a Dell T5500 workstation with dual Xeon L5520 and 36 GB ram, OS/rpool on an Intel SSD SLC, "mainppol" on mirrored Seagate ST4000VN000(new) with an SSD Samsung 840 EVO(new) as L2arc. Disabled bge0 on mo'bo', and a quad intel gbit nic as the working interfaces. I run a single kvm vm, edubuntu 13.10 on the machine. The crash came when I built a new chroot environment for the ltsp thin client system. I give you the info about the crash and what I've done to get it visible here: OmniOS 5.11 omnios-6de5e81 2013.11.27 OmniOS v11 r151008 root at omni:/var/crash/unknown# ls bounds unix.0 vmcore.0 vmdump.0 root at omni:/var/crash/unknown# mdb -k unix.0 vmcore.0 Loading modules: [ unix genunix specfs dtrace mac cpu.generic uppc pcplusmp scsi_vhci zfs sata sd ip hook neti sockfs arp usba uhci stmf stmf_sbd md lofs random idm nfs crypto ptm kvm cpc smbsrv ufs logindmux nsmb ] > ::status debugging crash dump vmcore.0 (64-bit) from omni operating system: 5.11 omnios-6de5e81 (i86pc) image uuid: a5e10116-5ed1-68ce-eba1-86f6ade3d5f5 panic message: I/O to pool 'mainpool' appears to be hung. dump content: kernel pages only > ::stack vpanic() vdev_deadman+0x10b(ffffff0a277f0540) vdev_deadman+0x4a(ffffff0a1eea6040) vdev_deadman+0x4a(ffffff0a1dfea580) spa_deadman+0xad(ffffff0a1cd8a580) cyclic_softint+0xf3(fffffffffbc30d20, 0) cbe_low_level+0x14() av_dispatch_softvect+0x78(2) dispatch_softint+0x39(0, 0) switch_sp_and_call+0x13() dosoftint+0x44(ffffff0045805a50) do_interrupt+0xba(ffffff0045805a50, 1) _interrupt+0xba() acpi_cpu_cstate+0x11b(ffffff0a1ce9e670) cpu_acpi_idle+0x8d() cpu_idle_adaptive+0x13() idle+0xa7() thread_start+8() > ::msgbuf MESSAGE NOTICE: vnic1001 link down NOTICE: e1000g3 link up, 1000 Mbps, full duplex NOTICE: vnic1001 link up, 1000 Mbps, unknown duplex unhandled rdmsr: 0xff31ca8c unhandled wrmsr: 0x526849 data 8 unhandled rdmsr: 0xff31ca8c unhandled wrmsr: 0x526849 data 8 unhandled wrmsr: 0x0 data 0 unhandled wrmsr: 0x0 data 0 unhandled wrmsr: 0x0 data 0 unhandled wrmsr: 0x0 data 0 unhandled wrmsr: 0x0 data 0 unhandled wrmsr: 0x0 data 0 unhandled wrmsr: 0x0 data 0 unhandled wrmsr: 0x0 data 0 unhandled wrmsr: 0x0 data 0 unhandled wrmsr: 0x0 data 0 unhandled wrmsr: 0x0 data 0 unhandled wrmsr: 0x0 data 0 unhandled wrmsr: 0x0 data 0 unhandled wrmsr: 0x0 data 0 unhandled wrmsr: 0x0 data 0 unhandled wrmsr: 0x0 data 0 vcpu 1 received sipi with vector # 10 kvm_lapic_reset: vcpu=ffffff0a36c8e000, id=1, base_msr= fee00800 PRIx64 base_add ress=fee00000 vcpu 2 received sipi with vector # 10 kvm_lapic_reset: vcpu=ffffff0a36c86000, id=2, base_msr= fee00800 PRIx64 base_add ress=fee00000 vcpu 3 received sipi with vector # 10 kvm_lapic_reset: vcpu=ffffff0a36c7e000, id=3, base_msr= fee00800 PRIx64 base_add ress=fee00000 vcpu 4 received sipi with vector # 10 vcpu 7 received sipi with vector # 10 vcpu 6 received sipi with vector # 10 kvm_lapic_reset: vcpu=ffffff0a36cbe000, id=7, base_msr= fee00800 PRIx64 base_add ress=fee00000 kvm_lapic_reset: vcpu=ffffff0a36cc6000, id=6, base_msr= fee00800 PRIx64 base_add ress=fee00000 vcpu 5 received sipi with vector # 10 kvm_lapic_reset: vcpu=ffffff0a36c76000, id=4, base_msr= fee00800 PRIx64 base_add ress=fee00000 kvm_lapic_reset: vcpu=ffffff0a36cce000, id=5, base_msr= fee00800 PRIx64 base_add ress=fee00000 unhandled wrmsr: 0x0 data 0 vcpu 1 received sipi with vector # 98 kvm_lapic_reset: vcpu=ffffff0a36c8e000, id=1, base_msr= fee00800 PRIx64 base_add ress=fee00000 vcpu 2 received sipi with vector # 98 kvm_lapic_reset: vcpu=ffffff0a36c86000, id=2, base_msr= fee00800 PRIx64 base_add ress=fee00000 vcpu 3 received sipi with vector # 98 kvm_lapic_reset: vcpu=ffffff0a36c7e000, id=3, base_msr= fee00800 PRIx64 base_add ress=fee00000 vcpu 4 received sipi with vector # 98 kvm_lapic_reset: vcpu=ffffff0a36c76000, id=4, base_msr= fee00800 PRIx64 base_add ress=fee00000 vcpu 5 received sipi with vector # 98 kvm_lapic_reset: vcpu=ffffff0a36cce000, id=5, base_msr= fee00800 PRIx64 base_add ress=fee00000 vcpu 6 received sipi with vector # 98 kvm_lapic_reset: vcpu=ffffff0a36cc6000, id=6, base_msr= fee00800 PRIx64 base_add ress=fee00000 vcpu 7 received sipi with vector # 98 kvm_lapic_reset: vcpu=ffffff0a36cbe000, id=7, base_msr= fee00800 PRIx64 base_add ress=fee00000 unhandled rdmsr: 0xfe89f030 unhandled wrmsr: 0x525f43 data 2000000001 unhandled rdmsr: 0xfe89f030 unhandled wrmsr: 0x525f43 data 2000000001 unhandled rdmsr: 0xfe89f030 unhandled wrmsr: 0x525f43 data 2000000001 unhandled rdmsr: 0xfe89f030 unhandled wrmsr: 0x525f43 data 2000000001 unhandled rdmsr: 0xfe89f030 unhandled wrmsr: 0x525f43 data 2000000001 unhandled rdmsr: 0xfe89f030 unhandled wrmsr: 0x525f43 data 2000000001 unhandled rdmsr: 0xff31ca8c unhandled wrmsr: 0x525f43 data 10000000001 unhandled rdmsr: 0xff31ca8c unhandled wrmsr: 0x525f43 data 10000000001 unhandled rdmsr: 0xff31ca8c unhandled wrmsr: 0x525f43 data 10000000001 unhandled rdmsr: 0xff31ca8c unhandled wrmsr: 0x525f43 data 10000000001 unhandled rdmsr: 0xff31ca8c unhandled wrmsr: 0x525f43 data 10000000001 unhandled rdmsr: 0xff31ca8c unhandled wrmsr: 0x525f43 data 10000000001 unhandled rdmsr: 0xff31ca8c unhandled wrmsr: 0x525f43 data 10000000001 unhandled rdmsr: 0xff31ca8c unhandled wrmsr: 0x525f43 data 10000000001 unhandled rdmsr: 0xff31ca8c unhandled wrmsr: 0x525f43 data 10000000001 unhandled rdmsr: 0xff31ca8c unhandled wrmsr: 0x525f43 data 10000000001 NOTICE: e1000g3 link down NOTICE: vnic1001 link down NOTICE: e1000g3 link up, 100 Mbps, full duplex NOTICE: vnic1001 link up, 100 Mbps, unknown duplex WARNING: ahci0: watchdog port 1 satapkt 0xffffff0a5a545088 timed out WARNING: ahci0: watchdog port 2 satapkt 0xffffff0a5dc38160 timed out WARNING: ahci0: watchdog port 0 satapkt 0xffffff0a5dc642e0 timed out WARNING: ahci0: watchdog port 1 satapkt 0xffffff0a57020388 timed out WARNING: ahci0: watchdog port 1 satapkt 0xffffff0a57020388 timed out WARNING: ahci0: watchdog port 1 satapkt 0xffffff0a57020388 timed out WARNING: ahci0: watchdog port 1 satapkt 0xffffff0a57020388 timed out WARNING: ahci0: watchdog port 2 satapkt 0xffffff0a57020388 timed out WARNING: ahci0: watchdog port 2 satapkt 0xffffff0a57020388 timed out WARNING: ahci0: watchdog port 2 satapkt 0xffffff0a57020388 timed out WARNING: ahci0: watchdog port 0 satapkt 0xffffff0a5fe32b90 timed out WARNING: ahci0: watchdog port 0 satapkt 0xffffff0a5fe32b90 timed out WARNING: ahci0: watchdog port 0 satapkt 0xffffff0a5fe32b90 timed out WARNING: ahci0: watchdog port 1 satapkt 0xffffff0a5fe32b90 timed out WARNING: ahci0: watchdog port 1 satapkt 0xffffff0a5fe32b90 timed out WARNING: ahci0: watchdog port 1 satapkt 0xffffff0a5fe32b90 timed out WARNING: ahci0: watchdog port 2 satapkt 0xffffff0a5fe32b90 timed out WARNING: ahci0: watchdog port 2 satapkt 0xffffff0a5fe32b90 timed out WARNING: ahci0: watchdog port 2 satapkt 0xffffff0a5fe32b90 timed out NOTICE: SUNW-MSG-ID: SUNOS-8000-0G, TYPE: Error, VER: 1, SEVERITY: Major panic[cpu0]/thread=ffffff00458cbc40: I/O to pool 'mainpool' appears to be hung. ffffff00458cba20 zfs:vdev_deadman+10b () ffffff00458cba70 zfs:vdev_deadman+4a () ffffff00458cbac0 zfs:vdev_deadman+4a () ffffff00458cbaf0 zfs:spa_deadman+ad () ffffff00458cbb90 genunix:cyclic_softint+f3 () ffffff00458cbba0 unix:cbe_low_level+14 () ffffff00458cbbf0 unix:av_dispatch_softvect+78 () ffffff00458cbc20 unix:dispatch_softint+39 () ffffff00458059a0 unix:switch_sp_and_call+13 () ffffff00458059e0 unix:dosoftint+44 () ffffff0045805a40 unix:do_interrupt+ba () ffffff0045805a50 unix:cmnint+ba () ffffff0045805bc0 unix:acpi_cpu_cstate+11b () ffffff0045805bf0 unix:cpu_acpi_idle+8d () ffffff0045805c00 unix:cpu_idle_adaptive+13 () ffffff0045805c20 unix:idle+a7 () ffffff0045805c30 unix:thread_start+8 () syncing file systems... done dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel NOTICE: ahci0: ahci_tran_reset_dport port 0 reset port Would be nice to get some info about this from someone that got some more clues than I got... Best regards from/Med v?nliga h?lsningar fr?n Johan Kragsterman Capvert From skiselkov.ml at gmail.com Mon Apr 7 09:37:50 2014 From: skiselkov.ml at gmail.com (Saso Kiselkov) Date: Mon, 07 Apr 2014 11:37:50 +0200 Subject: [OmniOS-discuss] crash In-Reply-To: References: Message-ID: <534271EE.70903@gmail.com> On 4/7/14, 11:19 AM, Johan Kragsterman wrote: > > Hej! > > > Got a crash here, that I would like someone have a look at. > > [..snip..] > >> ::stack > vpanic() > vdev_deadman+0x10b(ffffff0a277f0540) > vdev_deadman+0x4a(ffffff0a1eea6040) > vdev_deadman+0x4a(ffffff0a1dfea580) > spa_deadman+0xad(ffffff0a1cd8a580) > cyclic_softint+0xf3(fffffffffbc30d20, 0) > cbe_low_level+0x14() > av_dispatch_softvect+0x78(2) > dispatch_softint+0x39(0, 0) > switch_sp_and_call+0x13() > dosoftint+0x44(ffffff0045805a50) > do_interrupt+0xba(ffffff0045805a50, 1) > _interrupt+0xba() > acpi_cpu_cstate+0x11b(ffffff0a1ce9e670) > cpu_acpi_idle+0x8d() > cpu_idle_adaptive+0x13() > idle+0xa7() > thread_start+8() > [..snip..] > WARNING: ahci0: watchdog port 1 satapkt 0xffffff0a5a545088 timed out > > WARNING: ahci0: watchdog port 2 satapkt 0xffffff0a5dc38160 timed out > > WARNING: ahci0: watchdog port 0 satapkt 0xffffff0a5dc642e0 timed out > > WARNING: ahci0: watchdog port 1 satapkt 0xffffff0a57020388 timed out > > WARNING: ahci0: watchdog port 1 satapkt 0xffffff0a57020388 timed out > > WARNING: ahci0: watchdog port 1 satapkt 0xffffff0a57020388 timed out > > WARNING: ahci0: watchdog port 1 satapkt 0xffffff0a57020388 timed out > > WARNING: ahci0: watchdog port 2 satapkt 0xffffff0a57020388 timed out > > WARNING: ahci0: watchdog port 2 satapkt 0xffffff0a57020388 timed out > > WARNING: ahci0: watchdog port 2 satapkt 0xffffff0a57020388 timed out > > WARNING: ahci0: watchdog port 0 satapkt 0xffffff0a5fe32b90 timed out > > WARNING: ahci0: watchdog port 0 satapkt 0xffffff0a5fe32b90 timed out > > WARNING: ahci0: watchdog port 0 satapkt 0xffffff0a5fe32b90 timed out > > WARNING: ahci0: watchdog port 1 satapkt 0xffffff0a5fe32b90 timed out > > WARNING: ahci0: watchdog port 1 satapkt 0xffffff0a5fe32b90 timed out > > WARNING: ahci0: watchdog port 1 satapkt 0xffffff0a5fe32b90 timed out > > WARNING: ahci0: watchdog port 2 satapkt 0xffffff0a5fe32b90 timed out > > WARNING: ahci0: watchdog port 2 satapkt 0xffffff0a5fe32b90 timed out > > WARNING: ahci0: watchdog port 2 satapkt 0xffffff0a5fe32b90 timed out > > NOTICE: SUNW-MSG-ID: SUNOS-8000-0G, TYPE: Error, VER: 1, SEVERITY: Major > > > panic[cpu0]/thread=ffffff00458cbc40: > I/O to pool 'mainpool' appears to be hung. > > > ffffff00458cba20 zfs:vdev_deadman+10b () > ffffff00458cba70 zfs:vdev_deadman+4a () > ffffff00458cbac0 zfs:vdev_deadman+4a () > ffffff00458cbaf0 zfs:spa_deadman+ad () > ffffff00458cbb90 genunix:cyclic_softint+f3 () > ffffff00458cbba0 unix:cbe_low_level+14 () > ffffff00458cbbf0 unix:av_dispatch_softvect+78 () > ffffff00458cbc20 unix:dispatch_softint+39 () > ffffff00458059a0 unix:switch_sp_and_call+13 () > ffffff00458059e0 unix:dosoftint+44 () > ffffff0045805a40 unix:do_interrupt+ba () > ffffff0045805a50 unix:cmnint+ba () > ffffff0045805bc0 unix:acpi_cpu_cstate+11b () > ffffff0045805bf0 unix:cpu_acpi_idle+8d () > ffffff0045805c00 unix:cpu_idle_adaptive+13 () > ffffff0045805c20 unix:idle+a7 () > ffffff0045805c30 unix:thread_start+8 () > > syncing file systems... > done > dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel > NOTICE: ahci0: ahci_tran_reset_dport port 0 reset port > > Would be nice to get some info about this from someone that got some more clues than I got... Essentially, this says that your SATA controller hung in a bad state that isn't recoverable: https://github.com/illumos/illumos-gate/blob/master/usr/src/uts/common/fs/zfs/spa_misc.c#L256-L261 I'd suspect the SATA controller. If this panic comes with any regularity, try working around the SATA controller by using a substitute HBA and disabling the old one to see if it goes away. Cheers, -- Saso From johan.kragsterman at capvert.se Mon Apr 7 09:57:19 2014 From: johan.kragsterman at capvert.se (Johan Kragsterman) Date: Mon, 7 Apr 2014 11:57:19 +0200 Subject: [OmniOS-discuss] crash In-Reply-To: <534271EE.70903@gmail.com> References: <534271EE.70903@gmail.com>, Message-ID: An HTML attachment was scrubbed... URL: From jesus at omniti.com Tue Apr 8 01:51:46 2014 From: jesus at omniti.com (Theo Schlossnagle) Date: Mon, 7 Apr 2014 21:51:46 -0400 Subject: [OmniOS-discuss] OmniOS OpenSSL 1.0.1g and CVE-2014-0160 Message-ID: Today was an unfortunate day for the Internet as a particularly devastating and quite longstanding bug was reveal in OpenSSL 1.0.1. OmniOS uses OpenSSL 1.0.1 and, like all other distributions (regardless of operating system) that use OpenSSL 1.0.1, is vulnerable. While I'd normally link to the CVE directly, there is a particularly well organized site dedicated to this bug with many reference documents linked from it. If you are interested in the details of the bug (and if you care about security, you should be interested), please visit http://heartbleed.com/ Earlier today we updated our builds to use OpenSSL 1.0.1g which addresses this particular bug (CVE-2014-0160). We've rerolled and published packages for all supported OmniOS releases: bloody, r151008 and r151006LTS The package FMRIs are as follows: For r151006 LTS: pkg://omnios/library/security/openssl at 1.0.1.7,5.11-0.151006:20140407T211430Z For r151008: pkg://omnios/library/security/openssl at 1.0.1.7,5.11-0.151008:20140407T220403Z For bloody: pkg://omnios/library/security/openssl at 1.0.1.7,5.11-0.151009:20140407T211119Z These packages do not require a new BE or a reboot. You can perform this upgrade with minimal service interruption. Please update your systems now and restart any services that link against OpenSSL libraries to arrive at a safe state. On a side note. April 7th is National Beer Day and an OmniTI corporate holiday. We considered this security issue critical enough to stop drinking beer and dive into providing updates. If we thought this security issue warranted interruption of our celebration of National Beer Day, you too should take it very seriously. Best regards, Theo -------------- next part -------------- An HTML attachment was scrubbed... URL: From Kevin.Swab at ColoState.EDU Tue Apr 8 03:32:22 2014 From: Kevin.Swab at ColoState.EDU (Kevin Swab) Date: Mon, 07 Apr 2014 21:32:22 -0600 Subject: [OmniOS-discuss] kernel panic Message-ID: <53436DC6.6080208@ColoState.EDU> I've got OmniOS 151008j running on a home file server, and the other day it went into a reboot loop, displaying a kernel panic on the console just after the kernel banner was printed. The panic message on screen showed some zfs function calls so following that lead, I booted off the install media, mounted my root pool and removed /etc/zpool.cache. The system was able to boot after that but when I attempt to import the pool containing my data, it panics again. FMD shows that a reboot occurred after a kernel panic, and says more info is available from fmdump. Here's the stack trace from 'fmdump': # fmdump -Vp -u 38f6aa49-6c97-4675-b526-e455b1ae215b TIME UUID SUNW-MSG-ID Apr 07 2014 21:03:45.097921000 38f6aa49-6c97-4675-b526-e455b1ae215b SUNOS-8000-KL TIME CLASS ENA Apr 07 21:03:45.0237 ireport.os.sunos.panic.dump_available 0x0000000000000000 Apr 07 21:03:03.8496 ireport.os.sunos.panic.dump_pending_on_device 0x0000000000000000 nvlist version: 0 version = 0x0 class = list.suspect uuid = 38f6aa49-6c97-4675-b526-e455b1ae215b code = SUNOS-8000-KL diag-time = 1396926225 62791 de = fmd:///module/software-diagnosis fault-list-sz = 0x1 fault-list = (array of embedded nvlists) (start fault-list[0]) nvlist version: 0 version = 0x0 class = defect.sunos.kernel.panic certainty = 0x64 asru = sw:///:path=/var/crash/unknown/.38f6aa49-6c97-4675-b526-e455b1ae215b resource = sw:///:path=/var/crash/unknown/.38f6aa49-6c97-4675-b526-e455b1ae215b savecore-succcess = 1 dump-dir = /var/crash/unknown dump-files = vmdump.1 os-instance-uuid = 38f6aa49-6c97-4675-b526-e455b1ae215b panicstr = BAD TRAP: type=e (#pf Page fault) rp=ffffff000fadafc0 addr=2b8 occurred in module "unix" due to a NULL pointer dereference panicstack = unix:die+df () | unix:trap+db3 () | unix:cmntrap+e6 () | unix:mutex_enter+b () | zfs:zio_buf_alloc+25 () | zfs:arc_get_data_buf+2b8 () | zfs:arc_buf_alloc+b5 () | zfs:arc_read+42b () | zfs:dsl_scan_prefetch+a7 () | zfs:dsl_scan_recurse+16f () | zfs:dsl_scan_visitbp+eb () | zfs:dsl_scan_visitdnode+bd () | zfs:dsl_scan_recurse+439 () | zfs:dsl_scan_visitbp+eb () | zfs:dsl_scan_visit_rootbp+61 () | zfs:dsl_scan_visit+26b () | zfs:dsl_scan_sync+12f () | zfs:spa_sync+334 () | zfs:txg_sync_thread+227 () | unix:thread_start+8 () | crashtime = 1396801998 panic-time = Sun Apr 6 10:33:18 2014 MDT (end fault-list[0]) fault-status = 0x1 severity = Major __ttl = 0x1 __tod = 0x53436711 0x5d627e8 I'd really like to recover the data on that pool if possible, any suggestions on what I can try next? Thanks, Kevin From jimklimov at cos.ru Tue Apr 8 13:35:27 2014 From: jimklimov at cos.ru (Jim Klimov) Date: Tue, 08 Apr 2014 15:35:27 +0200 Subject: [OmniOS-discuss] OmniOS OpenSSL 1.0.1g and CVE-2014-0160 In-Reply-To: References: Message-ID: <5343FB1F.4040300@cos.ru> On 2014-04-08 03:51, Theo Schlossnagle wrote: > Today was an unfortunate day for the Internet as a particularly > devastating and quite longstanding bug was reveal in OpenSSL 1.0.1. Thanks for the heads-up! Can anyone please elaborate on this question, though: some of the legacy systems (i.e. Solaris 10 based) out in the field have not, in fact, seen or used OpenSSL past 0.9.8-something; and ran some SSL-protected email, openvpn, web or ldap services (though the latter is probably using some java security layer). It is however not known what SSL implementations and versions were used by the users of these systems. Are such setups vulnerable (given that the server side had no heartbeat handshake code with the bug) to the extent that everything should be urgently upgraded or not? Thanks, //Jim From skiselkov.ml at gmail.com Tue Apr 8 13:44:23 2014 From: skiselkov.ml at gmail.com (Saso Kiselkov) Date: Tue, 08 Apr 2014 15:44:23 +0200 Subject: [OmniOS-discuss] OmniOS OpenSSL 1.0.1g and CVE-2014-0160 In-Reply-To: <5343FB1F.4040300@cos.ru> References: <5343FB1F.4040300@cos.ru> Message-ID: <5343FD37.9000605@gmail.com> On 4/8/14, 3:35 PM, Jim Klimov wrote: > On 2014-04-08 03:51, Theo Schlossnagle wrote: >> Today was an unfortunate day for the Internet as a particularly >> devastating and quite longstanding bug was reveal in OpenSSL 1.0.1. > > Thanks for the heads-up! > > Can anyone please elaborate on this question, though: some of the > legacy systems (i.e. Solaris 10 based) out in the field have not, > in fact, seen or used OpenSSL past 0.9.8-something; and ran some > SSL-protected email, openvpn, web or ldap services (though the > latter is probably using some java security layer). It is however > not known what SSL implementations and versions were used by the > users of these systems. Are such setups vulnerable (given that > the server side had no heartbeat handshake code with the bug) to > the extent that everything should be urgently upgraded or not? Anything below OpenSSL 1.0.0 (inclusive) isn't vulnerable to this. (Most legacy systems, including OI, still run on the OpenSSL 0.9.8 release train) Cheers, -- Saso From dswartz at druber.com Tue Apr 8 13:45:30 2014 From: dswartz at druber.com (Dan Swartzendruber) Date: Tue, 8 Apr 2014 09:45:30 -0400 Subject: [OmniOS-discuss] OmniOS OpenSSL 1.0.1g and CVE-2014-0160 In-Reply-To: References: Message-ID: <46dd41948454127d1c756bdb1caf34bf.squirrel@webmail.druber.com> @1.0.1.7,5.11-0.151009:20140407T211119Z > > These packages do not require a new BE or a reboot. You can perform this > upgrade with minimal service interruption. Please update your systems now > and restart any services that link against OpenSSL libraries to arrive at > a > safe state. Theo, I am puzzled. I updated my box, and it did create a boot environment with the fix in it, so I can't get it until I reboot... Maybe I updated the wrong way? I did 'pkg image-update' which is how I usually do things... From johan.kragsterman at capvert.se Tue Apr 8 13:49:34 2014 From: johan.kragsterman at capvert.se (Johan Kragsterman) Date: Tue, 8 Apr 2014 15:49:34 +0200 Subject: [OmniOS-discuss] OmniOS OpenSSL 1.0.1g and CVE-2014-0160 In-Reply-To: <46dd41948454127d1c756bdb1caf34bf.squirrel@webmail.druber.com> References: <46dd41948454127d1c756bdb1caf34bf.squirrel@webmail.druber.com>, Message-ID: An HTML attachment was scrubbed... URL: From gmason at msu.edu Tue Apr 8 14:18:44 2014 From: gmason at msu.edu (Greg Mason) Date: Tue, 8 Apr 2014 10:18:44 -0400 Subject: [OmniOS-discuss] OmniOS OpenSSL 1.0.1g and CVE-2014-0160 In-Reply-To: <46dd41948454127d1c756bdb1caf34bf.squirrel@webmail.druber.com> References: <46dd41948454127d1c756bdb1caf34bf.squirrel@webmail.druber.com> Message-ID: > > Theo, I am puzzled. I updated my box, and it did create a boot > environment with the fix in it, so I can't get it until I reboot... Maybe > I updated the wrong way? I did 'pkg image-update' which is how I usually > do things? Dan, If you simply do a ?pkg install? or ?pkg update? it will install the new OpenSSL package in the current BE. -Greg > > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss From dswartz at druber.com Tue Apr 8 14:21:35 2014 From: dswartz at druber.com (Dan Swartzendruber) Date: Tue, 8 Apr 2014 10:21:35 -0400 Subject: [OmniOS-discuss] OmniOS OpenSSL 1.0.1g and CVE-2014-0160 In-Reply-To: References: <46dd41948454127d1c756bdb1caf34bf.squirrel@webmail.druber.com>, Message-ID: <3941ee54fe1eca513abb21bec1d52634.squirrel@webmail.druber.com> > > Hi, Dan! > > I just did a pkg install, and it worked like a charm, no new BE...you can > probably remove the one you just did, and do a new pkg install Worked great, thanks! From jimklimov at cos.ru Tue Apr 8 14:24:00 2014 From: jimklimov at cos.ru (Jim Klimov) Date: Tue, 08 Apr 2014 16:24:00 +0200 Subject: [OmniOS-discuss] OmniOS OpenSSL 1.0.1g and CVE-2014-0160 In-Reply-To: <5343FD37.9000605@gmail.com> References: <5343FB1F.4040300@cos.ru> <5343FD37.9000605@gmail.com> Message-ID: <53440680.6020407@cos.ru> On 2014-04-08 15:44, Saso Kiselkov wrote: > Anything below OpenSSL 1.0.0 (inclusive) isn't vulnerable to this. (Most > legacy systems, including OI, still run on the OpenSSL 0.9.8 > release train) Thanks, I've read that statement ;) I just wanted to make sure that if we have an OpenSSL 0.9.8 enabled server and an OpenSSL 1.0.1* (vulnerable) client, and someone has sniffed and saved the traffic, does indeed or does not that disclose the sensitive data? For instance, I can't yet figure out if this heartbeat handshake is something new introduced in 1.0.1 series and so the whole procedure is skipped when a new OpenSSL connects with an old OpenSSL? Or not?.. Thanks, //Jim From dswartz at druber.com Tue Apr 8 15:00:38 2014 From: dswartz at druber.com (Dan Swartzendruber) Date: Tue, 8 Apr 2014 11:00:38 -0400 Subject: [OmniOS-discuss] OmniOS OpenSSL 1.0.1g and CVE-2014-0160 In-Reply-To: References: <46dd41948454127d1c756bdb1caf34bf.squirrel@webmail.druber.com> Message-ID: >> >> Theo, I am puzzled. I updated my box, and it did create a boot >> environment with the fix in it, so I can't get it until I reboot... >> Maybe >> I updated the wrong way? I did 'pkg image-update' which is how I >> usually >> do things > > Dan, > > If you simply do a ?pkg install? or ?pkg update? it will install the new > OpenSSL package in the current BE. Yes, that worked, thanks. I've just been used to 'pkg image-install'... From cks at cs.toronto.edu Tue Apr 8 15:13:42 2014 From: cks at cs.toronto.edu (Chris Siebenmann) Date: Tue, 08 Apr 2014 11:13:42 -0400 Subject: [OmniOS-discuss] OmniOS OpenSSL 1.0.1g and CVE-2014-0160 In-Reply-To: Your message of Tue, 08 Apr 2014 16:24:00 +0200. <53440680.6020407@cos.ru> Message-ID: <20140408151342.2BDB41A0463@apps0.cs.toronto.edu> | On 2014-04-08 15:44, Saso Kiselkov wrote: | > Anything below OpenSSL 1.0.0 (inclusive) isn't vulnerable to this. (Most | > legacy systems, including OI, still run on the OpenSSL 0.9.8 | > release train) | | Thanks, I've read that statement ;) | | I just wanted to make sure that if we have an OpenSSL 0.9.8 enabled | server and an OpenSSL 1.0.1* (vulnerable) client, and someone has | sniffed and saved the traffic, does indeed or does not that disclose | the sensitive data? | | For instance, I can't yet figure out if this heartbeat handshake is | something new introduced in 1.0.1 series and so the whole procedure is | skipped when a new OpenSSL connects with an old OpenSSL? Or not?.. My understanding of the bug is that it requires active exploitation by one end of the connection (either the client against the server or the server against the client, if the client holds any sensitive material). It's not a passive bug that can be exploited by a third party that is just listening in because it involves introducing a deliberate protocol violation[*]. The bug is only present in OpenSSL versions that support heartbeats. This was apparently introduced in 1.0.1, which dates from early 2012 (and is closed in 1.0.1g or patched versions of earlier 1.0.1 releases). - cks [*: very crudely summarized, the bug is that you send the other end a heartbeat request that says 'echo back these 64K bytes' but don't actually supply anywhere near that many bytes to echo back. The other end then overruns your input buffer and sends you back whatever memory was beyond it. ] From mir at miras.org Tue Apr 8 15:15:23 2014 From: mir at miras.org (Michael Rasmussen) Date: Tue, 8 Apr 2014 17:15:23 +0200 Subject: [OmniOS-discuss] OmniOS OpenSSL 1.0.1g and CVE-2014-0160 In-Reply-To: References: <46dd41948454127d1c756bdb1caf34bf.squirrel@webmail.druber.com> Message-ID: <20140408171523.51b11654@sleipner.datanom.net> On Tue, 8 Apr 2014 11:00:38 -0400 "Dan Swartzendruber" wrote: > > Yes, that worked, thanks. I've just been used to 'pkg image-install'... > The reason why someone, like me, are seeing a new BE being created is that a pkg update will pull in driver/storage/mpt_sas. This is strange since this is an upgrade after r151008j but no formal release is mentioning driver/storage/mpt_sas so I wonder where this is coming from?!!! -- Hilsen/Regards Michael Rasmussen Get my public GnuPG keys: michael rasmussen cc http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E mir datanom net http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C mir miras org http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917 -------------------------------------------------------------- /usr/games/fortune -es says: How's it going in those MODULAR LOVE UNITS?? -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From esproul at omniti.com Tue Apr 8 15:15:49 2014 From: esproul at omniti.com (Eric Sproul) Date: Tue, 8 Apr 2014 11:15:49 -0400 Subject: [OmniOS-discuss] OmniOS OpenSSL 1.0.1g and CVE-2014-0160 In-Reply-To: References: Message-ID: On Mon, Apr 7, 2014 at 9:51 PM, Theo Schlossnagle wrote: > For r151008: > pkg://omnios/library/security/openssl at 1.0.1.7,5.11-0.151008:20140407T220403Z > FYI, I just re-spun the r151008 package to clear up an issue where the unsigned manifest appeared in the repo catalog alongside the signed version. It's a quirk^Wfeature of how pkg(5) does signing that it does not alter the version of the package, so effectively we had two different hashes for the same "version" of the openssl manifest. This caused confusion for some pkg* tools and sub-commands but not others. For instance, update/install was *not* affected, but pkgrecv(1) was. The new spin is pkg://omnios/library/security/openssl at 1.0.1.7,5.11-0.151008:20140408T142844Z Sorry for the inconvenience. We've clarified our package signing process to ensure this does not recur. Eric From groups at tierarzt-mueller.de Tue Apr 8 18:13:15 2014 From: groups at tierarzt-mueller.de (Alexander Lesle) Date: Tue, 8 Apr 2014 20:13:15 +0200 Subject: [OmniOS-discuss] Pool degraded Message-ID: <1456071949.20140408201315@tierarzt-mueller.de> Hello All, I have a pool with mirrors and one spare. Now my pool is degraded and I though that Omnios/ZFS activate the spare itself and make a resilvering. # zpool status -x pool: pool_ripley state: DEGRADED status: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Attach the missing device and online it using 'zpool online'. see: http://illumos.org/msg/ZFS-8000-2Q scan: resilvered 84K in 0h0m with 0 errors on Sun Mar 23 15:09:08 2014 config: NAME STATE READ WRITE CKSUM pool_ripley DEGRADED 0 0 0 mirror-0 DEGRADED 0 0 0 c1t5000CCA22BC16BC5d0 ONLINE 0 0 0 c1t5000CCA22BEEF6A3d0 UNAVAIL 0 0 0 cannot open mirror-1 ONLINE 0 0 0 c1t5000CCA22BC8D31Ad0 ONLINE 0 0 0 c1t5000CCA22BF612C4d0 ONLINE 0 0 0 . . . spares c1t5000CCA22BF5B9DEd0 AVAIL But nothing done. OK, then I do it myself. # zpool replace -f pool_ripley c1t5000CCA22BEEF6A3d0 c1t5000CCA22BF5B9DEd0 Resilvering is starting immediately. # zpool status -x pool: pool_ripley state: DEGRADED status: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Attach the missing device and online it using 'zpool online'. see: http://illumos.org/msg/ZFS-8000-2Q scan: resilvered 1.53T in 3h12m with 0 errors on Sun Apr 6 17:48:51 2014 config: NAME STATE READ WRITE CKSUM pool_ripley DEGRADED 0 0 0 mirror-0 DEGRADED 0 0 0 c1t5000CCA22BC16BC5d0 ONLINE 0 0 0 spare-1 DEGRADED 0 0 0 c1t5000CCA22BEEF6A3d0 UNAVAIL 0 0 0 cannot open c1t5000CCA22BF5B9DEd0 ONLINE 0 0 0 mirror-1 ONLINE 0 0 0 c1t5000CCA22BC8D31Ad0 ONLINE 0 0 0 c1t5000CCA22BF612C4d0 ONLINE 0 0 0 . . . spares c1t5000CCA22BF5B9DEd0 INUSE currently in use After resilvering I made power-off, unplugged the broken HDD from Case-slot 1 and switched the Spare from Slot 21 to Slot 1. The pool is still degraded. The broken HDD I cant remove it. # zpool remove pool_ripley c1t5000CCA22BEEF6A3d0 cannot remove c1t5000CCA22BEEF6A3d0: only inactive hot spares, cache, top-level, or log devices can be removed What can I do to through out the broken HDD and tell ZFS that the spare is now member of mirror-0 and remove it from the spare list? Why does not automatically jump in the Spare device and resilver the pool? Thanks. -- Best Regards Alexander April, 08 2014 From groups at tierarzt-mueller.de Tue Apr 8 19:09:28 2014 From: groups at tierarzt-mueller.de (Alexander Lesle) Date: Tue, 8 Apr 2014 21:09:28 +0200 Subject: [OmniOS-discuss] Pool degraded In-Reply-To: <53443D37.7020307@ColoState.EDU> References: <1456071949.20140408201315@tierarzt-mueller.de> <53443D37.7020307@ColoState.EDU> Message-ID: <1697492574.20140408210928@tierarzt-mueller.de> Hello Kevin Swab and List, On April, 08 2014, 20:17 wrote in [1]: > Instead of a 'zpool remove ...', you want to do a 'zpool detach ...' to > get rid of the old device. thats it. zpool detach ... "removes" the broken device from the pool. > If you turn the 'autoreplace' property on > for the pool, the spare will automatically kick in the next time a drive > fails... Are you sure? Because the man zpool tell me other: ,-----[ man zpool ]----- | | autoreplace=on | off | | Controls automatic device replacement. If set to "off", | device replacement must be initiated by the administra- | tor by using the "zpool replace" command. If set to | "on", any new device, found in the same physical loca- | tion as a device that previously belonged to the pool, | is automatically formatted and replaced. The default | behavior is "off". This property can also be referred to | by its shortened column name, "replace". | `------------------- I understand it that when I pull out a device and put a new device in the _same_ Case-Slot ZFS make a resilver and ZFS pull out the old one automatically. When the property if off I have use the command zpool replace ... ... what I have done. But in my case, the spare device was in the Case and _named for_ this pool So the 'Hot Spares-Section' tells ,-----[ man zpool ]----- | | ZFS allows devices to be associated with pools as "hot | spares". These devices are not actively used in the pool, | but when an active device fails, it is automatically | replaced by a hot spare. | `------------------- Or I have misunderstood. > On 04/08/2014 12:13 PM, Alexander Lesle wrote: >> Hello All, >> >> I have a pool with mirrors and one spare. >> Now my pool is degraded and I though that Omnios/ZFS activate the >> spare itself and make a resilvering. >> >> # zpool status -x >> pool: pool_ripley >> state: DEGRADED >> status: One or more devices could not be opened. Sufficient replicas exist for >> the pool to continue functioning in a degraded state. >> action: Attach the missing device and online it using 'zpool online'. >> see: http://illumos.org/msg/ZFS-8000-2Q >> scan: resilvered 84K in 0h0m with 0 errors on Sun Mar 23 15:09:08 2014 >> config: >> >> NAME STATE READ WRITE CKSUM >> pool_ripley DEGRADED 0 0 0 >> mirror-0 DEGRADED 0 0 0 >> c1t5000CCA22BC16BC5d0 ONLINE 0 0 0 >> c1t5000CCA22BEEF6A3d0 UNAVAIL 0 0 0 cannot open >> mirror-1 ONLINE 0 0 0 >> c1t5000CCA22BC8D31Ad0 ONLINE 0 0 0 >> c1t5000CCA22BF612C4d0 ONLINE 0 0 0 >> . >> . >> . >> >> spares >> c1t5000CCA22BF5B9DEd0 AVAIL >> >> But nothing done. >> OK, then I do it myself. >> # zpool replace -f pool_ripley c1t5000CCA22BEEF6A3d0 c1t5000CCA22BF5B9DEd0 >> Resilvering is starting immediately. >> >> # zpool status -x >> pool: pool_ripley >> state: DEGRADED >> status: One or more devices could not be opened. Sufficient replicas exist for >> the pool to continue functioning in a degraded state. >> action: Attach the missing device and online it using 'zpool online'. >> see: http://illumos.org/msg/ZFS-8000-2Q >> scan: resilvered 1.53T in 3h12m with 0 errors on Sun Apr 6 17:48:51 2014 >> config: >> >> NAME STATE READ WRITE CKSUM >> pool_ripley DEGRADED 0 0 0 >> mirror-0 DEGRADED 0 0 0 >> c1t5000CCA22BC16BC5d0 ONLINE 0 0 0 >> spare-1 DEGRADED 0 0 0 >> c1t5000CCA22BEEF6A3d0 UNAVAIL 0 0 0 cannot open >> c1t5000CCA22BF5B9DEd0 ONLINE 0 0 0 >> mirror-1 ONLINE 0 0 0 >> c1t5000CCA22BC8D31Ad0 ONLINE 0 0 0 >> c1t5000CCA22BF612C4d0 ONLINE 0 0 0 >> . >> . >> . >> spares >> c1t5000CCA22BF5B9DEd0 INUSE currently in use >> >> After resilvering I made power-off, unplugged the broken HDD from >> Case-slot 1 and switched the Spare from Slot 21 to Slot 1. >> The pool is still degraded. The broken HDD I cant remove it. >> >> # zpool remove pool_ripley c1t5000CCA22BEEF6A3d0 >> cannot remove c1t5000CCA22BEEF6A3d0: only inactive hot spares, >> cache, top-level, or log devices can be removed >> >> What can I do to through out the broken HDD and tell ZFS that the >> spare is now member of mirror-0 and remove it from the spare list? >> Why does not automatically jump in the Spare device and resilver the >> pool? >> >> Thanks. >> -- Best Regards Alexander April, 08 2014 ........ [1] mid:53443D37.7020307 at ColoState.EDU ........ From Kevin.Swab at ColoState.EDU Tue Apr 8 20:22:46 2014 From: Kevin.Swab at ColoState.EDU (Kevin Swab) Date: Tue, 08 Apr 2014 14:22:46 -0600 Subject: [OmniOS-discuss] Pool degraded In-Reply-To: <1697492574.20140408210928@tierarzt-mueller.de> References: <1456071949.20140408201315@tierarzt-mueller.de> <53443D37.7020307@ColoState.EDU> <1697492574.20140408210928@tierarzt-mueller.de> Message-ID: <53445A96.8090902@ColoState.EDU> Hello, and sorry for accidentally failing to "reply-all" on your first message... The man page seems misleading or incomplete on the subject of "autoreplace" and spares. Setting 'autoreplace=on' should cause your hot spare to kick in during a drive failure - with over 1100 spindles running ZFS here, we've had the "opportunity" to test it many times! ;-) I couldn't find any authoratative references for this, but here's a few unautoratative ones: http://my.safaribooksonline.com/book/operating-systems-and-server-administration/solaris/9780137049639/managing-storage-pools/ch02lev1sec7 http://stanley-huang.blogspot.com/2009/09/how-to-set-autoreplace-in-zfs-pool.html http://www.datadisk.co.uk/html_docs/sun/sun_zfs_cs.htm Hope this helps, Kevin On 04/08/2014 01:09 PM, Alexander Lesle wrote: > Hello Kevin Swab and List, > > On April, 08 2014, 20:17 wrote in [1]: > >> Instead of a 'zpool remove ...', you want to do a 'zpool detach ...' to >> get rid of the old device. > > thats it. > zpool detach ... "removes" the broken device from the pool. > >> If you turn the 'autoreplace' property on >> for the pool, the spare will automatically kick in the next time a drive >> fails... > > Are you sure? Because the man zpool tell me other: > > ,-----[ man zpool ]----- > | > | autoreplace=on | off > | > | Controls automatic device replacement. If set to "off", > | device replacement must be initiated by the administra- > | tor by using the "zpool replace" command. If set to > | "on", any new device, found in the same physical loca- > | tion as a device that previously belonged to the pool, > | is automatically formatted and replaced. The default > | behavior is "off". This property can also be referred to > | by its shortened column name, "replace". > | > `------------------- > > I understand it that when I pull out a device and put a new device in > the _same_ Case-Slot ZFS make a resilver and ZFS pull out the old one > automatically. > When the property if off I have use the command zpool replace ... ... > what I have done. > > But in my case, the spare device was in the Case and _named for_ this > pool > So the 'Hot Spares-Section' tells > ,-----[ man zpool ]----- > | > | ZFS allows devices to be associated with pools as "hot > | spares". These devices are not actively used in the pool, > | but when an active device fails, it is automatically > | replaced by a hot spare. > | > `------------------- > > Or I have misunderstood. > >> On 04/08/2014 12:13 PM, Alexander Lesle wrote: >>> Hello All, >>> >>> I have a pool with mirrors and one spare. >>> Now my pool is degraded and I though that Omnios/ZFS activate the >>> spare itself and make a resilvering. >>> >>> # zpool status -x >>> pool: pool_ripley >>> state: DEGRADED >>> status: One or more devices could not be opened. Sufficient replicas exist for >>> the pool to continue functioning in a degraded state. >>> action: Attach the missing device and online it using 'zpool online'. >>> see: http://illumos.org/msg/ZFS-8000-2Q >>> scan: resilvered 84K in 0h0m with 0 errors on Sun Mar 23 15:09:08 2014 >>> config: >>> >>> NAME STATE READ WRITE CKSUM >>> pool_ripley DEGRADED 0 0 0 >>> mirror-0 DEGRADED 0 0 0 >>> c1t5000CCA22BC16BC5d0 ONLINE 0 0 0 >>> c1t5000CCA22BEEF6A3d0 UNAVAIL 0 0 0 cannot open >>> mirror-1 ONLINE 0 0 0 >>> c1t5000CCA22BC8D31Ad0 ONLINE 0 0 0 >>> c1t5000CCA22BF612C4d0 ONLINE 0 0 0 >>> . >>> . >>> . >>> >>> spares >>> c1t5000CCA22BF5B9DEd0 AVAIL >>> >>> But nothing done. >>> OK, then I do it myself. >>> # zpool replace -f pool_ripley c1t5000CCA22BEEF6A3d0 c1t5000CCA22BF5B9DEd0 >>> Resilvering is starting immediately. >>> >>> # zpool status -x >>> pool: pool_ripley >>> state: DEGRADED >>> status: One or more devices could not be opened. Sufficient replicas exist for >>> the pool to continue functioning in a degraded state. >>> action: Attach the missing device and online it using 'zpool online'. >>> see: http://illumos.org/msg/ZFS-8000-2Q >>> scan: resilvered 1.53T in 3h12m with 0 errors on Sun Apr 6 17:48:51 2014 >>> config: >>> >>> NAME STATE READ WRITE CKSUM >>> pool_ripley DEGRADED 0 0 0 >>> mirror-0 DEGRADED 0 0 0 >>> c1t5000CCA22BC16BC5d0 ONLINE 0 0 0 >>> spare-1 DEGRADED 0 0 0 >>> c1t5000CCA22BEEF6A3d0 UNAVAIL 0 0 0 cannot open >>> c1t5000CCA22BF5B9DEd0 ONLINE 0 0 0 >>> mirror-1 ONLINE 0 0 0 >>> c1t5000CCA22BC8D31Ad0 ONLINE 0 0 0 >>> c1t5000CCA22BF612C4d0 ONLINE 0 0 0 >>> . >>> . >>> . >>> spares >>> c1t5000CCA22BF5B9DEd0 INUSE currently in use >>> >>> After resilvering I made power-off, unplugged the broken HDD from >>> Case-slot 1 and switched the Spare from Slot 21 to Slot 1. >>> The pool is still degraded. The broken HDD I cant remove it. >>> >>> # zpool remove pool_ripley c1t5000CCA22BEEF6A3d0 >>> cannot remove c1t5000CCA22BEEF6A3d0: only inactive hot spares, >>> cache, top-level, or log devices can be removed >>> >>> What can I do to through out the broken HDD and tell ZFS that the >>> spare is now member of mirror-0 and remove it from the spare list? >>> Why does not automatically jump in the Spare device and resilver the >>> pool? >>> >>> Thanks. >>> > > -- ------------------------------------------------------------------- Kevin Swab UNIX Systems Administrator ACNS Colorado State University Phone: (970)491-6572 Email: Kevin.Swab at ColoState.EDU GPG Fingerprint: 7026 3F66 A970 67BD 6F17 8EB8 8A7D 142F 2392 791C From daleg at omniti.com Wed Apr 9 19:02:38 2014 From: daleg at omniti.com (Dale Ghent) Date: Wed, 9 Apr 2014 15:02:38 -0400 Subject: [OmniOS-discuss] [ANN] OmniOS releases 151006_049 and 151008t Message-ID: <8EB23259-C87C-4C9D-8EB7-D8F2E9899445@omniti.com> Hello, this week brings a new release for both 151006 and 151008. These releases include OpenSSL 1.0.1g, which corrects the ?Heartbleed? vulnerability, as well as fixes which are relevant to users of the mpt_sas and ipmi BMC drivers. * 151008t release notes: http://omnios.omniti.com/wiki.php/ReleaseNotes#r151008t * 151006_049 release notes: http://omnios.omniti.com/wiki.php/ReleaseNotes#r151006_049 ISO, USB, and Kayak images which reflect these versions are available at http://omnios.omniti.com/wiki.php/Installation Of course, all updated packages are available from the pkg.omniti.com IPS repository. /dale -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 494 bytes Desc: Message signed with OpenPGP using GPGMail URL: From henson at acm.org Thu Apr 10 20:45:31 2014 From: henson at acm.org (Paul B. Henson) Date: Thu, 10 Apr 2014 13:45:31 -0700 Subject: [OmniOS-discuss] update from r151008f -> r151008t -- two reboots? Message-ID: <20140410204531.GC1367@bender.unx.csupomona.edu> I'm currently running r151008f, and was looking to update to r151008t to get the openssl fix. However, it looks like the pkg update from r151008j has to be installed first, by itself? Necessitating a reboot into that new BE, before installing the rest of the updates into another new BE, and then rebooting into that to be done? # pkg update -n WARNING: pkg(5) appears to be out of date, and should be updated before running update. Please update pkg(5) by executing 'pkg install pkg:/package/pkg' as a privileged user and then retry the update. # pkg install -n pkg:/package/pkg Packages to update: 2 Create boot environment: Yes Create backup boot environment: No It would be nice to be able to install all updates with one reboot, it's not like I'm running Windows ;). Thanks... From groups at tierarzt-mueller.de Fri Apr 11 08:39:05 2014 From: groups at tierarzt-mueller.de (Alexander Lesle) Date: Fri, 11 Apr 2014 10:39:05 +0200 Subject: [OmniOS-discuss] Pool degraded In-Reply-To: <53445A96.8090902@ColoState.EDU> References: <1456071949.20140408201315@tierarzt-mueller.de> <53443D37.7020307@ColoState.EDU> <1697492574.20140408210928@tierarzt-mueller.de> <53445A96.8090902@ColoState.EDU> Message-ID: <227786046.20140411103905@tierarzt-mueller.de> Hello Kevin Swab and List, thanks Kevin your contribution helps me. It would be nice if an official of Illumos or Omnios would confirm it and would change the man page of zpool(1m). On April, 08 2014, 22:22 wrote in [1]: > The man page seems misleading or incomplete on the subject of > "autoreplace" and spares. Setting 'autoreplace=on' should cause your > hot spare to kick in during a drive failure - with over 1100 spindles > running ZFS here, we've had the "opportunity" to test it many times! ;-) > I couldn't find any authoratative references for this, but here's a few > unautoratative ones: > http://my.safaribooksonline.com/book/operating-systems-and-server-administration/solaris/9780137049639/managing-storage-pools/ch02lev1sec7 > http://stanley-huang.blogspot.com/2009/09/how-to-set-autoreplace-in-zfs-pool.html > http://www.datadisk.co.uk/html_docs/sun/sun_zfs_cs.htm > Hope this helps, > Kevin > On 04/08/2014 01:09 PM, Alexander Lesle wrote: >> Hello Kevin Swab and List, >> >> On April, 08 2014, 20:17 wrote in [1]: >> >>> Instead of a 'zpool remove ...', you want to do a 'zpool detach ...' to >>> get rid of the old device. >> >> thats it. >> zpool detach ... "removes" the broken device from the pool. >> >>> If you turn the 'autoreplace' property on >>> for the pool, the spare will automatically kick in the next time a drive >>> fails... >> >> Are you sure? Because the man zpool tell me other: >> >> ,-----[ man zpool ]----- >> | >> | autoreplace=on | off >> | >> | Controls automatic device replacement. If set to "off", >> | device replacement must be initiated by the administra- >> | tor by using the "zpool replace" command. If set to >> | "on", any new device, found in the same physical loca- >> | tion as a device that previously belonged to the pool, >> | is automatically formatted and replaced. The default >> | behavior is "off". This property can also be referred to >> | by its shortened column name, "replace". >> | >> `------------------- >> >> I understand it that when I pull out a device and put a new device in >> the _same_ Case-Slot ZFS make a resilver and ZFS pull out the old one >> automatically. >> When the property if off I have use the command zpool replace ... ... >> what I have done. >> >> But in my case, the spare device was in the Case and _named for_ this >> pool >> So the 'Hot Spares-Section' tells >> ,-----[ man zpool ]----- >> | >> | ZFS allows devices to be associated with pools as "hot >> | spares". These devices are not actively used in the pool, >> | but when an active device fails, it is automatically >> | replaced by a hot spare. >> | >> `------------------- >> >> Or I have misunderstood. >> >>> On 04/08/2014 12:13 PM, Alexander Lesle wrote: >>>> Hello All, >>>> >>>> I have a pool with mirrors and one spare. >>>> Now my pool is degraded and I though that Omnios/ZFS activate the >>>> spare itself and make a resilvering. >>>> >>>> # zpool status -x >>>> pool: pool_ripley >>>> state: DEGRADED >>>> status: One or more devices could not be opened. Sufficient replicas exist for >>>> the pool to continue functioning in a degraded state. >>>> action: Attach the missing device and online it using 'zpool online'. >>>> see: http://illumos.org/msg/ZFS-8000-2Q >>>> scan: resilvered 84K in 0h0m with 0 errors on Sun Mar 23 15:09:08 2014 >>>> config: >>>> >>>> NAME STATE READ WRITE CKSUM >>>> pool_ripley DEGRADED 0 0 0 >>>> mirror-0 DEGRADED 0 0 0 >>>> c1t5000CCA22BC16BC5d0 ONLINE 0 0 0 >>>> c1t5000CCA22BEEF6A3d0 UNAVAIL 0 0 0 cannot open >>>> mirror-1 ONLINE 0 0 0 >>>> c1t5000CCA22BC8D31Ad0 ONLINE 0 0 0 >>>> c1t5000CCA22BF612C4d0 ONLINE 0 0 0 >>>> . >>>> . >>>> . >>>> >>>> spares >>>> c1t5000CCA22BF5B9DEd0 AVAIL >>>> >>>> But nothing done. >>>> OK, then I do it myself. >>>> # zpool replace -f pool_ripley c1t5000CCA22BEEF6A3d0 c1t5000CCA22BF5B9DEd0 >>>> Resilvering is starting immediately. >>>> >>>> # zpool status -x >>>> pool: pool_ripley >>>> state: DEGRADED >>>> status: One or more devices could not be opened. Sufficient replicas exist for >>>> the pool to continue functioning in a degraded state. >>>> action: Attach the missing device and online it using 'zpool online'. >>>> see: http://illumos.org/msg/ZFS-8000-2Q >>>> scan: resilvered 1.53T in 3h12m with 0 errors on Sun Apr 6 17:48:51 2014 >>>> config: >>>> >>>> NAME STATE READ WRITE CKSUM >>>> pool_ripley DEGRADED 0 0 0 >>>> mirror-0 DEGRADED 0 0 0 >>>> c1t5000CCA22BC16BC5d0 ONLINE 0 0 0 >>>> spare-1 DEGRADED 0 0 0 >>>> c1t5000CCA22BEEF6A3d0 UNAVAIL 0 0 0 cannot open >>>> c1t5000CCA22BF5B9DEd0 ONLINE 0 0 0 >>>> mirror-1 ONLINE 0 0 0 >>>> c1t5000CCA22BC8D31Ad0 ONLINE 0 0 0 >>>> c1t5000CCA22BF612C4d0 ONLINE 0 0 0 >>>> . >>>> . >>>> . >>>> spares >>>> c1t5000CCA22BF5B9DEd0 INUSE currently in use >>>> >>>> After resilvering I made power-off, unplugged the broken HDD from >>>> Case-slot 1 and switched the Spare from Slot 21 to Slot 1. >>>> The pool is still degraded. The broken HDD I cant remove it. >>>> >>>> # zpool remove pool_ripley c1t5000CCA22BEEF6A3d0 >>>> cannot remove c1t5000CCA22BEEF6A3d0: only inactive hot spares, >>>> cache, top-level, or log devices can be removed >>>> >>>> What can I do to through out the broken HDD and tell ZFS that the >>>> spare is now member of mirror-0 and remove it from the spare list? >>>> Why does not automatically jump in the Spare device and resilver the >>>> pool? >>>> >>>> Thanks. >>>> >> >> -- Best Regards Alexander April, 11 2014 ........ [1] mid:53445A96.8090902 at ColoState.EDU ........ From kjf at taylorbritt.com Fri Apr 11 19:12:14 2014 From: kjf at taylorbritt.com (Ken F) Date: Fri, 11 Apr 2014 19:12:14 +0000 (UTC) Subject: [OmniOS-discuss] Overheating faults with ST4000NM0023 References: <6B4199959CFA4E1CB067BBF9270654CB@486dx4> <52FC9C12.9090900@smartjog.com> Message-ID: Can you provide a link to the dl_sea_fw software for download as I can not find it anywhere? Thanks. Ken From daleg at omniti.com Tue Apr 15 18:04:48 2014 From: daleg at omniti.com (Dale Ghent) Date: Tue, 15 Apr 2014 14:04:48 -0400 Subject: [OmniOS-discuss] [ANN] web/curl 7.36.0 package available Message-ID: web/curl 7.36.0 has been released for 151006 and 151008 to address two security issues: Info on the security issues addressed with this version: http://curl.haxx.se/docs/adv_20140326A.html http://curl.haxx.se/docs/adv_20140326B.html Package FMRIs: pkg://omnios/web/curl at 7.36.0,5.11-0.151006:20140414T214024Z pkg://omnios/web/curl at 7.36.0,5.11-0.151008:20140414T215242Z /dale -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 494 bytes Desc: Message signed with OpenPGP using GPGMail URL: From skiselkov.ml at gmail.com Tue Apr 15 22:30:20 2014 From: skiselkov.ml at gmail.com (Saso Kiselkov) Date: Wed, 16 Apr 2014 00:30:20 +0200 Subject: [OmniOS-discuss] Overheating faults with ST4000NM0023 In-Reply-To: <52FC9C12.9090900@smartjog.com> References: <6B4199959CFA4E1CB067BBF9270654CB@486dx4> <52FC9C12.9090900@smartjog.com> Message-ID: <534DB2FC.9090404@gmail.com> Hi, I've hit this exact same issue on my recent SEAGATE ST2000NM0023 drives. Can you please direct me to where I can get the firmware package? Perhaps we could also post the link publicly, so that people can find it through google or some such method. Thanks! Best wishes, -- Saso On 2/13/14, 11:18 AM, Thibault VINCENT wrote: > On 02/12/2014 09:59 PM, Steamer wrote: >> Did you ever find a solution to the overheating faults with the >> ST4000NM0023? >> >> I'm currently having the exact same issue with ST1000NM0023 drives, >> seems like seagate has the user temp probe set at 40'C. The manual >> states that the temperature settings are programmable via smart, but I >> haven't found a way to do that. > > Hello Emile, > > I've found a workaround but the definitive fix should be handled by > Illumos I guess. There is no open ticket, first I was waiting for > something to happen with #4051 before going back to using that distro > and kernel. > > Here's the story: > The SCSI specification defines two registers to store the temperature > thresholds in SMART data. One contains the recommended maximum operation > temperature for best MTBF, and the other register is for the absolute > maximum rating. Usually the industry has always put the same value in > both, and that is the absolute maximum. That's why we always see > something like 60/65?C from SMART. But recently Seagate has changed that > because it was asked by a large OS company to comply with the > specification for better hardware monitoring integration. The change did > not only occur in newer products but in a firmware update for existing > disks and that was applied to the production line which explains some > disks mays or may not expose this problem although they are the same > model. Our disks are of the Megalodon serie and all share the same > firmware basecode. > > So any Seagate disk will now trigger faults in FMA if they have a > firmware with the newer policy. Also I think other brands will follow > the same path. > > Like other members suggested in that thread, maybe nothing should change > in FMA but let's face it, you can't maintain a temperature steadily > under 40?C in a JBOD of hundreds of busy disks. Especially in > eco-friendly datacenters. IMHO we should not trigger a fault on the > lower threshold, and certainly not a drive retirement. It breaks storage > servers on reboot or before a pool import, also spare disks could > disappear with the retirement triggered. > > The workaround is to downgrade firmware to the last version before the > change, and to reset the register with an SCSI command. It is not > possible to set the register to a user specified value like the > documentation suggests, they confirmed it. > > I'm sending a working firmware to you in a private mail. I'm not aware > of any issue working with that older version and hopefully it should > upload to 1TB drives as well. > I'm applying it like this but from Linux not OmniOS: > # ./dl_sea_fw-0.2.3_32 -f Megalodon_StdOEM_SAS_0002+C84C.lod -m ST4000NM0023 > # ./dl_sea_fw-0.2.3_32 -i > > Then you should reset the drives so they reload the firmware. > Here's our example for 4TB drives: > ------------- > for i in $(lsscsi | grep 'ST4000NM0023' | awk '{print $6}') ; do > sg_reset -d $i > done > ------------- > > And reset the register that contains value from the previous firmware. > It doesn't work well so we've got this script to run a few times until > all disks got it. Again it matches 4TB Megalodon. > ------------- > for i in $(lsscsi | grep 'ST4000NM0023' | awk '{print $6}') ; do > echo -n "$i " > if sg_logs $i --page=0x0d | grep 'Reference temperature = 68 C' >> /dev/null ; then > echo 'ok' > else > sg_logs $i --page=0x0d --reset > echo 'reset' > fi > done > ------------- > > > Cheers > From matthias-omn-discuss at mteege.de Wed Apr 16 09:29:41 2014 From: matthias-omn-discuss at mteege.de (Matthias Teege) Date: Wed, 16 Apr 2014 11:29:41 +0200 Subject: [OmniOS-discuss] No updates available for zone but an old openssl? Message-ID: <1849611f-16fe-420d-8135-c2a544d6d836@mteege.de> Hallo, I've upgraded my omnios system with pkg update. After that the new openssl is installed. root at tst:~# openssl version OpenSSL 1.0.1g 7 Apr 2014 But there are no updates for the zone: root at tst:~# zlogin t1 [Connected to zone 't1' pts/2] Last login: Wed Apr 16 05:57:01 on pts/2 OmniOS 5.11 omnios-6de5e81 2013.11.27 root at t1:~# openssl versin OpenSSL 1.0.1f 6 Jan 2014 root at t1:~# pkg update -vn No updates available for this image. How do I update the zone? Many thanks Matthias From mailinglists at qutic.com Wed Apr 16 10:47:51 2014 From: mailinglists at qutic.com (qutic development) Date: Wed, 16 Apr 2014 12:47:51 +0200 Subject: [OmniOS-discuss] No updates available for zone but an old openssl? In-Reply-To: <1849611f-16fe-420d-8135-c2a544d6d836@mteege.de> References: <1849611f-16fe-420d-8135-c2a544d6d836@mteege.de> Message-ID: > How do I update the zone? http://omnios.omniti.com/wiki.php/GeneralAdministration#UpgradingWithNon-GlobalZones From groups at tierarzt-mueller.de Wed Apr 16 11:00:39 2014 From: groups at tierarzt-mueller.de (Alexander Lesle) Date: Wed, 16 Apr 2014 13:00:39 +0200 Subject: [OmniOS-discuss] Granular control of fma modules In-Reply-To: References: <94925D68-A787-4BF4-9A26-AD9CF80D7268@cooperi.net> Message-ID: <1008203753.20140416130039@tierarzt-mueller.de> Hello List, sorry for highjacking this thread. I am searching for informations and helps about disk-transport.conf and changing properties for a fma modul. But I cant found the right manual. Background is that I have insert at the end of the file /usr/lib/fm/fmd/plugins/disk-transport.conf setprop interval 6h Restart fmd and the fmd get faulty. Where I can get some helps about fma and setting other properties in the various modules? Links? man? Thanks. -- Best Regards Alexander April, 16 2014 ........ [1] mid:etPan.5306518a.507ed7ab.12a0 at abp.local ........ From Kevin.Swab at ColoState.EDU Wed Apr 16 16:39:21 2014 From: Kevin.Swab at ColoState.EDU (Kevin Swab) Date: Wed, 16 Apr 2014 10:39:21 -0600 Subject: [OmniOS-discuss] kernel panic In-Reply-To: <53436DC6.6080208@ColoState.EDU> References: <53436DC6.6080208@ColoState.EDU> Message-ID: <534EB239.6000609@ColoState.EDU> Any thoughts on this one? I can provide some more info if that helps. The system is all desktop-grade hardware, with a core-i3 540 CPU and 8gigs of (non-ecc) ram. The pool in question is a 3-disk raidz built on Toshiba DT01ACA3 3T SATA drives attached to the motherboard SATA ports. The pool was working fine for about 12 months prior to the panic. The pool originally had dedup running, but the stack trace from an isolated panic about 2 months ago indicated dedup problems, so I turned it off. In an attempt to eliminate hardware problems, I've tried the following: - Ran memtest86+ for about 30 hours, no errors found - Ran SMART long tests on all the drives, no errors - Read the entire drive with 'dd' to /dev/null (all 3 drives), no errors reported by dd or iostat - Put the drives in another machine w/ an LSI SAS controller, same result. - dd'ed the contents of the drives to 3 borrowed SAS drives, and attmpted to import the pool from there, same results. I found this page with steps that solved a similar problem for someone else: http://sigtar.com/2009/10/19/opensolaris-zfs-recovery-after-kernel-panic/ Importing the pool read-only as suggested still results in a kernel panic. The 'zdb' command mentioned dumps core before completing: # zpool import pool: data1 id: 17144127232233481271 state: ONLINE action: The pool can be imported using its name or numeric identifier. config: data1 ONLINE raidz1-0 ONLINE c2t3d0 ONLINE c2t2d0 ONLINE c2t4d0 ONLINE # zdb -e -bcsvL data1 Traversing all blocks to verify checksums ... assertion failed for thread 0xfffffd7fff162a40, thread-id 1: c < SPA_MAXBLOCKSIZE >> SPA_MINBLOCKSHIFT, file ../../../uts/common/fs/zfs/zio.c, line 226 Abort (core dumped) # # zpool import -F -f -o readonly=on -R /mnt data1 plankton console login: panic[cpu1]/thread=ffffff000ef07c40: BAD TRAP: type=e (#pf Page fault) rp=ffffff000ef07530 addr=278 occurred in module "unix" due to a NULL pointer dereference sched: #pf Page fault Bad kernel fault at addr=0x278 pid=0, pc=0xfffffffffb85ed1b, sp=0xffffff000ef07628, eflags=0x10246 cr0: 8005003b cr4: 26f8 cr2: 278cr3: bc00000cr8: c rdi: 278 rsi: 4 rdx: ffffff000ef07c40 rcx: 0 r8: ffffff02d9168840 r9: 2 rax: 0 rbx: 278 rbp: ffffff000ef07680 r10: fffffffffb8540bc r11: ffffff02d91b7000 r12: 0 r13: 1 r14: 4 r15: 0 fsb: 0 gsb: ffffff02cbb4dac0 ds: 4b es: 4b fs: 0 gs: 1c3 trp: e err: 2 rip: fffffffffb85ed1b cs: 30 rfl: 10246 rsp: ffffff000ef07628 ss: 38 ffffff000ef07410 unix:die+df () ffffff000ef07520 unix:trap+db3 () ffffff000ef07530 unix:cmntrap+e6 () ffffff000ef07680 unix:mutex_enter+b () ffffff000ef076a0 zfs:zio_buf_alloc+25 () ffffff000ef076e0 zfs:arc_get_data_buf+1d0 () ffffff000ef07730 zfs:arc_buf_alloc+b5 () ffffff000ef07820 zfs:arc_read+42b () ffffff000ef07880 zfs:traverse_prefetch_metadata+9d () ffffff000ef07970 zfs:traverse_visitbp+38b () ffffff000ef07a00 zfs:traverse_dnode+8b () ffffff000ef07af0 zfs:traverse_visitbp+5fd () ffffff000ef07b90 zfs:traverse_prefetch_thread+79 () ffffff000ef07c20 genunix:taskq_d_thread+b7 () ffffff000ef07c30 unix:thread_start+8 () syncing file systems... done dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel 0:44 100% done 100% done: 146470 pages dumped, dump succeeded rebooting... Most other 'zdb' commands I've tried also dump core. I really want to recover the data on this pool if at all possible. I can provide crash dumps if needed. Barring recovery, I would at least like to understand what went wrong so I can avoid doing it again in the future. Please, can anyone help? Thanks - Kevin On 04/07/2014 09:32 PM, Kevin Swab wrote: > I've got OmniOS 151008j running on a home file server, and the other day > it went into a reboot loop, displaying a kernel panic on the console > just after the kernel banner was printed. > > The panic message on screen showed some zfs function calls so following > that lead, I booted off the install media, mounted my root pool and > removed /etc/zpool.cache. The system was able to boot after that but > when I attempt to import the pool containing my data, it panics again. > > FMD shows that a reboot occurred after a kernel panic, and says more > info is available from fmdump. Here's the stack trace from 'fmdump': > > # fmdump -Vp -u 38f6aa49-6c97-4675-b526-e455b1ae215b > TIME UUID > SUNW-MSG-ID > Apr 07 2014 21:03:45.097921000 38f6aa49-6c97-4675-b526-e455b1ae215b > SUNOS-8000-KL > > TIME CLASS ENA > Apr 07 21:03:45.0237 ireport.os.sunos.panic.dump_available > 0x0000000000000000 > Apr 07 21:03:03.8496 ireport.os.sunos.panic.dump_pending_on_device > 0x0000000000000000 > > nvlist version: 0 > version = 0x0 > class = list.suspect > uuid = 38f6aa49-6c97-4675-b526-e455b1ae215b > code = SUNOS-8000-KL > diag-time = 1396926225 62791 > de = fmd:///module/software-diagnosis > fault-list-sz = 0x1 > fault-list = (array of embedded nvlists) > (start fault-list[0]) > nvlist version: 0 > version = 0x0 > class = defect.sunos.kernel.panic > certainty = 0x64 > asru = > sw:///:path=/var/crash/unknown/.38f6aa49-6c97-4675-b526-e455b1ae215b > resource = > sw:///:path=/var/crash/unknown/.38f6aa49-6c97-4675-b526-e455b1ae215b > savecore-succcess = 1 > dump-dir = /var/crash/unknown > dump-files = vmdump.1 > os-instance-uuid = 38f6aa49-6c97-4675-b526-e455b1ae215b > panicstr = BAD TRAP: type=e (#pf Page fault) > rp=ffffff000fadafc0 addr=2b8 occurred in module "unix" due to a NULL > pointer dereference > panicstack = unix:die+df () | unix:trap+db3 () | > unix:cmntrap+e6 () | unix:mutex_enter+b () | zfs:zio_buf_alloc+25 () | > zfs:arc_get_data_buf+2b8 () | zfs:arc_buf_alloc+b5 () | zfs:arc_read+42b > () | zfs:dsl_scan_prefetch+a7 () | zfs:dsl_scan_recurse+16f () | > zfs:dsl_scan_visitbp+eb () | zfs:dsl_scan_visitdnode+bd () | > zfs:dsl_scan_recurse+439 () | zfs:dsl_scan_visitbp+eb () | > zfs:dsl_scan_visit_rootbp+61 () | zfs:dsl_scan_visit+26b () | > zfs:dsl_scan_sync+12f () | zfs:spa_sync+334 () | zfs:txg_sync_thread+227 > () | unix:thread_start+8 () | > crashtime = 1396801998 > panic-time = Sun Apr 6 10:33:18 2014 MDT > (end fault-list[0]) > > fault-status = 0x1 > severity = Major > __ttl = 0x1 > __tod = 0x53436711 0x5d627e8 > > > > I'd really like to recover the data on that pool if possible, any > suggestions on what I can try next? > > Thanks, > Kevin > > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss > -- ------------------------------------------------------------------- Kevin Swab UNIX Systems Administrator ACNS Colorado State University Phone: (970)491-6572 Email: Kevin.Swab at ColoState.EDU GPG Fingerprint: 7026 3F66 A970 67BD 6F17 8EB8 8A7D 142F 2392 791C From danmcd at omniti.com Wed Apr 16 17:44:46 2014 From: danmcd at omniti.com (Dan McDonald) Date: Wed, 16 Apr 2014 13:44:46 -0400 Subject: [OmniOS-discuss] kernel panic In-Reply-To: <534EB239.6000609@ColoState.EDU> References: <53436DC6.6080208@ColoState.EDU> <534EB239.6000609@ColoState.EDU> Message-ID: On Apr 16, 2014, at 12:39 PM, Kevin Swab wrote: > > Traversing all blocks to verify checksums ... > > assertion failed for thread 0xfffffd7fff162a40, thread-id 1: c < > SPA_MAXBLOCKSIZE >> SPA_MINBLOCKSHIFT, file > ../../../uts/common/fs/zfs/zio.c, line 226 > Abort (core dumped) > # > > # zpool import -F -f -o readonly=on -R /mnt data1 > plankton console login: > panic[cpu1]/thread=ffffff000ef07c40: BAD TRAP: type=e (#pf Page fault) > rp=ffffff000ef07530 addr=278 occurred in module "unix" due to a NULL > pointer dereference Interesting. We've seen one (just one) panic just like this in-house. In our case, some very strange corruption was written to disk, and ZFS couldn't cope. I have a request out to the ZFS community to improve the coping mechanisms. :) I've some dumb questions: 1.) Earlier in the thread, you mention these are SATA drives. When the panic occurred, were they attached via AHCI? Or to a controller of some sort? You mention you tried attaching these disks to an mpt_sas controller to try and recover them. Our machine was using plain SATA drives attached via AHCI. 2.) Is the kernel coredump available? If this is what we were seeing, I'd VERY much like to see what your corruption actually looks like. Knowing might help us root-cause the corruption in the first place. The corruption is of the blkptr_t, in particular its size, which ZFS now assumes is sane. zdb indicates this via an assertion failure, a non-debug kernel will just panic when it goes dereferencing a pointer in hyperspace. The coping mechanism involved would throw an IO error if an insane size is read off disk. The biggest question, of course, is how the corruption was introduced. THAT's why I want to see your coredump. If your corruption is close to ours - ours has a disk name of all things scribbled there - we share a common source of corruption. > I really want to recover the data on this pool if at all possible. I > can provide crash dumps if needed. Barring recovery, I would at least > like to understand what went wrong so I can avoid doing it again in the > future. If we can get a version of ZFS that can cope with corrupted blkptrs, that may help in recovery. I know *IN THIS PARTICULAR CODEPATH* how to cope, but I'm concerned it would expose other errors, and even read-only, I don't want to perform such experiments on a customer's data. :) It does seem, however, that our box is in the same state, so I will try it there. If I have success, I can share the modified "zfs" module. Dan From Kevin.Swab at ColoState.EDU Wed Apr 16 19:32:23 2014 From: Kevin.Swab at ColoState.EDU (Kevin Swab) Date: Wed, 16 Apr 2014 13:32:23 -0600 Subject: [OmniOS-discuss] kernel panic In-Reply-To: References: <53436DC6.6080208@ColoState.EDU> <534EB239.6000609@ColoState.EDU> Message-ID: <534EDAC7.5060009@ColoState.EDU> Hello Dan - Thanks for your help, I really appreciate it! Answers to your questions are inline below.... On 04/16/2014 11:44 AM, Dan McDonald wrote: > > On Apr 16, 2014, at 12:39 PM, Kevin Swab wrote: >> > > >> Traversing all blocks to verify checksums ... >> >> assertion failed for thread 0xfffffd7fff162a40, thread-id 1: c < >> SPA_MAXBLOCKSIZE >> SPA_MINBLOCKSHIFT, file >> ../../../uts/common/fs/zfs/zio.c, line 226 >> Abort (core dumped) >> # >> >> # zpool import -F -f -o readonly=on -R /mnt data1 >> plankton console login: >> panic[cpu1]/thread=ffffff000ef07c40: BAD TRAP: type=e (#pf Page fault) >> rp=ffffff000ef07530 addr=278 occurred in module "unix" due to a NULL >> pointer dereference > > Interesting. > > We've seen one (just one) panic just like this in-house. In our case, some very strange corruption was written to disk, and ZFS couldn't cope. I have a request out to the ZFS community to improve the coping mechanisms. :) > > I've some dumb questions: > > 1.) Earlier in the thread, you mention these are SATA drives. When the panic occurred, were they attached via AHCI? Or to a controller of some sort? You mention you tried attaching these disks to an mpt_sas controller to try and recover them. Our machine was using plain SATA drives attached via AHCI. Yes, at the time of the initial panic, the drives were attached to motherboard SATA ports that are conigured to run in AHCI mode. At the current time, they are in a test machine at work attached via mpt_sas. > > 2.) Is the kernel coredump available? If this is what we were seeing, I'd VERY much like to see what your corruption actually looks like. Knowing might help us root-cause the corruption in the first place. I believe the original crash dump files are available on my home fileserver, I'll check tonight. I can reproduce the crash at will in my test system at work and have those crash dump files available now. Which would you like to see? > The corruption is of the blkptr_t, in particular its size, which ZFS now assumes is sane. zdb indicates this via an assertion failure, a non-debug kernel will just panic when it goes dereferencing a pointer in hyperspace. The coping mechanism involved would throw an IO error if an insane size is read off disk. > > The biggest question, of course, is how the corruption was introduced. THAT's why I want to see your coredump. If your corruption is close to ours - ours has a disk name of all things scribbled there - we share a common source of corruption. > >> I really want to recover the data on this pool if at all possible. I >> can provide crash dumps if needed. Barring recovery, I would at least >> like to understand what went wrong so I can avoid doing it again in the >> future. > > If we can get a version of ZFS that can cope with corrupted blkptrs, that may help in recovery. > > I know *IN THIS PARTICULAR CODEPATH* how to cope, but I'm concerned it would expose other errors, and even read-only, I don't want to perform such experiments on a customer's data. :) > I appreciate your caution, but without a fix of some kind, my data's gone anyway so I'm willing to experiment... > It does seem, however, that our box is in the same state, so I will try it there. If I have success, I can share the modified "zfs" module. > > Dan > Thanks! that would be great. Let me know what I can do to help... -- ------------------------------------------------------------------- Kevin Swab UNIX Systems Administrator ACNS Colorado State University Phone: (970)491-6572 Email: Kevin.Swab at ColoState.EDU GPG Fingerprint: 7026 3F66 A970 67BD 6F17 8EB8 8A7D 142F 2392 791C From danmcd at omniti.com Wed Apr 16 19:35:07 2014 From: danmcd at omniti.com (Dan McDonald) Date: Wed, 16 Apr 2014 15:35:07 -0400 Subject: [OmniOS-discuss] kernel panic In-Reply-To: <534EDAC7.5060009@ColoState.EDU> References: <53436DC6.6080208@ColoState.EDU> <534EB239.6000609@ColoState.EDU> <534EDAC7.5060009@ColoState.EDU> Message-ID: <800EF407-523E-4612-9C9D-AE4B158C68E8@omniti.com> Doesn't matter where the panic is from --> it's caused by a corrupt block on the disk. A vmdump.N would be nice. You're running 008, I see, so I can use an 008 box to examine the dump. Dan From Kevin.Swab at ColoState.EDU Wed Apr 16 21:04:21 2014 From: Kevin.Swab at ColoState.EDU (Kevin Swab) Date: Wed, 16 Apr 2014 15:04:21 -0600 Subject: [OmniOS-discuss] kernel panic In-Reply-To: <800EF407-523E-4612-9C9D-AE4B158C68E8@omniti.com> References: <53436DC6.6080208@ColoState.EDU> <534EB239.6000609@ColoState.EDU> <534EDAC7.5060009@ColoState.EDU> <800EF407-523E-4612-9C9D-AE4B158C68E8@omniti.com> Message-ID: <534EF055.3060904@ColoState.EDU> Thanks again Dan - sending "vmdump.2" in a separate message... On 04/16/2014 01:35 PM, Dan McDonald wrote: > Doesn't matter where the panic is from --> it's caused by a corrupt block on the disk. > > A vmdump.N would be nice. You're running 008, I see, so I can use an 008 box to examine the dump. > > Dan > -- ------------------------------------------------------------------- Kevin Swab UNIX Systems Administrator ACNS Colorado State University Phone: (970)491-6572 Email: Kevin.Swab at ColoState.EDU GPG Fingerprint: 7026 3F66 A970 67BD 6F17 8EB8 8A7D 142F 2392 791C From chip at innovates.com Thu Apr 17 15:40:02 2014 From: chip at innovates.com (Schweiss, Chip) Date: Thu, 17 Apr 2014 10:40:02 -0500 Subject: [OmniOS-discuss] Overheating faults with ST4000NM0023 In-Reply-To: <534DB2FC.9090404@gmail.com> References: <6B4199959CFA4E1CB067BBF9270654CB@486dx4> <52FC9C12.9090900@smartjog.com> <534DB2FC.9090404@gmail.com> Message-ID: You can get the Seagate firmwares from this link: https://apps1.seagate.com/downloads/request.html Seems they don't link to this on their site any more I found it in an old email from their site. -Chip On Tue, Apr 15, 2014 at 5:30 PM, Saso Kiselkov wrote: > Hi, > > I've hit this exact same issue on my recent SEAGATE ST2000NM0023 drives. > Can you please direct me to where I can get the firmware package? > Perhaps we could also post the link publicly, so that people can find it > through google or some such method. > > Thanks! > > Best wishes, > -- > Saso > > On 2/13/14, 11:18 AM, Thibault VINCENT wrote: > > On 02/12/2014 09:59 PM, Steamer wrote: > >> Did you ever find a solution to the overheating faults with the > >> ST4000NM0023? > >> > >> I'm currently having the exact same issue with ST1000NM0023 drives, > >> seems like seagate has the user temp probe set at 40'C. The manual > >> states that the temperature settings are programmable via smart, but I > >> haven't found a way to do that. > > > > Hello Emile, > > > > I've found a workaround but the definitive fix should be handled by > > Illumos I guess. There is no open ticket, first I was waiting for > > something to happen with #4051 before going back to using that distro > > and kernel. > > > > Here's the story: > > The SCSI specification defines two registers to store the temperature > > thresholds in SMART data. One contains the recommended maximum operation > > temperature for best MTBF, and the other register is for the absolute > > maximum rating. Usually the industry has always put the same value in > > both, and that is the absolute maximum. That's why we always see > > something like 60/65?C from SMART. But recently Seagate has changed that > > because it was asked by a large OS company to comply with the > > specification for better hardware monitoring integration. The change did > > not only occur in newer products but in a firmware update for existing > > disks and that was applied to the production line which explains some > > disks mays or may not expose this problem although they are the same > > model. Our disks are of the Megalodon serie and all share the same > > firmware basecode. > > > > So any Seagate disk will now trigger faults in FMA if they have a > > firmware with the newer policy. Also I think other brands will follow > > the same path. > > > > Like other members suggested in that thread, maybe nothing should change > > in FMA but let's face it, you can't maintain a temperature steadily > > under 40?C in a JBOD of hundreds of busy disks. Especially in > > eco-friendly datacenters. IMHO we should not trigger a fault on the > > lower threshold, and certainly not a drive retirement. It breaks storage > > servers on reboot or before a pool import, also spare disks could > > disappear with the retirement triggered. > > > > The workaround is to downgrade firmware to the last version before the > > change, and to reset the register with an SCSI command. It is not > > possible to set the register to a user specified value like the > > documentation suggests, they confirmed it. > > > > I'm sending a working firmware to you in a private mail. I'm not aware > > of any issue working with that older version and hopefully it should > > upload to 1TB drives as well. > > I'm applying it like this but from Linux not OmniOS: > > # ./dl_sea_fw-0.2.3_32 -f Megalodon_StdOEM_SAS_0002+C84C.lod -m > ST4000NM0023 > > # ./dl_sea_fw-0.2.3_32 -i > > > > Then you should reset the drives so they reload the firmware. > > Here's our example for 4TB drives: > > ------------- > > for i in $(lsscsi | grep 'ST4000NM0023' | awk '{print $6}') ; do > > sg_reset -d $i > > done > > ------------- > > > > And reset the register that contains value from the previous firmware. > > It doesn't work well so we've got this script to run a few times until > > all disks got it. Again it matches 4TB Megalodon. > > ------------- > > for i in $(lsscsi | grep 'ST4000NM0023' | awk '{print $6}') ; do > > echo -n "$i " > > if sg_logs $i --page=0x0d | grep 'Reference temperature = 68 C' > >> /dev/null ; then > > echo 'ok' > > else > > sg_logs $i --page=0x0d --reset > > echo 'reset' > > fi > > done > > ------------- > > > > > > Cheers > > > > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss > -------------- next part -------------- An HTML attachment was scrubbed... URL: From skiselkov.ml at gmail.com Thu Apr 17 16:15:23 2014 From: skiselkov.ml at gmail.com (Saso Kiselkov) Date: Thu, 17 Apr 2014 18:15:23 +0200 Subject: [OmniOS-discuss] Overheating faults with ST4000NM0023 In-Reply-To: References: <6B4199959CFA4E1CB067BBF9270654CB@486dx4> <52FC9C12.9090900@smartjog.com> <534DB2FC.9090404@gmail.com> Message-ID: <534FFE1B.2060201@gmail.com> On 4/17/14, 5:40 PM, Schweiss, Chip wrote: > You can get the Seagate firmwares from this link: > > https://apps1.seagate.com/downloads/request.html > > Seems they don't link to this on their site any more I found it in an > old email from their site. I found the same form, but the damn thing can't find my drive by S/N (Z1Y18H7V0000C4196NRF). Cheers, -- Saso From chip at innovates.com Thu Apr 17 16:27:05 2014 From: chip at innovates.com (Schweiss, Chip) Date: Thu, 17 Apr 2014 11:27:05 -0500 Subject: [OmniOS-discuss] Overheating faults with ST4000NM0023 In-Reply-To: <534FFE1B.2060201@gmail.com> References: <6B4199959CFA4E1CB067BBF9270654CB@486dx4> <52FC9C12.9090900@smartjog.com> <534DB2FC.9090404@gmail.com> <534FFE1B.2060201@gmail.com> Message-ID: On Thu, Apr 17, 2014 at 11:15 AM, Saso Kiselkov wrote: > > I found the same form, but the damn thing can't find my drive by S/N > (Z1Y18H7V0000C4196NRF). > > Use the short form of the S/N: Z1Y18H7V -Chip > Cheers, > -- > Saso > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mweiss at cimlbr.com Thu Apr 17 16:50:26 2014 From: mweiss at cimlbr.com (Matt Weiss) Date: Thu, 17 Apr 2014 11:50:26 -0500 Subject: [OmniOS-discuss] Overheating faults with ST4000NM0023 In-Reply-To: References: <6B4199959CFA4E1CB067BBF9270654CB@486dx4> <52FC9C12.9090900@smartjog.com> <534DB2FC.9090404@gmail.com> <534FFE1B.2060201@gmail.com> Message-ID: <53500652.5020803@cimlbr.com> https://apps1.seagate.com/downloads/certificate.html?action=performDownload&key=393947625083 Don't know how long the link will work On 4/17/2014 11:27 AM, Schweiss, Chip wrote: > > > > On Thu, Apr 17, 2014 at 11:15 AM, Saso Kiselkov > > wrote: > > > I found the same form, but the damn thing can't find my drive by S/N > (Z1Y18H7V0000C4196NRF). > > > Use the short form of the S/N: Z1Y18H7V > > -Chip > > Cheers, > -- > Saso > > > > > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: From cscoman at gmail.com Thu Apr 17 17:37:17 2014 From: cscoman at gmail.com (Jason Cox) Date: Thu, 17 Apr 2014 10:37:17 -0700 Subject: [OmniOS-discuss] Granular control of fma modules In-Reply-To: <989404F2-E09D-4681-9CD4-9614CD795518@RichardElling.com> References: <989404F2-E09D-4681-9CD4-9614CD795518@RichardElling.com> Message-ID: So I am running into this for a server I built for production. I unloaded the module, but I am guessing that this means I will not be able to tell when a drive is failed now outside of the normal way you can tell when a drive is having issues. Thinking long term here, what other options do I have? Just look at running smartmontool to monitor the drive I guess? Also is Seagate going to provide a way to update the firmware once it is available or do we have to try and RMA the drives or just swap them as they fail... I love how the spec says they are good from 5-60c, but the firmware says 40c as the threshold temp. Thanks On Sat, Feb 22, 2014 at 9:50 PM, Richard Elling < richard.elling at richardelling.com> wrote: > On Feb 18, 2014, at 4:17 PM, Anh Quach wrote: > > > Is it possible to tell the disk-transport FMA module to ignore > over-temperature on only a certain set of disks? > > In Solaris 11, yes this is possible. However, the open source community > has not implemented it > yet, AFAIK. > > > > > I?m doing testing with some Seagate Constellation.3?s that seem to run > hotter even at idle than the rest of my disks (39-44 C) and they are > continually getting flagged for over temp. I know I can disable to the temp > alert for that module but I don?t want to disable it for all disks, just > these new Seagates. > > You can unload disk-transport altogether as a workaround. The root cause > is a bug in > the Seagate firmware introduced in version 3 of their firmware. A fix is > in the works > for version 4, available RSN. > ? richard > > -- > > ZFS storage and performance consulting at http://www.RichardElling.com > > > > > > > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss > -- Jason Cox -------------- next part -------------- An HTML attachment was scrubbed... URL: From skiselkov.ml at gmail.com Thu Apr 17 17:49:20 2014 From: skiselkov.ml at gmail.com (Saso Kiselkov) Date: Thu, 17 Apr 2014 19:49:20 +0200 Subject: [OmniOS-discuss] Overheating faults with ST4000NM0023 In-Reply-To: References: <6B4199959CFA4E1CB067BBF9270654CB@486dx4> <52FC9C12.9090900@smartjog.com> <534DB2FC.9090404@gmail.com> <534FFE1B.2060201@gmail.com> Message-ID: <53501420.9090307@gmail.com> On 4/17/14, 6:27 PM, Schweiss, Chip wrote: > > Use the short form of the S/N: Z1Y18H7V Ok, thanks, didn't know there two forms... (FMA only prints one). -- Saso From cscoman at gmail.com Thu Apr 17 17:58:03 2014 From: cscoman at gmail.com (Jason Cox) Date: Thu, 17 Apr 2014 10:58:03 -0700 Subject: [OmniOS-discuss] Granular control of fma modules In-Reply-To: References: <989404F2-E09D-4681-9CD4-9614CD795518@RichardElling.com> Message-ID: Sorry, I guess I jumped on the sent button a little to soon. I found another thread that mentions the firmware update for the drives and how to get it. I guess I will be updating my drives and re-enabling the module. On Thu, Apr 17, 2014 at 10:37 AM, Jason Cox wrote: > So I am running into this for a server I built for production. I unloaded > the module, but I am guessing that this means I will not be able to tell > when a drive is failed now outside of the normal way you can tell when a > drive is having issues. Thinking long term here, what other options do I > have? Just look at running smartmontool to monitor the drive I guess? > > Also is Seagate going to provide a way to update the firmware once it is > available or do we have to try and RMA the drives or just swap them as they > fail... I love how the spec says they are good from 5-60c, but the > firmware says 40c as the threshold temp. > > Thanks > > > On Sat, Feb 22, 2014 at 9:50 PM, Richard Elling < > richard.elling at richardelling.com> wrote: > >> On Feb 18, 2014, at 4:17 PM, Anh Quach wrote: >> >> > Is it possible to tell the disk-transport FMA module to ignore >> over-temperature on only a certain set of disks? >> >> In Solaris 11, yes this is possible. However, the open source community >> has not implemented it >> yet, AFAIK. >> >> > >> > I?m doing testing with some Seagate Constellation.3?s that seem to run >> hotter even at idle than the rest of my disks (39-44 C) and they are >> continually getting flagged for over temp. I know I can disable to the temp >> alert for that module but I don?t want to disable it for all disks, just >> these new Seagates. >> >> You can unload disk-transport altogether as a workaround. The root cause >> is a bug in >> the Seagate firmware introduced in version 3 of their firmware. A fix is >> in the works >> for version 4, available RSN. >> ? richard >> >> -- >> >> ZFS storage and performance consulting at http://www.RichardElling.com >> >> >> >> >> >> >> _______________________________________________ >> OmniOS-discuss mailing list >> OmniOS-discuss at lists.omniti.com >> http://lists.omniti.com/mailman/listinfo/omnios-discuss >> > > > > -- > Jason Cox > -- Jason Cox -------------- next part -------------- An HTML attachment was scrubbed... URL: From chip at innovates.com Fri Apr 18 19:23:27 2014 From: chip at innovates.com (Schweiss, Chip) Date: Fri, 18 Apr 2014 14:23:27 -0500 Subject: [OmniOS-discuss] Overheating faults with ST4000NM0023 In-Reply-To: <53501420.9090307@gmail.com> References: <6B4199959CFA4E1CB067BBF9270654CB@486dx4> <52FC9C12.9090900@smartjog.com> <534DB2FC.9090404@gmail.com> <534FFE1B.2060201@gmail.com> <53501420.9090307@gmail.com> Message-ID: I've flashed 0004 to some of my Constellations so far. The drives are now set at a reference temperature of 60C which is much better than 40C. I had to disable mulltipathing to get these disks to flash. I'm not sure if this is an issue with the drive or the Supermicro JBOD. I disabled multipathing and I'm getting them to flash. -Chip On Thu, Apr 17, 2014 at 12:49 PM, Saso Kiselkov wrote: > On 4/17/14, 6:27 PM, Schweiss, Chip wrote: > > > > Use the short form of the S/N: Z1Y18H7V > > Ok, thanks, didn't know there two forms... (FMA only prints one). > > -- > Saso > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From skiselkov.ml at gmail.com Fri Apr 18 20:35:09 2014 From: skiselkov.ml at gmail.com (Saso Kiselkov) Date: Fri, 18 Apr 2014 22:35:09 +0200 Subject: [OmniOS-discuss] Overheating faults with ST4000NM0023 In-Reply-To: References: <6B4199959CFA4E1CB067BBF9270654CB@486dx4> <52FC9C12.9090900@smartjog.com> <534DB2FC.9090404@gmail.com> <534FFE1B.2060201@gmail.com> <53501420.9090307@gmail.com> Message-ID: <53518C7D.7060703@gmail.com> On 4/18/14, 9:23 PM, Schweiss, Chip wrote: > I've flashed 0004 to some of my Constellations so far. The drives are > now set at a reference temperature of 60C which is much better than 40C. > > I had to disable mulltipathing to get these disks to flash. I'm not > sure if this is an issue with the drive or the Supermicro JBOD. > > I disabled multipathing and I'm getting them to flash. I'm still trying to figure out how to flash them, as the flashing tools only seem to be available for Linux :( Guess I'm gonna have to ask the customer to take the machine offline for a while. Cheers, -- Saso From chip at innovates.com Fri Apr 18 20:49:48 2014 From: chip at innovates.com (Schweiss, Chip) Date: Fri, 18 Apr 2014 15:49:48 -0500 Subject: [OmniOS-discuss] Overheating faults with ST4000NM0023 In-Reply-To: <53518C7D.7060703@gmail.com> References: <6B4199959CFA4E1CB067BBF9270654CB@486dx4> <52FC9C12.9090900@smartjog.com> <534DB2FC.9090404@gmail.com> <534FFE1B.2060201@gmail.com> <53501420.9090307@gmail.com> <53518C7D.7060703@gmail.com> Message-ID: I used Santools, which is a licensed product. >From what I understand lsiutil and sg_buffer_write from sg3-utils can do it too. The mode for sg_buffer_write may need to be set to 7 instead of 5 as stated in the firmware docs. -Chip On Fri, Apr 18, 2014 at 3:35 PM, Saso Kiselkov wrote: > On 4/18/14, 9:23 PM, Schweiss, Chip wrote: > > I've flashed 0004 to some of my Constellations so far. The drives are > > now set at a reference temperature of 60C which is much better than 40C. > > > > I had to disable mulltipathing to get these disks to flash. I'm not > > sure if this is an issue with the drive or the Supermicro JBOD. > > > > I disabled multipathing and I'm getting them to flash. > > I'm still trying to figure out how to flash them, as the flashing tools > only seem to be available for Linux :( > > Guess I'm gonna have to ask the customer to take the machine offline for > a while. > > Cheers, > -- > Saso > -------------- next part -------------- An HTML attachment was scrubbed... URL: From skiselkov.ml at gmail.com Fri Apr 18 21:23:16 2014 From: skiselkov.ml at gmail.com (Saso Kiselkov) Date: Fri, 18 Apr 2014 23:23:16 +0200 Subject: [OmniOS-discuss] Overheating faults with ST4000NM0023 In-Reply-To: References: <6B4199959CFA4E1CB067BBF9270654CB@486dx4> <52FC9C12.9090900@smartjog.com> <534DB2FC.9090404@gmail.com> <534FFE1B.2060201@gmail.com> <53501420.9090307@gmail.com> <53518C7D.7060703@gmail.com> Message-ID: <535197C4.9010507@gmail.com> On 4/18/14, 10:49 PM, Schweiss, Chip wrote: > I used Santools, which is a licensed product. > > From what I understand lsiutil and sg_buffer_write from sg3-utils can do > it too. The mode for sg_buffer_write may need to be set to 7 instead of > 5 as stated in the firmware docs. Hey cool, didn't know sg3_utils was compilable on non-Linux systems. Will try it out, thanks! -- Saso From matthias-omn-discuss at mteege.de Sun Apr 20 11:47:25 2014 From: matthias-omn-discuss at mteege.de (Matthias Teege) Date: Sun, 20 Apr 2014 13:47:25 +0200 Subject: [OmniOS-discuss] No updates available for zone but an old openssl? In-Reply-To: References: <1849611f-16fe-420d-8135-c2a544d6d836@mteege.de> Message-ID: <6d498b05-34ac-470e-8ed0-72b799f99cff@mteege.de> On Wed, Apr 16, 2014 at 12:47:51PM +0200, qutic development wrote: Hi, > > How do I update the zone? > > http://omnios.omniti.com/wiki.php/GeneralAdministration#UpgradingWithNon-GlobalZones the Omnios publisher was missing. After root at tst:~# pkg -R /export/t1/root/ set-publisher -g http://pkg.omniti.com/omnios/release/ omnios root at tst:~# pkg -R /export/t1/root publisher PUBLISHER TYPE STATUS URI cs.umd.edu origin online http://pkg.cs.umd.edu/ omnios origin online http://pkg.omniti.com/omnios/release/ root at tst:~# pkg -R /export/t1/root update it works. Thanks Matthias From chip at innovates.com Mon Apr 21 13:12:56 2014 From: chip at innovates.com (Schweiss, Chip) Date: Mon, 21 Apr 2014 08:12:56 -0500 Subject: [OmniOS-discuss] Overheating faults with ST4000NM0023 In-Reply-To: <535197C4.9010507@gmail.com> References: <6B4199959CFA4E1CB067BBF9270654CB@486dx4> <52FC9C12.9090900@smartjog.com> <534DB2FC.9090404@gmail.com> <534FFE1B.2060201@gmail.com> <53501420.9090307@gmail.com> <53518C7D.7060703@gmail.com> <535197C4.9010507@gmail.com> Message-ID: I have 20 disks that went offline because they reached 40C before I applied the firmware update. I tried marking them as repaired in fmadm, but that didn't make any difference. Does anyone know the trick to bring these back online to OmniOS? -Chip On Fri, Apr 18, 2014 at 4:23 PM, Saso Kiselkov wrote: > On 4/18/14, 10:49 PM, Schweiss, Chip wrote: > > I used Santools, which is a licensed product. > > > > From what I understand lsiutil and sg_buffer_write from sg3-utils can do > > it too. The mode for sg_buffer_write may need to be set to 7 instead of > > 5 as stated in the firmware docs. > > Hey cool, didn't know sg3_utils was compilable on non-Linux systems. > Will try it out, thanks! > > -- > Saso > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chip at innovates.com Mon Apr 21 16:19:44 2014 From: chip at innovates.com (Schweiss, Chip) Date: Mon, 21 Apr 2014 11:19:44 -0500 Subject: [OmniOS-discuss] Overheating faults with ST4000NM0023 In-Reply-To: References: <6B4199959CFA4E1CB067BBF9270654CB@486dx4> <52FC9C12.9090900@smartjog.com> <534DB2FC.9090404@gmail.com> <534FFE1B.2060201@gmail.com> <53501420.9090307@gmail.com> <53518C7D.7060703@gmail.com> <535197C4.9010507@gmail.com> Message-ID: I suspecting these drives have self-destructed. Can anyone confirm this firmware issue causes the drives to permanently go offline? -Chip On Mon, Apr 21, 2014 at 8:12 AM, Schweiss, Chip wrote: > I have 20 disks that went offline because they reached 40C before I > applied the firmware update. > > I tried marking them as repaired in fmadm, but that didn't make any > difference. > > Does anyone know the trick to bring these back online to OmniOS? > > -Chip > > > On Fri, Apr 18, 2014 at 4:23 PM, Saso Kiselkov wrote: > >> On 4/18/14, 10:49 PM, Schweiss, Chip wrote: >> > I used Santools, which is a licensed product. >> > >> > From what I understand lsiutil and sg_buffer_write from sg3-utils can do >> > it too. The mode for sg_buffer_write may need to be set to 7 instead of >> > 5 as stated in the firmware docs. >> >> Hey cool, didn't know sg3_utils was compilable on non-Linux systems. >> Will try it out, thanks! >> >> -- >> Saso >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimklimov at cos.ru Mon Apr 21 16:28:55 2014 From: jimklimov at cos.ru (Jim Klimov) Date: Mon, 21 Apr 2014 18:28:55 +0200 Subject: [OmniOS-discuss] Overheating faults with ST4000NM0023 In-Reply-To: References: <6B4199959CFA4E1CB067BBF9270654CB@486dx4> <52FC9C12.9090900@smartjog.com> <534DB2FC.9090404@gmail.com> <534FFE1B.2060201@gmail.com> <53501420.9090307@gmail.com> <53518C7D.7060703@gmail.com> <535197C4.9010507@gmail.com> Message-ID: <53554747.1070204@cos.ru> Is it possible to boot into another OS (perhaps the LiveCD) and/or connect these disks to another host, just to check if they respond to read requests? Essentially this would help confirm or reject the theory of self-destruction. HTH, //Jim On 2014-04-21 18:19, Schweiss, Chip wrote: > I suspecting these drives have self-destructed. > > Can anyone confirm this firmware issue causes the drives to permanently > go offline? > > -Chip > > > On Mon, Apr 21, 2014 at 8:12 AM, Schweiss, Chip > wrote: > > I have 20 disks that went offline because they reached 40C before I > applied the firmware update. > > I tried marking them as repaired in fmadm, but that didn't make any > difference. > > Does anyone know the trick to bring these back online to OmniOS? From skiselkov.ml at gmail.com Tue Apr 22 09:36:03 2014 From: skiselkov.ml at gmail.com (Saso Kiselkov) Date: Tue, 22 Apr 2014 11:36:03 +0200 Subject: [OmniOS-discuss] Overheating faults with ST4000NM0023 In-Reply-To: References: <6B4199959CFA4E1CB067BBF9270654CB@486dx4> <52FC9C12.9090900@smartjog.com> <534DB2FC.9090404@gmail.com> <534FFE1B.2060201@gmail.com> <53501420.9090307@gmail.com> <53518C7D.7060703@gmail.com> Message-ID: <53563803.5030905@gmail.com> On 4/18/14, 10:49 PM, Schweiss, Chip wrote: > I used Santools, which is a licensed product. > > From what I understand lsiutil and sg_buffer_write from sg3-utils can do > it too. The mode for sg_buffer_write may need to be set to 7 instead of > 5 as stated in the firmware docs. > Sadly, I had no luck with either lsiutil or sg_write_buffer from sg3-utils. lsiutil is only for older MPT HBAs (I have an MPT 2.0 one) and sg_write_buffer fails with the following error: # sg_write_buffer -v --in=MegalodonES3-SAS-STD-0004.LOD --length=1625600 --mode=5 /dev/rdsk/c9t5000C500578F774Bd0 Write buffer cmd: 3b 05 00 00 00 00 18 ce 00 00 ioctl(USCSICMD) failed with os_err (errno) = 22 write buffer: pass through os error: Invalid argument Write buffer failed res=-1 I also tried the following device names: /dev/rdsk/c9t5000C500578F774Bd0p0 /dev/dsk/c9t5000C500578F774Bd0 /dev/dsk/c9t5000C500578F774Bd0p0 The OS also printed the following error: WARNING: mpt_sas: coding error detected, the driver is using ddi_dma_attr(9S) incorrectly. There is a small risk of data corruption in particular with large I/Os. The driver should be replaced with a corrected version for proper system operation. To disable this warning, add 'set rootnex:rootnex_bind_warn=0' to /etc/system(4). Staring at the code near usr/src/uts/i86pc/io/rootnex.c:3305, this means that the driver can't submit a DMA job this large, which means that I can't really fix this at all (this is really way outside of my field). Any ideas on what to do next? Cheers, -- Saso From mir at miras.org Tue Apr 22 09:53:41 2014 From: mir at miras.org (Michael Rasmussen) Date: Tue, 22 Apr 2014 11:53:41 +0200 Subject: [OmniOS-discuss] Overheating faults with ST4000NM0023 In-Reply-To: <53563803.5030905@gmail.com> References: <6B4199959CFA4E1CB067BBF9270654CB@486dx4> <52FC9C12.9090900@smartjog.com> <534DB2FC.9090404@gmail.com> <534FFE1B.2060201@gmail.com> <53501420.9090307@gmail.com> <53518C7D.7060703@gmail.com> <53563803.5030905@gmail.com> Message-ID: <20140422115341.4c0a26a9@sleipner.datanom.net> On Tue, 22 Apr 2014 11:36:03 +0200 Saso Kiselkov wrote: > > Any ideas on what to do next? > Could you boot the system from a live linux distro and run the tools from it? Maybe support for linux is better. I can recommend systemrescuecd (based on gentoo). -- Hilsen/Regards Michael Rasmussen Get my public GnuPG keys: michael rasmussen cc http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E mir datanom net http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C mir miras org http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917 -------------------------------------------------------------- /usr/games/fortune -es says: All bad precedents began as justifiable measures. -- Gaius Julius Caesar, quoted in "The Conspiracy of Catiline", by Sallust -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From skiselkov.ml at gmail.com Tue Apr 22 09:58:35 2014 From: skiselkov.ml at gmail.com (Saso Kiselkov) Date: Tue, 22 Apr 2014 11:58:35 +0200 Subject: [OmniOS-discuss] Overheating faults with ST4000NM0023 In-Reply-To: <20140422115341.4c0a26a9@sleipner.datanom.net> References: <6B4199959CFA4E1CB067BBF9270654CB@486dx4> <52FC9C12.9090900@smartjog.com> <534DB2FC.9090404@gmail.com> <534FFE1B.2060201@gmail.com> <53501420.9090307@gmail.com> <53518C7D.7060703@gmail.com> <53563803.5030905@gmail.com> <20140422115341.4c0a26a9@sleipner.datanom.net> Message-ID: <53563D4B.7020105@gmail.com> On 4/22/14, 11:53 AM, Michael Rasmussen wrote: > On Tue, 22 Apr 2014 11:36:03 +0200 > Saso Kiselkov wrote: > >> >> Any ideas on what to do next? >> > Could you boot the system from a live linux distro and run the tools > from it? Maybe support for linux is better. > > I can recommend systemrescuecd (based on gentoo). I can't, the system is in production. -- Saso From chip at innovates.com Tue Apr 22 15:03:38 2014 From: chip at innovates.com (Schweiss, Chip) Date: Tue, 22 Apr 2014 10:03:38 -0500 Subject: [OmniOS-discuss] Overheating faults with ST4000NM0023 In-Reply-To: <53563803.5030905@gmail.com> References: <6B4199959CFA4E1CB067BBF9270654CB@486dx4> <52FC9C12.9090900@smartjog.com> <534DB2FC.9090404@gmail.com> <534FFE1B.2060201@gmail.com> <53501420.9090307@gmail.com> <53518C7D.7060703@gmail.com> <53563803.5030905@gmail.com> Message-ID: Are you sure you have SAS multipath disabled on the disk you are trying to flash? I couldn't get these to flash at all with MP enabled. I too kept getting OS related errors. For one system I did an stmsboot -d, for another I just pulled one of the SAS cables to each JBOD. -Chip On Tue, Apr 22, 2014 at 4:36 AM, Saso Kiselkov wrote: > On 4/18/14, 10:49 PM, Schweiss, Chip wrote: > > I used Santools, which is a licensed product. > > > > From what I understand lsiutil and sg_buffer_write from sg3-utils can do > > it too. The mode for sg_buffer_write may need to be set to 7 instead of > > 5 as stated in the firmware docs. > > > > Sadly, I had no luck with either lsiutil or sg_write_buffer from > sg3-utils. lsiutil is only for older MPT HBAs (I have an MPT 2.0 one) > and sg_write_buffer fails with the following error: > > # sg_write_buffer -v --in=MegalodonES3-SAS-STD-0004.LOD --length=1625600 > --mode=5 /dev/rdsk/c9t5000C500578F774Bd0 > Write buffer cmd: 3b 05 00 00 00 00 18 ce 00 00 > ioctl(USCSICMD) failed with os_err (errno) = 22 > write buffer: pass through os error: Invalid argument > Write buffer failed res=-1 > > I also tried the following device names: > /dev/rdsk/c9t5000C500578F774Bd0p0 > /dev/dsk/c9t5000C500578F774Bd0 > /dev/dsk/c9t5000C500578F774Bd0p0 > > The OS also printed the following error: > > WARNING: mpt_sas: coding error detected, the driver is using > ddi_dma_attr(9S) incorrectly. There is a small risk of data corruption > in particular with large I/Os. The driver should be replaced with a > corrected version for proper system operation. To disable this warning, > add 'set rootnex:rootnex_bind_warn=0' to /etc/system(4). > > Staring at the code near usr/src/uts/i86pc/io/rootnex.c:3305, this means > that the driver can't submit a DMA job this large, which means that I > can't really fix this at all (this is really way outside of my field). > > Any ideas on what to do next? > > Cheers, > -- > Saso > -------------- next part -------------- An HTML attachment was scrubbed... URL: From richard.elling at richardelling.com Tue Apr 22 16:15:32 2014 From: richard.elling at richardelling.com (Richard Elling) Date: Tue, 22 Apr 2014 09:15:32 -0700 Subject: [OmniOS-discuss] Overheating faults with ST4000NM0023 In-Reply-To: References: <6B4199959CFA4E1CB067BBF9270654CB@486dx4> <52FC9C12.9090900@smartjog.com> <534DB2FC.9090404@gmail.com> <534FFE1B.2060201@gmail.com> <53501420.9090307@gmail.com> <53518C7D.7060703@gmail.com> <535197C4.9010507@gmail.com> Message-ID: <0DFDE814-0EC4-4ABB-9752-6C5F910F7F8B@RichardElling.com> On Apr 21, 2014, at 9:19 AM, Schweiss, Chip wrote: > I suspecting these drives have self-destructed. > > Can anyone confirm this firmware issue causes the drives to permanently go offline? They are fine. FMA retires them, so you have to coerce the OS to reinstantiate them. In my case, they were in the lab, and we reinstall OSes continuously, so it wasn't a problem for us :-) You might have a look at cfgadm -al, and see if it is in a state that can be coerced... the docs are poor in this area :-( and this is not a frequent operation :-) -- richard -- Richard.Elling at RichardElling.com +1-760-896-4422 -------------- next part -------------- An HTML attachment was scrubbed... URL: From chip at innovates.com Tue Apr 22 16:36:24 2014 From: chip at innovates.com (Schweiss, Chip) Date: Tue, 22 Apr 2014 11:36:24 -0500 Subject: [OmniOS-discuss] Overheating faults with ST4000NM0023 In-Reply-To: <0DFDE814-0EC4-4ABB-9752-6C5F910F7F8B@RichardElling.com> References: <6B4199959CFA4E1CB067BBF9270654CB@486dx4> <52FC9C12.9090900@smartjog.com> <534DB2FC.9090404@gmail.com> <534FFE1B.2060201@gmail.com> <53501420.9090307@gmail.com> <53518C7D.7060703@gmail.com> <535197C4.9010507@gmail.com> <0DFDE814-0EC4-4ABB-9752-6C5F910F7F8B@RichardElling.com> Message-ID: On Tue, Apr 22, 2014 at 11:15 AM, Richard Elling < richard.elling at richardelling.com> wrote: > > On Apr 21, 2014, at 9:19 AM, Schweiss, Chip wrote: > > I suspecting these drives have self-destructed. > > Can anyone confirm this firmware issue causes the drives to permanently go > offline? > > > They are fine. FMA retires them, so you have to coerce the OS to > reinstantiate them. > In my case, they were in the lab, and we reinstall OSes continuously, so > it wasn't a > problem for us :-) You might have a look at cfgadm -al, and see if it is > in a state that > can be coerced... the docs are poor in this area :-( and this is not a > frequent operation :-) > -- richard > After running devfsadm -C the device stubs aren't there anymore. They don't show up in 'cfgadm -al' I can see them from the HBA BIOs. So I'm still leaning towards the disk are okay, but OmniOS refuses to talk to them. So once 'retired', even marking the device repaired will not allow it to be mounted? -Chip -------------- next part -------------- An HTML attachment was scrubbed... URL: From hakansom at ohsu.edu Tue Apr 22 17:15:48 2014 From: hakansom at ohsu.edu (Marion Hakanson) Date: Tue, 22 Apr 2014 10:15:48 -0700 Subject: [OmniOS-discuss] Overheating faults with ST4000NM0023 In-Reply-To: Message from Saso Kiselkov of "Tue, 22 Apr 2014 11:36:03 +0200." <53563803.5030905@gmail.com> Message-ID: <201404221715.s3MHFm4B000071@kyklops.ohsu.edu> skiselkov.ml at gmail.com said: > Sadly, I had no luck with either lsiutil or sg_write_buffer from sg3-utils. > lsiutil is only for older MPT HBAs (I have an MPT 2.0 one) and > sg_write_buffer fails with the following error: > . . . > Staring at the code near usr/src/uts/i86pc/io/rootnex.c:3305, this means that > the driver can't submit a DMA job this large, which means that I can't really > fix this at all (this is really way outside of my field). > > Any ideas on what to do next? Have any of you tried the "fwflash" utility (comes with OmniOS, oi151a7, etc.)? When I do "fwflash -l" it does list out the Seagate 2TB and 4TB drives on a couple of our systems here (multipath enabled). I don't have any drives needing firmware updates, so haven't tested out that functionality yet. Regards, Marion From skiselkov.ml at gmail.com Tue Apr 22 17:58:58 2014 From: skiselkov.ml at gmail.com (Saso Kiselkov) Date: Tue, 22 Apr 2014 19:58:58 +0200 Subject: [OmniOS-discuss] Overheating faults with ST4000NM0023 In-Reply-To: References: <6B4199959CFA4E1CB067BBF9270654CB@486dx4> <52FC9C12.9090900@smartjog.com> <534DB2FC.9090404@gmail.com> <534FFE1B.2060201@gmail.com> <53501420.9090307@gmail.com> <53518C7D.7060703@gmail.com> <53563803.5030905@gmail.com> Message-ID: <5356ADE2.5070605@gmail.com> On 4/22/14, 5:03 PM, Schweiss, Chip wrote: > Are you sure you have SAS multipath disabled on the disk you are trying > to flash? > > I couldn't get these to flash at all with MP enabled. I too kept > getting OS related errors. > > For one system I did an stmsboot -d, for another I just pulled one of > the SAS cables to each JBOD. Oh, you're right, hadn't considered that. I'll have to try this out, even though it means downtime. Cheers, -- Saso From richard.elling at richardelling.com Tue Apr 22 20:08:57 2014 From: richard.elling at richardelling.com (Richard Elling) Date: Tue, 22 Apr 2014 13:08:57 -0700 Subject: [OmniOS-discuss] Overheating faults with ST4000NM0023 In-Reply-To: <5356ADE2.5070605@gmail.com> References: <6B4199959CFA4E1CB067BBF9270654CB@486dx4> <52FC9C12.9090900@smartjog.com> <534DB2FC.9090404@gmail.com> <534FFE1B.2060201@gmail.com> <53501420.9090307@gmail.com> <53518C7D.7060703@gmail.com> <53563803.5030905@gmail.com> <5356ADE2.5070605@gmail.com> Message-ID: On Apr 22, 2014, at 10:58 AM, Saso Kiselkov wrote: > On 4/22/14, 5:03 PM, Schweiss, Chip wrote: >> Are you sure you have SAS multipath disabled on the disk you are trying >> to flash? >> >> I couldn't get these to flash at all with MP enabled. I too kept >> getting OS related errors. >> >> For one system I did an stmsboot -d, for another I just pulled one of >> the SAS cables to each JBOD. > > Oh, you're right, hadn't considered that. I'll have to try this out, > even though it means downtime. mpathadm(1m) allows you to enable/disable paths on the fly, without pulling cables. -- richard -- Richard.Elling at RichardElling.com +1-760-896-4422 -------------- next part -------------- An HTML attachment was scrubbed... URL: From chip at innovates.com Tue Apr 22 20:10:35 2014 From: chip at innovates.com (Schweiss, Chip) Date: Tue, 22 Apr 2014 15:10:35 -0500 Subject: [OmniOS-discuss] Overheating faults with ST4000NM0023 In-Reply-To: References: <6B4199959CFA4E1CB067BBF9270654CB@486dx4> <52FC9C12.9090900@smartjog.com> <534DB2FC.9090404@gmail.com> <534FFE1B.2060201@gmail.com> <53501420.9090307@gmail.com> <53518C7D.7060703@gmail.com> <53563803.5030905@gmail.com> <5356ADE2.5070605@gmail.com> Message-ID: mpathadm also panics the kernel on OmniOS if there are any offline disks. Proceed with caution. On Tue, Apr 22, 2014 at 3:08 PM, Richard Elling < richard.elling at richardelling.com> wrote: > On Apr 22, 2014, at 10:58 AM, Saso Kiselkov > wrote: > > On 4/22/14, 5:03 PM, Schweiss, Chip wrote: > > Are you sure you have SAS multipath disabled on the disk you are trying > to flash? > > I couldn't get these to flash at all with MP enabled. I too kept > getting OS related errors. > > For one system I did an stmsboot -d, for another I just pulled one of > the SAS cables to each JBOD. > > > Oh, you're right, hadn't considered that. I'll have to try this out, > even though it means downtime. > > > mpathadm(1m) allows you to enable/disable paths on the fly, without > pulling cables. > -- richard > > -- > > Richard.Elling at RichardElling.com > +1-760-896-4422 > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From skiselkov.ml at gmail.com Tue Apr 22 20:17:15 2014 From: skiselkov.ml at gmail.com (Saso Kiselkov) Date: Tue, 22 Apr 2014 22:17:15 +0200 Subject: [OmniOS-discuss] Overheating faults with ST4000NM0023 In-Reply-To: References: <6B4199959CFA4E1CB067BBF9270654CB@486dx4> <52FC9C12.9090900@smartjog.com> <534DB2FC.9090404@gmail.com> <534FFE1B.2060201@gmail.com> <53501420.9090307@gmail.com> <53518C7D.7060703@gmail.com> <53563803.5030905@gmail.com> <5356ADE2.5070605@gmail.com> Message-ID: <5356CE4B.50606@gmail.com> On 4/22/14, 10:08 PM, Richard Elling wrote: > On Apr 22, 2014, at 10:58 AM, Saso Kiselkov > wrote: > >> On 4/22/14, 5:03 PM, Schweiss, Chip wrote: >>> Are you sure you have SAS multipath disabled on the disk you are trying >>> to flash? >>> >>> I couldn't get these to flash at all with MP enabled. I too kept >>> getting OS related errors. >>> >>> For one system I did an stmsboot -d, for another I just pulled one of >>> the SAS cables to each JBOD. >> >> Oh, you're right, hadn't considered that. I'll have to try this out, >> even though it means downtime. > > mpathadm(1m) allows you to enable/disable paths on the fly, without > pulling cables. I know, but if I understand it correctly, I need to not only disable a particular path, I need to disable mpath support entirely to get sg_write_buffer to talk to mpt_sas directly, instead of going through the scsi_vhci glob in the middle (which, presumably, is what's causing this problem). If I'm misunderstanding this, please do set me straight. Cheers, -- Saso From chip at innovates.com Tue Apr 22 20:31:59 2014 From: chip at innovates.com (Schweiss, Chip) Date: Tue, 22 Apr 2014 15:31:59 -0500 Subject: [OmniOS-discuss] Overheating faults with ST4000NM0023 In-Reply-To: <5356CE4B.50606@gmail.com> References: <6B4199959CFA4E1CB067BBF9270654CB@486dx4> <52FC9C12.9090900@smartjog.com> <534DB2FC.9090404@gmail.com> <534FFE1B.2060201@gmail.com> <53501420.9090307@gmail.com> <53518C7D.7060703@gmail.com> <53563803.5030905@gmail.com> <5356ADE2.5070605@gmail.com> <5356CE4B.50606@gmail.com> Message-ID: On Tue, Apr 22, 2014 at 3:17 PM, Saso Kiselkov wrote: > > I know, but if I understand it correctly, I need to not only disable a > particular path, I need to disable mpath support entirely to get > sg_write_buffer to talk to mpt_sas directly, instead of going through > the scsi_vhci glob in the middle (which, presumably, is what's causing > this problem). If I'm misunderstanding this, please do set me straight. > > Cheers, > -- > Saso > Actually no. Disabling a physical path works too. That is how I stumbled upon the MP issue. I plugged one of my paths into a second server to attempt using Linux to flash the firmware. When the flash started working from the primary server, I never loaded Linux in the second server. I think the problem is actually in the disk accepting firmware via multipath not so much the OS. The OS throws the error when a message down a second path gets rejected by the drive. -Chip -------------- next part -------------- An HTML attachment was scrubbed... URL: From skiselkov.ml at gmail.com Tue Apr 22 21:02:39 2014 From: skiselkov.ml at gmail.com (Saso Kiselkov) Date: Tue, 22 Apr 2014 23:02:39 +0200 Subject: [OmniOS-discuss] Overheating faults with ST4000NM0023 In-Reply-To: References: <6B4199959CFA4E1CB067BBF9270654CB@486dx4> <52FC9C12.9090900@smartjog.com> <534DB2FC.9090404@gmail.com> <534FFE1B.2060201@gmail.com> <53501420.9090307@gmail.com> <53518C7D.7060703@gmail.com> <53563803.5030905@gmail.com> <5356ADE2.5070605@gmail.com> <5356CE4B.50606@gmail.com> Message-ID: <5356D8EF.7020604@gmail.com> On 4/22/14, 10:31 PM, Schweiss, Chip wrote: > > On Tue, Apr 22, 2014 at 3:17 PM, Saso Kiselkov > wrote: > > > I know, but if I understand it correctly, I need to not only disable a > particular path, I need to disable mpath support entirely to get > sg_write_buffer to talk to mpt_sas directly, instead of going through > the scsi_vhci glob in the middle (which, presumably, is what's causing > this problem). If I'm misunderstanding this, please do set me straight. > > Cheers, > -- > Saso > > > Actually no. Disabling a physical path works too. That is how I > stumbled upon the MP issue. I plugged one of my paths into a second > server to attempt using Linux to flash the firmware. When the flash > started working from the primary server, I never loaded Linux in the > second server. > > I think the problem is actually in the disk accepting firmware via > multipath not so much the OS. The OS throws the error when a message > down a second path gets rejected by the drive. Still no luck, though it's possible I'm doing it wrong: # mpathadm disable path -l /dev/rdsk/c9t5000C500578F774Bd0s2 \ -i w5b8ca3a0e5029c00 -t w5000c500578f774a # mpathadm show lu /dev/rdsk/c9t5000C500578F774Bd0s2 Logical Unit: /dev/rdsk/c9t5000C500578F774Bd0s2 mpath-support: libmpscsi_vhci.so Vendor: SEAGATE Product: ST2000NM0023 Revision: 0003 Name Type: unknown type Name: 5000c500578f774b Asymmetric: no Current Load Balance: round-robin Logical Unit Group ID: NA Auto Failback: on Auto Probing: NA Paths: Initiator Port Name: w5b8ca3a0e5029c00 Target Port Name: w5000c500578f774a Override Path: NA Path State: OK Disabled: yes Initiator Port Name: w5b8ca3a0e5029c00 Target Port Name: w5000c500578f7749 Override Path: NA Path State: OK Disabled: no Target Ports: Name: w5000c500578f774a Relative ID: 0 Name: w5000c500578f7749 Relative ID: 0 # sg_write_buffer -v --in=MegalodonES3-SAS-STD-0004.LOD \ --length=1625600 --mode=5 /dev/rdsk/c9t5000C500578F774Bd0 Write buffer cmd: 3b 05 00 00 00 00 18 ce 00 00 ioctl(USCSICMD) failed with os_err (errno) = 22 write buffer: pass through os error: Invalid argument Write buffer failed res=-1 The situation is the same regardless of which path I disable. At the point of the sg_write_buffer, I also get a single SCSI error logged by "iostat -E", so it's clear there's something wrong going on on the SCSI bus. I suspect it might have something to do with what you mentioned, but I'm just no SCSI guru to figure this out. Cheers, -- Saso From richard.elling at richardelling.com Tue Apr 22 22:42:51 2014 From: richard.elling at richardelling.com (Richard Elling) Date: Tue, 22 Apr 2014 15:42:51 -0700 Subject: [OmniOS-discuss] Overheating faults with ST4000NM0023 In-Reply-To: <5356D8EF.7020604@gmail.com> References: <6B4199959CFA4E1CB067BBF9270654CB@486dx4> <52FC9C12.9090900@smartjog.com> <534DB2FC.9090404@gmail.com> <534FFE1B.2060201@gmail.com> <53501420.9090307@gmail.com> <53518C7D.7060703@gmail.com> <53563803.5030905@gmail.com> <5356ADE2.5070605@gmail.com> <5356CE4B.50606@gmail.com> <5356D8EF.7020604@gmail.com> Message-ID: going out on a limb... On Apr 22, 2014, at 2:02 PM, Saso Kiselkov wrote: > On 4/22/14, 10:31 PM, Schweiss, Chip wrote: >> >> On Tue, Apr 22, 2014 at 3:17 PM, Saso Kiselkov > > wrote: >> >> >> I know, but if I understand it correctly, I need to not only disable a >> particular path, I need to disable mpath support entirely to get >> sg_write_buffer to talk to mpt_sas directly, instead of going through >> the scsi_vhci glob in the middle (which, presumably, is what's causing >> this problem). If I'm misunderstanding this, please do set me straight. >> >> Cheers, >> -- >> Saso >> >> >> Actually no. Disabling a physical path works too. That is how I >> stumbled upon the MP issue. I plugged one of my paths into a second >> server to attempt using Linux to flash the firmware. When the flash >> started working from the primary server, I never loaded Linux in the >> second server. >> >> I think the problem is actually in the disk accepting firmware via >> multipath not so much the OS. The OS throws the error when a message >> down a second path gets rejected by the drive. This is plausible. The default multipath policy of round-robin means that it will chop up such big transfers across both ports. One would think that the drives would treat this as one server, multiple queues, but my recent experience with drive firmware bugs reaffirms the old adage: never assume anything. > > Still no luck, though it's possible I'm doing it wrong: > > # mpathadm disable path -l /dev/rdsk/c9t5000C500578F774Bd0s2 \ > -i w5b8ca3a0e5029c00 -t w5000c500578f774a > > # mpathadm show lu /dev/rdsk/c9t5000C500578F774Bd0s2 > Logical Unit: /dev/rdsk/c9t5000C500578F774Bd0s2 > mpath-support: libmpscsi_vhci.so > Vendor: SEAGATE > Product: ST2000NM0023 > Revision: 0003 > Name Type: unknown type > Name: 5000c500578f774b > Asymmetric: no > Current Load Balance: round-robin > Logical Unit Group ID: NA > Auto Failback: on > Auto Probing: NA > > Paths: > Initiator Port Name: w5b8ca3a0e5029c00 > Target Port Name: w5000c500578f774a > Override Path: NA > Path State: OK > Disabled: yes > > Initiator Port Name: w5b8ca3a0e5029c00 > Target Port Name: w5000c500578f7749 > Override Path: NA > Path State: OK > Disabled: no The other lesson I've learned recently is that some drive firmware is keyed to look at one port over the other for certain operations :-( While I have no knowledge or suspicion of it in this specific case, you might try switching ports. > > Target Ports: > Name: w5000c500578f774a > Relative ID: 0 > > Name: w5000c500578f7749 > Relative ID: 0 > > # sg_write_buffer -v --in=MegalodonES3-SAS-STD-0004.LOD \ > --length=1625600 --mode=5 /dev/rdsk/c9t5000C500578F774Bd0 > Write buffer cmd: 3b 05 00 00 00 00 18 ce 00 00 > ioctl(USCSICMD) failed with os_err (errno) = 22 > write buffer: pass through os error: Invalid argument > Write buffer failed res=-1 > > The situation is the same regardless of which path I disable. At the > point of the sg_write_buffer, I also get a single SCSI error logged by > "iostat -E", so it's clear there's something wrong going on on the SCSI > bus. I suspect it might have something to do with what you mentioned, > but I'm just no SCSI guru to figure this out. fmdump -eV shows SCSI error reports in detail. -- richard -- Richard.Elling at RichardElling.com +1-760-896-4422 -------------- next part -------------- An HTML attachment was scrubbed... URL: From chip at innovates.com Wed Apr 23 18:06:54 2014 From: chip at innovates.com (Schweiss, Chip) Date: Wed, 23 Apr 2014 13:06:54 -0500 Subject: [OmniOS-discuss] Overheating faults with ST4000NM0023 In-Reply-To: <5356D8EF.7020604@gmail.com> References: <6B4199959CFA4E1CB067BBF9270654CB@486dx4> <52FC9C12.9090900@smartjog.com> <534DB2FC.9090404@gmail.com> <534FFE1B.2060201@gmail.com> <53501420.9090307@gmail.com> <53518C7D.7060703@gmail.com> <53563803.5030905@gmail.com> <5356ADE2.5070605@gmail.com> <5356CE4B.50606@gmail.com> <5356D8EF.7020604@gmail.com> Message-ID: On Tue, Apr 22, 2014 at 4:02 PM, Saso Kiselkov wrote: > > # sg_write_buffer -v --in=MegalodonES3-SAS-STD-0004.LOD \ > --length=1625600 --mode=5 /dev/rdsk/c9t5000C500578F774Bd0 > Write buffer cmd: 3b 05 00 00 00 00 18 ce 00 00 > ioctl(USCSICMD) failed with os_err (errno) = 22 > write buffer: pass through os error: Invalid argument > Write buffer failed res=-1 > > The situation is the same regardless of which path I disable. At the > point of the sg_write_buffer, I also get a single SCSI error logged by > "iostat -E", so it's clear there's something wrong going on on the SCSI > bus. I suspect it might have something to do with what you mentioned, > but I'm just no SCSI guru to figure this out. > > Cheers, > -- > Saso > Like I said I use Santools. However, David Lethe, the author of Santools, who was a great help to me in working through this, informed me that from Solaris sg_write_buffer should be set to --mode-7 and possibly even set --length to 16384. I have not tested this. For me Santools has been well worth it's investment on every ZFS server I've deployed. -Chip -------------- next part -------------- An HTML attachment was scrubbed... URL: From chip at innovates.com Wed Apr 23 18:25:45 2014 From: chip at innovates.com (Schweiss, Chip) Date: Wed, 23 Apr 2014 13:25:45 -0500 Subject: [OmniOS-discuss] Overheating faults with ST4000NM0023 In-Reply-To: References: <6B4199959CFA4E1CB067BBF9270654CB@486dx4> <52FC9C12.9090900@smartjog.com> <534DB2FC.9090404@gmail.com> <534FFE1B.2060201@gmail.com> <53501420.9090307@gmail.com> <53518C7D.7060703@gmail.com> <535197C4.9010507@gmail.com> <0DFDE814-0EC4-4ABB-9752-6C5F910F7F8B@RichardElling.com> Message-ID: I can confirm the disks are fine. Getting around FMA is darn near impossible from the information I've collected. I attached the disks to another server still running OmniOS, but disable FMA service before doing so. I then flashed 0004 firmware to these disks. Upon reboot the original server now sees the disks just fine. There has to be a way to "un-retire" disks so they can be flashed, but I have not found such a way. -Chip On Tue, Apr 22, 2014 at 11:36 AM, Schweiss, Chip wrote: > > On Tue, Apr 22, 2014 at 11:15 AM, Richard Elling < > richard.elling at richardelling.com> wrote: > >> >> On Apr 21, 2014, at 9:19 AM, Schweiss, Chip wrote: >> >> I suspecting these drives have self-destructed. >> >> Can anyone confirm this firmware issue causes the drives to permanently >> go offline? >> >> >> They are fine. FMA retires them, so you have to coerce the OS to >> reinstantiate them. >> In my case, they were in the lab, and we reinstall OSes continuously, so >> it wasn't a >> problem for us :-) You might have a look at cfgadm -al, and see if it is >> in a state that >> can be coerced... the docs are poor in this area :-( and this is not a >> frequent operation :-) >> -- richard >> > > After running devfsadm -C the device stubs aren't there anymore. They > don't show up in 'cfgadm -al' > > I can see them from the HBA BIOs. So I'm still leaning towards the disk > are okay, but OmniOS refuses to talk to them. > > So once 'retired', even marking the device repaired will not allow it to > be mounted? > > -Chip > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kai at meder.info Sun Apr 27 16:24:00 2014 From: kai at meder.info (Kai Meder) Date: Sun, 27 Apr 2014 18:24:00 +0200 Subject: [OmniOS-discuss] Install on (not from) USB-Stick Message-ID: Hello, is it possible to install an OmniOS to an USB2-Stick for production home-use of a ZFS-NAS without trashing the stick to death? Any recommendations, advice, proven USB-Sticks? Thanks alot From lists at marzocchi.net Sun Apr 27 16:41:42 2014 From: lists at marzocchi.net (Olaf Marzocchi) Date: Sun, 27 Apr 2014 18:41:42 +0200 Subject: [OmniOS-discuss] Install on (not from) USB-Stick In-Reply-To: References: Message-ID: <4D0B3F97-E47D-41D3-A9CB-384B66A2E054@marzocchi.net> Copy on write reduces strain on the flash units, but also you can take a branded SD card and it should have wear leveling. If the size is quite bigger than what you need, the wear leveling will be effective. http://electronics.stackexchange.com/questions/27619/is-it-true-that-a-sd-mmc-card-does-wear-levelling-with-its-own-controller Olaf Il giorno 27/apr/2014, alle ore 18:24, Kai Meder ha scritto: > > Hello, > > is it possible to install an OmniOS to an USB2-Stick for production home-use of a ZFS-NAS without trashing the stick to death? > > Any recommendations, advice, proven USB-Sticks? > Thanks alot > > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss From kai at meder.info Sun Apr 27 20:57:57 2014 From: kai at meder.info (Kai Meder) Date: Sun, 27 Apr 2014 22:57:57 +0200 Subject: [OmniOS-discuss] OmniOS and Intel Atom Avoton vs. Rangeley Message-ID: Hello, I am about to buy either the new Intel Atom C2750 "Avoton" or C2758 "Rangeley", difference being higher Turbo Clocks (Avoton) vs. Intel QuickAssist support (Rangeley). Does OmniOS take any advantage of Rangeleys QuickAssist featureset or is any support forseeable in the future? Avoton is about 30 EUR more expensive than Rangeley, so if there is only the slightest support for Rangeley I would buy it and save some money. However, if there is absolutely no point in buying Rangeleys QuickAssist-thingy, I will choose Avoton's higher clock speeds and its 30 eur premium... Thanks From kai at meder.info Sun Apr 27 21:17:56 2014 From: kai at meder.info (Kai Meder) Date: Sun, 27 Apr 2014 23:17:56 +0200 Subject: [OmniOS-discuss] Install on (not from) USB-Stick In-Reply-To: <4D0B3F97-E47D-41D3-A9CB-384B66A2E054@marzocchi.net> References: <4D0B3F97-E47D-41D3-A9CB-384B66A2E054@marzocchi.net> Message-ID: <535D7404.3080908@meder.info> Thank you, do you think it's feasible to install OmniOS to a normal SanDisk MicroSDHC "Ultra" 16GB Class10 Card, via a normal MicroSDHC-USB Adapter? My current installation takes only 6GB. I am currently investigating whether their "Ultra"-series also support any Wear-Leveling of if it is a featureset only available to their top "Extreme" lines. Is a modern Lexar/SanDisk USB3-stick >16GB OK as well or is SD to be preferred in principle? Thanks Olaf Marzocchi schrieb: > Copy on write reduces strain on the flash units, but also you can take a branded SD card and it should have wear leveling. If the size is quite bigger than what you need, the wear leveling will be effective. > > http://electronics.stackexchange.com/questions/27619/is-it-true-that-a-sd-mmc-card-does-wear-levelling-with-its-own-controller > > Olaf > > > > Il giorno 27/apr/2014, alle ore 18:24, Kai Meder ha scritto: > >> Hello, >> >> is it possible to install an OmniOS to an USB2-Stick for production home-use of a ZFS-NAS without trashing the stick to death? >> >> Any recommendations, advice, proven USB-Sticks? >> Thanks alot >> >> _______________________________________________ >> OmniOS-discuss mailing list >> OmniOS-discuss at lists.omniti.com >> http://lists.omniti.com/mailman/listinfo/omnios-discuss > From mmabis at vmware.com Mon Apr 28 05:15:52 2014 From: mmabis at vmware.com (Matthew Mabis) Date: Sun, 27 Apr 2014 22:15:52 -0700 (PDT) Subject: [OmniOS-discuss] Install on (not from) USB-Stick In-Reply-To: <4D0B3F97-E47D-41D3-A9CB-384B66A2E054@marzocchi.net> References: <4D0B3F97-E47D-41D3-A9CB-384B66A2E054@marzocchi.net> Message-ID: <2002310730.192526.1398662152058.JavaMail.root@vmware.com> Kai, Be careful of who you buy your board from, when i bought the ASRock C2750D4I Support gave me a lot of crap because it wasn't on their supported matrix. I had to install new OS's just to prove the issue wasn't Solaris related. I sold off that board because i was having a lot of issues with Freezing on the board. Just some Friendly advice! Matt Mabis Sr. Consultant PSO (End User Computing) VCA-DCV/WM,VCP-DCV/DT,VCAP-DCA/DCD/DTD mmabis at vmware.com 3401 Hillview Avenue, Palo Alto, CA 94304 530.481.5405 Mobile ----- Original Message ----- From: "Olaf Marzocchi" To: "Kai Meder" Cc: omnios-discuss at lists.omniti.com Sent: Sunday, April 27, 2014 4:41:42 PM Subject: Re: [OmniOS-discuss] Install on (not from) USB-Stick Copy on write reduces strain on the flash units, but also you can take a branded SD card and it should have wear leveling. If the size is quite bigger than what you need, the wear leveling will be effective. https://urldefense.proofpoint.com/v1/url?u=http://electronics.stackexchange.com/questions/27619/is-it-true-that-a-sd-mmc-card-does-wear-levelling-with-its-own-controller&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=yqgQ6LhGnfWMd79QvLrmWsnr%2FlpWj5c0oy4MpT8%2Bgik%3D%0A&m=se%2BvXDB3CI3L3%2FQPMz4fmFFsrRvOrUEIDCt0Ku4x9Pg%3D%0A&s=b2d780ad73f7d9792c9d9ec4355a28f09cf5903fc803b5c91af11f21d26ae0ac Olaf Il giorno 27/apr/2014, alle ore 18:24, Kai Meder ha scritto: > > Hello, > > is it possible to install an OmniOS to an USB2-Stick for production home-use of a ZFS-NAS without trashing the stick to death? > > Any recommendations, advice, proven USB-Sticks? > Thanks alot > > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > https://urldefense.proofpoint.com/v1/url?u=http://lists.omniti.com/mailman/listinfo/omnios-discuss&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=yqgQ6LhGnfWMd79QvLrmWsnr%2FlpWj5c0oy4MpT8%2Bgik%3D%0A&m=se%2BvXDB3CI3L3%2FQPMz4fmFFsrRvOrUEIDCt0Ku4x9Pg%3D%0A&s=64e517ff0bca2a5e2cfa37932123dd5385db5eda81da229c500718942a55194d _______________________________________________ OmniOS-discuss mailing list OmniOS-discuss at lists.omniti.com https://urldefense.proofpoint.com/v1/url?u=http://lists.omniti.com/mailman/listinfo/omnios-discuss&k=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0A&r=yqgQ6LhGnfWMd79QvLrmWsnr%2FlpWj5c0oy4MpT8%2Bgik%3D%0A&m=se%2BvXDB3CI3L3%2FQPMz4fmFFsrRvOrUEIDCt0Ku4x9Pg%3D%0A&s=64e517ff0bca2a5e2cfa37932123dd5385db5eda81da229c500718942a55194d From alex.ranskis at gmail.com Mon Apr 28 13:17:44 2014 From: alex.ranskis at gmail.com (Alex) Date: Mon, 28 Apr 2014 15:17:44 +0200 Subject: [OmniOS-discuss] "zpool import" triggers deadlock in somes cases ? (metaslab_group_taskqs) Message-ID: Hello, I'm trying to understand this behavior, which I see on servers connected to an external disk enclosure. (I cannot reproduce it on a simple 1 disk VM) # kstat -c taskq | grep metaslab_group_tasksq| wc -l 1112 # zpool import >/dev/null # kstat -c taskq | grep metaslab_group_tasksq| wc -l 1160 we are accumulating 'metaslab_group_taskqs' module: unix instance: 513 name: metaslab_group_tasksq class: taskq crtime 842173.739164514 executed 0 maxtasks 0 nactive 0 nalloc 0 pid 0 priority 60 snaptime 842774.7092530ok 06 tasks 0 threads 3 totaltime 0 The "zpool import" command itself runs fine. I get the same behavior whether there are pools to import or not. but kernel threads are piling up, for each CV there are 3 threads : > ffffff05844fe080::wchaninfo -v ADDR TYPE NWAITERS THREAD PROC ffffff05844fe080 cond 3: ffffff0021c58c40 sched ffffff0021c5ec40 sched ffffff0021c64c40 sched and they're all blocking, with a similar stack : > ffffff0021c58c40::findstack -v stack pointer for thread ffffff0021c58c40: ffffff0021c58a80 [ ffffff0021c58a80 _resume_from_idle+0xf4() ] ffffff0021c58ab0 swtch+0x141() ffffff0021c58af0 cv_wait+0x70(ffffff05844fe080, ffffff05844fe070) ffffff0021c58b60 taskq_thread_wait+0xbe(ffffff05844fe050, ffffff05844fe070, ffffff05844fe080, ffffff0021c58bc0, ffffffffffffffff) ffffff0021c58c20 taskq_thread+0x37c(ffffff05844fe050) ffffff0021c58c30 thread_start+8() the taskq seems to be created by a call to metaslab_group_create(), here : zfs`vdev_alloc+0x54a zfs`spa_config_parse+0x48 zfs`spa_config_parse+0xda zfs`spa_config_valid+0x78 zfs`spa_load_impl+0xa81 zfs`spa_load+0x14e zfs`spa_tryimport+0xaa zfs`zfs_ioc_pool_tryimport+0x51 zfs`zfsdev_ioctl+0x4a7 genunix`cdev_ioctl+0x39 specfs`spec_ioctl+0x60 genunix`fop_ioctl+0x55 genunix`ioctl+0x9b unix`sys_syscall32+0xff I'm out of my depth here, any pointer to investigate further would be much appreciated ! cheers, alex -------------- next part -------------- An HTML attachment was scrubbed... URL: From esproul at omniti.com Mon Apr 28 14:01:04 2014 From: esproul at omniti.com (Eric Sproul) Date: Mon, 28 Apr 2014 10:01:04 -0400 Subject: [OmniOS-discuss] OmniOS and Intel Atom Avoton vs. Rangeley In-Reply-To: References: Message-ID: On Sun, Apr 27, 2014 at 4:57 PM, Kai Meder wrote: > Hello, > > I am about to buy either the new Intel Atom C2750 "Avoton" or C2758 > "Rangeley", difference being higher Turbo Clocks (Avoton) vs. Intel > QuickAssist support (Rangeley). > > Does OmniOS take any advantage of Rangeleys QuickAssist featureset or is any > support forseeable in the future? My guess is no. I can't make sense of the marketing buzzword-laden press releases that I see. It seems to be aimed more at single-purpose embedded use cases, and not likely to be found on general-purpose deployments. > > Avoton is about 30 EUR more expensive than Rangeley, so if there is only the > slightest support for Rangeley I would buy it and save some money. However, > if there is absolutely no point in buying Rangeleys QuickAssist-thingy, I > will choose Avoton's higher clock speeds and its 30 eur premium... TurboBoost at least stands a chance of helping a typical OS workload. :) Eric From youzhong at gmail.com Mon Apr 28 14:22:18 2014 From: youzhong at gmail.com (Youzhong Yang) Date: Mon, 28 Apr 2014 10:22:18 -0400 Subject: [OmniOS-discuss] "zpool import" triggers deadlock in somes cases ? (metaslab_group_taskqs) In-Reply-To: References: Message-ID: This could be the following issue: https://www.illumos.org/issues/4730 On Mon, Apr 28, 2014 at 9:17 AM, Alex wrote: > Hello, > > I'm trying to understand this behavior, which I see on servers connected > to an external disk enclosure. (I cannot reproduce it on a simple 1 disk VM) > > # kstat -c taskq | grep metaslab_group_tasksq| wc -l > 1112 > > # zpool import >/dev/null > > # kstat -c taskq | grep metaslab_group_tasksq| wc -l > 1160 > > > we are accumulating 'metaslab_group_taskqs' > > module: unix instance: 513 > name: metaslab_group_tasksq class: taskq > crtime 842173.739164514 > executed 0 > maxtasks 0 > nactive 0 > nalloc 0 > pid 0 > priority 60 > snaptime 842774.7092530ok 06 > tasks 0 > threads 3 > totaltime 0 > > > The "zpool import" command itself runs fine. I get the same behavior > whether there are pools to import or not. > > but kernel threads are piling up, for each CV there are 3 threads : > > ffffff05844fe080::wchaninfo -v > ADDR TYPE NWAITERS THREAD PROC > ffffff05844fe080 cond 3: ffffff0021c58c40 sched > ffffff0021c5ec40 sched > ffffff0021c64c40 sched > > and they're all blocking, with a similar stack : > > ffffff0021c58c40::findstack -v > stack pointer for thread ffffff0021c58c40: ffffff0021c58a80 > [ ffffff0021c58a80 _resume_from_idle+0xf4() ] > ffffff0021c58ab0 swtch+0x141() > ffffff0021c58af0 cv_wait+0x70(ffffff05844fe080, ffffff05844fe070) > ffffff0021c58b60 taskq_thread_wait+0xbe(ffffff05844fe050, > ffffff05844fe070, ffffff05844fe080, ffffff0021c58bc0, ffffffffffffffff) > ffffff0021c58c20 taskq_thread+0x37c(ffffff05844fe050) > ffffff0021c58c30 thread_start+8() > > > the taskq seems to be created by a call to metaslab_group_create(), here : > zfs`vdev_alloc+0x54a > zfs`spa_config_parse+0x48 > zfs`spa_config_parse+0xda > zfs`spa_config_valid+0x78 > zfs`spa_load_impl+0xa81 > zfs`spa_load+0x14e > zfs`spa_tryimport+0xaa > zfs`zfs_ioc_pool_tryimport+0x51 > zfs`zfsdev_ioctl+0x4a7 > genunix`cdev_ioctl+0x39 > specfs`spec_ioctl+0x60 > genunix`fop_ioctl+0x55 > genunix`ioctl+0x9b > unix`sys_syscall32+0xff > > > I'm out of my depth here, any pointer to investigate further would be much > appreciated ! > > cheers, > alex > > _______________________________________________ > OmniOS-discuss mailing list > OmniOS-discuss at lists.omniti.com > http://lists.omniti.com/mailman/listinfo/omnios-discuss > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.ranskis at gmail.com Mon Apr 28 15:05:36 2014 From: alex.ranskis at gmail.com (Alex) Date: Mon, 28 Apr 2014 17:05:36 +0200 Subject: [OmniOS-discuss] "zpool import" triggers deadlock in somes cases ? (metaslab_group_taskqs) In-Reply-To: References: Message-ID: Hi, Thanks for your feedback ! It does not hang in my case, but maybe it is related anyway. On 28 April 2014 16:22, Youzhong Yang wrote: > This could be the following issue: > > https://www.illumos.org/issues/4730 > > > > On Mon, Apr 28, 2014 at 9:17 AM, Alex wrote: > >> Hello, >> >> I'm trying to understand this behavior, which I see on servers connected >> to an external disk enclosure. (I cannot reproduce it on a simple 1 disk VM) >> >> # kstat -c taskq | grep metaslab_group_tasksq| wc -l >> 1112 >> >> # zpool import >/dev/null >> >> # kstat -c taskq | grep metaslab_group_tasksq| wc -l >> 1160 >> >> >> we are accumulating 'metaslab_group_taskqs' >> >> module: unix instance: 513 >> name: metaslab_group_tasksq class: taskq >> crtime 842173.739164514 >> executed 0 >> maxtasks 0 >> nactive 0 >> nalloc 0 >> pid 0 >> priority 60 >> snaptime 842774.7092530ok 06 >> tasks 0 >> threads 3 >> totaltime 0 >> >> >> The "zpool import" command itself runs fine. I get the same behavior >> whether there are pools to import or not. >> >> but kernel threads are piling up, for each CV there are 3 threads : >> > ffffff05844fe080::wchaninfo -v >> ADDR TYPE NWAITERS THREAD PROC >> ffffff05844fe080 cond 3: ffffff0021c58c40 sched >> ffffff0021c5ec40 sched >> ffffff0021c64c40 sched >> >> and they're all blocking, with a similar stack : >> > ffffff0021c58c40::findstack -v >> stack pointer for thread ffffff0021c58c40: ffffff0021c58a80 >> [ ffffff0021c58a80 _resume_from_idle+0xf4() ] >> ffffff0021c58ab0 swtch+0x141() >> ffffff0021c58af0 cv_wait+0x70(ffffff05844fe080, ffffff05844fe070) >> ffffff0021c58b60 taskq_thread_wait+0xbe(ffffff05844fe050, >> ffffff05844fe070, ffffff05844fe080, ffffff0021c58bc0, ffffffffffffffff) >> ffffff0021c58c20 taskq_thread+0x37c(ffffff05844fe050) >> ffffff0021c58c30 thread_start+8() >> >> >> the taskq seems to be created by a call to metaslab_group_create(), here : >> zfs`vdev_alloc+0x54a >> zfs`spa_config_parse+0x48 >> zfs`spa_config_parse+0xda >> zfs`spa_config_valid+0x78 >> zfs`spa_load_impl+0xa81 >> zfs`spa_load+0x14e >> zfs`spa_tryimport+0xaa >> zfs`zfs_ioc_pool_tryimport+0x51 >> zfs`zfsdev_ioctl+0x4a7 >> genunix`cdev_ioctl+0x39 >> specfs`spec_ioctl+0x60 >> genunix`fop_ioctl+0x55 >> genunix`ioctl+0x9b >> unix`sys_syscall32+0xff >> >> >> I'm out of my depth here, any pointer to investigate further would be >> much appreciated ! >> >> cheers, >> alex >> >> _______________________________________________ >> OmniOS-discuss mailing list >> OmniOS-discuss at lists.omniti.com >> http://lists.omniti.com/mailman/listinfo/omnios-discuss >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at linuxsuite.org Mon Apr 28 15:27:47 2014 From: steve at linuxsuite.org (steve at linuxsuite.org) Date: Mon, 28 Apr 2014 11:27:47 -0400 Subject: [OmniOS-discuss] Hang on Dell R710 with r151004? Message-ID: <8ed132750a7c6a7b91ba26904d99aea3.squirrel@emailmg.netfirms.com> Hi, I have 2 Dell R710's as a ZFS storage that are running r151004. About every 2 or 3 weeks they will hang with all services unresponsive and must be power cycled. I do not suspect hardware as it happens on both machines and hardware worked fine with other OS's. Both systems use the mpt_sas driver. I noticed that there have been many updates to mpt_sas since r151004. Not knowing the specifics of the driver issue, is it possible that there is a bug that is causing system hangs? Or could this be some kind of resource starvation? I disabled ata driver as it was logging some errors around "hang time" and is not required after install, but now system hangs without any logged errors. Ideas? Will upgrading to r151006 or r151008 fix this? thanx - steve From steve at linuxsuite.org Mon Apr 28 17:21:31 2014 From: steve at linuxsuite.org (steve at linuxsuite.org) Date: Mon, 28 Apr 2014 13:21:31 -0400 Subject: [OmniOS-discuss] Hang on Dell R710 with r151004? In-Reply-To: <8ed132750a7c6a7b91ba26904d99aea3.squirrel@emailmg.netfirms.com> References: <8ed132750a7c6a7b91ba26904d99aea3.squirrel@emailmg.netfirms.com> Message-ID: > Hi, > > I have 2 Dell R710's as a ZFS storage that are running r151004. > About every 2 or 3 weeks > they will hang with all services unresponsive and must be power cycled. I > do not Hmm... read something about cstates on dell r710's.... root at blahblah:~# kstat |grep current_cstate; kstat |grep supported_max_cstates current_cstate 3 current_cstate 0 current_cstate 3 current_cstate 3 current_cstate 3 current_cstate 3 current_cstate 3 current_cstate 3 supported_max_cstates 2 supported_max_cstates 2 supported_max_cstates 2 supported_max_cstates 2 supported_max_cstates 2 supported_max_cstates 2 supported_max_cstates 2 supported_max_cstates 2 Is this an issue? Do cstates need to be disabled in BIOS?? thanx - steve > suspect hardware as it happens on both machines and hardware worked fine > with other OS's. > > Both systems use the mpt_sas driver. I noticed that there have > been > many updates to mpt_sas since r151004. Not knowing the specifics of the > driver issue, is it possible that there is a bug that is causing system > hangs? > > Or could this be some kind of resource starvation? > > I disabled ata driver as it was logging some errors around "hang > time" and is not required after install, but now system hangs > without any logged errors. > > Ideas? > > Will upgrading to r151006 or r151008 fix this? > > thanx - steve > > > > > From danmcd at omniti.com Mon Apr 28 17:40:01 2014 From: danmcd at omniti.com (Dan McDonald) Date: Mon, 28 Apr 2014 13:40:01 -0400 Subject: [OmniOS-discuss] Hang on Dell R710 with r151004? In-Reply-To: References: <8ed132750a7c6a7b91ba26904d99aea3.squirrel@emailmg.netfirms.com> Message-ID: <326D1AB0-20FD-496C-B93A-5C70E1BFF5C7@omniti.com> On Apr 28, 2014, at 1:21 PM, steve at linuxsuite.org wrote: >> Hi, >> >> I have 2 Dell R710's as a ZFS storage that are running r151004. >> About every 2 or 3 weeks >> they will hang with all services unresponsive and must be power cycled. I >> do not > > Hmm... read something about cstates on dell r710's.... You should disable C-states. Also, upgrading to r151006 or r151008 will get you mpt_sas improvements as well. Dan From mir at miras.org Mon Apr 28 17:50:03 2014 From: mir at miras.org (Michael Rasmussen) Date: Mon, 28 Apr 2014 19:50:03 +0200 Subject: [OmniOS-discuss] Hang on Dell R710 with r151004? In-Reply-To: <326D1AB0-20FD-496C-B93A-5C70E1BFF5C7@omniti.com> References: <8ed132750a7c6a7b91ba26904d99aea3.squirrel@emailmg.netfirms.com> <326D1AB0-20FD-496C-B93A-5C70E1BFF5C7@omniti.com> Message-ID: <20140428195003.2ef0c535@sleipner.datanom.net> On Mon, 28 Apr 2014 13:40:01 -0400 Dan McDonald wrote: > > You should disable C-states. > The C-states mentioned is the one added with the Haswell chipset? -- Hilsen/Regards Michael Rasmussen Get my public GnuPG keys: michael rasmussen cc http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xD3C9A00E mir datanom net http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE501F51C mir miras org http://pgp.mit.edu:11371/pks/lookup?op=get&search=0xE3E80917 -------------------------------------------------------------- /usr/games/fortune -es says: Causes moderate eye irritation. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: not available URL: From takashiary at gmail.com Wed Apr 30 15:01:44 2014 From: takashiary at gmail.com (takashi ary) Date: Thu, 1 May 2014 00:01:44 +0900 Subject: [OmniOS-discuss] zfs diff UTF-8 probrem Message-ID: Hello, When OmniOS fix illumos Bug #4448 ? https://www.illumos.org/issues/4448 OmniOS r151008 behavior root at omnios1:~# uname -v omnios-6de5e81 root at omnios1:~# root at omnios1:~# zfs diff -HF tank at test M / /tank/ + F /tank/abcd\37777777703\37777777651fg root at omnios1:~# I tried to patch from zfsonlinux. https://github.com/zfsonlinux/zfs/issues/1172 root at omnios1:~# ls -l /root/zfsdiff/lib total 201 lrwxrwxrwx 1 root root 11 Apr 30 16:17 libzfs.so -> libzfs.so.1 -rwxr-xr-x 1 root bin 324932 Apr 28 20:29 libzfs.so.1 root at omnios1:~# root at omnios1:~# LD_LIBRARY_PATH=/root/zfsdiff/lib zfs diff -HF tank at test M / /tank/ + F /tank/abcd\303\251fg root at omnios1:~# I created a wrapper script. root at omnios1:~# cat /root/zfsdiff/zfsdiff.sh #!/bin/bash LIBZFS_DIR=/root/zfsdiff/lib LD_LIBRARY_PATH=$LIBZFS_DIR zfs diff $* | awk '{cmd = "printf \"a" $0 "\""; cmd | getline line; close(cmd); sub(/^a/,"",line); print line}' root at omnios1:~# root at omnios1:~# /root/zfsdiff/zfsdiff.sh -HF tank at test M / /tank/ + F /tank/abcd?fg root at omnios1:~# Thanks From danmcd at omniti.com Wed Apr 30 15:12:28 2014 From: danmcd at omniti.com (Dan McDonald) Date: Wed, 30 Apr 2014 11:12:28 -0400 Subject: [OmniOS-discuss] zfs diff UTF-8 probrem In-Reply-To: References: Message-ID: <2F9CED63-403A-4FF9-A6C5-76884A2BF60B@omniti.com> On Apr 30, 2014, at 11:01 AM, takashi ary wrote: > Hello, > > When OmniOS fix illumos Bug #4448 ? > https://www.illumos.org/issues/4448 > Your best bet is to raise this issue on the Illumos ZFS list: zfs at lists.illumos.org. Dan