[OmniOS-discuss] "zpool import" triggers deadlock in somes cases ? (metaslab_group_taskqs)

Mon Apr 28 15:05:36 UTC 2014

Hi,
Thanks for your feedback ! It does not hang in my case, but maybe it is
related anyway.

On 28 April 2014 16:22, Youzhong Yang <youzhong at gmail.com> wrote:

> This could be the following issue:
>
> https://www.illumos.org/issues/4730
>
>
>
> On Mon, Apr 28, 2014 at 9:17 AM, Alex <alex.ranskis at gmail.com> wrote:
>
>> Hello,
>>
>> I'm trying to understand this behavior, which I see on servers connected
>> to an external disk enclosure. (I cannot reproduce it on a simple 1 disk VM)
>>
>> # kstat -c taskq | grep metaslab_group_tasksq| wc -l
>> 1112
>>
>> # zpool import >/dev/null
>>
>> # kstat -c taskq | grep metaslab_group_tasksq| wc -l
>> 1160
>>
>>
>> we are accumulating 'metaslab_group_taskqs'
>>
>> module: unix                            instance: 513
>> name:   metaslab_group_tasksq           class:    taskq
>>         crtime                          842173.739164514
>>         executed                        0
>>         maxtasks                        0
>>         nactive                         0
>>         nalloc                          0
>>         pid                             0
>>         priority                        60
>>         snaptime                        842774.7092530ok 06
>>         tasks                           0
>>         threads                         3
>>         totaltime                       0
>>
>>
>> The "zpool import" command itself runs fine. I get the same behavior
>> whether there are pools to import or not.
>>
>> but kernel threads are piling up, for each CV there are 3 threads :
>> > ffffff05844fe080::wchaninfo -v
>> ADDR             TYPE NWAITERS   THREAD           PROC
>> ffffff05844fe080 cond        3:  ffffff0021c58c40 sched
>>                                  ffffff0021c5ec40 sched
>>                                  ffffff0021c64c40 sched
>>
>> and they're all blocking, with a similar stack :
>> > ffffff0021c58c40::findstack -v
>> stack pointer for thread ffffff0021c58c40: ffffff0021c58a80
>> [ ffffff0021c58a80 _resume_from_idle+0xf4() ]
>>   ffffff0021c58ab0 swtch+0x141()
>>   ffffff0021c58af0 cv_wait+0x70(ffffff05844fe080, ffffff05844fe070)
>>   ffffff0021c58b60 taskq_thread_wait+0xbe(ffffff05844fe050,
>> ffffff05844fe070, ffffff05844fe080, ffffff0021c58bc0, ffffffffffffffff)
>>   ffffff0021c58c20 taskq_thread+0x37c(ffffff05844fe050)
>>   ffffff0021c58c30 thread_start+8()
>>
>>
>> the taskq seems to be created by a call to metaslab_group_create(), here :
>>               zfs`vdev_alloc+0x54a
>>               zfs`spa_config_parse+0x48
>>               zfs`spa_config_parse+0xda
>>               zfs`spa_config_valid+0x78
>>               zfs`spa_load_impl+0xa81
>>               zfs`spa_load+0x14e
>>               zfs`spa_tryimport+0xaa
>>               zfs`zfs_ioc_pool_tryimport+0x51
>>               zfs`zfsdev_ioctl+0x4a7
>>               genunix`cdev_ioctl+0x39
>>               specfs`spec_ioctl+0x60
>>               genunix`fop_ioctl+0x55
>>               genunix`ioctl+0x9b
>>               unix`sys_syscall32+0xff
>>
>>
>> I'm out of my depth here, any pointer to investigate further would be
>> much appreciated !
>>
>> cheers,
>> alex
>>
>> _______________________________________________
>> OmniOS-discuss mailing list
>> OmniOS-discuss at lists.omniti.com
>> http://lists.omniti.com/mailman/listinfo/omnios-discuss
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://omniosce.org/ml-archive/attachments/20140428/fa163582/attachment-0001.html>