看板 DFBSD_bugs 關於我們 聯絡資訊
On Thu, Feb 24, 2011 at 09:29:16AM +0900, YONETANI Tomokazu wrote: > On Mon, Feb 21, 2011 at 01:41:26PM +0900, YONETANI Tomokazu wrote: > > Apparently a few hours of pbulk test can trigger this panic (this is > > on x86_64 and the kernel is built from source as of 5347900e6). > > As opposed to what KKASSERT claims, p->p_lock doesn't hold the non-zero > > value in the frame 5: > > > : > > (kgdb) fr 5 > > #5 0xffffffff802989a1 in kern_wait (pid=<value optimized out>, > > status=0xffffffe05e997a74, options=1528637672, rusage=0x0, > > res=0xffffffe05e997b58) at /usr/src/sys/kern/kern_exit.c:901 > > 901 KKASSERT(p->p_lock == 0); > > (kgdb) p p->p_lock > > $1 = 0 > > After poking here and the, I think this KKASSERT() can simply go away > as proc_remove_zombie() will wait for p->p_lock to drop to zero anyway. > > /* > * Unlink the proc from its process group so that > * the following operations won't lead to an > * inconsistent state for processes running down > * the zombie list. > */ > KKASSERT(p->p_lock == 0); > proc_remove_zombie(p); > leavepgrp(p); The following is what I have in my tree. What it does is to hold proc_token while waiting for p->p_lock to drop, just as done in proc_remove_zombie(). If I don't hold the proc_token as in the first chunk, I see the proc_remove_zombie: waiting for %p->p_lock to drop message a few times every hour on the console. I guess it may also imply that the race is between a code which holds proc_token for a long time but not p->p_token, like all*_scan(). diff --git a/sys/kern/kern_exit.c b/sys/kern/kern_exit.c index 1e5a110..30fc0af 100644 --- a/sys/kern/kern_exit.c +++ b/sys/kern/kern_exit.c @@ -867,8 +867,10 @@ loop: * put a hold on the process for short periods of * time. */ + lwkt_gettoken(&proc_token); while (p->p_lock) tsleep(p, 0, "reap3", hz); + lwkt_reltoken(&proc_token); /* Take care of our return values. */ *res = p->p_pid; @@ -898,7 +900,6 @@ loop: * inconsistent state for processes running down * the zombie list. */ - KKASSERT(p->p_lock == 0); proc_remove_zombie(p); leavepgrp(p); diff --git a/sys/kern/kern_proc.c b/sys/kern/kern_proc.c index cb067f0..048a619 100644 --- a/sys/kern/kern_proc.c +++ b/sys/kern/kern_proc.c @@ -661,6 +661,7 @@ proc_remove_zombie(struct proc *p) { lwkt_gettoken(&proc_token); while (p->p_lock) { + kprintf("%s: waiting for %p->p_lock to drop\n", __func__, p); tsleep(p, 0, "reap1", hz / 10); } LIST_REMOVE(p, p_list); /* off zombproc */ -- 1.7.3.2