Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 23 Sep 2008 20:25:05 +0200
From:      "Paul B. Mahol" <onemda@gmail.com>
To:        "Mikhail Teterin" <mi+mill@aldan.algebra.com>
Cc:        stable@freebsd.org
Subject:   Re: 7.0-stable: a hung process - scheduler bug?
Message-ID:  <3a142e750809231125o579445baufb5f2676e4d9a2ca@mail.gmail.com>
In-Reply-To: <48D92589.8000200@aldan.algebra.com>
References:  <48D92589.8000200@aldan.algebra.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 9/23/08, Mikhail Teterin <mi+mill@aldan.algebra.com> wrote:
> Hello!
>
> I was trying to build OpenOffice using all of my 4 CPUs. To be able to
> do other work on the machine comfortably, I ran the build under nice,
> and assigned real-time priority to the two Xorg processes.
> The build started at about 23:10 last night, and hung at 23:46. The
> procstat output for the make's process group is:
>
>       PID  PPID  PGID   SID  TSID THR LOGIN    WCHAN     EMUL
>     COMM
>      8371  2425  8371  2425  2425   1 mi       wait      FreeBSD ELF64 make
>     12254  8371  8371  2425  2425   1 mi       wait      FreeBSD ELF64 sh
>     12255 12254  8371  2425  2425   1 mi       pause     FreeBSD ELF64
>     tcsh
>     12262 12255  8371  2425  2425   1 mi       wait      FreeBSD ELF64
>     perl5.8.8
>     33010 12262  8371  2425  2425   1 mi       wait      FreeBSD ELF64
>     perl5.8.8
>     33011 33010  8371  2425  2425   1 mi       wait      FreeBSD ELF64 sh
>     33012 33011  8371  2425  2425   1 mi       wait      FreeBSD ELF64 dmake
>     37126 33012  8371  2425  2425   1 mi       -         FreeBSD ELF64 dmake
>
> The last line worries me greatly... According to "procstat -t", there is
> only one thread there:
>
>       PID    TID COMM             TDNAME           CPU  PRI STATE
>     WCHAN
>     37126 100724 dmake            -                  1  193 sleep   -
>
> And trying to "ktrace -p 37126" returns (even to root, even in /tmp):
>
>     ktrace: ktrace.out: Operation not permitted
>
> There are no problems ktrace-ing 33012, but nothing comes from there, as
> that process simply waits for its child. I guess, the child -- 37126 was
> (v)forked to launch a compiler or some such and remains stuck in between
> (v)fork and exec somewhere...
>
> The OS is: FreeBSD 7.0-STABLE/amd64 from Sat Jul 26, 2008 and the box is
> otherwise perfectly functional. The scheduling-related options are set
> as such:
>
>     options         SCHED_4BSD              # 4BSD scheduler
>     options         _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B
>     real-time extensions
>
> Let me know, what else I can do to help fix this bug -- I'm going to
> reboot the machine tonight... Should I switch to SCHED_ULE as a
> work-around?

SCHED_BSD4 is suboptimal for 4 CPUs, and it is replaced with SCHED_ULE
on 7 STABLE.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3a142e750809231125o579445baufb5f2676e4d9a2ca>