Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 29 Jan 2008 19:00:42 -0500
From:      John Baldwin <jhb@freebsd.org>
To:        freebsd-amd64@freebsd.org
Subject:   Re: Multi processor locking problem under 7.0
Message-ID:  <200801291900.42989.jhb@freebsd.org>
In-Reply-To: <20080129202643.6BF568DE@fep1.cogeco.net>
References:  <1201388299.84900.12.camel@Sylvester.dco.penx.com> <20080129202643.6BF568DE@fep1.cogeco.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tuesday 29 January 2008 03:26:44 pm Paul wrote:
> 
> >I have several systems of two different types running 7.0. One is an IBM
> >3550 and the other a Dell 2950. The IBMs more than the Dells
> >consistently seem to have a kernel locking problem during dump.
> >Specifically, if I execute this command:
> >
> >         dump 0uaLCf 64 /dev/null /usr
> >
> >Dump consistently stops in Phase IV. However, if I set
> >machdep.hlt_logical_cpus=1, dump does not stop. At the end of this
> >message is my boot information.
> >
> >When logical_cpus=0, the following is typical of what is displayed by
> >top when dump stops:
> >
> >   PID USERNAME  THR PRI NICE   SIZE    RES STATE  C   TIME   WCPU
> >COMMAND
> >   926 root        1   4    0 75476K 71744K sbwait 0   0:04  0.00% dump
> >   928 root        1  20    0 75348K 67740K pause  1   0:02  0.00% dump
> >   929 root        1  20    0 75348K 67740K pause  1   0:02  0.00% dump
> >   927 root        1  20    0 75348K 67740K pause  1   0:02  0.00% dump
> >   919 root        1   8    0 75348K 67144K wait   0   0:00  0.00% dump
> >
> >Fooling around a bit I have found that if I truss dump, the dump
> >continues. On the Dells, if I force disk activity during the dump, such
> >as executing a ls -lR /usr > /dev/null, the dump finishes.
> >
> >I am unsure how to proceed in debugging this problem. It has been around
> >for a while but I am now installing the IBMs and the dump problem is a
> >no-starter. Please contact me directly on how to proceed.
> 
> I have noticed something similar on my Intel test box.
> 
> When compiling many ports in the tree that is updated on 7.0RC1 with 
> a S5000pal with 2 Quadcore Xeons the process just STOPS. I am using 
> the install disk and have not updated to the latest cvsup release yet 
> (I am trying to make the world now with fingers crossed :)  ) I tried 
> it with just one quadcore and the same problem happens.
> 
> There are no errors on the screen but it no longer proceeds with the 
> port build. When I suspend the process and restart the make in the 
> same session it has no problem getting past this impasse and with a 
> few suspends the make finishes without error. It does not happen 
> every time which is very odd.
> 
> Based on your description above it seems like it may be the same problem.
> 
> What do you think?

If you have threads blocked on "vmo_de" then upgrade to the latest RELENG_7 or 
RELENG_7_0 (specifically the sys/kern/subr_sleepqueue.c file) and try again.

-- 
John Baldwin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200801291900.42989.jhb>