Date: Tue, 29 Jan 2008 19:00:42 -0500 From: John Baldwin <jhb@freebsd.org> To: freebsd-amd64@freebsd.org Subject: Re: Multi processor locking problem under 7.0 Message-ID: <200801291900.42989.jhb@freebsd.org> In-Reply-To: <20080129202643.6BF568DE@fep1.cogeco.net> References: <1201388299.84900.12.camel@Sylvester.dco.penx.com> <20080129202643.6BF568DE@fep1.cogeco.net>
next in thread | previous in thread | raw e-mail | index | archive | help
On Tuesday 29 January 2008 03:26:44 pm Paul wrote: > > >I have several systems of two different types running 7.0. One is an IBM > >3550 and the other a Dell 2950. The IBMs more than the Dells > >consistently seem to have a kernel locking problem during dump. > >Specifically, if I execute this command: > > > > dump 0uaLCf 64 /dev/null /usr > > > >Dump consistently stops in Phase IV. However, if I set > >machdep.hlt_logical_cpus=1, dump does not stop. At the end of this > >message is my boot information. > > > >When logical_cpus=0, the following is typical of what is displayed by > >top when dump stops: > > > > PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU > >COMMAND > > 926 root 1 4 0 75476K 71744K sbwait 0 0:04 0.00% dump > > 928 root 1 20 0 75348K 67740K pause 1 0:02 0.00% dump > > 929 root 1 20 0 75348K 67740K pause 1 0:02 0.00% dump > > 927 root 1 20 0 75348K 67740K pause 1 0:02 0.00% dump > > 919 root 1 8 0 75348K 67144K wait 0 0:00 0.00% dump > > > >Fooling around a bit I have found that if I truss dump, the dump > >continues. On the Dells, if I force disk activity during the dump, such > >as executing a ls -lR /usr > /dev/null, the dump finishes. > > > >I am unsure how to proceed in debugging this problem. It has been around > >for a while but I am now installing the IBMs and the dump problem is a > >no-starter. Please contact me directly on how to proceed. > > I have noticed something similar on my Intel test box. > > When compiling many ports in the tree that is updated on 7.0RC1 with > a S5000pal with 2 Quadcore Xeons the process just STOPS. I am using > the install disk and have not updated to the latest cvsup release yet > (I am trying to make the world now with fingers crossed :) ) I tried > it with just one quadcore and the same problem happens. > > There are no errors on the screen but it no longer proceeds with the > port build. When I suspend the process and restart the make in the > same session it has no problem getting past this impasse and with a > few suspends the make finishes without error. It does not happen > every time which is very odd. > > Based on your description above it seems like it may be the same problem. > > What do you think? If you have threads blocked on "vmo_de" then upgrade to the latest RELENG_7 or RELENG_7_0 (specifically the sys/kern/subr_sleepqueue.c file) and try again. -- John Baldwin
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200801291900.42989.jhb>