Date: Sat, 20 Mar 2010 07:27:43 -0400 From: jhell <jhell@DataIX.net> To: Anton Shterenlikht <mexas@bristol.ac.uk> Cc: FreeBSD Current <freebsd-current@freebsd.org>, freebsd-ia64@freebsd.org Subject: Re: ldd leaves the machine unresponsive Message-ID: <alpine.BSF.2.00.1003200726250.71443@pragry.qngnvk.ybpny> In-Reply-To: <20100319211535.GA76683@mech-cluster241.men.bris.ac.uk> References: <20100317163230.GJ87732@mech-cluster241.men.bris.ac.uk> <alpine.BSF.2.00.1003181013370.91777@pragry.qngnvk.ybpny> <20100319211535.GA76683@mech-cluster241.men.bris.ac.uk>
next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 19 Mar 2010 17:15, Anton Shterenlikht wrote: In Message-Id: <20100319211535.GA76683@mech-cluster241.men.bris.ac.uk> > On Thu, Mar 18, 2010 at 11:29:36AM -0400, jhell wrote: >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> >> >> On Wed, 17 Mar 2010 12:32, Anton Shterenlikht wrote: >> In Message-Id: <20100317163230.GJ87732@mech-cluster241.men.bris.ac.uk> >> >>> Just updated to ia64 r205248 >>> >>> If my problem is due to my mis-configuration, >>> I apologise in advance. >>> >>> I run this shell script after each upgrade >>> and 'make delete-old-libs' to check >>> if any shared objects need to be rebuilt: >>> >>> <start script> >>> >>> #!/bin/sh >>> >>> for file in `find /bin /sbin /usr/bin /usr/sbin /usr/lib /usr/libexec /usr/local -name "*"` >>> do >>> echo $file >>> ldd $file >> /root/ldd_results 2> /dev/zero >>> done >>> >>> <end script> >>> >> >> This will probably do closer to what you actually would want to look for. >> >> Writing to /dev/zero ... I don't know never tried it since /dev/null is >> usually the standard place to throw trash. >> >> #!/bin/sh >> for file in `find /*bin /usr/*bin /usr/lib* /usr/local/*bin -type f` do >> echo $file >> ldd $file >>/root/ldd_results 2>/dev/null >> done >> >> The problem with your script is that it finds most files that it can not >> or is not useful to run ldd on and leaves you junk in return. >> >> It might be more useful if you searched for dynamically linked ELF >> binaries to run ldd against like the following. >> >> === Script starts here === >> #!/bin/sh >> >> SEARCHPATH="/*bin /usr/*bin /usr/lib* /usr/local/*bin" >> >> trap 'exit 1' 2 >> >> check_libs() { >> for spath in $SEARCHPATH; do >> for ifelf in `find $spath -type f`; do >> ldd `file $ifelf | grep dynamically | cut -f1 -d:` >> done >> done >> } >> >> check_libs 2>/dev/null >> === Script ends here === >> >> The above will find all type ELF * that are dynamically linked within the >> SEARCHPATH variable and run ldd on them and print the results to stdout. >> >> Obviously since you are going to have thousands of files being questioned, >> stdout is not going to be useful. >> >> So with the about stated: >> save the script to: checklibs.sh >> run with: "sh checklibs.sh >/root/checklibs_output" >> or: "script /root/checklibs_output checklibs.sh" >> >>> After the upgrade to r205248, the script >>> freezes at seemingly random points. >>> >> >> Unneeded disk usage & execution. >> >>> I can still ssh to the machine (using keys), i.e. >>> I see the welcome message, but cannot get to the console prompt. >> >> Of course... to many open files or processes in wait. SSH already has the >> information it needs loaded into memory, that's why you can get sort-of-in >> >> ZFS file-system perhaps ? > > I've no ZFS. > > I'm seeing very similar behaviour now with csup: > > ( I do csup -L2 /root/ports-supfile, where > > # cat /root/ports-supfile > *default host=cvsup.uk.FreeBSD.org > *default base=/var/db > *default prefix=/usr > *default release=cvs tag=. delete use-rel-suffix compress > > ports-all > # ) > > top(1) shows: > > last pid: 1160; load averages: 0.00, 0.06, 0.07 up 0+00:10:53 15:05:52 > 81 processes: 3 running, 61 sleeping, 17 waiting > CPU 0: 0.0% user, 0.0% nice, 0.2% system, 0.0% interrupt, 99.8% idle > CPU 1: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle > Mem: 23M Active, 19M Inact, 75M Wired, 136K Cache, 34M Buf, 5900M Free > Swap: 2780M Total, 2780M Free > > PID UID THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND > 10 0 2 171 ki31 0K 64K RUN 0 20:18 198.00% idle > 11 0 17 -48 - 0K 544K WAIT 0 0:01 0.00% intr > 1118 1001 1 96 0 12800K 3920K CPU0 0 0:00 0.00% top > 4 0 1 -8 - 0K 32K - 1 0:00 0.00% g_down > 1158 0 4 -8 0 43672K 6296K biowr 0 0:00 0.00% csup > > > which stays in biowr state indefinitely. > > I can issue kill -9 or kill -HUP from top(1), > which makes csup change state to STOP, but > nothing else happens. > > As before, I can't log in from other terminals > and have to do a cold reset. I've reinstalled > on another disk, so not sure what's going on. > > I think rm(1) is also extremely slow, but > maybe I'm imagining things. > > many thanks > anton > > I would post up the contents of your make.conf & your kernel config & your dmesg somewhere so it can be evaluated. Regards, -- jhell
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?alpine.BSF.2.00.1003200726250.71443>