FreeBSD Mail Archives

Date:      Mon, 13 Nov 1995 21:34:16 -0800 (PST)
From:      Julian Elischer <julian@ref.tfs.com>
To:        uhclem%nemesis@fw.ast.com (Frank Durda IV)
Cc:        simonm@dcs.gla.ac.uk, davidg@Root.COM, current@freebsd.org
Subject:   Re: Disk I/O that binds
Message-ID:  <199511140534.VAA26450@ref.tfs.com>
In-Reply-To: <m0tFD4o-000J7hC@nemesis.lonestar.org> from "Frank Durda IV" at Nov 13, 95 10:34:00 pm

Last I checked it was a double que elevator sort...
any block requewsted that is behind you gets put in the "next pass"
queue, and anything in front of you get's put in front of you in the 
"present pass" queue.. direction of travel never changes..
(I just checked again, it hasn't changed)

It's possible that if one process has started reading a large file that was 
read onto the disk sequentially, then otehr processes may starve..
it goes like this:

process 1 reads block x, and the readahead puts in a request for block x+1
process 1 reads plock x+1, readahead inserts block x+2

etc.


as the disk is walking slowely forwards along the disk, any process looking for a block BEFORE x is missing out..
this should only be a problem for systems where the cpu is fast enough
to deal with the present block and put it's request for the next one in before the disk has completed transfer of the readahead block.
In effect the faster the system, the more problem you're going to have...

however, a 5 Second wait would imply reading about 25MB of data!
(though I guess a process that neads to do 10 reads needs to get past this hog
10 times before being able to proceed  (it might be getting little bits
of disk at a time. 

It's  possibly the only way to fix this might be a 'fair share' stricture,
in which a system volintarily puts the lookahead block in the second 
queue after a certain number of successful read-aheads..
Certainly the disk clustering code must have effected this..

Anyone with a linux machine care to check what they have as a disk-sort?
I bet they have something similar...

> 
> I have just run a few tests and have found a way to get a bind to occur
> in just a few tries.  All I/O was on SCSI drives (NO IDE).  Hard disk
> was 2GB Seagate Baracuda and SCSI was 1540B Adaptec.  1104 stock kernel
> (also done on a 2.0.5 with driver deletions kernel), 8MB RAM.
> 
> 1.	Kill any processes that might be doing a lot of writes in
> 	background, such as tind, kick off UUCP, the users, etc.  It seems
> 	OK to leave update, sendmail and other intermittent items running,
> 	although it may fail faster with them eliminated too.
> 
> 2.	cd /usr/spool/news (assuming you have news) on one multiscreen.
> 	Type  cp history /dev/null
> 	(or you can use some other extremely large file.  My copy of history
> 	was 29Meg.  History is usually a bit fragmented, although I don't
> 	know if that is a factor.)
> 
> 3.	On a different multiscreen, do a ls -alR of *ANY* filesystem
> 	located on the same drive.  (It can be a different slice).
> 
> Now watch the ls progress.  It will probably run fast for 40 to 80 seconds
> and then it will slow and stop.  Each time it pauses, start counting
> and note where you are path-wise.  Then when it resumes, note how
> many files were in the directory it took a long time on.
> 
> When you hit a directory that has less than ten files in it and it
> takes 20 seconds or more to display it, you are seeing the problem.   If
> you can't get it to happen right away, do a few "!!"s on the screen
> with the cp so that it won't run out of things to do and give you false
> results. 
> 
> You may note when the ls pauses, the hard disk seems to go quiet also
> (less seeking), although the SCSI controller light remains on solid.
> 
> I did a ps -alx on a third screen while the ls was stuck (in my case,
> the directory it was stuck on had six files, one subdirectory, and took
> 27 seconds to resume and display.  (The subdirectory contained one file.)
> It also paused on the next three or four directories for excessive
> amounts of time vs the number of files present in the directories.
> The ps shows that the cp was in "getblk D+" while the ls was in "biowai D+".
> 
> Note that there is no disk writing going on here in the test commands.
> I was able to get it to fail with disk writing, such as changing the
> cp history /dev/null to cp history xyzzy, but it seemed to take a lot longer
> to fail.  I also didn't kill update or any of the basic services, so there
> was someone doing a write once in a while, even during the first example.
> 
> This smells like a disksort implementation flaw that resets direction each
> time an item is added to the queue, rather than completely exhausting the
> queue in one direction before reversing direction.  Something like an
> elevator sort should be done.
> 
> Frank Durda IV <uhclem@nemesis.lonestar.org>|"The Knights who say "LETNi"
> or uhclem%nemesis@fw.ast.com (Fastest Route)| demand...  A SEGMENT REGISTER!!!"
> ...letni!rwsys!nemesis!uhclem               |"A what?"
> ...decvax!fw.ast.com!nemesis!uhclem         |"LETNi! LETNi! LETNi!"  - 1983
> 
>

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199511140534.VAA26450>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation