Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 16 Dec 1998 14:17:57 -0700 (MST)
From:      Alex Rousskov <rousskov@nlanr.net>
To:        FreeBSD-FS@FreeBSD.ORG
Subject:   Fast/slow close and open
Message-ID:  <Pine.BSF.3.96.981216115141.29208E-100000@pail.scd.ucar.edu>

next in thread | raw e-mail | index | archive | help
Hi there,

	I have a strange problem that is probably FS related. Any help is
appreciated. 

Background:
----------

	- A small program ("player") replays a trace from Squid Web proxy
	  running under Web Polygraph benchmark
	- The trace consists of ~ 300K of open(2)/write(2)/close(2) file
	  system calls
	- The calls may interleave
	  (e.g. open(#5), write(#3), write(#5), close(#3), write(#5))
	  but all FS calls are blocking (no threads and such)
	- No artificial delays between the calls are introduced
	- Most files are "small": 11K mean, 7.5K median
	- No other activity on the system

	- FreeBSD 2.2.7-RELEASE
	- 256 RAM; Pentium II (267.27-MHz 686-class CPU)
	- kern.update is set to 43200  (12 hours)
	  (for these tests I do not care about FS consistency)
	- 9GB disk(s) 
	  ["IBM DDRS-39130W S92A" type 0 fixed SCSI 2; Direct-Access 8715MB] 
	  with one partition per disk
	- newfs -o time
	  mount options: rw,noauto
	- each disk has /cache?/??/???/ directories pre-created 
	  (1x16x128 directories); no other data on the disk
	- Squid (and hence player) fill leaf directories with files 
	  one leaf directory at a time 
	  (until there are 128 files in the directory)
	- Disk space utilization starts with 0% and is 20%
	  at the end of the end of each experiment


The problem:
-----------

	The player measures open/write/close delays using gettimeofday() 
calls wrapped around file system calls. I monitor sudden peaks and dives in
close and open calls throughput:

	http://ircache.nlanr.net/Polygraph/tmp/

	During peaks, close(2) response time DEcreases from a mean of 17 msec
to tens of usec(!) and open(2) response time INcreases from 20 to 27 msec or
so. During dives, both response times increase 50-400%. Write(2) response
time is very fast (e.g., 300 usec mean) and is virtually not affected by the
peaks and dives. 

	There are no peaks for 3- and 4- disks experiments. Dives are present
on 1-, 2-, 3-, and 4-disk runs. 

	There are up to three bursts within a ~2-3 hour experiment so it is
hard to say if they occur at "close-to-regular" intervals. Usually the bursts
are 40-80 minutes apart. Each burst lasts 5-10 minutes (9K-18K open/close
calls) so it is not a "random" thing.

	The same behavior was measured on Squid. The player program is very
small and simple and confirms that those oddities are not caused by something
in Squid or in the network.


Question:
--------

	What kind of system activity can suddenly significantly speedup or
slowdown open/close calls? Since it happens once in 40-80 minutes and lasts
for several minutes (with 30-50 open_calls/sec rate) it is probably something
major.. 

I am especially confused by close calls being almost zero-overhead in the
middle of a big experiment. Looks like some write-behind buffers suddenly
appear out of nowhere, get used (speeding up close calls), and then
disappear. Dives are also very disturbing as they hurt overall performance...

Any clues?

Thank you,

Alex.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-fs" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.3.96.981216115141.29208E-100000>