Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 28 Mar 2006 21:27:21 +1100
From:      Peter Jeremy <peterjeremy@optushome.com.au>
To:        Mikhail Teterin <mi+mx@aldan.algebra.com>
Cc:        stable@freebsd.org
Subject:   Re: Reading via mmap stinks (Re: weird bugs with mmap-ing via NFS)
Message-ID:  <20060328102721.GA2352@turion.vk2pj.dyndns.org>
In-Reply-To: <20060325103927.GE703@turion.vk2pj.dyndns.org>
References:  <200603232352.k2NNqPS8018729@gate.bitblocks.com> <200603241518.01027.mi%2Bmx@aldan.algebra.com> <20060325103927.GE703@turion.vk2pj.dyndns.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sat, 2006-Mar-25 21:39:27 +1100, Peter Jeremy wrote:
>What happens if you simulate read-ahead yourself?  Have your main
>program fork and the child access pages slightly ahead of the parent
>but do nothing else.

I suspect something like this may be the best approach for your application.

My suggestion would be to split the backup into 3 processes that share
memory.  I wrote a program that is designed to buffer data in what looks
like a big FIFO and "dump | myfifo | gzip > file.gz" is significantly
faster than "dump | gzip > file.gz" so I suspect it will help you as well.

Process 1 reads the input file into mmap A.
Process 2 {b,gz}ips's mmap A into mmap B.
Process 3 writes mmap B into the output file.

Process 3 and mmap B may be optional, depending on your target's write
performance.

mmap A could be the real file with process 1 just accessing pages to
force them into RAM.

I'd suggest that each mmap be capable of storing several hundred msec of
data as a minumum (maybe 10MB input and 5MB output, preferably more).
Synchronisation can be done by writing tokens into pipes shared with the
mmap's, optimised by sharing read/write pointers (so you only really need
the tokens when the shared buffer is full/empty).

-- 
Peter Jeremy



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060328102721.GA2352>