Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 21 Feb 2009 14:46:44 -0500
From:      Junsuk Shin <junsukshin@gmail.com>
To:        freebsd-questions@freebsd.org
Subject:   read two files simultaneously
Message-ID:  <7873ac110902211146k6a8ee7d0pd67edc559ed14b15@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
Hello,

I need to read two files simultaneously, and simply read(2) is
interleaved to do this. The problem is the performance varies
dramatically depending on the file size. I'm wondering what is the
problem in this case.

The test application does following:

open 2 files
  - the size of two file is same
  - since I read only once, bypass cache with O_DIRECT
read 16Kbytes of file1, then read 16K of file2, and so on

simplified code is like this:

fd1 = open(file1, O_RDONLY | O_DIRECT);
fd2 = open(file2, O_RDONLY | O_DIRECT);

for(...) {
    /* read 16K of file1 */
    while(...) {
        count = read(fd1,...);
        ....
    }
    /* read 16K of file2 */
    while(...) {
        count = read(fd2,...);
        ....
    }
}

When I tested with two 100M files, it takes 3.17 seconds (about 31MB/s
per file, 62MB/s in total)
However, if I test with two 700M files, it takes 162 seconds (about
4.5MB/s per file, 9MB/s in total)

I'm just guessing inode structure, the physical file location on HDD
might be related to this. But, if I read only one file, the size
doesn't matter. Reading file (10M, 100M, 700M) gives constantly about
70MB/s, and the weird thing happens when I read 2 files of big size.

The seek time might be related to this, but it looks like too huge
difference. What is going on this?

Thanks.

-- 
Junsuk



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?7873ac110902211146k6a8ee7d0pd67edc559ed14b15>