Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 22 Apr 2005 08:33:20 -0700
From:      "Kevin Oberman" <>
To:        "Daniel Eriksson" <>
Cc:        'FreeBSD Current' <>
Subject:   Re: Serious I/O problems (bad performance and live-lock) 
Message-ID:  <>
In-Reply-To: Your message of "Fri, 22 Apr 2005 15:13:43 +0200." <!~!> 

Next in thread | Previous in thread | Raw E-Mail | Index | Archive | Help
> From: "Daniel Eriksson" <>
> Date: Fri, 22 Apr 2005 15:13:43 +0200
> Sender:
> With recent CURRENT (at least for the last 2 days, but probably longer), two
> of my systems can be brought to their knees (live-lock) with a simple "dd
> if=/dev/zero of=test bs=128k" command. I have not tested any other systems.
> I keep both servers synced running 6-CURRENT:
> Server #1: dual AthlonMP 2600+, Compaq SmartArray 5302/64 hardware raid card
> (ciss). The card hosts two arrays, one RAID-5 built from 4 discs that holds
> the system and one RAID-0 built from 14 discs. All the discs are 36GB 10krpm
> and I have one array on each channel on the card.
> Server #2: AthlonXP 2500+ with an old Maxtor 27GB UDMA66 disc for the
> system.
> What made me take notice was that server #2 ran through a "make
> installkernel; make installworld" faster than server #1 during a recent
> upgrade. This makes no sense given the superior I/O performance of the
> hardware scsi raid array on server #1, and I know that in the past server #1
> has finished the process ahead of server #2.
> After the upgrade was done I ran some simple tests with 'dd', and it only
> took ~1 minute for the system to live-lock. Breaking into DDB and killing
> the 'dd' process brought the machine back to life. I assumed the problem was
> ciss-related, CAM-related or SMP-related, but I just tried doing the same
> thing on the UP machine (server #2), and it too live-locked within a minute.
> Both systems use pretty much the same config, with the only major difference
> being SMP or not:
> * debug.mpsafenet="1", debug.mpsafevfs="1"
> The problem manifests itself like this:
> Shortly after 'dd' is started, the machine starts to swap.
> The swapping makes the machine very unresponsive.
> After about a minute or so the machine enters some sort of live-lock where
> the IP-stack replies to icmp echos, but nothing else can be done.
> The last test I did was on a system compiled from sources dated
> 2005. (earlier today). The oldest system I've tested is from
> 2005. (but I did notice the system being slightly sluggish
> earlier in the week too, so I think the problem is older than that).
> This is a serious regression! I don't know when I last did any testing with
> 'dd', but I'm pretty sure it was less than 3 months ago (and back then
> neither system live-locked).

I had been seeing similar problems (very painful to get anything done)
for several days, but the kernel I built from CURRENT as of
2005. seems to have fixed the problem for me.
R. Kevin Oberman, Network Engineer
Energy Sciences Network (ESnet)
Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
E-mail:			Phone: +1 510 486-8634

Want to link to this message? Use this URL: <>