Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 21 Nov 2007 21:18:56 +0100
From:      Kris Kennaway <kris@FreeBSD.org>
To:        Kris Kennaway <kris@FreeBSD.org>
Cc:        freebsd-hackers@freebsd.org, Panagiotis Christias <christias@gmail.com>, freebsd-stable@freebsd.org, Alexey Popov <lol@chistydom.ru>
Subject:   Re: amrd disk performance drop after running under high load
Message-ID:  <474492B0.1010108@FreeBSD.org>
In-Reply-To: <4741EE9E.9050406@FreeBSD.org>
References:  <47137D36.1020305@chistydom.ru> <47149E6E.9000500@chistydom.ru>	 <4715035D.2090802@FreeBSD.org> <4715C297.1020905@chistydom.ru>	 <4715C5D7.7060806@FreeBSD.org> <471EE4D9.5080307@chistydom.ru>	 <4723BF87.20302@FreeBSD.org> <47344E47.9050908@chistydom.ru>	 <47349A17.3080806@FreeBSD.org> <47373B43.9060406@chistydom.ru> <e4b0ecef0711111531k449f78fbnf7f3241b768498ad@mail.gmail.com> <4739557A.6090209@chistydom.ru> <4741EE9E.9050406@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Kris Kennaway wrote:
> Alexey Popov wrote:
>> Hi.
>>
>> Panagiotis Christias wrote:
>>>>>>> In the "good" case you are getting a much higher interrupt rate but
>>>>>>> with the data you provided I can't tell where from.  You need to run
>>>>>>> vmstat -i at regular intervals (e.g. every 10 seconds for a minute)
>>>>>>> during the "good" and "bad" times, since it only provides counters
>>>>>>> and an average rate over the uptime of the system.
>>>>>> Now I'm running 10-process lighttpd and the problem became no so big.
>>>>>>
>>>>>> I collected interrupt stats and it shows no relation beetween
>>>>>> ionterrupts and slowdowns. Here is it:
>>>>>> http://83.167.98.162/gprof/intr-graph/
>>>>>>
>>>>>> Also I have similiar statistics on mutex profiling and it shows
>>>>>> there's no problem in mutexes.
>>>>>> http://83.167.98.162/gprof/mtx-graph/mtxgifnew/
>>>>>>
>>>>>> I have no idea what else to check.
>>>>> I don't know what this graph is showing me :)  When precisely is the
>>>>> system behaving poorly?
>>> what is your RAID controller configuration (read ahead/cache/write
>>> policy)? I have seen weird/bogus numbers (~100% busy) reported by
>>> systat -v when read ahead was enabled on LSI/amr controllers.
>>
>>
>> **********************************************************************
>>               Existing Logical Drive Information
>>               By LSI Logic Corp.,USA
>>
>> **********************************************************************
>>           [Note: For SATA-2, 4 and 6 channel controllers, please specify
>>           Ch=0 Id=0..15 for specifying physical drive(Ch=channel,
>> Id=Target)]
>>
>>
>>           Logical Drive : 0( Adapter: 0 ):  Status: OPTIMAL
>>         ---------------------------------------------------
>>         SpanDepth :01     RaidLevel: 5  RdAhead : Adaptive  Cache: 
>> DirectIo
>>         StripSz   :064KB   Stripes  : 6  WrPolicy: WriteBack
>>
>>         Logical Drive 0 : SpanLevel_0 Disks
>>         Chnl  Target  StartBlock   Blocks      Physical Target Status
>>         ----  ------  ----------   ------      ----------------------
>>         0      00    0x00000000   0x22ec0000   ONLINE
>>         0      01    0x00000000   0x22ec0000   ONLINE
>>         0      02    0x00000000   0x22ec0000   ONLINE
>>         0      03    0x00000000   0x22ec0000   ONLINE
>>         0      04    0x00000000   0x22ec0000   ONLINE
>>         0      05    0x00000000   0x22ec0000   ONLINE
>>
>> I tried to run with disabled Read-ahead, but it didn't help.
> 
> I just ran into this myself, and apparently it can be caused by "Patrol 
> Reads" where the adapter periodically scans the disks to look for media 
> errors.  You can turn this off using -stopPR with the megarc port.
> 
> Kris
> 

Oops, -disPR is the correct command to disable, -stopPR just halts a PR 
event in progress.

Kris



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?474492B0.1010108>