From owner-freebsd-performance@FreeBSD.ORG  Sun Feb 20 17:36:49 2005
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 293EE16A4CE
	for <freebsd-performance@freebsd.org>;
	Sun, 20 Feb 2005 17:36:49 +0000 (GMT)
Received: from rproxy.gmail.com (rproxy.gmail.com [64.233.170.194])
	by mx1.FreeBSD.org (Postfix) with ESMTP id ABFE843D46
	for <freebsd-performance@freebsd.org>;
	Sun, 20 Feb 2005 17:36:48 +0000 (GMT)
	(envelope-from joseph.koshy@gmail.com)
Received: by rproxy.gmail.com with SMTP id j1so165882rnf
	for <freebsd-performance@freebsd.org>;
	Sun, 20 Feb 2005 09:36:48 -0800 (PST)
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws;
	s=beta; d=gmail.com;
	h=received:message-id:date:from:reply-to:to:subject:mime-version:content-type:content-transfer-encoding;
	b=QDf6g5cYdjtJ33lAMGpKyUVAZVqTDfCfIfzrWE99KK0eo1iyE50D8yUXLaiBOcvPxxiummIHzMVdv/WrZSsSON2/pL3vEriE4RtKiANJ91tW7XZrDQoeIrKbsMaK+lYeIkhxs9W3RL3PIN0X6zykBTZ8CrCYBTucFQBGbFBVAfA=
Received: by 10.38.97.18 with SMTP id u18mr73526rnb;
        Sun, 20 Feb 2005 09:36:48 -0800 (PST)
Received: by 10.38.209.12 with HTTP; Sun, 20 Feb 2005 09:36:47 -0800 (PST)
Message-ID: <84dead72050220093654a8bc4c@mail.gmail.com>
Date: Sun, 20 Feb 2005 17:36:47 +0000
From: Joseph Koshy <joseph.koshy@gmail.com>
To: FreeBSD Current <freebsd-current@freebsd.org>,
	freebsd-performance@freebsd.org
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Subject: [hwpmc support] new code snapshot
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
Reply-To: Joseph Koshy <joseph.koshy@gmail.com>
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 20 Feb 2005 17:36:49 -0000

A snapshot of the HWPMC based performance monitoring code is
available:

http://people.freebsd.org/~jkoshy/projects/perf-measurement/snapshot-5.html

Please take it for a spin.

-- 
FreeBSD Volunteer, http://people.freebsd.org/~jkoshy

From owner-freebsd-performance@FreeBSD.ORG  Mon Feb 21 20:34:53 2005
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id AB84516A4CF; Mon, 21 Feb 2005 20:34:53 +0000 (GMT)
Received: from corp.globat.com (corp.globat.com [216.193.201.6])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 27B7943D48; Mon, 21 Feb 2005 20:34:53 +0000 (GMT)
	(envelope-from drice@globat.com)
Received: from globat.com (globat [66.159.202.156])
	by corp.globat.com (8.12.11/8.12.9) with ESMTP id j1LKYnuP080837;
	Mon, 21 Feb 2005 12:34:49 -0800 (PST)
	(envelope-from drice@globat.com)
From: David Rice <drice@globat.com>
Organization: Globat
To: Robert Watson <rwatson@FreeBSD.org>
Date: Mon, 21 Feb 2005 12:34:51 -0800
User-Agent: KMail/1.5.4
References: <Pine.NEB.3.96L.1050219121641.67347L-100000@fledge.watson.org>
In-Reply-To: <Pine.NEB.3.96L.1050219121641.67347L-100000@fledge.watson.org>
MIME-Version: 1.0
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <200502211234.51976.drice@globat.com>
cc: freebsd-performance@FreeBSD.org
Subject: Re: High traffic NFS performance and availability problems
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 21 Feb 2005 20:34:53 -0000


Here are the snapshots of the output you requested. These are from the NFS 
server. We have just upgraded them to 5.3-RELEASE as so many have recomended.
Hope that makes them more stable. The performance still needs some attention.

Thank You

--------------------------------------------------------------------------------------------------
D USERNAME PRI NICE   SIZE    RES STATE  C   TIME   WCPU    CPU COMMAND
    4 users    Load  5.28 19.37 28.00                  Feb 21 12:18

Mem:KB    REAL            VIRTUAL                     VN PAGER  SWAP PAGER
        Tot   Share      Tot    Share    Free         in  out     in  out
Act   19404    2056    90696     3344   45216 count
All 1020204    4280  4015204     7424         pages
                                                          zfod   Interrupts
Proc:r  p  d  s  w    Csw  Trp  Sys  Int  Sof  Flt        cow    7226 total
           5128  5  60861    3  14021584    9      152732 wire        4: sio0
                                                    23228 act         6: fdc0
30.2%Sys  11.8%Intr  0.0%User  0.0%Nice 58.0%Idl   803616 inact   128 8: rtc
|    |    |    |    |    |    |    |    |    |      43556 cache       13: npx
===============++++++                                1660 free        15: ata
                                                          daefr  6358 16: bge
Namei         Name-cache    Dir-cache                     prcfr     1 17: bge
    Calls     hits    %     hits    %                     react       18: mpt
     1704      971   57       11    1                     pdwak       19: mpt
                                                     5342 pdpgs   639 24: amr
Disks amrd0   da0 pass0 pass1 pass2                       intrn   100 0: clk
KB/t  22.41  0.00  0.00  0.00  0.00                114288 buf
tps     602     0     0     0     0                   510 dirtybuf
MB/s  13.16  0.00  0.00  0.00  0.00                 70235 desiredvnodes
% busy  100     0     0     0     0                 20543 numvnodes
                                                     7883 freevnodes
-----------------------------------------------------------------------------------------
last pid: 10330;  load averages: 14.69, 11.81, 18.62                                                                                           
up 0+09:01:13  12:32:57
226 processes: 5 running, 153 sleeping, 57 waiting, 11 lock
CPU states:  0.1% user,  0.0% nice, 66.0% system, 24.3% interrupt,  9.6% idle
Mem: 23M Active, 774M Inact, 150M Wired, 52M Cache, 112M Buf, 1660K Free
Swap: 1024M Total, 124K Used, 1024M Free

  PID USERNAME PRI NICE   SIZE    RES STATE  C   TIME   WCPU    CPU COMMAND
   63 root     -44 -163     0K    12K WAIT   0 147:05 45.07% 45.07% swi1: net
   30 root     -68 -187     0K    12K WAIT   0 101:39 32.32% 32.32% irq16: 
bge0
   12 root     117    0     0K    12K CPU2   2 329:09 19.58% 19.58% idle: cpu2
   11 root     116    0     0K    12K CPU3   3 327:29 19.24% 19.24% idle: cpu3
   13 root     114    0     0K    12K RUN    1 263:39 16.89% 16.89% idle: cpu1
   14 root     109    0     0K    12K CPU0   0 228:50 12.06% 12.06% idle: cpu0
  368 root       4    0  1220K   740K *Giant 3  45:27  7.52%  7.52% nfsd
  366 root       4    0  1220K   740K *Giant 0  48:52  7.28%  7.28% nfsd
  364 root       4    0  1220K   740K *Giant 3  53:01  7.13%  7.13% nfsd
  367 root      -8    0  1220K   740K biord  3  41:22  7.08%  7.08% nfsd
  372 root       4    0  1220K   740K *Giant 0  28:54  7.08%  7.08% nfsd
  365 root      -1    0  1220K   740K *Giant 3  51:53  6.93%  6.93% nfsd
  370 root      -1    0  1220K   740K nfsslp 0  32:49  6.84%  6.84% nfsd
  369 root      -8    0  1220K   740K biord  1  36:40  6.49%  6.49% nfsd
  371 root       4    0  1220K   740K *Giant 0  25:14  6.45%  6.45% nfsd
  374 root      -1    0  1220K   740K nfsslp 2  22:31  6.45%  6.45% nfsd
  377 root       4    0  1220K   740K *Giant 2  17:21  5.52%  5.52% nfsd
  376 root      -4    0  1220K   740K *Giant 2  15:45  5.37%  5.37% nfsd
  373 root      -4    0  1220K   740K ufs    3  19:38  5.18%  5.18% nfsd
  378 root       4    0  1220K   740K *Giant 2  13:55  4.54%  4.54% nfsd
  379 root      -8    0  1220K   740K biord  3  12:41  4.49%  4.49% nfsd
  380 root       4    0  1220K   740K -      2  11:26  4.20%  4.20% nfsd
    3 root      -8    0     0K    12K -      1  21:21  4.05%  4.05% g_up
    4 root      -8    0     0K    12K -      0  20:05  3.96%  3.96% g_down
  381 root       4    0  1220K   740K -      3   9:28  3.66%  3.66% nfsd
  382 root       4    0  1220K   740K -      1  10:13  3.47%  3.47% nfsd
  385 root      -1    0  1220K   740K nfsslp 3   7:21  3.17%  3.17% nfsd
   38 root     -64 -183     0K    12K *Giant 0  14:45  3.12%  3.12% irq24: 
amr0
  384 root       4    0  1220K   740K -      3   8:40  3.12%  3.12% nfsd
   72 root     -24 -143     0K    12K WAIT   2  16:50  2.98%  2.98% swi6:+
  383 root      -8    0  1220K   740K biord  2   7:57  2.93%  2.93% nfsd
  389 root       4    0  1220K   740K -      2   5:31  2.64%  2.64% nfsd
  390 root      -8    0  1220K   740K biord  3   5:54  2.59%  2.59% nfsd
  387 root      -8    0  1220K   740K biord  0   6:40  2.54%  2.54% nfsd
  386 root      -8    0  1220K   740K biord  1   6:22  2.44%  2.44% nfsd
  392 root       4    0  1220K   740K -      3   4:27  2.10%  2.10% nfsd
  388 root      -4    0  1220K   740K *Giant 2   4:45  2.05%  2.05% nfsd
  395 root       4    0  1220K   740K -      0   3:59  2.05%  2.05% nfsd
  391 root       4    0  1220K   740K -      2   5:10  1.95%  1.95% nfsd
  393 root       4    0  1220K   740K sbwait 1   4:13  1.56%  1.56% nfsd
  398 root       4    0  1220K   740K -      2   3:31  1.56%  1.56% nfsd
  399 root       4    0  1220K   740K -      3   3:12  1.56%  1.56% nfsd
  401 root       4    0  1220K   740K -      1   2:57  1.51%  1.51% nfsd
  403 root       4    0  1220K   740K -      0   3:04  1.42%  1.42% nfsd
  406 root       4    0  1220K   740K -      1   2:27  1.37%  1.37% nfsd
  397 root       4    0  1220K   740K -      3   3:16  1.27%  1.27% nfsd
  396 root       4    0  1220K   740K -      2   3:42  1.22%  1.22% nfsd


On Saturday 19 February 2005 04:23 am, Robert Watson wrote:
> On Thu, 17 Feb 2005, David Rice wrote:
> > Typicly we have 7 client boxes mounting storage from a single file
> > server.  Each client box servers 1000 web sites and associate email. We
> > have done the basic NFS tuning (ie: Read write size optimization and
> > kernel tuning)
>
> How many nfsd's are you running with?
>
> If you run systat -vmstat 1 on your server under high load, could you send
> us the output?  In particular, I'm interested in knowing how the system is
> spending its time, the paging level, I/O throughput on devices, and the
> systat -vmstat summary screen provides a good summary of this and more.  A
> few snapshots of "gstat" output would also be very helpful.  As would a
> snapshot or two of "top -S" output.  This will give us a picture of how
> the system is spending its time.
>
> > 2. Client boxes have high load averages and sometimes crashes due to
> > slow NFS performance.
>
> Could you be more specific about the crash failure mode?
>
> > 3. File servers that randomly crash with "Fatal trap 12: page fault
> > while in kernel mode"
>
> Could you make sure you're running with at least the latest 5.3 patch
> level on the server, which includes some NFS server stability fixes, and
> also look at sliding to the head of 5-STABLE?  There are a number of
> performance and stability improvements that may be relevant there.
>
> Could you provide serial console output of the full panic message, trap
> details, compile the kernel with KDB+DDB, and include a full stack trace?
> I'm happy to try to help debug these problems.
>
> > 4. With soft updates enabled during FSCK the fileserver will freeze with
> > all NFS processs in the "snaplck" state. We disabled soft updates
> > because of this.
>
> If it's possible to do get some more information, it would be quite
> helpful.  In particular, could you compile the server box with
> DDB+KDB+BREAK_TO_DEBUGGER, breka into the serial debugger when it appears
> wedged, and put the contents of "show lockedvnods", "ps", and "trace
> <pid>" of any processes listed in "show lockedvnods" output, that would be
> great.  A crash dump would also be very helpful.  For some hints on the
> information that is necessary here, take a look at the handbook chapter on
> kernel debugging and reporting kernel bugs, and my recent post to current@
> diagnosing a similar bug.
>
> If you e-enable soft updates but leave bgfsck disabled, does that correct
> this stability problem?
>
> In any case, I'm happy to help try to figure out what's going on -- some
> of the above information for stability and performance problems would be
> quite helpful in tracking it down.
>
> Robert N M Watson

From owner-freebsd-performance@FreeBSD.ORG  Mon Feb 21 20:38:43 2005
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id E613D16A4CE; Mon, 21 Feb 2005 20:38:43 +0000 (GMT)
Received: from mh1.centtech.com (moat3.centtech.com [207.200.51.50])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id EC3DA43D66; Mon, 21 Feb 2005 20:38:42 +0000 (GMT)
	(envelope-from anderson@centtech.com)
Received: from [10.177.171.220] (neutrino.centtech.com [10.177.171.220])
	by mh1.centtech.com (8.13.1/8.13.1) with ESMTP id j1LKcfxs051438;
	Mon, 21 Feb 2005 14:38:42 -0600 (CST)
	(envelope-from anderson@centtech.com)
Message-ID: <421A46D1.1010003@centtech.com>
Date: Mon, 21 Feb 2005 14:38:41 -0600
From: Eric Anderson <anderson@centtech.com>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.5) Gecko/20050210
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: David Rice <drice@globat.com>
References: <Pine.NEB.3.96L.1050219121641.67347L-100000@fledge.watson.org>
	<200502211234.51976.drice@globat.com>
In-Reply-To: <200502211234.51976.drice@globat.com>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
X-Virus-Scanned: ClamAV 0.82/718/Mon Feb 21 04:38:57 2005 on mh1.centtech.com
X-Virus-Status: Clean
cc: freebsd-performance@freebsd.org
cc: Robert Watson <rwatson@freebsd.org>
Subject: Re: High traffic NFS performance and availability problems
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 21 Feb 2005 20:38:44 -0000

David Rice wrote:
> Here are the snapshots of the output you requested. These are from the NFS 
> server. We have just upgraded them to 5.3-RELEASE as so many have recomended.
> Hope that makes them more stable. The performance still needs some attention.
[..snip..]
> Disks amrd0   da0 pass0 pass1 pass2                       intrn   100 0: clk
> KB/t  22.41  0.00  0.00  0.00  0.00                114288 buf
> tps     602     0     0     0     0                   510 dirtybuf
> MB/s  13.16  0.00  0.00  0.00  0.00                 70235 desiredvnodes
> % busy  100     0     0     0     0 


I think you are spindle bound - looks like the disk is maxed (heavy writes?).  What kind of disk subsystem do you have?

Eric


-- 
------------------------------------------------------------------------
Eric Anderson        Sr. Systems Administrator        Centaur Technology
I have seen the future and it is just like the present, only longer.
------------------------------------------------------------------------

From owner-freebsd-performance@FreeBSD.ORG  Mon Feb 21 21:27:52 2005
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 72FD616A4CE
	for <freebsd-performance@FreeBSD.org>;
	Mon, 21 Feb 2005 21:27:52 +0000 (GMT)
Received: from cyrus.watson.org (cyrus.watson.org [204.156.12.53])
	by mx1.FreeBSD.org (Postfix) with ESMTP id BFF1343D46
	for <freebsd-performance@FreeBSD.org>;
	Mon, 21 Feb 2005 21:27:51 +0000 (GMT)
	(envelope-from robert@fledge.watson.org)
Received: from fledge.watson.org (fledge.watson.org [204.156.12.50])
	by cyrus.watson.org (Postfix) with SMTP id F257446B0A;
	Mon, 21 Feb 2005 16:27:50 -0500 (EST)
Date: Mon, 21 Feb 2005 21:26:14 +0000 (GMT)
From: Robert Watson <rwatson@FreeBSD.org>
X-Sender: robert@fledge.watson.org
To: David Rice <drice@globat.com>
In-Reply-To: <200502211234.51976.drice@globat.com>
Message-ID: <Pine.NEB.3.96L.1050221212016.30319D-100000@fledge.watson.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
cc: freebsd-performance@FreeBSD.org
Subject: Re: High traffic NFS performance and availability problems
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 21 Feb 2005 21:27:52 -0000

On Mon, 21 Feb 2005, David Rice wrote:

> Here are the snapshots of the output you requested. These are from the
> NFS server. We have just upgraded them to 5.3-RELEASE as so many have
> recomended.  Hope that makes them more stable. The performance still
> needs some attention. 

In the top output below, it looks like there's a lot of contention on
Giant.  In 5.3-RELEASE and before, the amr driver is not MPSAFE, but my
understanding is that in 5-STABLE, it has been made MPSAFE, which may make
quite a difference in performance.  I pinged Scott Long, who did the work
on the driver, and he indicated that backporting the patch to run on
-RELEASE would be quite difficult, so an upgrade to 5-STABLE is the best
way to get the changes.  I believe that you can build a 5-STABLE kernel
and run with a 5.3-RELEASE user space to avoid having to commit to a full
upgrade to see if that helps or not.

Two other observations:

- It looks like the amr storage array is pretty busy, which may be part of
  the issue.

- It looks like you have four processors, suggesting a two-processor Xeon
  with hyper-threading turned on. For many workloads, hyper-threading does
  not improve performance, so you may want to try turning that off in the
  BIOS to see if that helps.

Robert N M Watson


> 
> Thank You
> 
> --------------------------------------------------------------------------------------------------
> D USERNAME PRI NICE   SIZE    RES STATE  C   TIME   WCPU    CPU COMMAND
>     4 users    Load  5.28 19.37 28.00                  Feb 21 12:18
> 
> Mem:KB    REAL            VIRTUAL                     VN PAGER  SWAP PAGER
>         Tot   Share      Tot    Share    Free         in  out     in  out
> Act   19404    2056    90696     3344   45216 count
> All 1020204    4280  4015204     7424         pages
>                                                           zfod   Interrupts
> Proc:r  p  d  s  w    Csw  Trp  Sys  Int  Sof  Flt        cow    7226 total
>            5128  5  60861    3  14021584    9      152732 wire        4: sio0
>                                                     23228 act         6: fdc0
> 30.2%Sys  11.8%Intr  0.0%User  0.0%Nice 58.0%Idl   803616 inact   128 8: rtc
> |    |    |    |    |    |    |    |    |    |      43556 cache       13: npx
> ===============++++++                                1660 free        15: ata
>                                                           daefr  6358 16: bge
> Namei         Name-cache    Dir-cache                     prcfr     1 17: bge
>     Calls     hits    %     hits    %                     react       18: mpt
>      1704      971   57       11    1                     pdwak       19: mpt
>                                                      5342 pdpgs   639 24: amr
> Disks amrd0   da0 pass0 pass1 pass2                       intrn   100 0: clk
> KB/t  22.41  0.00  0.00  0.00  0.00                114288 buf
> tps     602     0     0     0     0                   510 dirtybuf
> MB/s  13.16  0.00  0.00  0.00  0.00                 70235 desiredvnodes
> % busy  100     0     0     0     0                 20543 numvnodes
>                                                      7883 freevnodes
> -----------------------------------------------------------------------------------------
> last pid: 10330;  load averages: 14.69, 11.81, 18.62                                                                                           
> up 0+09:01:13  12:32:57
> 226 processes: 5 running, 153 sleeping, 57 waiting, 11 lock
> CPU states:  0.1% user,  0.0% nice, 66.0% system, 24.3% interrupt,  9.6% idle
> Mem: 23M Active, 774M Inact, 150M Wired, 52M Cache, 112M Buf, 1660K Free
> Swap: 1024M Total, 124K Used, 1024M Free
> 
>   PID USERNAME PRI NICE   SIZE    RES STATE  C   TIME   WCPU    CPU COMMAND
>    63 root     -44 -163     0K    12K WAIT   0 147:05 45.07% 45.07% swi1: net
>    30 root     -68 -187     0K    12K WAIT   0 101:39 32.32% 32.32% irq16: 
> bge0
>    12 root     117    0     0K    12K CPU2   2 329:09 19.58% 19.58% idle: cpu2
>    11 root     116    0     0K    12K CPU3   3 327:29 19.24% 19.24% idle: cpu3
>    13 root     114    0     0K    12K RUN    1 263:39 16.89% 16.89% idle: cpu1
>    14 root     109    0     0K    12K CPU0   0 228:50 12.06% 12.06% idle: cpu0
>   368 root       4    0  1220K   740K *Giant 3  45:27  7.52%  7.52% nfsd
>   366 root       4    0  1220K   740K *Giant 0  48:52  7.28%  7.28% nfsd
>   364 root       4    0  1220K   740K *Giant 3  53:01  7.13%  7.13% nfsd
>   367 root      -8    0  1220K   740K biord  3  41:22  7.08%  7.08% nfsd
>   372 root       4    0  1220K   740K *Giant 0  28:54  7.08%  7.08% nfsd
>   365 root      -1    0  1220K   740K *Giant 3  51:53  6.93%  6.93% nfsd
>   370 root      -1    0  1220K   740K nfsslp 0  32:49  6.84%  6.84% nfsd
>   369 root      -8    0  1220K   740K biord  1  36:40  6.49%  6.49% nfsd
>   371 root       4    0  1220K   740K *Giant 0  25:14  6.45%  6.45% nfsd
>   374 root      -1    0  1220K   740K nfsslp 2  22:31  6.45%  6.45% nfsd
>   377 root       4    0  1220K   740K *Giant 2  17:21  5.52%  5.52% nfsd
>   376 root      -4    0  1220K   740K *Giant 2  15:45  5.37%  5.37% nfsd
>   373 root      -4    0  1220K   740K ufs    3  19:38  5.18%  5.18% nfsd
>   378 root       4    0  1220K   740K *Giant 2  13:55  4.54%  4.54% nfsd
>   379 root      -8    0  1220K   740K biord  3  12:41  4.49%  4.49% nfsd
>   380 root       4    0  1220K   740K -      2  11:26  4.20%  4.20% nfsd
>     3 root      -8    0     0K    12K -      1  21:21  4.05%  4.05% g_up
>     4 root      -8    0     0K    12K -      0  20:05  3.96%  3.96% g_down
>   381 root       4    0  1220K   740K -      3   9:28  3.66%  3.66% nfsd
>   382 root       4    0  1220K   740K -      1  10:13  3.47%  3.47% nfsd
>   385 root      -1    0  1220K   740K nfsslp 3   7:21  3.17%  3.17% nfsd
>    38 root     -64 -183     0K    12K *Giant 0  14:45  3.12%  3.12% irq24: 
> amr0
>   384 root       4    0  1220K   740K -      3   8:40  3.12%  3.12% nfsd
>    72 root     -24 -143     0K    12K WAIT   2  16:50  2.98%  2.98% swi6:+
>   383 root      -8    0  1220K   740K biord  2   7:57  2.93%  2.93% nfsd
>   389 root       4    0  1220K   740K -      2   5:31  2.64%  2.64% nfsd
>   390 root      -8    0  1220K   740K biord  3   5:54  2.59%  2.59% nfsd
>   387 root      -8    0  1220K   740K biord  0   6:40  2.54%  2.54% nfsd
>   386 root      -8    0  1220K   740K biord  1   6:22  2.44%  2.44% nfsd
>   392 root       4    0  1220K   740K -      3   4:27  2.10%  2.10% nfsd
>   388 root      -4    0  1220K   740K *Giant 2   4:45  2.05%  2.05% nfsd
>   395 root       4    0  1220K   740K -      0   3:59  2.05%  2.05% nfsd
>   391 root       4    0  1220K   740K -      2   5:10  1.95%  1.95% nfsd
>   393 root       4    0  1220K   740K sbwait 1   4:13  1.56%  1.56% nfsd
>   398 root       4    0  1220K   740K -      2   3:31  1.56%  1.56% nfsd
>   399 root       4    0  1220K   740K -      3   3:12  1.56%  1.56% nfsd
>   401 root       4    0  1220K   740K -      1   2:57  1.51%  1.51% nfsd
>   403 root       4    0  1220K   740K -      0   3:04  1.42%  1.42% nfsd
>   406 root       4    0  1220K   740K -      1   2:27  1.37%  1.37% nfsd
>   397 root       4    0  1220K   740K -      3   3:16  1.27%  1.27% nfsd
>   396 root       4    0  1220K   740K -      2   3:42  1.22%  1.22% nfsd
> 
> 
> 
> 
> 
> 
> On Saturday 19 February 2005 04:23 am, Robert Watson wrote:
> > On Thu, 17 Feb 2005, David Rice wrote:
> > > Typicly we have 7 client boxes mounting storage from a single file
> > > server.  Each client box servers 1000 web sites and associate email. We
> > > have done the basic NFS tuning (ie: Read write size optimization and
> > > kernel tuning)
> >
> > How many nfsd's are you running with?
> >
> > If you run systat -vmstat 1 on your server under high load, could you send
> > us the output?  In particular, I'm interested in knowing how the system is
> > spending its time, the paging level, I/O throughput on devices, and the
> > systat -vmstat summary screen provides a good summary of this and more.  A
> > few snapshots of "gstat" output would also be very helpful.  As would a
> > snapshot or two of "top -S" output.  This will give us a picture of how
> > the system is spending its time.
> >
> > > 2. Client boxes have high load averages and sometimes crashes due to
> > > slow NFS performance.
> >
> > Could you be more specific about the crash failure mode?
> >
> > > 3. File servers that randomly crash with "Fatal trap 12: page fault
> > > while in kernel mode"
> >
> > Could you make sure you're running with at least the latest 5.3 patch
> > level on the server, which includes some NFS server stability fixes, and
> > also look at sliding to the head of 5-STABLE?  There are a number of
> > performance and stability improvements that may be relevant there.
> >
> > Could you provide serial console output of the full panic message, trap
> > details, compile the kernel with KDB+DDB, and include a full stack trace?
> > I'm happy to try to help debug these problems.
> >
> > > 4. With soft updates enabled during FSCK the fileserver will freeze with
> > > all NFS processs in the "snaplck" state. We disabled soft updates
> > > because of this.
> >
> > If it's possible to do get some more information, it would be quite
> > helpful.  In particular, could you compile the server box with
> > DDB+KDB+BREAK_TO_DEBUGGER, breka into the serial debugger when it appears
> > wedged, and put the contents of "show lockedvnods", "ps", and "trace
> > <pid>" of any processes listed in "show lockedvnods" output, that would be
> > great.  A crash dump would also be very helpful.  For some hints on the
> > information that is necessary here, take a look at the handbook chapter on
> > kernel debugging and reporting kernel bugs, and my recent post to current@
> > diagnosing a similar bug.
> >
> > If you e-enable soft updates but leave bgfsck disabled, does that correct
> > this stability problem?
> >
> > In any case, I'm happy to help try to figure out what's going on -- some
> > of the above information for stability and performance problems would be
> > quite helpful in tracking it down.
> >
> > Robert N M Watson
> 
> 

From owner-freebsd-performance@FreeBSD.ORG  Tue Feb 22 09:45:36 2005
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id D29DD16A4CE; Tue, 22 Feb 2005 09:45:36 +0000 (GMT)
Received: from av8-2-sn3.vrr.skanova.net (av8-2-sn3.vrr.skanova.net
	[81.228.9.184])	by mx1.FreeBSD.org (Postfix) with ESMTP
	id B54EC43D58; Tue, 22 Feb 2005 09:45:35 +0000 (GMT)
	(envelope-from martin@gneto.com)
Received: by av8-2-sn3.vrr.skanova.net (Postfix, from userid 502)
	id 669F137F1A; Tue, 22 Feb 2005 10:45:34 +0100 (CET)
Received: from smtp3-2-sn3.vrr.skanova.net (smtp3-2-sn3.vrr.skanova.net
	[81.228.9.102])	by av8-2-sn3.vrr.skanova.net (Postfix) with ESMTP
	id 4FD7B37EA9; Tue, 22 Feb 2005 10:45:34 +0100 (CET)
Received: from [192.168.2.30] (h118n1fls31o985.telia.com [213.65.16.118])
	by smtp3-2-sn3.vrr.skanova.net (Postfix) with ESMTP id 093FE37E47;
	Tue, 22 Feb 2005 10:45:33 +0100 (CET)
Message-ID: <421AFF3C.8090608@gneto.com>
Date: Tue, 22 Feb 2005 10:45:32 +0100
From: Martin Nilsson <martin@gneto.com>
User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206)
X-Accept-Language: sv, en-us, en
MIME-Version: 1.0
To: Robert Watson <rwatson@FreeBSD.org>
References: <Pine.NEB.3.96L.1050221212016.30319D-100000@fledge.watson.org>
In-Reply-To: <Pine.NEB.3.96L.1050221212016.30319D-100000@fledge.watson.org>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
cc: David Rice <drice@globat.com>
cc: freebsd-performance@FreeBSD.org
Subject: Re: High traffic NFS performance and availability problems
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 22 Feb 2005 09:45:36 -0000

Robert Watson wrote:
> In the top output below, it looks like there's a lot of contention on
> Giant.  In 5.3-RELEASE and before, the amr driver is not MPSAFE, but my
> understanding is that in 5-STABLE, it has been made MPSAFE, which may make
> quite a difference in performance.  I pinged Scott Long, who did the work

The locking of amr have not been MFC:d to RELENG_5 yet, I've just 
checked with webcvs. From my brief and unconclusive tests the locked 
driver was much faster (about 2x on sequential transfers).

/Martin

From owner-freebsd-performance@FreeBSD.ORG  Wed Feb 23 18:44:57 2005
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id DFDDE16A4CE; Wed, 23 Feb 2005 18:44:56 +0000 (GMT)
Received: from corp.globat.com (corp.globat.com [216.193.201.6])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 27AEE43D41; Wed, 23 Feb 2005 18:44:56 +0000 (GMT)
	(envelope-from drice@globat.com)
Received: from globat.com (globat [66.159.202.156])
	by corp.globat.com (8.12.11/8.12.9) with ESMTP id j1NIiqHZ087019;
	Wed, 23 Feb 2005 10:44:52 -0800 (PST)
	(envelope-from drice@globat.com)
From: David Rice <drice@globat.com>
Organization: Globat
To: Robert Watson <rwatson@freebsd.org>
Date: Wed, 23 Feb 2005 10:44:54 -0800
User-Agent: KMail/1.5.4
References: <Pine.NEB.3.96L.1050221212016.30319D-100000@fledge.watson.org>
In-Reply-To: <Pine.NEB.3.96L.1050221212016.30319D-100000@fledge.watson.org>
MIME-Version: 1.0
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <200502231044.54801.drice@globat.com>
cc: freebsd-performance@freebsd.org
Subject: Re: High traffic NFS performance and availability problem
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 23 Feb 2005 18:44:57 -0000

Where can I find the MPSAFE version of the amr PERC driver.
I checked the release notes for 5.3-STABLE and it makes no refrence to
the amr driver being MPSAFE.


On Monday 21 February 2005 01:26 pm, Robert Watson wrote:
> On Mon, 21 Feb 2005, David Rice wrote:
> > Here are the snapshots of the output you requested. These are from the
> > NFS server. We have just upgraded them to 5.3-RELEASE as so many have
> > recomended.  Hope that makes them more stable. The performance still
> > needs some attention.
>
> In the top output below, it looks like there's a lot of contention on
> Giant.  In 5.3-RELEASE and before, the amr driver is not MPSAFE, but my
> understanding is that in 5-STABLE, it has been made MPSAFE, which may make
> quite a difference in performance.  I pinged Scott Long, who did the work
> on the driver, and he indicated that backporting the patch to run on
> -RELEASE would be quite difficult, so an upgrade to 5-STABLE is the best
> way to get the changes.  I believe that you can build a 5-STABLE kernel
> and run with a 5.3-RELEASE user space to avoid having to commit to a full
> upgrade to see if that helps or not.
>
> Two other observations:
>
> - It looks like the amr storage array is pretty busy, which may be part of
>   the issue.
>
> - It looks like you have four processors, suggesting a two-processor Xeon
>   with hyper-threading turned on. For many workloads, hyper-threading does
>   not improve performance, so you may want to try turning that off in the
>   BIOS to see if that helps.
>
> Robert N M Watson
>
> > Thank You
> >
> > -------------------------------------------------------------------------
> >------------------------- D USERNAME PRI NICE   SIZE    RES STATE  C  
> > TIME   WCPU    CPU COMMAND 4 users    Load  5.28 19.37 28.00             
> >     Feb 21 12:18
> >
> > Mem:KB    REAL            VIRTUAL                     VN PAGER  SWAP
> > PAGER Tot   Share      Tot    Share    Free         in  out     in  out
> > Act   19404    2056    90696     3344   45216 count
> > All 1020204    4280  4015204     7424         pages
> >                                                           zfod  
> > Interrupts Proc:r  p  d  s  w    Csw  Trp  Sys  Int  Sof  Flt        cow 
> >   7226 total 5128  5  60861    3  14021584    9      152732 wire       
> > 4: sio0 23228 act         6: fdc0 30.2%Sys  11.8%Intr  0.0%User  0.0%Nice
> > 58.0%Idl   803616 inact   128 8: rtc
> >
> > |    |    |    |    |    |    |    |    |    |      43556 cache       13:
> > |    |    |    |    |    |    |    |    |    | npx
> >
> > ===============++++++                                1660 free        15:
> > ata daefr  6358 16: bge Namei         Name-cache    Dir-cache            
> >         prcfr     1 17: bge Calls     hits    %     hits    %            
> >         react       18: mpt 1704      971   57       11    1             
> >        pdwak       19: mpt 5342 pdpgs   639 24: amr Disks amrd0   da0
> > pass0 pass1 pass2                       intrn   100 0: clk KB/t  22.41 
> > 0.00  0.00  0.00  0.00                114288 buf
> > tps     602     0     0     0     0                   510 dirtybuf
> > MB/s  13.16  0.00  0.00  0.00  0.00                 70235 desiredvnodes
> > % busy  100     0     0     0     0                 20543 numvnodes
> >                                                      7883 freevnodes
> > -------------------------------------------------------------------------
> >---------------- last pid: 10330;  load averages: 14.69, 11.81, 18.62
> > up 0+09:01:13  12:32:57
> > 226 processes: 5 running, 153 sleeping, 57 waiting, 11 lock
> > CPU states:  0.1% user,  0.0% nice, 66.0% system, 24.3% interrupt,  9.6%
> > idle Mem: 23M Active, 774M Inact, 150M Wired, 52M Cache, 112M Buf, 1660K
> > Free Swap: 1024M Total, 124K Used, 1024M Free
> >
> >   PID USERNAME PRI NICE   SIZE    RES STATE  C   TIME   WCPU    CPU
> > COMMAND 63 root     -44 -163     0K    12K WAIT   0 147:05 45.07% 45.07%
> > swi1: net 30 root     -68 -187     0K    12K WAIT   0 101:39 32.32%
> > 32.32% irq16: bge0
> >    12 root     117    0     0K    12K CPU2   2 329:09 19.58% 19.58% idle:
> > cpu2 11 root     116    0     0K    12K CPU3   3 327:29 19.24% 19.24%
> > idle: cpu3 13 root     114    0     0K    12K RUN    1 263:39 16.89%
> > 16.89% idle: cpu1 14 root     109    0     0K    12K CPU0   0 228:50
> > 12.06% 12.06% idle: cpu0 368 root       4    0  1220K   740K *Giant 3 
> > 45:27  7.52%  7.52% nfsd 366 root       4    0  1220K   740K *Giant 0 
> > 48:52  7.28%  7.28% nfsd 364 root       4    0  1220K   740K *Giant 3 
> > 53:01  7.13%  7.13% nfsd 367 root      -8    0  1220K   740K biord  3 
> > 41:22  7.08%  7.08% nfsd 372 root       4    0  1220K   740K *Giant 0 
> > 28:54  7.08%  7.08% nfsd 365 root      -1    0  1220K   740K *Giant 3 
> > 51:53  6.93%  6.93% nfsd 370 root      -1    0  1220K   740K nfsslp 0 
> > 32:49  6.84%  6.84% nfsd 369 root      -8    0  1220K   740K biord  1 
> > 36:40  6.49%  6.49% nfsd 371 root       4    0  1220K   740K *Giant 0 
> > 25:14  6.45%  6.45% nfsd 374 root      -1    0  1220K   740K nfsslp 2 
> > 22:31  6.45%  6.45% nfsd 377 root       4    0  1220K   740K *Giant 2 
> > 17:21  5.52%  5.52% nfsd 376 root      -4    0  1220K   740K *Giant 2 
> > 15:45  5.37%  5.37% nfsd 373 root      -4    0  1220K   740K ufs    3 
> > 19:38  5.18%  5.18% nfsd 378 root       4    0  1220K   740K *Giant 2 
> > 13:55  4.54%  4.54% nfsd 379 root      -8    0  1220K   740K biord  3 
> > 12:41  4.49%  4.49% nfsd 380 root       4    0  1220K   740K -      2 
> > 11:26  4.20%  4.20% nfsd 3 root      -8    0     0K    12K -      1 
> > 21:21  4.05%  4.05% g_up 4 root      -8    0     0K    12K -      0 
> > 20:05  3.96%  3.96% g_down 381 root       4    0  1220K   740K -      3  
> > 9:28  3.66%  3.66% nfsd 382 root       4    0  1220K   740K -      1 
> > 10:13  3.47%  3.47% nfsd 385 root      -1    0  1220K   740K nfsslp 3  
> > 7:21  3.17%  3.17% nfsd 38 root     -64 -183     0K    12K *Giant 0 
> > 14:45  3.12%  3.12% irq24: amr0
> >   384 root       4    0  1220K   740K -      3   8:40  3.12%  3.12% nfsd
> >    72 root     -24 -143     0K    12K WAIT   2  16:50  2.98%  2.98%
> > swi6:+ 383 root      -8    0  1220K   740K biord  2   7:57  2.93%  2.93%
> > nfsd 389 root       4    0  1220K   740K -      2   5:31  2.64%  2.64%
> > nfsd 390 root      -8    0  1220K   740K biord  3   5:54  2.59%  2.59%
> > nfsd 387 root      -8    0  1220K   740K biord  0   6:40  2.54%  2.54%
> > nfsd 386 root      -8    0  1220K   740K biord  1   6:22  2.44%  2.44%
> > nfsd 392 root       4    0  1220K   740K -      3   4:27  2.10%  2.10%
> > nfsd 388 root      -4    0  1220K   740K *Giant 2   4:45  2.05%  2.05%
> > nfsd 395 root       4    0  1220K   740K -      0   3:59  2.05%  2.05%
> > nfsd 391 root       4    0  1220K   740K -      2   5:10  1.95%  1.95%
> > nfsd 393 root       4    0  1220K   740K sbwait 1   4:13  1.56%  1.56%
> > nfsd 398 root       4    0  1220K   740K -      2   3:31  1.56%  1.56%
> > nfsd 399 root       4    0  1220K   740K -      3   3:12  1.56%  1.56%
> > nfsd 401 root       4    0  1220K   740K -      1   2:57  1.51%  1.51%
> > nfsd 403 root       4    0  1220K   740K -      0   3:04  1.42%  1.42%
> > nfsd 406 root       4    0  1220K   740K -      1   2:27  1.37%  1.37%
> > nfsd 397 root       4    0  1220K   740K -      3   3:16  1.27%  1.27%
> > nfsd 396 root       4    0  1220K   740K -      2   3:42  1.22%  1.22%
> > nfsd
> >
> > On Saturday 19 February 2005 04:23 am, Robert Watson wrote:
> > > On Thu, 17 Feb 2005, David Rice wrote:
> > > > Typicly we have 7 client boxes mounting storage from a single file
> > > > server.  Each client box servers 1000 web sites and associate email.
> > > > We have done the basic NFS tuning (ie: Read write size optimization
> > > > and kernel tuning)
> > >
> > > How many nfsd's are you running with?
> > >
> > > If you run systat -vmstat 1 on your server under high load, could you
> > > send us the output?  In particular, I'm interested in knowing how the
> > > system is spending its time, the paging level, I/O throughput on
> > > devices, and the systat -vmstat summary screen provides a good summary
> > > of this and more.  A few snapshots of "gstat" output would also be very
> > > helpful.  As would a snapshot or two of "top -S" output.  This will
> > > give us a picture of how the system is spending its time.
> > >
> > > > 2. Client boxes have high load averages and sometimes crashes due to
> > > > slow NFS performance.
> > >
> > > Could you be more specific about the crash failure mode?
> > >
> > > > 3. File servers that randomly crash with "Fatal trap 12: page fault
> > > > while in kernel mode"
> > >
> > > Could you make sure you're running with at least the latest 5.3 patch
> > > level on the server, which includes some NFS server stability fixes,
> > > and also look at sliding to the head of 5-STABLE?  There are a number
> > > of performance and stability improvements that may be relevant there.
> > >
> > > Could you provide serial console output of the full panic message, trap
> > > details, compile the kernel with KDB+DDB, and include a full stack
> > > trace? I'm happy to try to help debug these problems.
> > >
> > > > 4. With soft updates enabled during FSCK the fileserver will freeze
> > > > with all NFS processs in the "snaplck" state. We disabled soft
> > > > updates because of this.
> > >
> > > If it's possible to do get some more information, it would be quite
> > > helpful.  In particular, could you compile the server box with
> > > DDB+KDB+BREAK_TO_DEBUGGER, breka into the serial debugger when it
> > > appears wedged, and put the contents of "show lockedvnods", "ps", and
> > > "trace <pid>" of any processes listed in "show lockedvnods" output,
> > > that would be great.  A crash dump would also be very helpful.  For
> > > some hints on the information that is necessary here, take a look at
> > > the handbook chapter on kernel debugging and reporting kernel bugs, and
> > > my recent post to current@ diagnosing a similar bug.
> > >
> > > If you e-enable soft updates but leave bgfsck disabled, does that
> > > correct this stability problem?
> > >
> > > In any case, I'm happy to help try to figure out what's going on --
> > > some of the above information for stability and performance problems
> > > would be quite helpful in tracking it down.
> > >
> > > Robert N M Watson
>
> _______________________________________________
> freebsd-performance@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-performance
> To unsubscribe, send any mail to
> "freebsd-performance-unsubscribe@freebsd.org"

From owner-freebsd-performance@FreeBSD.ORG  Wed Feb 23 18:53:54 2005
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 98FF216A4CE; Wed, 23 Feb 2005 18:53:54 +0000 (GMT)
Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 8936A43D3F; Wed, 23 Feb 2005 18:53:53 +0000 (GMT)
	(envelope-from scottl@samsco.org)
Received: from [192.168.254.11] (junior-wifi.samsco.home [192.168.254.11])
	(authenticated bits=0)
	by pooker.samsco.org (8.13.1/8.13.1) with ESMTP id j1NIrujD056486;
	Wed, 23 Feb 2005 11:53:57 -0700 (MST)
	(envelope-from scottl@samsco.org)
Message-ID: <421CD0CA.10601@samsco.org>
Date: Wed, 23 Feb 2005 11:51:54 -0700
From: Scott Long <scottl@samsco.org>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.5) Gecko/20050218
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: David Rice <drice@globat.com>
References: <Pine.NEB.3.96L.1050221212016.30319D-100000@fledge.watson.org>
	<200502231044.54801.drice@globat.com>
In-Reply-To: <200502231044.54801.drice@globat.com>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, score=-2.8 required=3.8 tests=ALL_TRUSTED autolearn=failed 
	version=3.0.2
X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on pooker.samsco.org
X-Mailman-Approved-At: Wed, 23 Feb 2005 19:20:47 +0000
cc: freebsd-performance@freebsd.org
cc: Robert Watson <rwatson@freebsd.org>
Subject: Re: High traffic NFS performance and availability problem
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 23 Feb 2005 18:53:54 -0000

David,

Sorry for the mis-information about the AMR status earlier in the
thread.  I forgot that I was holding off on merging the MPSAFE work to
5-STABLE for a bit.  LSI is getting involved in active maintainership
again, and I'm working with them to review all of the changes so far and
fix some of the bugs that I accidentally introduced.  Hopefully we'll
have a resolution by the end of the week, after which I'll prepare the
updated driver for inclusion in 5.4.

Scott

David Rice wrote:
> Where can I find the MPSAFE version of the amr PERC driver.
> I checked the release notes for 5.3-STABLE and it makes no refrence to
> the amr driver being MPSAFE.
> 
> 
> On Monday 21 February 2005 01:26 pm, Robert Watson wrote:
> 
>>On Mon, 21 Feb 2005, David Rice wrote:
>>
>>>Here are the snapshots of the output you requested. These are from the
>>>NFS server. We have just upgraded them to 5.3-RELEASE as so many have
>>>recomended.  Hope that makes them more stable. The performance still
>>>needs some attention.
>>
>>In the top output below, it looks like there's a lot of contention on
>>Giant.  In 5.3-RELEASE and before, the amr driver is not MPSAFE, but my
>>understanding is that in 5-STABLE, it has been made MPSAFE, which may make
>>quite a difference in performance.  I pinged Scott Long, who did the work
>>on the driver, and he indicated that backporting the patch to run on
>>-RELEASE would be quite difficult, so an upgrade to 5-STABLE is the best
>>way to get the changes.  I believe that you can build a 5-STABLE kernel
>>and run with a 5.3-RELEASE user space to avoid having to commit to a full
>>upgrade to see if that helps or not.
>>
>>Two other observations:
>>
>>- It looks like the amr storage array is pretty busy, which may be part of
>>  the issue.
>>
>>- It looks like you have four processors, suggesting a two-processor Xeon
>>  with hyper-threading turned on. For many workloads, hyper-threading does
>>  not improve performance, so you may want to try turning that off in the
>>  BIOS to see if that helps.
>>
>>Robert N M Watson
>>
>>
>>>Thank You
>>>
>>>-------------------------------------------------------------------------
>>>------------------------- D USERNAME PRI NICE   SIZE    RES STATE  C  
>>>TIME   WCPU    CPU COMMAND 4 users    Load  5.28 19.37 28.00             
>>>    Feb 21 12:18
>>>
>>>Mem:KB    REAL            VIRTUAL                     VN PAGER  SWAP
>>>PAGER Tot   Share      Tot    Share    Free         in  out     in  out
>>>Act   19404    2056    90696     3344   45216 count
>>>All 1020204    4280  4015204     7424         pages
>>>                                                          zfod  
>>>Interrupts Proc:r  p  d  s  w    Csw  Trp  Sys  Int  Sof  Flt        cow 
>>>  7226 total 5128  5  60861    3  14021584    9      152732 wire       
>>>4: sio0 23228 act         6: fdc0 30.2%Sys  11.8%Intr  0.0%User  0.0%Nice
>>>58.0%Idl   803616 inact   128 8: rtc
>>>
>>>|    |    |    |    |    |    |    |    |    |      43556 cache       13:
>>>|    |    |    |    |    |    |    |    |    | npx
>>>
>>>===============++++++                                1660 free        15:
>>>ata daefr  6358 16: bge Namei         Name-cache    Dir-cache            
>>>        prcfr     1 17: bge Calls     hits    %     hits    %            
>>>        react       18: mpt 1704      971   57       11    1             
>>>       pdwak       19: mpt 5342 pdpgs   639 24: amr Disks amrd0   da0
>>>pass0 pass1 pass2                       intrn   100 0: clk KB/t  22.41 
>>>0.00  0.00  0.00  0.00                114288 buf
>>>tps     602     0     0     0     0                   510 dirtybuf
>>>MB/s  13.16  0.00  0.00  0.00  0.00                 70235 desiredvnodes
>>>% busy  100     0     0     0     0                 20543 numvnodes
>>>                                                     7883 freevnodes
>>>-------------------------------------------------------------------------
>>>---------------- last pid: 10330;  load averages: 14.69, 11.81, 18.62
>>>up 0+09:01:13  12:32:57
>>>226 processes: 5 running, 153 sleeping, 57 waiting, 11 lock
>>>CPU states:  0.1% user,  0.0% nice, 66.0% system, 24.3% interrupt,  9.6%
>>>idle Mem: 23M Active, 774M Inact, 150M Wired, 52M Cache, 112M Buf, 1660K
>>>Free Swap: 1024M Total, 124K Used, 1024M Free
>>>
>>>  PID USERNAME PRI NICE   SIZE    RES STATE  C   TIME   WCPU    CPU
>>>COMMAND 63 root     -44 -163     0K    12K WAIT   0 147:05 45.07% 45.07%
>>>swi1: net 30 root     -68 -187     0K    12K WAIT   0 101:39 32.32%
>>>32.32% irq16: bge0
>>>   12 root     117    0     0K    12K CPU2   2 329:09 19.58% 19.58% idle:
>>>cpu2 11 root     116    0     0K    12K CPU3   3 327:29 19.24% 19.24%
>>>idle: cpu3 13 root     114    0     0K    12K RUN    1 263:39 16.89%
>>>16.89% idle: cpu1 14 root     109    0     0K    12K CPU0   0 228:50
>>>12.06% 12.06% idle: cpu0 368 root       4    0  1220K   740K *Giant 3 
>>>45:27  7.52%  7.52% nfsd 366 root       4    0  1220K   740K *Giant 0 
>>>48:52  7.28%  7.28% nfsd 364 root       4    0  1220K   740K *Giant 3 
>>>53:01  7.13%  7.13% nfsd 367 root      -8    0  1220K   740K biord  3 
>>>41:22  7.08%  7.08% nfsd 372 root       4    0  1220K   740K *Giant 0 
>>>28:54  7.08%  7.08% nfsd 365 root      -1    0  1220K   740K *Giant 3 
>>>51:53  6.93%  6.93% nfsd 370 root      -1    0  1220K   740K nfsslp 0 
>>>32:49  6.84%  6.84% nfsd 369 root      -8    0  1220K   740K biord  1 
>>>36:40  6.49%  6.49% nfsd 371 root       4    0  1220K   740K *Giant 0 
>>>25:14  6.45%  6.45% nfsd 374 root      -1    0  1220K   740K nfsslp 2 
>>>22:31  6.45%  6.45% nfsd 377 root       4    0  1220K   740K *Giant 2 
>>>17:21  5.52%  5.52% nfsd 376 root      -4    0  1220K   740K *Giant 2 
>>>15:45  5.37%  5.37% nfsd 373 root      -4    0  1220K   740K ufs    3 
>>>19:38  5.18%  5.18% nfsd 378 root       4    0  1220K   740K *Giant 2 
>>>13:55  4.54%  4.54% nfsd 379 root      -8    0  1220K   740K biord  3 
>>>12:41  4.49%  4.49% nfsd 380 root       4    0  1220K   740K -      2 
>>>11:26  4.20%  4.20% nfsd 3 root      -8    0     0K    12K -      1 
>>>21:21  4.05%  4.05% g_up 4 root      -8    0     0K    12K -      0 
>>>20:05  3.96%  3.96% g_down 381 root       4    0  1220K   740K -      3  
>>>9:28  3.66%  3.66% nfsd 382 root       4    0  1220K   740K -      1 
>>>10:13  3.47%  3.47% nfsd 385 root      -1    0  1220K   740K nfsslp 3  
>>>7:21  3.17%  3.17% nfsd 38 root     -64 -183     0K    12K *Giant 0 
>>>14:45  3.12%  3.12% irq24: amr0
>>>  384 root       4    0  1220K   740K -      3   8:40  3.12%  3.12% nfsd
>>>   72 root     -24 -143     0K    12K WAIT   2  16:50  2.98%  2.98%
>>>swi6:+ 383 root      -8    0  1220K   740K biord  2   7:57  2.93%  2.93%
>>>nfsd 389 root       4    0  1220K   740K -      2   5:31  2.64%  2.64%
>>>nfsd 390 root      -8    0  1220K   740K biord  3   5:54  2.59%  2.59%
>>>nfsd 387 root      -8    0  1220K   740K biord  0   6:40  2.54%  2.54%
>>>nfsd 386 root      -8    0  1220K   740K biord  1   6:22  2.44%  2.44%
>>>nfsd 392 root       4    0  1220K   740K -      3   4:27  2.10%  2.10%
>>>nfsd 388 root      -4    0  1220K   740K *Giant 2   4:45  2.05%  2.05%
>>>nfsd 395 root       4    0  1220K   740K -      0   3:59  2.05%  2.05%
>>>nfsd 391 root       4    0  1220K   740K -      2   5:10  1.95%  1.95%
>>>nfsd 393 root       4    0  1220K   740K sbwait 1   4:13  1.56%  1.56%
>>>nfsd 398 root       4    0  1220K   740K -      2   3:31  1.56%  1.56%
>>>nfsd 399 root       4    0  1220K   740K -      3   3:12  1.56%  1.56%
>>>nfsd 401 root       4    0  1220K   740K -      1   2:57  1.51%  1.51%
>>>nfsd 403 root       4    0  1220K   740K -      0   3:04  1.42%  1.42%
>>>nfsd 406 root       4    0  1220K   740K -      1   2:27  1.37%  1.37%
>>>nfsd 397 root       4    0  1220K   740K -      3   3:16  1.27%  1.27%
>>>nfsd 396 root       4    0  1220K   740K -      2   3:42  1.22%  1.22%
>>>nfsd
>>>
>>>On Saturday 19 February 2005 04:23 am, Robert Watson wrote:
>>>
>>>>On Thu, 17 Feb 2005, David Rice wrote:
>>>>
>>>>>Typicly we have 7 client boxes mounting storage from a single file
>>>>>server.  Each client box servers 1000 web sites and associate email.
>>>>>We have done the basic NFS tuning (ie: Read write size optimization
>>>>>and kernel tuning)
>>>>
>>>>How many nfsd's are you running with?
>>>>
>>>>If you run systat -vmstat 1 on your server under high load, could you
>>>>send us the output?  In particular, I'm interested in knowing how the
>>>>system is spending its time, the paging level, I/O throughput on
>>>>devices, and the systat -vmstat summary screen provides a good summary
>>>>of this and more.  A few snapshots of "gstat" output would also be very
>>>>helpful.  As would a snapshot or two of "top -S" output.  This will
>>>>give us a picture of how the system is spending its time.
>>>>
>>>>
>>>>>2. Client boxes have high load averages and sometimes crashes due to
>>>>>slow NFS performance.
>>>>
>>>>Could you be more specific about the crash failure mode?
>>>>
>>>>
>>>>>3. File servers that randomly crash with "Fatal trap 12: page fault
>>>>>while in kernel mode"
>>>>
>>>>Could you make sure you're running with at least the latest 5.3 patch
>>>>level on the server, which includes some NFS server stability fixes,
>>>>and also look at sliding to the head of 5-STABLE?  There are a number
>>>>of performance and stability improvements that may be relevant there.
>>>>
>>>>Could you provide serial console output of the full panic message, trap
>>>>details, compile the kernel with KDB+DDB, and include a full stack
>>>>trace? I'm happy to try to help debug these problems.
>>>>
>>>>
>>>>>4. With soft updates enabled during FSCK the fileserver will freeze
>>>>>with all NFS processs in the "snaplck" state. We disabled soft
>>>>>updates because of this.
>>>>
>>>>If it's possible to do get some more information, it would be quite
>>>>helpful.  In particular, could you compile the server box with
>>>>DDB+KDB+BREAK_TO_DEBUGGER, breka into the serial debugger when it
>>>>appears wedged, and put the contents of "show lockedvnods", "ps", and
>>>>"trace <pid>" of any processes listed in "show lockedvnods" output,
>>>>that would be great.  A crash dump would also be very helpful.  For
>>>>some hints on the information that is necessary here, take a look at
>>>>the handbook chapter on kernel debugging and reporting kernel bugs, and
>>>>my recent post to current@ diagnosing a similar bug.
>>>>
>>>>If you e-enable soft updates but leave bgfsck disabled, does that
>>>>correct this stability problem?
>>>>
>>>>In any case, I'm happy to help try to figure out what's going on --
>>>>some of the above information for stability and performance problems
>>>>would be quite helpful in tracking it down.
>>>>
>>>>Robert N M Watson
>>
>>_______________________________________________
>>freebsd-performance@freebsd.org mailing list
>>http://lists.freebsd.org/mailman/listinfo/freebsd-performance
>>To unsubscribe, send any mail to
>>"freebsd-performance-unsubscribe@freebsd.org"
> 
> 
> 

From owner-freebsd-performance@FreeBSD.ORG  Wed Feb 23 20:19:23 2005
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 64F2E16A4CE
	for <freebsd-performance@freebsd.org>;
	Wed, 23 Feb 2005 20:19:23 +0000 (GMT)
Received: from mh1.centtech.com (moat3.centtech.com [207.200.51.50])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 5D03B43D49
	for <freebsd-performance@freebsd.org>;
	Wed, 23 Feb 2005 20:19:22 +0000 (GMT)
	(envelope-from anderson@centtech.com)
Received: from [10.177.171.220] (neutrino.centtech.com [10.177.171.220])
	by mh1.centtech.com (8.13.1/8.13.1) with ESMTP id j1NKJLuE076061;
	Wed, 23 Feb 2005 14:19:21 -0600 (CST)
	(envelope-from anderson@centtech.com)
Message-ID: <421CE545.3010805@centtech.com>
Date: Wed, 23 Feb 2005 14:19:17 -0600
From: Eric Anderson <anderson@centtech.com>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.5) Gecko/20050210
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: Scott Long <scottl@samsco.org>
References: <Pine.NEB.3.96L.1050221212016.30319D-100000@fledge.watson.org>
	<200502231044.54801.drice@globat.com> <421CD0CA.10601@samsco.org>
In-Reply-To: <421CD0CA.10601@samsco.org>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
X-Virus-Scanned: ClamAV 0.82/721/Tue Feb 22 08:01:26 2005 on mh1.centtech.com
X-Virus-Status: Clean
cc: freebsd-performance@freebsd.org
Subject: Re: High traffic NFS performance and availability problem
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 23 Feb 2005 20:19:23 -0000

Scott Long wrote:
> David,
> 
> Sorry for the mis-information about the AMR status earlier in the
> thread.  I forgot that I was holding off on merging the MPSAFE work to
> 5-STABLE for a bit.  LSI is getting involved in active maintainership
> again, and I'm working with them to review all of the changes so far and
> fix some of the bugs that I accidentally introduced.  Hopefully we'll
> have a resolution by the end of the week, after which I'll prepare the
> updated driver for inclusion in 5.4.

Scott - let me just say *THANKS!* - that is truly good news!  I'm a heavy user of these devices, and I sleep much better knowing there is some backing/help from the manufacturer.  

I have to ask - is there any work being done on a monitoring tool for these cards?  Forgive me if I'm overlooking one that already exists, but I couldn't find one that would work on these cards.  (Pointers are welcome!)

Eric


-- 
------------------------------------------------------------------------
Eric Anderson        Sr. Systems Administrator        Centaur Technology
I have seen the future and it is just like the present, only longer.
------------------------------------------------------------------------

From owner-freebsd-performance@FreeBSD.ORG  Wed Feb 23 22:06:43 2005
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id EB41D16A4CE; Wed, 23 Feb 2005 22:06:43 +0000 (GMT)
Received: from corp.globat.com (corp.globat.com [216.193.201.6])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 2E09F43D53; Wed, 23 Feb 2005 22:06:43 +0000 (GMT)
	(envelope-from drice@globat.com)
Received: from globat.com (globat [66.159.202.156])
	by corp.globat.com (8.12.11/8.12.9) with ESMTP id j1NM6RGs005671;
	Wed, 23 Feb 2005 14:06:39 -0800 (PST)
	(envelope-from drice@globat.com)
From: David Rice <drice@globat.com>
Organization: Globat
To: Scott Long <scottl@samsco.org>
Date: Wed, 23 Feb 2005 14:06:25 -0800
User-Agent: KMail/1.5.4
References: <Pine.NEB.3.96L.1050221212016.30319D-100000@fledge.watson.org>
	<200502231044.54801.drice@globat.com> <421CD0CA.10601@samsco.org>
In-Reply-To: <421CD0CA.10601@samsco.org>
MIME-Version: 1.0
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <200502231406.25461.drice@globat.com>
cc: freebsd-performance@freebsd.org
cc: Robert Watson <rwatson@freebsd.org>
Subject: Re: High traffic NFS performance and availability problem
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 23 Feb 2005 22:06:44 -0000

We are willing to be a test site for the new amr driver. We have several
NFS servers running 5.3-RELEASE that have 1.3TB of disk under high load. 
Also looking for a management utility for the PERC4 under FreeBSD.
Thanks do much to all the people that have responded to this thread. Special 
thanks to Scott for his work on this driver.


On Wednesday 23 February 2005 10:51 am, Scott Long wrote:
> David,
>
> Sorry for the mis-information about the AMR status earlier in the
> thread.  I forgot that I was holding off on merging the MPSAFE work to
> 5-STABLE for a bit.  LSI is getting involved in active maintainership
> again, and I'm working with them to review all of the changes so far and
> fix some of the bugs that I accidentally introduced.  Hopefully we'll
> have a resolution by the end of the week, after which I'll prepare the
> updated driver for inclusion in 5.4.
>
> Scott
>
> David Rice wrote:
> > Where can I find the MPSAFE version of the amr PERC driver.
> > I checked the release notes for 5.3-STABLE and it makes no refrence to
> > the amr driver being MPSAFE.
> >
> > On Monday 21 February 2005 01:26 pm, Robert Watson wrote:
> >>On Mon, 21 Feb 2005, David Rice wrote:
> >>>Here are the snapshots of the output you requested. These are from the
> >>>NFS server. We have just upgraded them to 5.3-RELEASE as so many have
> >>>recomended.  Hope that makes them more stable. The performance still
> >>>needs some attention.
> >>
> >>In the top output below, it looks like there's a lot of contention on
> >>Giant.  In 5.3-RELEASE and before, the amr driver is not MPSAFE, but my
> >>understanding is that in 5-STABLE, it has been made MPSAFE, which may
> >> make quite a difference in performance.  I pinged Scott Long, who did
> >> the work on the driver, and he indicated that backporting the patch to
> >> run on -RELEASE would be quite difficult, so an upgrade to 5-STABLE is
> >> the best way to get the changes.  I believe that you can build a
> >> 5-STABLE kernel and run with a 5.3-RELEASE user space to avoid having to
> >> commit to a full upgrade to see if that helps or not.
> >>
> >>Two other observations:
> >>
> >>- It looks like the amr storage array is pretty busy, which may be part
> >> of the issue.
> >>
> >>- It looks like you have four processors, suggesting a two-processor Xeon
> >>  with hyper-threading turned on. For many workloads, hyper-threading
> >> does not improve performance, so you may want to try turning that off in
> >> the BIOS to see if that helps.
> >>
> >>Robert N M Watson
> >>
> >>>Thank You
> >>>
> >>>------------------------------------------------------------------------
> >>>- ------------------------- D USERNAME PRI NICE   SIZE    RES STATE  C
> >>> TIME   WCPU    CPU COMMAND 4 users    Load  5.28 19.37 28.00
> >>>    Feb 21 12:18
> >>>
> >>>Mem:KB    REAL            VIRTUAL                     VN PAGER  SWAP
> >>>PAGER Tot   Share      Tot    Share    Free         in  out     in  out
> >>>Act   19404    2056    90696     3344   45216 count
> >>>All 1020204    4280  4015204     7424         pages
> >>>                                                          zfod
> >>>Interrupts Proc:r  p  d  s  w    Csw  Trp  Sys  Int  Sof  Flt        cow
> >>>  7226 total 5128  5  60861    3  14021584    9      152732 wire
> >>>4: sio0 23228 act         6: fdc0 30.2%Sys  11.8%Intr  0.0%User 
> >>> 0.0%Nice 58.0%Idl   803616 inact   128 8: rtc
> >>>
> >>>|    |    |    |    |    |    |    |    |    |      43556 cache      
> >>>|    |    |    |    |    |    |    |    |    | 13: npx
> >>>
> >>>===============++++++                                1660 free       
> >>> 15: ata daefr  6358 16: bge Namei         Name-cache    Dir-cache
> >>>        prcfr     1 17: bge Calls     hits    %     hits    %
> >>>        react       18: mpt 1704      971   57       11    1
> >>>       pdwak       19: mpt 5342 pdpgs   639 24: amr Disks amrd0   da0
> >>>pass0 pass1 pass2                       intrn   100 0: clk KB/t  22.41
> >>>0.00  0.00  0.00  0.00                114288 buf
> >>>tps     602     0     0     0     0                   510 dirtybuf
> >>>MB/s  13.16  0.00  0.00  0.00  0.00                 70235 desiredvnodes
> >>>% busy  100     0     0     0     0                 20543 numvnodes
> >>>                                                     7883 freevnodes
> >>>------------------------------------------------------------------------
> >>>- ---------------- last pid: 10330;  load averages: 14.69, 11.81, 18.62
> >>> up 0+09:01:13  12:32:57
> >>>226 processes: 5 running, 153 sleeping, 57 waiting, 11 lock
> >>>CPU states:  0.1% user,  0.0% nice, 66.0% system, 24.3% interrupt,  9.6%
> >>>idle Mem: 23M Active, 774M Inact, 150M Wired, 52M Cache, 112M Buf, 1660K
> >>>Free Swap: 1024M Total, 124K Used, 1024M Free
> >>>
> >>>  PID USERNAME PRI NICE   SIZE    RES STATE  C   TIME   WCPU    CPU
> >>>COMMAND 63 root     -44 -163     0K    12K WAIT   0 147:05 45.07% 45.07%
> >>>swi1: net 30 root     -68 -187     0K    12K WAIT   0 101:39 32.32%
> >>>32.32% irq16: bge0
> >>>   12 root     117    0     0K    12K CPU2   2 329:09 19.58% 19.58%
> >>> idle: cpu2 11 root     116    0     0K    12K CPU3   3 327:29 19.24%
> >>> 19.24% idle: cpu3 13 root     114    0     0K    12K RUN    1 263:39
> >>> 16.89% 16.89% idle: cpu1 14 root     109    0     0K    12K CPU0   0
> >>> 228:50 12.06% 12.06% idle: cpu0 368 root       4    0  1220K   740K
> >>> *Giant 3 45:27  7.52%  7.52% nfsd 366 root       4    0  1220K   740K
> >>> *Giant 0 48:52  7.28%  7.28% nfsd 364 root       4    0  1220K   740K
> >>> *Giant 3 53:01  7.13%  7.13% nfsd 367 root      -8    0  1220K   740K
> >>> biord  3 41:22  7.08%  7.08% nfsd 372 root       4    0  1220K   740K
> >>> *Giant 0 28:54  7.08%  7.08% nfsd 365 root      -1    0  1220K   740K
> >>> *Giant 3 51:53  6.93%  6.93% nfsd 370 root      -1    0  1220K   740K
> >>> nfsslp 0 32:49  6.84%  6.84% nfsd 369 root      -8    0  1220K   740K
> >>> biord  1 36:40  6.49%  6.49% nfsd 371 root       4    0  1220K   740K
> >>> *Giant 0 25:14  6.45%  6.45% nfsd 374 root      -1    0  1220K   740K
> >>> nfsslp 2 22:31  6.45%  6.45% nfsd 377 root       4    0  1220K   740K
> >>> *Giant 2 17:21  5.52%  5.52% nfsd 376 root      -4    0  1220K   740K
> >>> *Giant 2 15:45  5.37%  5.37% nfsd 373 root      -4    0  1220K   740K
> >>> ufs    3 19:38  5.18%  5.18% nfsd 378 root       4    0  1220K   740K
> >>> *Giant 2 13:55  4.54%  4.54% nfsd 379 root      -8    0  1220K   740K
> >>> biord  3 12:41  4.49%  4.49% nfsd 380 root       4    0  1220K   740K -
> >>>      2 11:26  4.20%  4.20% nfsd 3 root      -8    0     0K    12K -    
> >>>  1 21:21  4.05%  4.05% g_up 4 root      -8    0     0K    12K -      0
> >>> 20:05  3.96%  3.96% g_down 381 root       4    0  1220K   740K -      3
> >>> 9:28  3.66%  3.66% nfsd 382 root       4    0  1220K   740K -      1
> >>> 10:13  3.47%  3.47% nfsd 385 root      -1    0  1220K   740K nfsslp 3
> >>> 7:21  3.17%  3.17% nfsd 38 root     -64 -183     0K    12K *Giant 0
> >>> 14:45  3.12%  3.12% irq24: amr0
> >>>  384 root       4    0  1220K   740K -      3   8:40  3.12%  3.12% nfsd
> >>>   72 root     -24 -143     0K    12K WAIT   2  16:50  2.98%  2.98%
> >>>swi6:+ 383 root      -8    0  1220K   740K biord  2   7:57  2.93%  2.93%
> >>>nfsd 389 root       4    0  1220K   740K -      2   5:31  2.64%  2.64%
> >>>nfsd 390 root      -8    0  1220K   740K biord  3   5:54  2.59%  2.59%
> >>>nfsd 387 root      -8    0  1220K   740K biord  0   6:40  2.54%  2.54%
> >>>nfsd 386 root      -8    0  1220K   740K biord  1   6:22  2.44%  2.44%
> >>>nfsd 392 root       4    0  1220K   740K -      3   4:27  2.10%  2.10%
> >>>nfsd 388 root      -4    0  1220K   740K *Giant 2   4:45  2.05%  2.05%
> >>>nfsd 395 root       4    0  1220K   740K -      0   3:59  2.05%  2.05%
> >>>nfsd 391 root       4    0  1220K   740K -      2   5:10  1.95%  1.95%
> >>>nfsd 393 root       4    0  1220K   740K sbwait 1   4:13  1.56%  1.56%
> >>>nfsd 398 root       4    0  1220K   740K -      2   3:31  1.56%  1.56%
> >>>nfsd 399 root       4    0  1220K   740K -      3   3:12  1.56%  1.56%
> >>>nfsd 401 root       4    0  1220K   740K -      1   2:57  1.51%  1.51%
> >>>nfsd 403 root       4    0  1220K   740K -      0   3:04  1.42%  1.42%
> >>>nfsd 406 root       4    0  1220K   740K -      1   2:27  1.37%  1.37%
> >>>nfsd 397 root       4    0  1220K   740K -      3   3:16  1.27%  1.27%
> >>>nfsd 396 root       4    0  1220K   740K -      2   3:42  1.22%  1.22%
> >>>nfsd
> >>>
> >>>On Saturday 19 February 2005 04:23 am, Robert Watson wrote:
> >>>>On Thu, 17 Feb 2005, David Rice wrote:
> >>>>>Typicly we have 7 client boxes mounting storage from a single file
> >>>>>server.  Each client box servers 1000 web sites and associate email.
> >>>>>We have done the basic NFS tuning (ie: Read write size optimization
> >>>>>and kernel tuning)
> >>>>
> >>>>How many nfsd's are you running with?
> >>>>
> >>>>If you run systat -vmstat 1 on your server under high load, could you
> >>>>send us the output?  In particular, I'm interested in knowing how the
> >>>>system is spending its time, the paging level, I/O throughput on
> >>>>devices, and the systat -vmstat summary screen provides a good summary
> >>>>of this and more.  A few snapshots of "gstat" output would also be very
> >>>>helpful.  As would a snapshot or two of "top -S" output.  This will
> >>>>give us a picture of how the system is spending its time.
> >>>>
> >>>>>2. Client boxes have high load averages and sometimes crashes due to
> >>>>>slow NFS performance.
> >>>>
> >>>>Could you be more specific about the crash failure mode?
> >>>>
> >>>>>3. File servers that randomly crash with "Fatal trap 12: page fault
> >>>>>while in kernel mode"
> >>>>
> >>>>Could you make sure you're running with at least the latest 5.3 patch
> >>>>level on the server, which includes some NFS server stability fixes,
> >>>>and also look at sliding to the head of 5-STABLE?  There are a number
> >>>>of performance and stability improvements that may be relevant there.
> >>>>
> >>>>Could you provide serial console output of the full panic message, trap
> >>>>details, compile the kernel with KDB+DDB, and include a full stack
> >>>>trace? I'm happy to try to help debug these problems.
> >>>>
> >>>>>4. With soft updates enabled during FSCK the fileserver will freeze
> >>>>>with all NFS processs in the "snaplck" state. We disabled soft
> >>>>>updates because of this.
> >>>>
> >>>>If it's possible to do get some more information, it would be quite
> >>>>helpful.  In particular, could you compile the server box with
> >>>>DDB+KDB+BREAK_TO_DEBUGGER, breka into the serial debugger when it
> >>>>appears wedged, and put the contents of "show lockedvnods", "ps", and
> >>>>"trace <pid>" of any processes listed in "show lockedvnods" output,
> >>>>that would be great.  A crash dump would also be very helpful.  For
> >>>>some hints on the information that is necessary here, take a look at
> >>>>the handbook chapter on kernel debugging and reporting kernel bugs, and
> >>>>my recent post to current@ diagnosing a similar bug.
> >>>>
> >>>>If you e-enable soft updates but leave bgfsck disabled, does that
> >>>>correct this stability problem?
> >>>>
> >>>>In any case, I'm happy to help try to figure out what's going on --
> >>>>some of the above information for stability and performance problems
> >>>>would be quite helpful in tracking it down.
> >>>>
> >>>>Robert N M Watson
> >>
> >>_______________________________________________
> >>freebsd-performance@freebsd.org mailing list
> >>http://lists.freebsd.org/mailman/listinfo/freebsd-performance
> >>To unsubscribe, send any mail to
> >>"freebsd-performance-unsubscribe@freebsd.org"

From owner-freebsd-performance@FreeBSD.ORG  Thu Feb 24 07:02:24 2005
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 9347A16A4CF
	for <freebsd-performance@freebsd.org>;
	Thu, 24 Feb 2005 07:02:24 +0000 (GMT)
Received: from web26809.mail.ukl.yahoo.com (web26809.mail.ukl.yahoo.com
	[217.146.176.85])
	by mx1.FreeBSD.org (Postfix) with SMTP id BA5F243D4C
	for <freebsd-performance@freebsd.org>;
	Thu, 24 Feb 2005 07:02:23 +0000 (GMT)
	(envelope-from cguttesen@yahoo.dk)
Received: (qmail 2874 invoked by uid 60001); 24 Feb 2005 07:02:22 -0000
Message-ID: <20050224070222.2872.qmail@web26809.mail.ukl.yahoo.com>
Received: from [194.248.174.58] by web26809.mail.ukl.yahoo.com via HTTP;
	Thu, 24 Feb 2005 08:02:22 CET
Date: Thu, 24 Feb 2005 08:02:22 +0100 (CET)
From: Claus Guttesen <cguttesen@yahoo.dk>
To: David Rice <drice@globat.com>, Scott Long <scottl@samsco.org>
In-Reply-To: <200502231406.25461.drice@globat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: 8bit
cc: freebsd-performance@freebsd.org
cc: Robert Watson <rwatson@freebsd.org>
Subject: Re: High traffic NFS performance and availability problem
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 24 Feb 2005 07:02:24 -0000

> We are willing to be a test site for the new amr
> driver. We have several
> NFS servers running 5.3-RELEASE that have 1.3TB of
> disk under high load. 

Me too. Have a Dell PE 2850 with a LSILogic PERC 4e/Di
controller which is *not* yet in production, approx.
138 MB of storage.

regards
Claus

From owner-freebsd-performance@FreeBSD.ORG  Thu Feb 24 11:18:07 2005
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id D627116A4CE
	for <freebsd-performance@freebsd.org>;
	Thu, 24 Feb 2005 11:18:07 +0000 (GMT)
Received: from ford.blinkenlights.nl (ford.blinkenlights.nl [213.204.211.2])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 5844943D31
	for <freebsd-performance@freebsd.org>;
	Thu, 24 Feb 2005 11:18:07 +0000 (GMT)
	(envelope-from sten@blinkenlights.nl)
Received: from tea.blinkenlights.nl (tea.blinkenlights.nl
	[IPv6:2001:960:301:3:a00:20ff:fe85:fa39])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by ford.blinkenlights.nl (Postfix) with ESMTP id 807B03F294;
	Thu, 24 Feb 2005 12:18:05 +0100 (CET)
Received: by tea.blinkenlights.nl (Postfix, from userid 101)
	id 19A26285; Thu, 24 Feb 2005 12:18:05 +0100 (CET)
Received: from localhost (localhost [127.0.0.1])
	by tea.blinkenlights.nl (Postfix) with ESMTP
	id 0E17D18F; Thu, 24 Feb 2005 12:18:05 +0100 (CET)
Date: Thu, 24 Feb 2005 12:18:04 +0100 (CET)
From: Sten Spans <sten@blinkenlights.nl>
To: Eric Anderson <anderson@centtech.com>
In-Reply-To: <421CE545.3010805@centtech.com>
Message-ID: <Pine.SOC.4.61.0502241213450.16100@tea.blinkenlights.nl>
References: <Pine.NEB.3.96L.1050221212016.30319D-100000@fledge.watson.org>
 <200502231044.54801.drice@globat.com> <421CD0CA.10601@samsco.org>
 <421CE545.3010805@centtech.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
cc: Scott Long <scottl@samsco.org>
cc: freebsd-performance@freebsd.org
Subject: Re: High traffic NFS performance and availability problem
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 24 Feb 2005 11:18:07 -0000

On Wed, 23 Feb 2005, Eric Anderson wrote:

> Scott - let me just say *THANKS!* - that is truly good news!  I'm a heavy 
> user of these devices, and I sleep much better knowing there is some 
> backing/help from the manufacturer.

Quite.

> I have to ask - is there any work being done on a monitoring tool for these 
> cards?  Forgive me if I'm overlooking one that already exists, but I couldn't 
> find one that would work on these cards.  (Pointers are welcome!)

The freebsd4 binary works for me ( I only need raid status info ).

zaphod# ./amrcontrol status 0

#######################################################
###   LSI MEGARAID LOGICAL DRIVE INFORMATION    ###
#######################################################

##Logical drives 0
RAID= RAID5
Status= OPTIMAL
Rebuild Rate= 30


or with a wrapper:

zaphod# telnet 127.0.0.1 666
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
Status= OPTIMAL
Connection closed by foreign host.


http://people.freebsd.org/~emoore/MegaRAID_SCSI/amrcontrol/

-- 
Sten Spans

"There is a crack in everything, that's how the light gets in."
Leonard Cohen - Anthem

From owner-freebsd-performance@FreeBSD.ORG  Fri Feb 25 05:54:37 2005
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id B3CEB16A4CE
	for <freebsd-performance@freebsd.org>;
	Fri, 25 Feb 2005 05:54:37 +0000 (GMT)
Received: from rproxy.gmail.com (rproxy.gmail.com [64.233.170.206])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 31CAE43D55
	for <freebsd-performance@freebsd.org>;
	Fri, 25 Feb 2005 05:54:37 +0000 (GMT)
	(envelope-from linicks@gmail.com)
Received: by rproxy.gmail.com with SMTP id 1so383273rny
	for <freebsd-performance@freebsd.org>;
	Thu, 24 Feb 2005 21:54:36 -0800 (PST)
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws;
	s=beta; d=gmail.com;
	h=received:message-id:date:from:reply-to:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:references;
	b=VwbDqiDQoHT4HAfljY2c5yQyvtpYC57c32GpWrsApnJDa5BALu8jTGkIoZDkj7lzK8yX7xyRDP/Vtnimx6t5Rxl88ppGbi1NXLsezQNmGggxfxnqn4YQVO/Jy7zSxNnFyYDnryxGq1D9Tapc/JcBBQJPa8rC4lcn74C1aYgL9ok=
Received: by 10.38.179.80 with SMTP id b80mr263453rnf;
        Thu, 24 Feb 2005 21:54:36 -0800 (PST)
Received: by 10.38.8.69 with HTTP; Thu, 24 Feb 2005 21:54:36 -0800 (PST)
Message-ID: <dc9ba044050224215466c95faa@mail.gmail.com>
Date: Thu, 24 Feb 2005 22:54:36 -0700
From: Nick Pavlica <linicks@gmail.com>
To: freebsd-performance@freebsd.org
In-Reply-To: <dc9ba044050203143647cee0c2@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
References: <dc9ba044050203143647cee0c2@mail.gmail.com>
Subject: Re: My disk I/O testing methods for FreeBSD 5.3 ...
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
Reply-To: Nick Pavlica <linicks@gmail.com>
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 25 Feb 2005 05:54:37 -0000

All,
  I have recently performed my suite of tests on 5 (RELENG_5 2/23/05)
and have noticed almost identical performance results to my original
tests.  Are there any outstanding code changes that will help improve
performance before 5.4 starts to ship?  Has anyone identified any
trouble areas?  ...

Much Thanks!
--Nick


On Thu, 3 Feb 2005 15:36:37 -0700, Nick Pavlica <linicks@gmail.com> wrote:
> All,
>    I would like to share the methods that I have been using in my disk
> I/O testing. The detailed results of these tests have been posted to
> the performance and questions mailing lists under the title " FreeBSD
> 5.3 I/O Performance / Linux 2.6.10 | Continued Discussion".  I
> originally started this testing as due diligence in an up coming
> project.  As a result of this testing I discovered an elegant
> operating system that I enjoy working with.
> 
> Intent Of This Testing:
> 1)To measure the disk I/O performance of various operating systems for
> use as a production database server.
> 2)Help improve the disk I/O performance of FreeBSD 5.x and greater by
> assisting the FreeBSD development team in identifying possible
> performance issues, and provide them with data to measure the success
> of various changes to the operating system.
> 
> Operating Systems tested:
> Fedora Core 3 with EXT3, and XFS.  I tested with and with out patches.
> SUSE Enterprise Server 9 with Riser FS.
> FreeBSD 4.11R
> FreeBSD 5.3R, RELENG_5_3, RELENG_5
> NetBSD 2.0R
> OpenBSD 3.6R
> 
> Test Hardware:
> Compaq DeskPro,  PIII 800, 384Mb Ram, 10Gb IDE HD.
> Dell PE 2400, Dual PIII 550, 512Mb Ram, (2)10K,LVD SCSI, RAID 1, PERC
> 2SI controller with 64Mb ram.
> Dell PE SC400, 2.4Ghz P4, 256MB Ram, 40Gb IDE HD.
> Dell 4600, 2.8 Ghz P4 with HT, 512MB Ram, 80GB IDE HD.
> 
> Installation Notes:
>   It's my intention to test these Operating Systems using as many of
> the default installation options as possible with no special tuning.
> The only deviations in my previous testing were as follows: The #linux
> xfs option was used when installing Fedora so that I could use XFS,
> and a special test where I installed  5.3R with UFS instead of UFS2 (I
> didn't see any improvement when using UFS).  I installed FreeBSD using
> the standard install option, and used the auto allocate features for
> partitioning and slicing.  I installed Fedora with the stock server
> packages and created a 100Mb /boot, 512Mb swap, and allocated the
> remaining space to /.  I tested FreeBSD5.3R and FC3R with and without
> updates.  I used cvsup to update FreeBSD and yum update to update
> Fedora.  I didn't do any updating to FreeBSD4.11R, NetBSD2.0, and
> OpenBSD3.6.
> 
> I used the following utilities/tools in my testing:
> DD
> CP
> IOSTAT (iostat -d 2)
> Bonnie++
> TOP
> SQL,PL, PSQL
> Postgresql 8.0
> 
> DD Example Tests:
> - #time dd bs=1024 if=/dev/zero of=tstfile count=1M
> - #time dd bs=1024 if=/dev/zero of=tstfile count=2M
> - #time dd bs=1024 if=/dev/zero of=tstfile count=3M
> 
> Bonnie++ Example Tests:
> #bonnie++ -u root -s 1024 -r 512 -n 5
> #bonnie++ -u root -s 2048 -r 512 -n 5
> #bonnie++ -u root -s 3072 -r 512 -n 5
> 
> CP  Example Tests:
> #time cp tstfile tstfile2
> 
> SQL, PL, PSQL Example Tests:
> 
> CREATE TABLE test1 (
>     thedate TIMESTAMP,
>     astring VARCHAR(200),
>     anumber INTEGER
> );
> 
> CREATE FUNCTION build_data() RETURNS integer AS '
>     DECLARE
>         i INTEGER DEFAULT 0;
>         curtime TIMESTAMP;
>     BEGIN
>         FOR i IN 1..1000000 LOOP
>             curtime := ''now'';
>             INSERT INTO test1 VALUES (curtime, ''test string'', i);
>         END LOOP;
>         RETURN 1;
>     END;
> ' LANGUAGE 'plpgsql';
> 
> SELECT build_data();
> Then the following script is run under the time program to ascertain
> how long it takes to run:
> CREATE TABLE test2  (
>     thedate TIMESTAMP,
>     astring VARCHAR(200),
>     anumber INTEGER
> );
> CREATE TABLE test3 AS SELECT * FROM test1;
> INSERT INTO test2 SELECT * FROM test1 WHERE ((anumber % 2) = 0);
> DELETE FROM test3 WHERE ((anumber % 2) = 0);
> DELETE FROM test3 WHERE ((anumber % 13) = 0);
> CREATE TABLE test4 AS
>  SELECT test1.thedate AS t1date,
>         test2.thedate AS t2date,
>         test1.astring AS t1string,
>         test2.astring AS t2string,
>         test1.anumber AS t1number,
>         test2.anumber AS t2number
>  FROM test1 JOIN test2 ON test1.anumber=test2.anumber;
> UPDATE test3 SET thedate='now' WHERE ((anumber % 5) = 0);
> DROP TABLE test4;
> CREATE TABLE test4 AS SELECT * FROM test1;
> DELETE FROM test4 WHERE ((anumber % 27) = 0);
> VACUUM ANALYZE;
> VACUUM FULL;
> DROP TABLE test4;
> DROP TABLE test3;
> DROP TABLE test2;
> VACUUM FULL;
> 
> Example FS TAB:
> 
> minime# cat /etc/fstab
> # Device                Mountpoint      FStype  Options         Dump    Pass#
> /dev/ad0s1b             none            swap    sw              0       0
> /dev/ad0s1a             /               ufs     rw              1       1
> /dev/ad0s1e             /tmp            ufs     rw              2       2
> /dev/ad0s1f             /usr            ufs     rw              2       2
> /dev/ad0s1d             /var            ufs     rw              2       2
> /dev/acd0               /cdrom          cd9660  ro,noauto       0       0
> 
> Verification Of Test:
>    I have been able to get consistent results in all of my testing.
> However, I think the best verification would be to have as many people
> as possible test the disk I/O performance on a range of hardware,
> testing methods, and configurations.
> 
> Summary Of Results:
>   The results of my testing have consistently demonstrated that
> FreeBSD5.3+ has dramatically slower disk I/O performance than all of
> the other operating systems that were tested.  FreeBSD 4.11R was the
> performance leader followed by Fedora C3 with XFS.  All of the BSD
> distributions, with the exception of 5.3+, were able to consistently
> demonstrate a throughput of 56-58Mb/s sustained throughput, while 5.3+
> consistently demonstrated a throughput of 12-15Mb/s (58 -15 = 43 ?).
> 
> Please let me know if you need any additional details.
> 
> Thanks!
> --Nick Pavlica
>

From owner-freebsd-performance@FreeBSD.ORG  Fri Feb 25 07:02:16 2005
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 45F6816A4CE
	for <freebsd-performance@freebsd.org>;
	Fri, 25 Feb 2005 07:02:16 +0000 (GMT)
Received: from istanbul.enderunix.org (freefall.marmara.edu.tr
	[193.140.143.23])
	by mx1.FreeBSD.org (Postfix) with SMTP id 135FB43D5D
	for <freebsd-performance@freebsd.org>;
	Fri, 25 Feb 2005 07:02:15 +0000 (GMT)
	(envelope-from simsek@istanbul.enderunix.org)
Received: (qmail 35462 invoked by uid 1013); 25 Feb 2005 07:02:46 -0000
X-Mail-Scanner: Scanned by qSheff 0.8-p3 against viruses and spams
	(http://www.enderunix.org/qsheff/)
Message-ID: <20050225070246.35459.qmail@istanbul.enderunix.org>
References: <dc9ba044050203143647cee0c2@mail.gmail.com>
            <dc9ba044050224215466c95faa@mail.gmail.com>
In-Reply-To: <dc9ba044050224215466c95faa@mail.gmail.com>
From: "Baris Simsek" <simsek@enderunix.org>
To: freebsd-performance@freebsd.org
Date: Fri, 25 Feb 2005 09:02:45 +0200
Mime-Version: 1.0
Content-Type: text/plain; format=flowed; charset="iso-8859-9"
Content-Transfer-Encoding: 7bit
Subject: unix domain sockets vs. internet sockets
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 25 Feb 2005 07:02:16 -0000

Hi, 

I am coding a daemon program. I am not sure about which type of sockets i 
should use. Could you compare ip sockets and unix domain sockets? My main 
criterions are performance and protocol load. What are the differences 
between impelementations of them at kernel level? 

Baris Simsek
http://www.enderunix.org/simsek/

From owner-freebsd-performance@FreeBSD.ORG  Fri Feb 25 07:21:10 2005
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 0714116A4CE
	for <freebsd-performance@freebsd.org>;
	Fri, 25 Feb 2005 07:21:10 +0000 (GMT)
Received: from gandalf.online.bg (gandalf.online.bg [217.75.128.9])
	by mx1.FreeBSD.org (Postfix) with SMTP id 559F943D39
	for <freebsd-performance@freebsd.org>;
	Fri, 25 Feb 2005 07:21:08 +0000 (GMT)
	(envelope-from roam@ringlet.net)
Received: (qmail 3454 invoked from network); 25 Feb 2005 07:21:04 -0000
Received: from unknown (HELO straylight.ringlet.net) (213.16.36.104)
  by gandalf.online.bg with SMTP; 25 Feb 2005 07:21:04 -0000
Received: (qmail 2096 invoked by uid 1000); 25 Feb 2005 07:21:05 -0000
Date: Fri, 25 Feb 2005 09:21:05 +0200
From: Peter Pentchev <roam@ringlet.net>
To: Baris Simsek <simsek@enderunix.org>
Message-ID: <20050225072105.GA1139@straylight.m.ringlet.net>
References: <dc9ba044050203143647cee0c2@mail.gmail.com>
	<dc9ba044050224215466c95faa@mail.gmail.com>
	<20050225070246.35459.qmail@istanbul.enderunix.org>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature"; boundary="1yeeQ81UyVL57Vl7"
Content-Disposition: inline
In-Reply-To: <20050225070246.35459.qmail@istanbul.enderunix.org>
User-Agent: Mutt/1.5.8i
cc: freebsd-performance@freebsd.org
Subject: Re: unix domain sockets vs. internet sockets
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 25 Feb 2005 07:21:10 -0000


--1yeeQ81UyVL57Vl7
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Fri, Feb 25, 2005 at 09:02:45AM +0200, Baris Simsek wrote:
> Hi,=20
>=20
> I am coding a daemon program. I am not sure about which type of
> sockets i should use. Could you compare ip sockets and unix domain
> sockets?  My main criterions are performance and protocol load.

The main point you should be thinking about is - should your daemon be
accessible by clients running on remote machines?  If so, Unix-domain
sockets are *definitely* not what you want, since they are, by design,
limited to connections on the same machine.  This is actually what makes
them a lot more efficient to use.

However, it would not be too hard to write your program so it is pretty
much independent of the type of sockets used - that's the point of the
Berkeley *sockets* intreface :)  If you drive carefully around the very
few things you can do *only* with Unix-domain sockets (for instance,
credential passing), and the very few things you can do *only* with
Internet-domain sockets (e.g. accept filters), and you handle the
address size/representation issue carefully (which in theory you should
anyway, what with IPv6 just around the corner... it seems ;), then your
program should have no trouble with listening on both/either type of
socket.  Actually, a lot of programs do that already, getting the best
of both worlds - the efficiency of Unix-domain sockets for local clients
and the ease of use of Internet-domain sockets for remote connections.

> What are the differences between impelementations of them at kernel
> level?=20

I'll have to let someone else answer that one :)

G'luck,
Peter

--=20
Peter Pentchev	roam@ringlet.net    roam@cnsys.bg    roam@FreeBSD.org
PGP key:	http://people.FreeBSD.org/~roam/roam.key.asc
Key fingerprint	FDBA FD79 C26F 3C51 C95E  DF9E ED18 B68D 1619 4553
This sentence contains exactly threee erors.

--1yeeQ81UyVL57Vl7
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (FreeBSD)

iD8DBQFCHtHh7Ri2jRYZRVMRAhxbAJ0SV+5IlROF/DbuWo9wiN/fF3pkgACeOjcY
abaL/GGTSzgT/t6LMbhwsII=
=x73A
-----END PGP SIGNATURE-----

--1yeeQ81UyVL57Vl7--

From owner-freebsd-performance@FreeBSD.ORG  Fri Feb 25 10:29:14 2005
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id CDBEF16A4CE
	for <freebsd-performance@freebsd.org>;
	Fri, 25 Feb 2005 10:29:14 +0000 (GMT)
Received: from cyrus.watson.org (cyrus.watson.org [204.156.12.53])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 5171943D1D
	for <freebsd-performance@freebsd.org>;
	Fri, 25 Feb 2005 10:29:14 +0000 (GMT)
	(envelope-from robert@fledge.watson.org)
Received: from fledge.watson.org (fledge.watson.org [204.156.12.50])
	by cyrus.watson.org (Postfix) with SMTP id 725CA46B3C;
	Fri, 25 Feb 2005 05:29:13 -0500 (EST)
Date: Fri, 25 Feb 2005 10:27:27 +0000 (GMT)
From: Robert Watson <rwatson@FreeBSD.org>
X-Sender: robert@fledge.watson.org
To: Baris Simsek <simsek@enderunix.org>
In-Reply-To: <20050225070246.35459.qmail@istanbul.enderunix.org>
Message-ID: <Pine.NEB.3.96L.1050225101348.25686B-100000@fledge.watson.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
cc: freebsd-performance@freebsd.org
Subject: Re: unix domain sockets vs. internet sockets
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 25 Feb 2005 10:29:14 -0000


On Fri, 25 Feb 2005, Baris Simsek wrote:

> I am coding a daemon program. I am not sure about which type of sockets
> i should use. Could you compare ip sockets and unix domain sockets? My
> main criterions are performance and protocol load. What are the
> differences between impelementations of them at kernel level?

There are a few differences that might be of interest, in addition to the
already pointed out difference that if you start out using IP sockets, you
don't have to migrate to them later when you want inter-machine
connectivity: 

- UNIX domain sockets use the file system as the address name space.  This
  means you can use UNIX file permissions to control access to communicate
  with them.  I.e., you can limit what other processes can connect to the
  daemon -- maybe one user can, but the web server can't, or the like.
  With IP sockets, the ability to connect to your daemon is exposed off
  the current system, so additional steps may have to be taken for
  security.  On the other hand, you get network transparency.  With UNIX
  domain sockets, you can actually retrieve the credential of the process
  that created the remote socket, and use that for access control also,
  which can be quite convenient on multi-user systems.

- IP sockets over localhost are basically looped back network on-the-wire
  IP.  There is intentionally "no special knowledge" of the fact that the
  connection is to the same system, so no effort is made to bypass the
  normal IP stack mechanisms for performance reasons.  For example,
  transmission over TCP will always involve two context switches to get to
  the remote socket, as you have to switch through the netisr, which
  occurs following the "loopback" of the packet through the synthetic
  loopback interface.  Likewise, you get all the overhead of ACKs, TCP
  flow control, encapsulation/decapsulation, etc.  Routing will be
  performed in order to decide if the packets go to the localhost.
  Large sends will have to be broken down into MTU-size datagrams, which
  also adds overhead for large writes.  It's really TCP, it just goes over
  a loopback interface by virtue of a special address, or discovering that
  the address requested is served locally rather than over an ethernet
  (etc). 

- UNIX domain sockets have explicit knowledge that they're executing on
  the same system.  They avoid the extra context switch through the
  netisr, and a sending thread will write the stream or datagrams directly
  into the receiving socket buffer.  No checksums are calculated, no
  headers are inserted, no routing is performed, etc.  Because they have
  access to the remote socket buffer, they can also directly provide
  feedback to the sender when it is filling, or more importantly,
  emptying, rather than having the added overhead of explicit
  acknowledgement and window changes.  The one piece of functionality that
  UNIX domain sockets don't provide that TCP does is out-of-band data.  In
  practice, this is an issue for almost noone.

In general, the argument for implementing over TCP is that it gives you
location independence and immediate portability -- you can move the client
or the daemon, update an address, and it will "just work".  The sockets
layer provides a reasonable abstraction of communications services, so
it's not hard to write an application so that the connection/binding
portion knows about TCP and UNIX domain sockets, and all the rest just
uses the socket it's given.  So if you're looking for performance locally,
I think UNIX domain sockets probably best meet your need.  Many people
will code to TCP anyway because performance is often less critical, and
the network portability benefit is substantial.

Right now, the UNIX domain socket code is covered by a subsystem lock; I
have a version that used more fine-grain locking, but have not yet
evaluated the performance impact of those changes.  I've you're running in
an SMP environment with four processors, it could be that those changes
might positively impact performance, so if you'd like the patches, let me
know.  Right now they're on my schedule to start testing, but not on the
path for inclusion in FreeBSD 5.4.  The primary benefit of greater
granularity would be if you had many pairs of threads/processes
communicating across processors using UNIX domain sockets, and as a result
there was substantial contention on the UNIX domain socket subsystem lock. 
The patches don't increase the cost of normal send/receive operations, but
due add extra mutex operations in the listen/accept/connect/bind paths.

Robert N M Watson

From owner-freebsd-performance@FreeBSD.ORG  Fri Feb 25 15:33:10 2005
Return-Path: <owner-freebsd-performance@FreeBSD.ORG>
Delivered-To: freebsd-performance@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id D452716A4CE
	for <freebsd-performance@freebsd.org>;
	Fri, 25 Feb 2005 15:33:10 +0000 (GMT)
Received: from gate.bitblocks.com (bitblocks.com [209.204.185.216])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 9353E43D49
	for <freebsd-performance@freebsd.org>;
	Fri, 25 Feb 2005 15:33:10 +0000 (GMT)
	(envelope-from bakul@bitblocks.com)
Received: from bitblocks.com (localhost [127.0.0.1])
	by gate.bitblocks.com (8.13.1/8.13.1) with ESMTP id j1PFX4OS056850;
	Fri, 25 Feb 2005 07:33:04 -0800 (PST)
	(envelope-from bakul@bitblocks.com)
Message-Id: <200502251533.j1PFX4OS056850@gate.bitblocks.com>
To: "Baris Simsek" <simsek@enderunix.org>
In-reply-to: Your message of "Fri, 25 Feb 2005 09:02:45 +0200."
             <20050225070246.35459.qmail@istanbul.enderunix.org> 
Date: Fri, 25 Feb 2005 07:33:04 -0800
From: Bakul Shah <bakul@BitBlocks.com>
cc: freebsd-performance@freebsd.org
Subject: Re: unix domain sockets vs. internet sockets 
X-BeenThere: freebsd-performance@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Performance/tuning <freebsd-performance.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-performance>
List-Post: <mailto:freebsd-performance@freebsd.org>
List-Help: <mailto:freebsd-performance-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-performance>,
	<mailto:freebsd-performance-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 25 Feb 2005 15:33:10 -0000

> I am coding a daemon program. I am not sure about which type of sockets i 
> should use. Could you compare ip sockets and unix domain sockets? My main 
> criterions are performance and protocol load. What are the differences 
> between impelementations of them at kernel level? 

If you *don't want* remote processes access to your daemon,
use unix sockets and you don't have to worry about security
issues as much.  This is the main reason for choosing one
over the other.  You should also structure your code so that
you can change the choice later (as well as use SSL if
necessary).

Answering what you didn't ask :-)
Socket implementation performance differences is the wrong
thing to worry about this early in the game.  Instead of
trying to make it `as fast as possible', it might make more
sense to think about what is the expected load (or setting
load goals) and how to meet the required performance by
making sure the machine has enough resources and making sure
you can measure relevant performance parameters at any time.
How you modularize your daemon, protocol design (if you have
control over it), available memory, disk speed and data
organization (if you are accessing lots of data), algorithm
design, whether you can distribute load across machines etc.
will have a much bigger influence on overall performance than
the choice of socket type.  You *never* have enough time to
optimize everything so setting realistic performance goals
helps you meet them sooner -- you fix the top N bottlenecks
(which change as you tighten things) and stop when done!