Date: Wed, 23 Feb 2005 11:51:54 -0700 From: Scott Long <scottl@samsco.org> To: David Rice <drice@globat.com> Cc: Robert Watson <rwatson@freebsd.org> Subject: Re: High traffic NFS performance and availability problem Message-ID: <421CD0CA.10601@samsco.org> In-Reply-To: <200502231044.54801.drice@globat.com> References: <Pine.NEB.3.96L.1050221212016.30319D-100000@fledge.watson.org> <200502231044.54801.drice@globat.com>
next in thread | previous in thread | raw e-mail | index | archive | help
David, Sorry for the mis-information about the AMR status earlier in the thread. I forgot that I was holding off on merging the MPSAFE work to 5-STABLE for a bit. LSI is getting involved in active maintainership again, and I'm working with them to review all of the changes so far and fix some of the bugs that I accidentally introduced. Hopefully we'll have a resolution by the end of the week, after which I'll prepare the updated driver for inclusion in 5.4. Scott David Rice wrote: > Where can I find the MPSAFE version of the amr PERC driver. > I checked the release notes for 5.3-STABLE and it makes no refrence to > the amr driver being MPSAFE. > > > On Monday 21 February 2005 01:26 pm, Robert Watson wrote: > >>On Mon, 21 Feb 2005, David Rice wrote: >> >>>Here are the snapshots of the output you requested. These are from the >>>NFS server. We have just upgraded them to 5.3-RELEASE as so many have >>>recomended. Hope that makes them more stable. The performance still >>>needs some attention. >> >>In the top output below, it looks like there's a lot of contention on >>Giant. In 5.3-RELEASE and before, the amr driver is not MPSAFE, but my >>understanding is that in 5-STABLE, it has been made MPSAFE, which may make >>quite a difference in performance. I pinged Scott Long, who did the work >>on the driver, and he indicated that backporting the patch to run on >>-RELEASE would be quite difficult, so an upgrade to 5-STABLE is the best >>way to get the changes. I believe that you can build a 5-STABLE kernel >>and run with a 5.3-RELEASE user space to avoid having to commit to a full >>upgrade to see if that helps or not. >> >>Two other observations: >> >>- It looks like the amr storage array is pretty busy, which may be part of >> the issue. >> >>- It looks like you have four processors, suggesting a two-processor Xeon >> with hyper-threading turned on. For many workloads, hyper-threading does >> not improve performance, so you may want to try turning that off in the >> BIOS to see if that helps. >> >>Robert N M Watson >> >> >>>Thank You >>> >>>------------------------------------------------------------------------- >>>------------------------- D USERNAME PRI NICE SIZE RES STATE C >>>TIME WCPU CPU COMMAND 4 users Load 5.28 19.37 28.00 >>> Feb 21 12:18 >>> >>>Mem:KB REAL VIRTUAL VN PAGER SWAP >>>PAGER Tot Share Tot Share Free in out in out >>>Act 19404 2056 90696 3344 45216 count >>>All 1020204 4280 4015204 7424 pages >>> zfod >>>Interrupts Proc:r p d s w Csw Trp Sys Int Sof Flt cow >>> 7226 total 5128 5 60861 3 14021584 9 152732 wire >>>4: sio0 23228 act 6: fdc0 30.2%Sys 11.8%Intr 0.0%User 0.0%Nice >>>58.0%Idl 803616 inact 128 8: rtc >>> >>>| | | | | | | | | | 43556 cache 13: >>>| | | | | | | | | | npx >>> >>>===============++++++ 1660 free 15: >>>ata daefr 6358 16: bge Namei Name-cache Dir-cache >>> prcfr 1 17: bge Calls hits % hits % >>> react 18: mpt 1704 971 57 11 1 >>> pdwak 19: mpt 5342 pdpgs 639 24: amr Disks amrd0 da0 >>>pass0 pass1 pass2 intrn 100 0: clk KB/t 22.41 >>>0.00 0.00 0.00 0.00 114288 buf >>>tps 602 0 0 0 0 510 dirtybuf >>>MB/s 13.16 0.00 0.00 0.00 0.00 70235 desiredvnodes >>>% busy 100 0 0 0 0 20543 numvnodes >>> 7883 freevnodes >>>------------------------------------------------------------------------- >>>---------------- last pid: 10330; load averages: 14.69, 11.81, 18.62 >>>up 0+09:01:13 12:32:57 >>>226 processes: 5 running, 153 sleeping, 57 waiting, 11 lock >>>CPU states: 0.1% user, 0.0% nice, 66.0% system, 24.3% interrupt, 9.6% >>>idle Mem: 23M Active, 774M Inact, 150M Wired, 52M Cache, 112M Buf, 1660K >>>Free Swap: 1024M Total, 124K Used, 1024M Free >>> >>> PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU CPU >>>COMMAND 63 root -44 -163 0K 12K WAIT 0 147:05 45.07% 45.07% >>>swi1: net 30 root -68 -187 0K 12K WAIT 0 101:39 32.32% >>>32.32% irq16: bge0 >>> 12 root 117 0 0K 12K CPU2 2 329:09 19.58% 19.58% idle: >>>cpu2 11 root 116 0 0K 12K CPU3 3 327:29 19.24% 19.24% >>>idle: cpu3 13 root 114 0 0K 12K RUN 1 263:39 16.89% >>>16.89% idle: cpu1 14 root 109 0 0K 12K CPU0 0 228:50 >>>12.06% 12.06% idle: cpu0 368 root 4 0 1220K 740K *Giant 3 >>>45:27 7.52% 7.52% nfsd 366 root 4 0 1220K 740K *Giant 0 >>>48:52 7.28% 7.28% nfsd 364 root 4 0 1220K 740K *Giant 3 >>>53:01 7.13% 7.13% nfsd 367 root -8 0 1220K 740K biord 3 >>>41:22 7.08% 7.08% nfsd 372 root 4 0 1220K 740K *Giant 0 >>>28:54 7.08% 7.08% nfsd 365 root -1 0 1220K 740K *Giant 3 >>>51:53 6.93% 6.93% nfsd 370 root -1 0 1220K 740K nfsslp 0 >>>32:49 6.84% 6.84% nfsd 369 root -8 0 1220K 740K biord 1 >>>36:40 6.49% 6.49% nfsd 371 root 4 0 1220K 740K *Giant 0 >>>25:14 6.45% 6.45% nfsd 374 root -1 0 1220K 740K nfsslp 2 >>>22:31 6.45% 6.45% nfsd 377 root 4 0 1220K 740K *Giant 2 >>>17:21 5.52% 5.52% nfsd 376 root -4 0 1220K 740K *Giant 2 >>>15:45 5.37% 5.37% nfsd 373 root -4 0 1220K 740K ufs 3 >>>19:38 5.18% 5.18% nfsd 378 root 4 0 1220K 740K *Giant 2 >>>13:55 4.54% 4.54% nfsd 379 root -8 0 1220K 740K biord 3 >>>12:41 4.49% 4.49% nfsd 380 root 4 0 1220K 740K - 2 >>>11:26 4.20% 4.20% nfsd 3 root -8 0 0K 12K - 1 >>>21:21 4.05% 4.05% g_up 4 root -8 0 0K 12K - 0 >>>20:05 3.96% 3.96% g_down 381 root 4 0 1220K 740K - 3 >>>9:28 3.66% 3.66% nfsd 382 root 4 0 1220K 740K - 1 >>>10:13 3.47% 3.47% nfsd 385 root -1 0 1220K 740K nfsslp 3 >>>7:21 3.17% 3.17% nfsd 38 root -64 -183 0K 12K *Giant 0 >>>14:45 3.12% 3.12% irq24: amr0 >>> 384 root 4 0 1220K 740K - 3 8:40 3.12% 3.12% nfsd >>> 72 root -24 -143 0K 12K WAIT 2 16:50 2.98% 2.98% >>>swi6:+ 383 root -8 0 1220K 740K biord 2 7:57 2.93% 2.93% >>>nfsd 389 root 4 0 1220K 740K - 2 5:31 2.64% 2.64% >>>nfsd 390 root -8 0 1220K 740K biord 3 5:54 2.59% 2.59% >>>nfsd 387 root -8 0 1220K 740K biord 0 6:40 2.54% 2.54% >>>nfsd 386 root -8 0 1220K 740K biord 1 6:22 2.44% 2.44% >>>nfsd 392 root 4 0 1220K 740K - 3 4:27 2.10% 2.10% >>>nfsd 388 root -4 0 1220K 740K *Giant 2 4:45 2.05% 2.05% >>>nfsd 395 root 4 0 1220K 740K - 0 3:59 2.05% 2.05% >>>nfsd 391 root 4 0 1220K 740K - 2 5:10 1.95% 1.95% >>>nfsd 393 root 4 0 1220K 740K sbwait 1 4:13 1.56% 1.56% >>>nfsd 398 root 4 0 1220K 740K - 2 3:31 1.56% 1.56% >>>nfsd 399 root 4 0 1220K 740K - 3 3:12 1.56% 1.56% >>>nfsd 401 root 4 0 1220K 740K - 1 2:57 1.51% 1.51% >>>nfsd 403 root 4 0 1220K 740K - 0 3:04 1.42% 1.42% >>>nfsd 406 root 4 0 1220K 740K - 1 2:27 1.37% 1.37% >>>nfsd 397 root 4 0 1220K 740K - 3 3:16 1.27% 1.27% >>>nfsd 396 root 4 0 1220K 740K - 2 3:42 1.22% 1.22% >>>nfsd >>> >>>On Saturday 19 February 2005 04:23 am, Robert Watson wrote: >>> >>>>On Thu, 17 Feb 2005, David Rice wrote: >>>> >>>>>Typicly we have 7 client boxes mounting storage from a single file >>>>>server. Each client box servers 1000 web sites and associate email. >>>>>We have done the basic NFS tuning (ie: Read write size optimization >>>>>and kernel tuning) >>>> >>>>How many nfsd's are you running with? >>>> >>>>If you run systat -vmstat 1 on your server under high load, could you >>>>send us the output? In particular, I'm interested in knowing how the >>>>system is spending its time, the paging level, I/O throughput on >>>>devices, and the systat -vmstat summary screen provides a good summary >>>>of this and more. A few snapshots of "gstat" output would also be very >>>>helpful. As would a snapshot or two of "top -S" output. This will >>>>give us a picture of how the system is spending its time. >>>> >>>> >>>>>2. Client boxes have high load averages and sometimes crashes due to >>>>>slow NFS performance. >>>> >>>>Could you be more specific about the crash failure mode? >>>> >>>> >>>>>3. File servers that randomly crash with "Fatal trap 12: page fault >>>>>while in kernel mode" >>>> >>>>Could you make sure you're running with at least the latest 5.3 patch >>>>level on the server, which includes some NFS server stability fixes, >>>>and also look at sliding to the head of 5-STABLE? There are a number >>>>of performance and stability improvements that may be relevant there. >>>> >>>>Could you provide serial console output of the full panic message, trap >>>>details, compile the kernel with KDB+DDB, and include a full stack >>>>trace? I'm happy to try to help debug these problems. >>>> >>>> >>>>>4. With soft updates enabled during FSCK the fileserver will freeze >>>>>with all NFS processs in the "snaplck" state. We disabled soft >>>>>updates because of this. >>>> >>>>If it's possible to do get some more information, it would be quite >>>>helpful. In particular, could you compile the server box with >>>>DDB+KDB+BREAK_TO_DEBUGGER, breka into the serial debugger when it >>>>appears wedged, and put the contents of "show lockedvnods", "ps", and >>>>"trace <pid>" of any processes listed in "show lockedvnods" output, >>>>that would be great. A crash dump would also be very helpful. For >>>>some hints on the information that is necessary here, take a look at >>>>the handbook chapter on kernel debugging and reporting kernel bugs, and >>>>my recent post to current@ diagnosing a similar bug. >>>> >>>>If you e-enable soft updates but leave bgfsck disabled, does that >>>>correct this stability problem? >>>> >>>>In any case, I'm happy to help try to figure out what's going on -- >>>>some of the above information for stability and performance problems >>>>would be quite helpful in tracking it down. >>>> >>>>Robert N M Watson >> >>_______________________________________________ >>freebsd-performance@freebsd.org mailing list >>http://lists.freebsd.org/mailman/listinfo/freebsd-performance >>To unsubscribe, send any mail to >>"freebsd-performance-unsubscribe@freebsd.org" > > >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?421CD0CA.10601>