Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 31 Jul 2000 08:41:14 -0700 (PDT)
From:      Matt Dillon <dillon@earth.backplane.com>
To:        gallatin@FreeBSD.ORG
Cc:        freebsd-stable@FreeBSD.ORG
Subject:   Re: NFS server running out of bufs & locking up
Message-ID:  <200007311541.IAA89271@earth.backplane.com>
References:   <14721.48065.766815.376959@grasshopper.cs.duke.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
    I've only got weekends available but I will try to look into it.

						-Matt

:I have an NFS server which I updated to RELENG_4 (as of Jul 13th) from
:4.0-current (as of Jul 13th 1999).  Since the upgrade, it has locked
:up 3 times; it had been up for 180+ days prior to the upgrade.
:
:The machine serves a large (64GB) volume stripped across 4 ATA drives
:with CCD mounted with soft updates.  When it locks up, it is getting
:beaten upon by a compute farm of 50+ Solaris boxes running NFS over
:TCP (via 100Mb ethernet).
:
:When it locks, most procs are waiting in biofre, and the nfsd's are
:wating on inode.  I've been unable to get a dump, the most I have
:is ps from ddb. (appended below).  Its somewhat interesting that 
:3 of the nfsds are waiting on the same inode
:
:Stopped at      siointr1+0xb1:  jmp     siointr1+0x1a0
:db> ps
:  pid   proc     addr    uid  ppid  pgrp  flag stat wmesg   wchan   cmd
:  494 d28fb2a0 d292a000    0   120   120 002004  3  biofre c02aa6d8 amd
:  493 d0ad9c20 d2869000    0   373   492 004006  3  biofre c02aa6d8 grep
:  473 d0ad7380 d28df000 1597   320   473 004006  3  biofre c02aa6d8 netdump_server
:  460 d28fbe00 d28fc000 1597   458   460 004086  3   ttyin c1729630 tcsh
:  458 d0ad76c0 d28c8000    0   205   205 000084  3  select c02be9ec sshd1
:  394 d28fb440 d290f000    1     1   394 000104  3  biofre c02aa6d8 portmap
:  373 d0ad6ea0 d28eb000    0   285   373 2004086  3  opause d28eb108 tcsh
:  320 d0ad7040 d28e4000 1597   317   320 2004086  3  opause d28e4108 tcsh
:  317 d0ad71e0 d28e9000    0   205   205 000084  3  select c02be9ec sshd1
:  285 d0ad7520 d28db000 1387   283   285 2004086  3  opause d28db108 tcsh
:  283 d0ad7860 d28c4000    0   205   205 000084  3  select c02be9ec sshd1
:  260 d0ad7a00 d28bf000 1387   259   260 004106  3  biofre c02aa6d8 systat
:  259 d0ad83c0 d2899000 1387   233   259 004186  3  select c02be9ec xterm
:  233 d0ada440 d284e000 1387   230   233 004006  3   inode c16f8000 tcsh
:  230 d0ad8220 d289d000    0   205   205 000004  3  biofre c02aa6d8 sshd1
:  223 d0ada5e0 d284b000    0     1   223 004006  3  biofre c02aa6d8 getty
:  218 d0ad7ba0 d28bd000    0     1   218 000084  3  sbwait d0668acc zhm
:  205 d0ad7d40 d28b1000    0     1   205 000084  3  select c02be9ec sshd1
:  147 d0ad8080 d28a1000    0     1   147 2000184  3   pause d28a1108 sendmail
:  144 d0ad7ee0 d28a4000    0     1   144 000084  3  nanslp c02aa580 cron
:  142 d0ad9a80 d286c000    0     1   142 000084  3  select c02be9ec inetd
:  120 d0ad8be0 d2889000    0     1   120 000084  3  select c02be9ec amd
:  115 d0ad8560 d2895000    0     1   110 000084  3  nfsidl c02c0d4c nfsiod
:  114 d0ad8700 d2892000    0     1   110 000084  3  nfsidl c02c0d48 nfsiod
:  113 d0ad88a0 d288f000    0     1   110 000084  3  nfsidl c02c0d44 nfsiod
:  112 d0ad8a40 d288c000    0     1   110 000084  3  nfsidl c02c0d40 nfsiod
:  108 d0ad8d80 d2886000    0     1   108 000084  3  select c02be9ec rpc.statd
:  105 d0ad8f20 d2882000    0   100   100 000004  3   inode c16c3400 nfsd
:  104 d0ad90c0 d287f000    0   100   100 000004  3   inode c16c3400 nfsd
:  103 d0ad9260 d287c000    0   100   100 000004  3   inode c16c3400 nfsd
:  102 d0ad9400 d2878000    0   100   100 000004  3  biofre c02aa6d8 nfsd
:  100 d0ad95a0 d2875000    0     1   100 000084  3  accept d06663f6 nfsd
:   98 d0ad9740 d2872000    0     1    98 000084  3  select c02be9ec mountd
:   92 d0ad98e0 d286f000    0     1    92 000084  3  select c02be9ec ypbind
:   87 d0ad9dc0 d2866000    0     1    87 000084  3  select c02be9ec ntpd
:   80 d0ad9f60 d285c000    0     1    80 000084  3  select c02be9ec syslogd
:   33 d0ada100 d2858000    0     1    33 2000084  3   pause d2858108 adjkerntz
:   25 d0ada2a0 d2855000    0     1    25 000084  3  mfsidl d0ad3d00 mount_mfs
:    5 d0ada780 d0ae7000    0     0     0 000204  3  biofre c02aa6d8 syncer
:    4 d0ada920 d0ae5000    0     0     0 100204  3  psleep c02aa6a8 bufdaemon
:    3 d0adaac0 d0ae3000    0     0     0 000204  3  psleep c02b5fa0 vmdaemon
:    2 d0adac60 d0ae1000    0     0     0 100204  3  psleep c029c8b8 pagedaemon
:    1 d0adae00 d0adf000    0     0     1 004284  3    wait d0adae00 init
:    0 c02bdd80 c0322000    0     0     0 000204  3   sched c02bdd80 swapper
:
:
:About 30 seconds before this lockup, I was looking at how much buf
:space is available:
:
:#sysctl -a | grep buf
:kern.ipc.maxsockbuf: 262144
:kern.ipc.sockbuf_waste_factor: 8
:kern.ipc.mbuf_wait: 32
:kern.ipc.nmbufs: 10240
:vfs.nfs.bufpackets: 0
:vfs.numdirtybuffers: 18
:vfs.hidirtybuffers: 796
:vfs.numfreebuffers: 3083
:vfs.lofreebuffers: 177
:vfs.hifreebuffers: 354
:vfs.runningbufspace: 32768
:vfs.maxbufspace: 50872320
:vfs.hibufspace: 50216960
:vfs.lobufspace: 50151424
:vfs.bufspace: 50151424
:vfs.maxmallocbufspace: 2510848
:vfs.bufmallocspace: 4096
:vfs.getnewbufcalls: 512584
:vfs.getnewbufrestarts: 0
:vfs.bufdefragcnt: 0
:vfs.buffreekvacnt: 0
:vfs.bufreusecnt: 3061
:vfs.reassignbufcalls: 426411
:vfs.reassignbufloops: 0
:vfs.reassignbufsortgood: 144583
:vfs.reassignbufsortbad: 4776
:vfs.reassignbufmethod: 1
:vfs.aio.max_buf_aio: 16
:vfs.aio.num_buf_aio: 0
:debug.bpf_bufsize: 4096
:debug.bpf_maxbufsize: 524288
:machdep.msgbuf: 
:machdep.msgbuf_clear: 0
:
:
:I have appended my config file & boot messages.
:
:Thanks for any help you can give,
:
:Drew
:
:------------------------------------------------------------------------------
:Andrew Gallatin, Sr Systems Programmer	http://www.cs.duke.edu/~gallatin
:Duke University				Email: gallatin@cs.duke.edu
:Department of Computer Science		Phone: (919) 660-6590
:
:
:Copyright (c) 1992-2000 The FreeBSD Project.
:Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
:        The Regents of the University of California. All rights reserved.
:FreeBSD 4.0-STABLE #0: Thu Jul 13 12:11:33 EDT 2000
:    gallatin@grits.cs.duke.edu:/usr/src/sys/compile/NFSSERVER
:Timecounter "i8254"  frequency 1193182 Hz
:Timecounter "TSC"  frequency 451024727 Hz
:CPU: Pentium II/Pentium II Xeon/Celeron (451.02-MHz 686-class CPU)
:  Origin = "GenuineIntel"  Id = 0x652  Stepping = 2
:  Features=0x183f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR>
:real memory  = 402640896 (393204K bytes)
:avail memory = 388145152 (379048K bytes)
:Preloaded elf kernel "kernel" at 0xc030f000.
:ccd0-3: Concatenated disk drivers
:Pentium Pro MTRR support enabled
:npx0: <math processor> on motherboard
:npx0: INT 16 interface
:pcib0: <Intel 82443BX (440 BX) host to PCI bridge> on motherboard
:pci0: <PCI bus> on pcib0
:pcib1: <Intel 82443BX (440 BX) PCI-PCI (AGP) bridge> at device 1.0 on pci0
:pci1: <PCI bus> on pcib1
:isab0: <Intel 82371AB PCI to ISA bridge> at device 4.0 on pci0
:isa0: <ISA bus> on isab0
:atapci0: <Intel PIIX4 ATA33 controller> port 0xd800-0xd80f at device 4.1 on pci0
:ata0: at 0x1f0 irq 14 on atapci0
:pci0: <Intel 82371AB/EB (PIIX4) USB controller> at 4.2
:chip1: <Intel 82371AB Power management controller> port 0xe800-0xe80f at device 4.3 on pci0
:atapci1: <Promise ATA33 controller> port 0xa800-0xa81f,0xb004-0xb007,0xb400-0xb407,0xb804-0xb807,0xd000-0xd007 irq 12 at device 9.0 on pci0
:ata2: at 0xd000 on atapci1
:ata3: at 0xb400 on atapci1
:fxp0: <Intel Pro 10/100B/100+ Ethernet> port 0xa400-0xa41f mem 0xe2000000-0xe20fffff,0xe3000000-0xe3000fff irq 10 at device 10.0 on pci0
:fxp0: Ethernet address 00:a0:c9:e7:95:bb
:atapci2: <Promise ATA33 controller> port 0x8800-0x881f,0x9004-0x9007,0x9400-0x9407,0x9804-0x9807,0xa000-0xa007 irq 11 at device 12.0 on pci0
:ata4: at 0xa000 on atapci2
:ata5: at 0x9400 on atapci2
:fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
:fdc0: FIFO enabled, 8 bytes threshold
:fd0: <1440-KB 3.5" drive> on fdc0 drive 0
:atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
:sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
:sio0: type 16550A, console
:sio1 at port 0x2f8-0x2ff irq 3 on isa0
:sio1: type 16550A
:ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0
:ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode
:ppc0: FIFO with 16/16/9 bytes threshold
:ppi0: <Parallel I/O> on ppbus0
:lpt0: <Printer> on ppbus0
:lpt0: Interrupt-driven port
:plip0: <PLIP network interface> on ppbus0
:ad0: 4892MB <QUANTUM FIREBALL EX5.1A> [10602/15/63] at ata0-master using UDMA33
:ad1: 16479MB <Maxtor 91728D8> [33483/16/63] at ata2-master using UDMA33
:ad2: 16479MB <Maxtor 91728D8> [33483/16/63] at ata3-master using UDMA33
:ad3: 16479MB <Maxtor 91728D8> [33483/16/63] at ata4-master using UDMA33
:ad4: 16479MB <Maxtor 91728D8> [33483/16/63] at ata5-master using UDMA33
:Mounting root from ufs:/dev/ad0s1a
:WARNING: / was not properly dismounted
:
:
:## NFSSERVER
:machine		i386
:cpu		I686_CPU
:ident		NFSSERVER
:maxusers	128
:
:#makeoptions	DEBUG=-g		#Build kernel with gdb(1) debug symbols
:
:options 	INET			#InterNETworking
:options 	FFS			#Berkeley Fast Filesystem
:options 	FFS_ROOT		#FFS usable as root device [keep this!]
:options 	SOFTUPDATES		#Enable FFS soft updates support
:options 	MFS			#Memory Filesystem
:options 	NFS			#Network Filesystem
:options 	MSDOSFS			#MSDOS Filesystem
:options 	CD9660			#ISO 9660 Filesystem
:options 	CD9660_ROOT		#CD-ROM usable as root, CD9660 required
:options 	PROCFS			#Process filesystem
:options 	COMPAT_43		#Compatible with BSD 4.3 [KEEP THIS!]
:options 	SCSI_DELAY=1500	#Delay (in ms) before probing SCSI
:options 	UCONSOLE		#Allow users to grab the console
:options 	USERCONFIG		#boot -c editor
:options 	VISUAL_USERCONFIG	#visual boot -c editor
:options 	KTRACE			#ktrace(1) support
:options 	SYSVSHM			#SYSV-style shared memory
:options 	SYSVMSG			#SYSV-style message queues
:options 	SYSVSEM			#SYSV-style semaphores
:options 	P1003_1B		#Posix P1003_1B real-time extensions
:options 	_KPOSIX_PRIORITY_SCHEDULING
:options		ICMP_BANDLIM		#Rate limit bad replies
:options 	KBD_INSTALL_CDEV	# install a CDEV entry in /dev
:
:# To make an SMP kernel, the next two are needed
:#options 	SMP			# Symmetric MultiProcessor Kernel
:#options 	APIC_IO			# Symmetric (APIC) I/O
:# Optionally these may need tweaked, (defaults shown):
:#options 	NCPU=2			# number of CPUs
:#options 	NBUS=4			# number of busses
:#options 	NAPIC=1			# number of IO APICs
:#options 	NINTR=24		# number of INTs
:
:device		isa
:device		pci
:
:# Floppy drives
:device		fdc0	at isa? port IO_FD1 irq 6 drq 2
:device		fd0	at fdc0 drive 0
:
:# ATA and ATAPI devices
:device		ata0	at isa? port IO_WD1 irq 14
:device		ata1	at isa? port IO_WD2 irq 15
:device		ata
:device		atadisk			# ATA disk drives
:device		atapicd			# ATAPI CDROM drives
:
:
:# SCSI Controllers
:#device		ahb		# EISA AHA1742 family
:device		ahc		# AHA2940 and onboard AIC7xxx devices
:#device		amd		# AMD 53C974 (Teckram DC-390(T))
:#device		dpt		# DPT Smartcache - See LINT for options!
:#device		isp		# Qlogic family
:#device		ncr		# NCR/Symbios Logic
:device		sym		# NCR/Symbios Logic (newer chipsets)
:options		SYM_SETUP_LP_PROBE_MAP=0x40
:				# Allow ncr to attach legacy NCR devices when 
:				# both sym and ncr are configured
:
:#device		adv0	at isa?
:#device		adw
:#device		bt0	at isa?
:#device		aha0	at isa?
:#device		aic0	at isa?
:
:# SCSI peripherals
:device		scbus		# SCSI bus (required)
:device		da		# Direct Access (disks)
:device		sa		# Sequential Access (tape etc)
:device		cd		# CD
:device		pass		# Passthrough device (direct SCSI access)
:
:# atkbdc0 controls both the keyboard and the PS/2 mouse
:device		atkbdc0	at isa? port IO_KBD
:device		atkbd0	at atkbdc? irq 1 flags 0x1
:device		psm0	at atkbdc? irq 12
:
:device		vga0	at isa?
:
:# splash screen/screen saver
:pseudo-device	splash
:
:# syscons is the default console driver, resembling an SCO console
:device		sc0	at isa? flags 0x100
:
:# Floating point support - do not disable.
:device		npx0	at nexus? port IO_NPX irq 13
:
:# Serial (COM) ports
:device		sio0	at isa? port IO_COM1 flags 0x10 irq 4
:device		sio1	at isa? port IO_COM2 irq 3
:
:# Parallel port
:device		ppc0	at isa? irq 7
:device		ppbus		# Parallel port bus (required)
:device		lpt		# Printer
:device		plip		# TCP/IP over parallel
:device		ppi		# Parallel port interface device
:device		vpo		# Requires scbus and da
:
:
:# PCI Ethernet NICs.
:device		de		# DEC/Intel DC21x4x (``Tulip'')
:device		fxp		# Intel EtherExpress PRO/100B (82557, 82558)
:
:# Pseudo devices - the number indicates how many units to allocated.
:pseudo-device	loop		# Network loopback
:pseudo-device	ether		# Ethernet support
:pseudo-device	pty		# Pseudo-ttys (telnet etc)
:
:# The `bpf' pseudo-device enables the Berkeley Packet Filter.
:# Be aware of the administrative consequences of enabling this!
:pseudo-device	bpf		#Berkeley packet filter
:
:pseudo-device	ccd	4	#Concatenated disk driver
:
:# Size of the kernel message buffer.  Should be N * pagesize.
:options 	MSGBUF_SIZE=40960
:
:#
:# Enable the kernel debugger.
:#
:options 	DDB
:
:#
:# Don't drop into DDB for a panic. Intended for unattended operation
:# where you may want to drop to DDB from the console, but still want
:# the machine to recover from a panic
:#
:options 	DDB_UNATTENDED
:
:# Options for serial drivers that support consoles (only for sio now):
:options 	BREAK_TO_DEBUGGER	#a BREAK on a comconsole goes to
:					#DDB, if available.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200007311541.IAA89271>