Date: Thu, 10 Aug 2006 14:07:51 +0300 From: Pavel Merdin <freebsd-stable1@merdin.com> To: freebsd-stable@freebsd.org Subject: 6-stable locking problem Message-ID: <292315388.20060810140751@fotki.com>
next in thread | raw e-mail | index | archive | help
Hello. There's a problem with a very busy server (ad server, CPU is close to 0% idle most of the time). Configuration: Dual AMD Opteron 252 2.6GHz Chipset: AMD 8131 Integrated LAN Controller: Broadcom BCM5704 dual-channel GbE Gigabit Adaptec AIC-7902W Ultra 320 SCSI controller amr0: <LSILogic MegaRAID 1.53> We tried both 6.1-RELEASE and 6-STABLE amd64 kernels. (bge driver is always from recent stable with full Broadcom support). The server hangs one or more times a day. It even hangs for some time right after boot sequence finishes (when "login:" prompt occurs). During a hang everything stops, even keyboard (interrupts). We already removed PREEMPTION and linux support. Sometimes the server can panic with: Sleeping thread (tid 100006, pid 4) owns a non-sleepable lock panic: sleeping thread cpuid=0 KDB: enter: panic and hangs there without even starting a debugger. pid 4 seems to be [g_down] Today I compiled a kernel with INVARIANTS and WITTNESS. Right after booting sequence I got the following: Aug 10 04:37:09 ad1 kernel: lock order reversal: (Giant after non-sleepable) Aug 10 04:37:09 ad1 kernel: 1st 0xffffff026c4ebe70 AMR List Lock (AMR List Lock) @ dev/amr/amr.c:403 Aug 10 04:37:09 ad1 kernel: 2nd 0xffffffff8073adc0 Giant (Giant) @ vm/vm_contig.c:579 Aug 10 04:37:09 ad1 kernel: KDB: stack backtrace: Aug 10 04:37:09 ad1 kernel: kdb_backtrace() at kdb_backtrace+0x37 Aug 10 04:37:09 ad1 kernel: witness_checkorder() at witness_checkorder+0x6fb Aug 10 04:37:09 ad1 kernel: _mtx_lock_flags() at _mtx_lock_flags+0x9a Aug 10 04:37:09 ad1 kernel: contigmalloc() at contigmalloc+0x57 Aug 10 04:37:09 ad1 kernel: alloc_bounce_pages() at alloc_bounce_pages+0x75 Aug 10 04:37:09 ad1 kernel: bus_dmamap_create() at bus_dmamap_create+0x149 Aug 10 04:37:09 ad1 kernel: amr_alloccmd_cluster() at amr_alloccmd_cluster+0x102 Aug 10 04:37:09 ad1 kernel: amr_alloccmd() at amr_alloccmd+0x55 Aug 10 04:37:09 ad1 kernel: amr_bio_command() at amr_bio_command+0x27 Aug 10 04:37:09 ad1 kernel: amr_startio() at amr_startio+0x6a Aug 10 04:37:09 ad1 kernel: amr_submit_bio() at amr_submit_bio+0x51 Aug 10 04:37:09 ad1 kernel: amrd_strategy() at amrd_strategy+0x23 Aug 10 04:37:09 ad1 kernel: g_disk_start() at g_disk_start+0x17d Aug 10 04:37:09 ad1 kernel: g_io_schedule_down() at g_io_schedule_down+0x189 Aug 10 04:37:09 ad1 kernel: g_down_procbody() at g_down_procbody+0x80 Aug 10 04:37:09 ad1 kernel: fork_exit() at fork_exit+0xdf Aug 10 04:37:09 ad1 kernel: fork_trampoline() at fork_trampoline+0xe Aug 10 04:37:09 ad1 kernel: --- trap 0, rip = 0, rsp = 0xffffffffb8e8bd00, rbp = 0 --- Any advice (except suggestion of switching to Linux) ? -- / Pavel Merdin Fotki Inc.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?292315388.20060810140751>