From owner-freebsd-bugs@FreeBSD.ORG Tue Jun 15 20:40:01 2010 Return-Path: Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B2523106566C for ; Tue, 15 Jun 2010 20:40:01 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 8AE338FC0A for ; Tue, 15 Jun 2010 20:40:01 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id o5FKe191046901 for ; Tue, 15 Jun 2010 20:40:01 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id o5FKe1iW046900; Tue, 15 Jun 2010 20:40:01 GMT (envelope-from gnats) Resent-Date: Tue, 15 Jun 2010 20:40:01 GMT Resent-Message-Id: <201006152040.o5FKe1iW046900@freefall.freebsd.org> Resent-From: FreeBSD-gnats-submit@FreeBSD.org (GNATS Filer) Resent-To: freebsd-bugs@FreeBSD.org Resent-Reply-To: FreeBSD-gnats-submit@FreeBSD.org, Dmitry Pryanishnikov Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D99321065678 for ; Tue, 15 Jun 2010 20:38:38 +0000 (UTC) (envelope-from nobody@FreeBSD.org) Received: from www.freebsd.org (www.freebsd.org [IPv6:2001:4f8:fff6::21]) by mx1.freebsd.org (Postfix) with ESMTP id C70938FC13 for ; Tue, 15 Jun 2010 20:38:38 +0000 (UTC) Received: from www.freebsd.org (localhost [127.0.0.1]) by www.freebsd.org (8.14.3/8.14.3) with ESMTP id o5FKccwh024100 for ; Tue, 15 Jun 2010 20:38:38 GMT (envelope-from nobody@www.freebsd.org) Received: (from nobody@localhost) by www.freebsd.org (8.14.3/8.14.3/Submit) id o5FKccvn024098; Tue, 15 Jun 2010 20:38:38 GMT (envelope-from nobody) Message-Id: <201006152038.o5FKccvn024098@www.freebsd.org> Date: Tue, 15 Jun 2010 20:38:38 GMT From: Dmitry Pryanishnikov To: freebsd-gnats-submit@FreeBSD.org X-Send-Pr-Version: www-3.1 Cc: Subject: kern/147890: [regression] ufs-related lock problem in RELENG_8 (18.04.2010 -> 20.04.2010) X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 Jun 2010 20:40:01 -0000 >Number: 147890 >Category: kern >Synopsis: [regression] ufs-related lock problem in RELENG_8 (18.04.2010 -> 20.04.2010) >Confidential: no >Severity: serious >Priority: medium >Responsible: freebsd-bugs >State: open >Quarter: >Keywords: >Date-Required: >Class: sw-bug >Submitter-Id: current-users >Arrival-Date: Tue Jun 15 20:40:01 UTC 2010 >Closed-Date: >Last-Modified: >Originator: Dmitry Pryanishnikov >Release: RELENG_8 >Organization: Home >Environment: FreeBSD lynx.homenet 8.0-STABLE FreeBSD 8.0-STABLE #0: Tue Jun 15 15:05:05 EEST 2010 root@lynx.homenet:/databig/obj/databig/ftp/RELENG_8/src.100418/sys/lynx i386 >Description: After upgrade RELENG_8 as of 10.04.2010 -> 14.06.2010 my desktop (previously running w/o problems for days) started to lock pretty fast (within 1-2 hours of uptime). I've narrowed down the time window of the problem appearance: kernel built from the sources as of date=2010.04.18.00.00.00 runs OK, one as of date=2010.04.20.00.00.00 already locks. Nothing "special" is required in order to trigger the condition: after boot I just log in as a non-root (via xdm - it starts xfce4 environment), then start the Thunar File Manager and GQview. Then I switch to text console (crash dump often doesn't complete when console is running X) and just do 'ps axww|grep D' from time to time. Eventually processes thunar-vfs-mime-cleaner-1 (stuck in uninterruptible wait state) start to accumulate. In this state, simple 'sync' also hangs the issuing shell, and (the most painful result) file operations (file creations/moves/...) don't get synched to the disk. Then deadlock resolver (after debug.deadl kres.slptime_threshold seconds) detects the deadlock, which allows to get crash dump. I've got valid dumps both for kernel built from 14.06.2010 and from 20.04.2010 sources, they look similar. Here is the crash of the latter: panic: deadlkres: possible deadlock detected for 0xcb52e4a0, blocked for 294235 ticks (I've lowered debug.deadlkres.slptime_threshold manually after detecting hung thunar-vfs-mime-cleaner-1 processes) (kgdb) print *(struct thread *)0xcb52e4a0 $1 = {td_lock = 0xc07db49c, td_proc = 0xcb5002a8, td_plist = {tqe_next = 0x0, tqe_prev = 0xcb5002b0}, td_runq = {tqe_next = 0x0, tqe_prev = 0xc07ce834}, td_slpq = {tqe_next = 0xcb52db90, tqe_prev = 0xc7383018}, td_lockq = {tqe_next = 0x0, tqe_prev = 0x0}, td_cpuset = 0xc702ce38, td_sel = 0x0, td_sleepqueue = 0x0, td_turnstile = 0xc7ca2c00, td_umtxq = 0xcabc3940, td_tid = 100186, td_sigqueue = {sq_signals = {__bits = {0, 0, 0, 0}}, sq_kill = {__bits = {0, 0, 0, 0}}, sq_list = {tqh_first = 0x0, tqh_last = 0xcb52e500}, sq_proc = 0xcb5002a8, sq_flags = 1}, td_flags = 4, td_inhibitors = 2, td_pflags = 0, td_dupfd = -1, td_sqqueue = 0, td_wchan = 0xcb3a9df4, td_wmesg = 0xc0735016 "ufs", td_lastcpu = 2 '\002', td_oncpu = 255 'Ъ', td_owepreempt = 0 '\0', td_tsqueue = 0 '\0', td_locks = 0, td_rw_rlocks = 0, td_lk_slocks = 0, td_blocked = 0x0, td_lockname = 0x0, td_contested = {lh_first = 0x0}, td_sleeplocks = 0xc0908618, td_intr_nesting_level = 0, td_pinned = 0, td_ucred = 0xc7668580, td_estcpu = 0, td_slptick = 703512, td_blktick = 0, td_ru = {ru_utime = {tv_sec = 0, tv_usec = 0}, ru_stime = {tv_sec = 0, tv_usec = 0}, ru_maxrss = 224, ru_ixrss = 12, ru_idrss = 4, ru_isrss = 128, ru_minflt = 145, ru_majflt = 0, ru_nswap = 0, ru_inblock = 0, ru_oublock = 0, ru_msgsnd = 0, ru_msgrcv = 0, ru_nsignals = 0, ru_nvcsw = 1, ru_nivcsw = 10}, td_incruntime = 0, td_runtime = 23985873, td_pticks = 0, td_sticks = 0, td_iticks = 0, td_uticks = 0, td_intrval = 0, td_oldsigmask = {__bits = {0, 0, 0, 0}}, td_sigmask = {__bits = {0, 0, 0, 0}}, td_generation = 11, td_sigstk = {ss_sp = 0x0, ss_size = 0, ss_flags = 4}, td_xsig = 0, td_profil_addr = 0, td_profil_ticks = 0, td_name = "initial thread\000-cle", td_fpop = 0x0, td_dbgflags = 0, td_ng_outbound = 0, td_osd = { osd_nslots = 0, osd_slots = 0x0, osd_next = {le_next = 0x0, le_prev = 0x0}}, td_rqindex = 32 ' ', td_base_pri = 128 '\200', td_priority = 128 '\200', td_pri_class = 3 '\003', td_user_pri = 128 '\200', td_base_user_pri = 128 '\200', td_pcb = 0xeb6a3d90, td_state = TDS_INHIBITED, td_retval = {0, 1}, td_slpcallout = {c_links = {sle = {sle_next = 0x0}, tqe = {tqe_next = 0x0, tqe_prev = 0x0}}, c_time = 0, c_arg = 0x0, c_func = 0, c_lock = 0x0, c_flags = 16, c_cpu = 0}, td_frame = 0xeb6a3d38, td_kstack_obj = 0xcb4ca3b8, td_kstack = 3949600768, td_kstack_pages = 2, td_unused1 = 0x0, td_unused2 = 0, td_unused3 = 0, td_critnest = 1, td_md = {md_spinlock_count = 1, md_saved_flags = 582}, td_sched = 0xcb52e6bc, td_ar = 0x0, td_syscalls = 11802, td_lprof = {{lh_first = 0x0}, {lh_first = 0x0}}, td_dtrace = 0x0, td_errno = 0, td_vnet = 0x0, td_vnet_lpush = 0x0} (kgdb) print *(struct proc *)0xcb5002a8 $2 = {p_list = {le_next = 0xcabb3d48, le_prev = 0xcabb1000}, p_threads = {tqh_first = 0xcb52e4a0, tqh_last = 0xcb52e4a8}, p_slock = {lock_object = {lo_name = 0xc073b098 "process slock", lo_flags = 720896, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, p_ucred = 0xc7668580, p_fd = 0xc781b900, p_fdtol = 0x0, p_stats = 0xc781ce00, p_limit = 0xc7826500, p_limco = {c_links = {sle = {sle_next = 0x0}, tqe = { tqe_next = 0x0, tqe_prev = 0x0}}, c_time = 0, c_arg = 0x0, c_func = 0, c_lock = 0xcb500330, c_flags = 0, c_cpu = 0}, p_sigacts = 0xcabea000, p_flag = 268451840, p_state = PRS_NORMAL, p_pid = 2044, p_hash = {le_next = 0x0, le_prev = 0xc6fe9ff0}, p_pglist = {le_next = 0xcabb3d48, le_prev = 0xc7800db8}, p_pptr = 0xc7043000, p_sibling = {le_next = 0xcabb3d48, le_prev = 0xc7043084}, p_children = { lh_first = 0x0}, p_mtx = {lock_object = {lo_name = 0xc073b08b "process lock", lo_flags = 21168128, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, p_ksi = 0xc750f410, p_sigqueue = {sq_signals = {__bits = { 0, 0, 0, 0}}, sq_kill = {__bits = {0, 0, 0, 0}}, sq_list = {tqh_first = 0x0, tqh_last = 0xcb500368}, sq_proc = 0xcb5002a8, sq_flags = 1}, p_oppid = 0, p_vmspace = 0xc76febc8, p_swtick = 703502, p_realtimer = {it_interval = {tv_sec = 0, tv_usec = 0}, it_value = {tv_sec = 0, tv_usec = 0}}, p_ru = {ru_utime = {tv_sec = 0, tv_usec = 0}, ru_stime = {tv_sec = 0, tv_usec = 0}, ru_maxrss = 0, ru_ixrss = 0, ru_idrss = 0, ru_isrss = 0, ru_minflt = 0, ru_majflt = 0, ru_nswap = 0, ru_inblock = 0, ru_oublock = 0, ru_msgsnd = 0, ru_msgrcv = 0, ru_nsignals = 0, ru_nvcsw = 0, ru_nivcsw = 0}, p_rux = { rux_runtime = 23985873, rux_uticks = 0, rux_sticks = 1, rux_iticks = 0, rux_uu = 0, rux_su = 9993, rux_tu = 9993}, p_crux = {rux_runtime = 0, rux_uticks = 0, rux_sticks = 0, rux_iticks = 0, rux_uu = 0, rux_su = 0, rux_tu = 0}, p_profthreads = 0, p_exitthreads = 0, p_traceflag = 0, p_tracevp = 0x0, p_tracecred = 0x0, p_textvp = 0xcb524218, p_lock = 0, p_sigiolst = {slh_first = 0x0}, p_sigparent = 20, p_sig = 0, p_code = 0, p_stops = 0, p_stype = 0, p_step = 0 '\0', p_pfsflags = 0 '\0', p_nlminfo = 0x0, p_aioinfo = 0x0, p_singlethread = 0x0, p_suspcount = 0, p_xthread = 0x0, p_boundary_count = 0, p_pendingcnt = 0, p_itimers = 0x0, p_magic = 3203398350, p_osrel = 800504, p_comm = "thunar-vfs-mime-cle", p_pgrp = 0xc7bd7040, p_sysent = 0xc07a4220, p_args = 0xc7bd6d40, p_cpulimit = 9223372036854775807, p_nice = 0 '\0', p_fibnum = 0, p_xstat = 0, p_klist = {kl_list = {slh_first = 0x0}, kl_lock = 0xc04d47e0 , kl_unlock = 0xc04d4810 , kl_assert_locked = 0xc04d4b60 , kl_assert_unlocked = 0xc04d4b30 , kl_lockarg = 0xcb500330}, p_numthreads = 1, p_md = {md_ldt = 0x0}, p_itcallout = { c_links = {sle = {sle_next = 0x0}, tqe = {tqe_next = 0x0, tqe_prev = 0x0}}, c_time = 0, c_arg = 0x0, c_func = 0, c_lock = 0x0, c_flags = 16, c_cpu = 0}, p_acflag = 0, p_peers = 0x0, p_leader = 0xcb5002a8, p_emuldata = 0x0, p_label = 0x0, p_sched = 0xcb500550, p_ktr = {stqh_first = 0x0, stqh_last = 0xcb500538}, p_mqnotifier = {lh_first = 0x0}, p_dtrace = 0x0, p_pwait = {cv_description = 0xc073b8cc "ppwait", cv_waiters = 0}} I'm ready to run any additional kgdb commands against those crash dumps, to try patches on this desktop etc... Here is the kernel config: ident LYNX makeoptions DEBUG=-g #Build kernel with gdb(1) debug symbols options INCLUDE_CONFIG_FILE # Include this file in kernel cpu I686_CPU # aka Pentium Pro(tm) options SCHED_ULE options SMP # Symmetric MultiProcessor Kernel options PREEMPTION # Enable kernel thread preemption options MUTEX_DEBUG options WITNESS device apic # I/O apic options COMPAT_FREEBSD4 options COMPAT_FREEBSD5 options COMPAT_FREEBSD6 options COMPAT_FREEBSD7 options SYSVSHM options SYSVSEM options SYSVMSG options KDB options KDB_TRACE options DDB options DDB_NUMSYM options BREAK_TO_DEBUGGER # BREAK on a serial console goes to DDB options SYSCTL_DEBUG options KTRACE #kernel tracing options INVARIANTS options INVARIANT_SUPPORT options DEADLKRES #software deadlock resolver thread options INET #Internet communications protocols options INET6 #IPv6 communications protocols options IPSEC #IP security (requires device crypto) options IPSEC_NAT_T #NAT-T support, UDP encap of ESP device ether #Generic Ethernet device loop #Network loopback device device bpf #Berkeley packet filter device enc #IPsec interface options IPFIREWALL #firewall options IPFIREWALL_VERBOSE #enable logging to syslogd(8) options IPFIREWALL_VERBOSE_LIMIT=100 #limit verbosity options IPFIREWALL_FORWARD #packet destination changes options IPFIREWALL_NAT #ipfw kernel nat support options LIBALIAS # libalias library, performing NAT options BPF_JITTER #support for BPF just-in-time compiler options FFS #Fast filesystem options SOFTUPDATES options UFS_EXTATTR options UFS_ACL options UFS_DIRHASH options UFS_GJOURNAL # Gjournal-based UFS journaling support. options QUOTA #enable disk quotas options SUIDDIR #The suiddir hack options _KPOSIX_PRIORITY_SCHEDULING options P1003_1B_SEMAPHORES device pty #Pseudo ttys options MSGBUF_SIZE=131072 device crypto # core crypto support device pci device atkbdc # AT keyboard controller device atkbd # AT keyboard device psm # PS/2 mouse options KBD_INSTALL_CDEV # install a CDEV entry in /dev device vga # VGA video card driver device splash # Splash screen and screen saver support options X86BIOS # x86 real mode BIOS emulator, required by atkbdc/dpms/vesa options PRINTF_BUFR_SIZE=128 # Prevent printf output being interspersed. device sc options SC_HISTORY_SIZE=20000 options SC_TWOBUTTON_MOUSE device uart # Generic UART driver device ata device atadisk # ATA disk drives options ATA_STATIC_ID # Static device numbering device scbus # SCSI bus (required for SCSI) device da # Direct Access (disks) device cd # CD device pass # Passthrough device (direct SCSI access) options SND_PCM_64 # Process 32bit samples through 64bit integer And the loader.conf: autoboot_delay="5" # Delay in seconds before autobooting beastie_disable="YES" # Turn the beastie boot menu on and off kernel="kernel" # /boot sub-directory containing kernel and modules kern.hz="1000" # Set the kernel interval timer rate kern.maxdsiz="1G" # Set the max data size kern.maxssiz="1G" # Set the max stack size kern.maxtsiz="1G" # Set the max text size kern.ipc.semmni=80 # Number of semaphore identifiers kern.ipc.semmns=480 # Maximum number of semaphores in the system hw.ata.atapi_dma=1 # Using DMA for ATAPI devices hw.snd.verbose=2 # Make /dev/sndstat more verbose hw.snd.maxautovchans=4 # Enable software sound mixer hint.uart.0.flags="0x10" # device is potential system console debug.minidump=1 # Use minidumps debug.witness.watch=0 # Turn off the witness ipdivert_load="YES" accf_data_load="YES" # Wait for data accept filter accf_dns_load="YES" # Wait for full DNS request accept filter accf_http_load="YES" # Wait for full HTTP request accept filter if_tap_load="YES" # Ethernet tunnel software network interface if_tun_load="YES" # Tunnel driver (user process ppp) if_vlan_load="YES" # IEEE 802.1Q VLAN network interface miibus_load="YES" # miibus support, needed for some drivers if_em_load="YES" # Intel(R) PRO/1000 Gigabit Ethernet snd_hda_load="YES" # Intel High Definition Audio usb_load="YES" # USB subsystem uhci_load="YES" # UHCI USB Host Controller driver ehci_load="YES" # USB Enhanced Host Controller driver ugen_load="YES" # USB generic device, if all else fails ... uhid_load="YES" # Human Interface Devices ukbd_load="YES" # Keyboard ulpt_load="YES" # Printer ums_load="YES" # Mouse umass_load="YES" # Mass Storage Devices ucom_load="YES" # USB tty support: uplcom_load="YES" # Prolific PL-2303/2303X/2303HX serial adapters uftdi_load="YES" # FTDI FT2232C, FT8U100AX and FT8U232AM chips uark_load="YES" # Arkmicro Technologies ARK3116 based USB serial adapter snd_uaudio_load="YES" # USB audio umodem_load="YES" # Modems random_load="YES" cpufreq_load="YES" coretemp_load="YES" ichwd_load="YES" smbus_load="YES" smb_load="YES" ichsmb_load="YES" speaker_load="YES" atapicam_load="YES" fdc_load="YES" drm_load="YES" radeon_load="YES" vesa_load="YES" acpi_video_load="YES" firewire_load="YES" >How-To-Repeat: >Fix: >Release-Note: >Audit-Trail: >Unformatted: