From owner-freebsd-hackers@FreeBSD.ORG Sat Nov 20 16:51:26 2010 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 42E63106564A for ; Sat, 20 Nov 2010 16:51:26 +0000 (UTC) (envelope-from to.my.trociny@gmail.com) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id B59E88FC14 for ; Sat, 20 Nov 2010 16:51:25 +0000 (UTC) Received: by bwz2 with SMTP id 2so5000025bwz.13 for ; Sat, 20 Nov 2010 08:51:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:from:to:subject:date :message-id:user-agent:mime-version:content-type; bh=812C1Yo4kufTAMfqdXLtUbTE1zTRFid0VRPEKgf/x5k=; b=S0YEkoew7jruDzACEVI7V8j1BYoq9QK8qlwp06rscWCFv5cOjy06U/wL5HJ/EyNafh QcZinhSH1xdxWIZjTubr9bzVrMHveS4xMkE0qnoMOGkjuzI//9uOp6YMWSkYUwPup8WC 7g32C3oycQDy2tiqwfMK+tHUuql99jcmXCRsc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:to:subject:date:message-id:user-agent:mime-version :content-type; b=qkdY9S5pwTZUsRU7phXFzCzdRgvLViEkdSVuCWIxyYHl0HsSlcgf++vZZ9ghkvOf9a 0RMOXAKS5N8EiLAkI8TKJFOu3uprXeIrreySDMfpzEKlgsIdAlM6ulHlIA9TyBRAaDSP kAqesBoiwFCgRmnuT9uKbMgiA1nJBJiwNlzGQ= Received: by 10.204.120.67 with SMTP id c3mr3309409bkr.174.1290270132323; Sat, 20 Nov 2010 08:22:12 -0800 (PST) Received: from localhost ([95.69.174.185]) by mx.google.com with ESMTPS id p22sm1445301bkp.9.2010.11.20.08.22.10 (version=TLSv1/SSLv3 cipher=RC4-MD5); Sat, 20 Nov 2010 08:22:11 -0800 (PST) From: Mikolaj Golub To: freebsd-hackers@freebsd.org Date: Sat, 20 Nov 2010 18:22:06 +0200 Message-ID: <86pqu0nexd.fsf@kopusha.home.net> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.2 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Subject: flowtable_cleaner/flowtable_flush livelock X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 20 Nov 2010 16:51:26 -0000 Hi, Running something like below under VirtualBox (CURRENT, VIMAGE) echo "creating jail and iface" jail -c name="${JAIL}" vnet persist ifconfig "${EPAIR}" create ifconfig "${EPAIR}b" vnet "${JAIL}" sleep 1 echo "destroying jail and iface" # below is a race jail -r "${JAIL}" & ifconfig "${EPAIR}a" destroy wait I will frequently get a livelock (it might also crash, but that may be a different story) between these 3 threads in flowtable code: 1308 1183 1183 0 D+ flowclea 0xc101a314 ifconfig 1307 1183 1183 0 R+ jail 18 0 0 0 RL [flowcleaner] Thread 100075 at 0xc2685b40: proc (pid 1308): 0xc28e4aa0 name: ifconfig stack: 0xc8742000-0xc8743fff flags: 0x20804 pflags: 0 state: INHIBITED: {SLEEPING} wmesg: flowcleanwait wchan: 0xc101a314 priority: 138 container lock: sleepq chain (0xc0ebee0c) Tracing command ifconfig pid 1308 tid 100075 td 0xc2685b40 sched_switch(c2685b40,0,104,191,4b654535,...) at sched_switch+0x3d3 mi_switch(104,0,c0d299f4,1f3,0,...) at mi_switch+0x200 sleepq_switch(c2685b40,0,c0d299f4,268,c2685b40,...) at sleepq_switch+0x15f sleepq_wait(c101a314,0,c87439c0,1,0,...) at sleepq_wait+0x63 _cv_wait(c101a314,c101a31c,c87439f8,17,0,...) at _cv_wait+0x243 flowtable_flush(0,c1ef0000,c0d353e4,38e,40,...) at flowtable_flush+0x90 if_detach_internal(c8743a68,c0999d7d,c1ef0000,c1ef0000,c8743aa4,...) at if_detach_internal+0x43d if_detach(c1ef0000) at if_detach+0x10 ether_ifdetach(c1ef0000,1,c8743aa4,c099309e,c0d35665,...) at ether_ifdetach+0x3d epair_clone_destroy(c2963c40,c1ef0000,c0d359fd,105,c2963c70,...) at epair_clone_destroy+0x6b if_clone_destroyif(c2963c40,c1ef0000,c0d359fd,e0,c08cfc1d,...) at if_clone_destroyif+0x147 if_clone_destroy(c1fee8e0,19c,3,c2685b40,c0d52bad,...) at if_clone_destroy+0x147 ifioctl(c2564680,80206979,c1fee8e0,c2685b40,c08a6a31,...) at ifioctl+0x621 soo_ioctl(c1ff5d90,80206979,c1fee8e0,c1d83200,c2685b40,...) at soo_ioctl+0x427 kern_ioctl(c2685b40,3,80206979,c1fee8e0,743cec,...) at kern_ioctl+0x20d ioctl(c2685b40,c8743cec,c2685b40,c8743d28,c0d2a23d,...) at ioctl+0x134 syscallenter(c2685b40,c8743ce4,c8743ce4,0,c0eb0c40,...) at syscallenter+0x2c3 syscall(c8743d28) at syscall+0x4f Xint0x80_syscall() at Xint0x80_syscall+0x21 --- syscall (54, FreeBSD ELF32, ioctl), eip = 0x281c73f3, esp = 0xbfbfe46c, ebp = 0xbfbfe488 --- Thread 100050 at 0xc20032d0: proc (pid 1307): 0xc267f7f8 name: jail stack: 0xc43fd000-0xc43fefff flags: 0x4 pflags: 0 state: RUNQ priority: 137 container lock: sched lock 0 (0xc0eb0c40) Tracing pid 1307 tid 100050 td 0xc20032d0 sched_switch(c20032d0,0,602,18c,4b69c645,...) at sched_switch+0x3d3 mi_switch(602,0,c0d25710,cd,0,...) at mi_switch+0x200 critical_exit(c0e6a98c,1,c0e6a98c,c43fea20,0,...) at critical_exit+0xa8 intr_event_handle(c1dbfe80,c43fea20,ff6b36c5,c20032d0,1,...) at intr_event_handle+0x115 intr_execute_handlers(c0e6a98c,c43fea20,c20032d0,c101a314,c43fea64,...) at intr_execute_handlers+ 0x49 atpic_handle_intr(1,c43fea20) at atpic_handle_intr+0x7c Xatpic_intr1() at Xatpic_intr1+0x22 --- interrupt, eip = 0xc0c30cfb, esp = 0xc43fea60, ebp = 0xc43fea64 --- spinlock_exit(c0eb0c40,4,c0d236ac,109,c091cf25,39248) at spinlock_exit+0x2b _mtx_unlock_spin_flags(c0eb0c40,0,c0d299f4,26a) at _mtx_unlock_spin_flags+0x12d sleepq_wait(c101a314,0,c43feadc,1,0,...) at sleepq_wait+0x85 _cv_wait(c101a314,c101a31c,c43feb14,17,0,...) at _cv_wait+0x243 flowtable_flush(0,c1ef0400,c0d353e4,38e,c1d42dc0,...) at flowtable_flush+0x90 if_detach_internal(8,c0d37941,117,0,c0d204c3,...) at if_detach_internal+0x43d if_vmove(c1ef0400,c1d720c0,117,115,0,...) at if_vmove+0x1b vnet_destroy(c1d5d260,c0d204c3,9c6,9b8,17,...) at vnet_destroy+0x163 prison_deref(c08b7d2b,c253c028,0,c0d204c3,2,...) at prison_deref+0x3a2 prison_remove_one(c0e20060,1,c0d204c3,83f,c0c3d6cf,...) at prison_remove_one+0x53 jail_remove(c20032d0,c43fecec,c20032d0,c43fed28,c0d2a23d,...) at jail_remove+0x266 syscallenter(c20032d0,c43fece4,c43fece4,0,c0eb0c40,...) at syscallenter+0x2c3 syscall(c43fed28) at syscall+0x4f Xint0x80_syscall() at Xint0x80_syscall+0x21 --- syscall (508, FreeBSD ELF32, jail_remove), eip = 0x280efa1b, esp = 0xbfbfebdc, ebp = 0xbfbfeca8 --- Thread 100037 at 0xc1fc6870: proc (pid 18): 0xc1fd47f8 name: flowcleaner stack: 0xc43c6000-0xc43c7fff flags: 0x4 pflags: 0x200000 state: RUNQ priority: 160 container lock: sched lock 0 (0xc0eb0c40) Tracing pid 18 tid 100037 td 0xc1fc6870 sched_switch(c1fc6870,0,104,191,3d6a1775,...) at sched_switch+0x3d3 mi_switch(104,0,c0d299f4,1f3,0,...) at mi_switch+0x200 sleepq_switch(c1fc6870,0,c0d299f4,28b,c1fc6870,...) at sleepq_switch+0x15f sleepq_timedwait(c101a314,0,c43c7ca0,1,0,...) at sleepq_timedwait+0x6b _cv_timedwait(c101a314,c101a31c,7d0,620,c1fc6870,...) at _cv_timedwait+0x252 flowtable_cleaner(0,c43c7d28,c0d20004,33b,c1fd47f8,...) at flowtable_cleaner+0x255 fork_exit(c0990630,0,c43c7d28) at fork_exit+0xb8 fork_trampoline() at fork_trampoline+0x8 In net/flowtable.c we have two functions: static void flowtable_cleaner(void) { ... while (1) { ... flowclean_cycles++; mtx_lock(&flowclean_lock); cv_broadcast(&flowclean_cv); cv_timedwait(&flowclean_cv, &flowclean_lock, flowclean_freq); mtx_unlock(&flowclean_lock); } } static void flowtable_flush(void *unused __unused) { uint64_t start; mtx_lock(&flowclean_lock); start = flowclean_cycles; while (start == flowclean_cycles) { cv_broadcast(&flowclean_cv); cv_wait(&flowclean_cv, &flowclean_lock); } mtx_unlock(&flowclean_lock); } It looks like when two threads enter flowtable_flush() simultaneously they start to wake up each other not giving to flowcleaner thread (which is in RUNQ) a chance to run (I suppose because it has higher priority number) and update flowclean_cycles counter. I added print in flowtable_flush() loop to check my assumption and got: flowtable_flush: start(C43FEB14): 23; flowclean_cycles: 23 flowtable_flush: start(C87439F8): 23; flowclean_cycles: 23 flowtable_flush: start(C43FEB14): 23; flowclean_cycles: 23 flowtable_flush: start(C87439F8): 23; flowclean_cycles: 23 flowtable_flush: start(C43FEB14): 23; flowclean_cycles: 23 flowtable_flush: start(C87439F8): 23; flowclean_cycles: 23 flowtable_flush: start(C43FEB14): 23; flowclean_cycles: 23 flowtable_flush: start(C87439F8): 23; flowclean_cycles: 23 ... So the question is who is guilty in this situation? ULE? flowtable? Or jail/epair, which should not allow simultaneous entering of flowtable_flush? -- Mikolaj Golub