From owner-svn-src-stable-11@freebsd.org Tue Nov 5 19:15:43 2019 Return-Path: Delivered-To: svn-src-stable-11@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id D98091B1920; Tue, 5 Nov 2019 19:15:43 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 476zsC5WMwz3R1B; Tue, 5 Nov 2019 19:15:43 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id A147A195F2; Tue, 5 Nov 2019 19:15:43 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id xA5JFh9m044400; Tue, 5 Nov 2019 19:15:43 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id xA5JFhvo044399; Tue, 5 Nov 2019 19:15:43 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201911051915.xA5JFhvo044399@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Tue, 5 Nov 2019 19:15:43 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r354366 - stable/11/sys/cddl/contrib/opensolaris/uts/common/zmod X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/cddl/contrib/opensolaris/uts/common/zmod X-SVN-Commit-Revision: 354366 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Nov 2019 19:15:43 -0000 Author: mav Date: Tue Nov 5 19:15:43 2019 New Revision: 354366 URL: https://svnweb.freebsd.org/changeset/base/354366 Log: MFC r354159: FreeBSD'fy ZFS zlib zalloc/zfree callbacks. The previous code came from OpenSolaris, which in my understanding require allocation size to be known to free memory. To store that size previous code allocated additional 8 byte header. But I have noticed that zlib with present settings allocates 64KB context buffers for each call, that could be efficiently cached by UMA, but addition of those 8 bytes makes them fall back to physical RAM allocations, that cause huge overhead and lock congestion on small blocks. Since FreeBSD's free() does not have the size argument, switching to it solves the problem, increasing write speed to ZVOLs with 4KB block size and GZIP compression on my 40-threads test system from ~60MB/s to ~600MB/s. Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/zmod/zmod_subr.c Modified: stable/11/sys/cddl/contrib/opensolaris/uts/common/zmod/zmod_subr.c ============================================================================== --- stable/11/sys/cddl/contrib/opensolaris/uts/common/zmod/zmod_subr.c Tue Nov 5 19:14:17 2019 (r354365) +++ stable/11/sys/cddl/contrib/opensolaris/uts/common/zmod/zmod_subr.c Tue Nov 5 19:15:43 2019 (r354366) @@ -28,41 +28,23 @@ #include #include -#include +#include -struct zchdr { - uint_t zch_magic; - uint_t zch_size; -}; - -#define ZCH_MAGIC 0x3cc13cc1 - /*ARGSUSED*/ void * zcalloc(void *opaque, uint_t items, uint_t size) { - size_t nbytes = sizeof (struct zchdr) + items * size; - struct zchdr *z = kobj_zalloc(nbytes, KM_NOWAIT|KM_TMP); + void *ptr; - if (z == NULL) - return (NULL); - - z->zch_magic = ZCH_MAGIC; - z->zch_size = nbytes; - - return (z + 1); + ptr = malloc((size_t)items * size, M_SOLARIS, M_NOWAIT); + return ptr; } /*ARGSUSED*/ void zcfree(void *opaque, void *ptr) { - struct zchdr *z = ((struct zchdr *)ptr) - 1; - - if (z->zch_magic != ZCH_MAGIC) - panic("zcfree region corrupt: hdr=%p ptr=%p", (void *)z, ptr); - - kobj_free(z, z->zch_size); + free(ptr, M_SOLARIS); } void From owner-svn-src-stable-11@freebsd.org Wed Nov 6 14:33:16 2019 Return-Path: Delivered-To: svn-src-stable-11@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 594E21B77C4; Wed, 6 Nov 2019 14:33:16 +0000 (UTC) (envelope-from imp@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 477TXr1ctYz4BnC; Wed, 6 Nov 2019 14:33:16 +0000 (UTC) (envelope-from imp@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 1AF612671B; Wed, 6 Nov 2019 14:33:16 +0000 (UTC) (envelope-from imp@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id xA6EXFr8058504; Wed, 6 Nov 2019 14:33:15 GMT (envelope-from imp@FreeBSD.org) Received: (from imp@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id xA6EXFt5058503; Wed, 6 Nov 2019 14:33:15 GMT (envelope-from imp@FreeBSD.org) Message-Id: <201911061433.xA6EXFt5058503@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: imp set sender to imp@FreeBSD.org using -f From: Warner Losh Date: Wed, 6 Nov 2019 14:33:15 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r354392 - stable/11 X-SVN-Group: stable-11 X-SVN-Commit-Author: imp X-SVN-Commit-Paths: stable/11 X-SVN-Commit-Revision: 354392 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Nov 2019 14:33:16 -0000 Author: imp Date: Wed Nov 6 14:33:15 2019 New Revision: 354392 URL: https://svnweb.freebsd.org/changeset/base/354392 Log: Update to indicate this is for stable/11, not current. Direct commit because it's not relevant to any other branch. Reported by: Jamie Landeg-Jones Modified: stable/11/UPDATING Modified: stable/11/UPDATING ============================================================================== --- stable/11/UPDATING Wed Nov 6 14:32:00 2019 (r354391) +++ stable/11/UPDATING Wed Nov 6 14:33:15 2019 (r354392) @@ -1,4 +1,4 @@ -Updating Information for FreeBSD current users. +Updating Information for FreeBSD stable/11 users. This file is maintained and copyrighted by M. Warner Losh . See end of file for further details. For commonly done items, please see the From owner-svn-src-stable-11@freebsd.org Wed Nov 6 17:59:19 2019 Return-Path: Delivered-To: svn-src-stable-11@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 12CEF1BBDE3; Wed, 6 Nov 2019 17:59:19 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 477Z6Z6nKSz4QZb; Wed, 6 Nov 2019 17:59:18 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id CC3C4C8D; Wed, 6 Nov 2019 17:59:18 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id xA6HxIKi085065; Wed, 6 Nov 2019 17:59:18 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id xA6HxISr085064; Wed, 6 Nov 2019 17:59:18 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201911061759.xA6HxISr085064@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 6 Nov 2019 17:59:18 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r354404 - stable/11/sys/kern X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/kern X-SVN-Commit-Revision: 354404 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Nov 2019 17:59:19 -0000 Author: mav Date: Wed Nov 6 17:59:18 2019 New Revision: 354404 URL: https://svnweb.freebsd.org/changeset/base/354404 Log: MFC r305368 (by markj): Micro-optimize sleepq_signal(). Lift a comparison out of the loop that finds the highest-priority thread on the queue. Modified: stable/11/sys/kern/subr_sleepqueue.c Directory Properties: stable/11/ (props changed) Modified: stable/11/sys/kern/subr_sleepqueue.c ============================================================================== --- stable/11/sys/kern/subr_sleepqueue.c Wed Nov 6 17:49:38 2019 (r354403) +++ stable/11/sys/kern/subr_sleepqueue.c Wed Nov 6 17:59:18 2019 (r354404) @@ -909,9 +909,9 @@ sleepq_signal(void *wchan, int flags, int pri, int que * been sleeping the longest since threads are always added to * the tail of sleep queues. */ - besttd = NULL; + besttd = TAILQ_FIRST(&sq->sq_blocked[queue]); TAILQ_FOREACH(td, &sq->sq_blocked[queue], td_slpq) { - if (besttd == NULL || td->td_priority < besttd->td_priority) + if (td->td_priority < besttd->td_priority) besttd = td; } MPASS(besttd != NULL); From owner-svn-src-stable-11@freebsd.org Wed Nov 6 18:02:21 2019 Return-Path: Delivered-To: svn-src-stable-11@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id 26E941BC017; Wed, 6 Nov 2019 18:02:21 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 477ZB50FzRz4R0V; Wed, 6 Nov 2019 18:02:21 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id E01B9E48; Wed, 6 Nov 2019 18:02:20 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id xA6I2KFX089725; Wed, 6 Nov 2019 18:02:20 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id xA6I2Jva089715; Wed, 6 Nov 2019 18:02:19 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201911061802.xA6I2Jva089715@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 6 Nov 2019 18:02:19 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r354405 - in stable/11: share/man/man9 sys/kern sys/sys X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: in stable/11: share/man/man9 sys/kern sys/sys X-SVN-Commit-Revision: 354405 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Nov 2019 18:02:21 -0000 Author: mav Date: Wed Nov 6 18:02:18 2019 New Revision: 354405 URL: https://svnweb.freebsd.org/changeset/base/354405 Log: MFC r349220: Add wakeup_any(), cheaper wakeup_one() for taskqueue(9). wakeup_one() and underlying sleepq_signal() spend additional time trying to be fair, waking thread with highest priority, sleeping longest time. But in case of taskqueue there are many absolutely identical threads, and any fairness between them is quite pointless. It makes even worse, since round-robin wakeups not only make previous CPU affinity in scheduler quite useless, but also hide from user chance to see CPU bottlenecks, when sequential workload with one request at a time looks evenly distributed between multiple threads. This change adds new SLEEPQ_UNFAIR flag to sleepq_signal(), making it wakeup thread that went to sleep last, but no longer in context switch (to avoid immediate spinning on the thread lock). On top of that new wakeup_any() function is added, equivalent to wakeup_one(), but setting the flag. On top of that taskqueue(9) is switchied to wakeup_any() to wakeup its threads. As result, on 72-core Xeon v4 machine sequential ZFS write to 12 ZVOLs with 16KB block size spend 34% less time in wakeup_any() and descendants then it was spending in wakeup_one(), and total write throughput increased by ~10% with the same as before CPU usage. Modified: stable/11/share/man/man9/Makefile stable/11/share/man/man9/sleep.9 stable/11/share/man/man9/sleepqueue.9 stable/11/sys/kern/kern_synch.c stable/11/sys/kern/subr_sleepqueue.c stable/11/sys/kern/subr_taskqueue.c stable/11/sys/sys/queue.h stable/11/sys/sys/sleepqueue.h stable/11/sys/sys/systm.h Directory Properties: stable/11/ (props changed) Modified: stable/11/share/man/man9/Makefile ============================================================================== --- stable/11/share/man/man9/Makefile Wed Nov 6 17:59:18 2019 (r354404) +++ stable/11/share/man/man9/Makefile Wed Nov 6 18:02:18 2019 (r354405) @@ -1648,7 +1648,8 @@ MLINKS+=sleep.9 msleep.9 \ sleep.9 tsleep.9 \ sleep.9 tsleep_sbt.9 \ sleep.9 wakeup.9 \ - sleep.9 wakeup_one.9 + sleep.9 wakeup_one.9 \ + sleep.9 wakeup_any.9 MLINKS+=sleepqueue.9 init_sleepqueues.9 \ sleepqueue.9 sleepq_abort.9 \ sleepqueue.9 sleepq_add.9 \ Modified: stable/11/share/man/man9/sleep.9 ============================================================================== --- stable/11/share/man/man9/sleep.9 Wed Nov 6 17:59:18 2019 (r354404) +++ stable/11/share/man/man9/sleep.9 Wed Nov 6 18:02:18 2019 (r354405) @@ -25,7 +25,7 @@ .\" .\" $FreeBSD$ .\" -.Dd March 4, 2018 +.Dd June 19, 2019 .Dt SLEEP 9 .Os .Sh NAME @@ -38,7 +38,9 @@ .Nm pause_sbt , .Nm tsleep , .Nm tsleep_sbt , -.Nm wakeup +.Nm wakeup , +.Nm wakeup_one , +.Nm wakeup_any .Nd wait for events .Sh SYNOPSIS .In sys/param.h @@ -70,6 +72,8 @@ .Fn wakeup "void *chan" .Ft void .Fn wakeup_one "void *chan" +.Ft void +.Fn wakeup_any "void *chan" .Sh DESCRIPTION The functions .Fn tsleep , @@ -79,8 +83,9 @@ The functions .Fn pause_sig , .Fn pause_sbt , .Fn wakeup , +.Fn wakeup_one , and -.Fn wakeup_one +.Fn wakeup_any handle event-based thread blocking. If a thread must wait for an external event, it is put to sleep by @@ -252,9 +257,10 @@ function is a wrapper around .Fn tsleep that suspends execution of the current thread for the indicated timeout. The thread can not be awakened early by signals or calls to -.Fn wakeup +.Fn wakeup , +.Fn wakeup_one or -.Fn wakeup_one . +.Fn wakeup_any . The .Fn pause_sig function is a variant of @@ -263,8 +269,8 @@ which can be awakened early by signals. .Pp The .Fn wakeup_one -function makes the first thread in the queue that is sleeping on the -parameter +function makes the first highest priority thread in the queue that is +sleeping on the parameter .Fa chan runnable. This reduces the load when a large number of threads are sleeping on @@ -292,6 +298,16 @@ to pay particular attention to ensure that no other threads wait on the same .Fa chan . +.Pp +The +.Fn wakeup_any +function is similar to +.Fn wakeup_one , +except that it makes runnable last thread on the queue (sleeping less), +ignoring fairness. +It can be used when threads sleeping on the +.Fa chan +are known to be identical and there is no reason to be fair. .Pp If the timeout given by .Fa timo Modified: stable/11/share/man/man9/sleepqueue.9 ============================================================================== --- stable/11/share/man/man9/sleepqueue.9 Wed Nov 6 17:59:18 2019 (r354404) +++ stable/11/share/man/man9/sleepqueue.9 Wed Nov 6 18:02:18 2019 (r354405) @@ -23,7 +23,7 @@ .\" .\" $FreeBSD$ .\" -.Dd September 22, 2014 +.Dd June 19, 2019 .Dt SLEEPQUEUE 9 .Os .Sh NAME @@ -291,7 +291,8 @@ and functions. The .Fn sleepq_signal -function awakens the highest priority thread sleeping on a wait channel while +function awakens the highest priority thread sleeping on a wait channel +(if SLEEPQ_UNFAIR flag is set, thread that went to sleep recently) while .Fn sleepq_broadcast awakens all of the threads sleeping on a wait channel. The Modified: stable/11/sys/kern/kern_synch.c ============================================================================== --- stable/11/sys/kern/kern_synch.c Wed Nov 6 17:59:18 2019 (r354404) +++ stable/11/sys/kern/kern_synch.c Wed Nov 6 18:02:18 2019 (r354405) @@ -369,6 +369,19 @@ wakeup_one(void *ident) kick_proc0(); } +void +wakeup_any(void *ident) +{ + int wakeup_swapper; + + sleepq_lock(ident); + wakeup_swapper = sleepq_signal(ident, SLEEPQ_SLEEP | SLEEPQ_UNFAIR, + 0, 0); + sleepq_release(ident); + if (wakeup_swapper) + kick_proc0(); +} + static void kdb_switch(void) { Modified: stable/11/sys/kern/subr_sleepqueue.c ============================================================================== --- stable/11/sys/kern/subr_sleepqueue.c Wed Nov 6 17:59:18 2019 (r354404) +++ stable/11/sys/kern/subr_sleepqueue.c Wed Nov 6 18:02:18 2019 (r354405) @@ -119,7 +119,7 @@ __FBSDID("$FreeBSD$"); * c - sleep queue chain lock */ struct sleepqueue { - TAILQ_HEAD(, thread) sq_blocked[NR_SLEEPQS]; /* (c) Blocked threads. */ + struct threadqueue sq_blocked[NR_SLEEPQS]; /* (c) Blocked threads. */ u_int sq_blockedcnt[NR_SLEEPQS]; /* (c) N. of blocked threads. */ LIST_ENTRY(sleepqueue) sq_hash; /* (c) Chain and free list. */ LIST_HEAD(, sleepqueue) sq_free; /* (c) Free queues. */ @@ -885,12 +885,14 @@ sleepq_init(void *mem, int size, int flags) } /* - * Find the highest priority thread sleeping on a wait channel and resume it. + * Find thread sleeping on a wait channel and resume it. */ int sleepq_signal(void *wchan, int flags, int pri, int queue) { + struct sleepqueue_chain *sc; struct sleepqueue *sq; + struct threadqueue *head; struct thread *td, *besttd; int wakeup_swapper; @@ -903,16 +905,33 @@ sleepq_signal(void *wchan, int flags, int pri, int que KASSERT(sq->sq_type == (flags & SLEEPQ_TYPE), ("%s: mismatch between sleep/wakeup and cv_*", __func__)); - /* - * Find the highest priority thread on the queue. If there is a - * tie, use the thread that first appears in the queue as it has - * been sleeping the longest since threads are always added to - * the tail of sleep queues. - */ - besttd = TAILQ_FIRST(&sq->sq_blocked[queue]); - TAILQ_FOREACH(td, &sq->sq_blocked[queue], td_slpq) { - if (td->td_priority < besttd->td_priority) + head = &sq->sq_blocked[queue]; + if (flags & SLEEPQ_UNFAIR) { + /* + * Find the most recently sleeping thread, but try to + * skip threads still in process of context switch to + * avoid spinning on the thread lock. + */ + sc = SC_LOOKUP(wchan); + besttd = TAILQ_LAST_FAST(head, thread, td_slpq); + while (besttd->td_lock != &sc->sc_lock) { + td = TAILQ_PREV_FAST(besttd, head, thread, td_slpq); + if (td == NULL) + break; besttd = td; + } + } else { + /* + * Find the highest priority thread on the queue. If there + * is a tie, use the thread that first appears in the queue + * as it has been sleeping the longest since threads are + * always added to the tail of sleep queues. + */ + besttd = td = TAILQ_FIRST(head); + while ((td = TAILQ_NEXT(td, td_slpq)) != NULL) { + if (td->td_priority < besttd->td_priority) + besttd = td; + } } MPASS(besttd != NULL); thread_lock(besttd); Modified: stable/11/sys/kern/subr_taskqueue.c ============================================================================== --- stable/11/sys/kern/subr_taskqueue.c Wed Nov 6 17:59:18 2019 (r354404) +++ stable/11/sys/kern/subr_taskqueue.c Wed Nov 6 18:02:18 2019 (r354405) @@ -802,7 +802,7 @@ taskqueue_thread_enqueue(void *context) tqp = context; tq = *tqp; - wakeup_one(tq); + wakeup_any(tq); } TASKQUEUE_DEFINE(swi, taskqueue_swi_enqueue, NULL, Modified: stable/11/sys/sys/queue.h ============================================================================== --- stable/11/sys/sys/queue.h Wed Nov 6 17:59:18 2019 (r354404) +++ stable/11/sys/sys/queue.h Wed Nov 6 18:02:18 2019 (r354405) @@ -760,6 +760,10 @@ struct { \ #define TAILQ_PREV(elm, headname, field) \ (*(((struct headname *)((elm)->field.tqe_prev))->tqh_last)) +#define TAILQ_PREV_FAST(elm, head, type, field) \ + ((elm)->field.tqe_prev == &(head)->tqh_first ? NULL : \ + __containerof((elm)->field.tqe_prev, QUEUE_TYPEOF(type), field.tqe_next)) + #define TAILQ_REMOVE(head, elm, field) do { \ QMD_SAVELINK(oldnext, (elm)->field.tqe_next); \ QMD_SAVELINK(oldprev, (elm)->field.tqe_prev); \ Modified: stable/11/sys/sys/sleepqueue.h ============================================================================== --- stable/11/sys/sys/sleepqueue.h Wed Nov 6 17:59:18 2019 (r354404) +++ stable/11/sys/sys/sleepqueue.h Wed Nov 6 18:02:18 2019 (r354405) @@ -83,6 +83,7 @@ struct thread; #define SLEEPQ_SX 0x03 /* Used by an sx lock. */ #define SLEEPQ_LK 0x04 /* Used by a lockmgr. */ #define SLEEPQ_INTERRUPTIBLE 0x100 /* Sleep is interruptible. */ +#define SLEEPQ_UNFAIR 0x200 /* Unfair wakeup order. */ void init_sleepqueues(void); int sleepq_abort(struct thread *td, int intrval); Modified: stable/11/sys/sys/systm.h ============================================================================== --- stable/11/sys/sys/systm.h Wed Nov 6 17:59:18 2019 (r354404) +++ stable/11/sys/sys/systm.h Wed Nov 6 18:02:18 2019 (r354405) @@ -422,6 +422,7 @@ int pause_sbt(const char *wmesg, sbintime_t sbt, sbint _sleep((chan), NULL, (pri), (wmesg), (bt), (pr), (flags)) void wakeup(void * chan); void wakeup_one(void * chan); +void wakeup_any(void * chan); /* * Common `struct cdev *' stuff are declared here to avoid #include poisoning From owner-svn-src-stable-11@freebsd.org Wed Nov 6 18:15:20 2019 Return-Path: Delivered-To: svn-src-stable-11@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id BF9211BC668; Wed, 6 Nov 2019 18:15:20 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) server-signature RSA-PSS (4096 bits) client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 477ZT45RtLz4SG0; Wed, 6 Nov 2019 18:15:20 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org (repo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 9E20C1076; Wed, 6 Nov 2019 18:15:20 +0000 (UTC) (envelope-from mav@FreeBSD.org) Received: from repo.freebsd.org ([127.0.1.37]) by repo.freebsd.org (8.15.2/8.15.2) with ESMTP id xA6IFKZh001023; Wed, 6 Nov 2019 18:15:20 GMT (envelope-from mav@FreeBSD.org) Received: (from mav@localhost) by repo.freebsd.org (8.15.2/8.15.2/Submit) id xA6IFKER001021; Wed, 6 Nov 2019 18:15:20 GMT (envelope-from mav@FreeBSD.org) Message-Id: <201911061815.xA6IFKER001021@repo.freebsd.org> X-Authentication-Warning: repo.freebsd.org: mav set sender to mav@FreeBSD.org using -f From: Alexander Motin Date: Wed, 6 Nov 2019 18:15:20 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-11@freebsd.org Subject: svn commit: r354406 - stable/11/sys/kern X-SVN-Group: stable-11 X-SVN-Commit-Author: mav X-SVN-Commit-Paths: stable/11/sys/kern X-SVN-Commit-Revision: 354406 X-SVN-Commit-Repository: base MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-11@freebsd.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: SVN commit messages for only the 11-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Nov 2019 18:15:20 -0000 Author: mav Date: Wed Nov 6 18:15:20 2019 New Revision: 354406 URL: https://svnweb.freebsd.org/changeset/base/354406 Log: MFC r354241: Some more taskqueue optimizations. - Optimize enqueue for two task priority values by adding new tq_hint field, pointing to the last task inserted into the middle of the list. In case of more then two priority values it should halve average search. - Move tq_active insert/remove out of the taskqueue_run_locked loop. Instead of dirtying few shared cache lines per task introduce different mechanism to drain active tasks, based on task sequence number counter, that uses only cache lines already present in cache. Since the new mechanism does not need ordering, switch tq_active from TAILQ to LIST. - Move static and dynamic struct taskqueue fields into different cache lines. Move lock into its own cache line, so that heavy lock spinning by multiple waiting threads would not affect the running thread. - While there, correct some TQ_SLEEP() wait messages. This change fixes certain ZFS write workloads, causing huge congestion on taskqueue lock. Those workloads combine some large block writes to saturate the pool and trigger allocation throttling, which uses higher priority tasks to requeue the delayed I/Os, with many small blocks to generate deep queue of small tasks for taskqueue to sort. Sponsored by: iXsystems, Inc. Modified: stable/11/sys/kern/subr_gtaskqueue.c stable/11/sys/kern/subr_taskqueue.c Modified: stable/11/sys/kern/subr_gtaskqueue.c ============================================================================== --- stable/11/sys/kern/subr_gtaskqueue.c Wed Nov 6 18:02:18 2019 (r354405) +++ stable/11/sys/kern/subr_gtaskqueue.c Wed Nov 6 18:15:20 2019 (r354406) @@ -55,24 +55,24 @@ static void gtaskqueue_thread_loop(void *arg); TASKQGROUP_DEFINE(softirq, mp_ncpus, 1); struct gtaskqueue_busy { - struct gtask *tb_running; - TAILQ_ENTRY(gtaskqueue_busy) tb_link; + struct gtask *tb_running; + u_int tb_seq; + LIST_ENTRY(gtaskqueue_busy) tb_link; }; -static struct gtask * const TB_DRAIN_WAITER = (struct gtask *)0x1; - struct gtaskqueue { STAILQ_HEAD(, gtask) tq_queue; + LIST_HEAD(, gtaskqueue_busy) tq_active; + u_int tq_seq; + int tq_callouts; + struct mtx_padalign tq_mutex; gtaskqueue_enqueue_fn tq_enqueue; void *tq_context; char *tq_name; - TAILQ_HEAD(, gtaskqueue_busy) tq_active; - struct mtx tq_mutex; struct thread **tq_threads; int tq_tcount; int tq_spin; int tq_flags; - int tq_callouts; taskqueue_callback_fn tq_callbacks[TASKQUEUE_NUM_CALLBACKS]; void *tq_cb_contexts[TASKQUEUE_NUM_CALLBACKS]; }; @@ -111,12 +111,11 @@ gtask_dump(struct gtask *gtask) #endif static __inline int -TQ_SLEEP(struct gtaskqueue *tq, void *p, struct mtx *m, int pri, const char *wm, - int t) +TQ_SLEEP(struct gtaskqueue *tq, void *p, const char *wm) { if (tq->tq_spin) - return (msleep_spin(p, m, wm, t)); - return (msleep(p, m, pri, wm, t)); + return (msleep_spin(p, (struct mtx *)&tq->tq_mutex, wm, 0)); + return (msleep(p, &tq->tq_mutex, 0, wm, 0)); } static struct gtaskqueue * @@ -140,7 +139,7 @@ _gtaskqueue_create(const char *name, int mflags, } STAILQ_INIT(&queue->tq_queue); - TAILQ_INIT(&queue->tq_active); + LIST_INIT(&queue->tq_active); queue->tq_enqueue = enqueue; queue->tq_context = context; queue->tq_name = tq_name; @@ -163,7 +162,7 @@ gtaskqueue_terminate(struct thread **pp, struct gtaskq while (tq->tq_tcount > 0 || tq->tq_callouts > 0) { wakeup(tq); - TQ_SLEEP(tq, pp, &tq->tq_mutex, PWAIT, "taskqueue_destroy", 0); + TQ_SLEEP(tq, pp, "gtq_destroy"); } } @@ -174,7 +173,7 @@ gtaskqueue_free(struct gtaskqueue *queue) TQ_LOCK(queue); queue->tq_flags &= ~TQ_FLAGS_ACTIVE; gtaskqueue_terminate(queue->tq_threads, queue); - KASSERT(TAILQ_EMPTY(&queue->tq_active), ("Tasks still running?")); + KASSERT(LIST_EMPTY(&queue->tq_active), ("Tasks still running?")); KASSERT(queue->tq_callouts == 0, ("Armed timeout tasks")); mtx_destroy(&queue->tq_mutex); free(queue->tq_threads, M_GTASKQUEUE); @@ -239,7 +238,7 @@ gtaskqueue_drain_tq_queue(struct gtaskqueue *queue) * have completed or are currently executing. */ while (t_barrier.ta_flags & TASK_ENQUEUED) - TQ_SLEEP(queue, &t_barrier, &queue->tq_mutex, PWAIT, "-", 0); + TQ_SLEEP(queue, &t_barrier, "gtq_qdrain"); } /* @@ -250,32 +249,25 @@ gtaskqueue_drain_tq_queue(struct gtaskqueue *queue) static void gtaskqueue_drain_tq_active(struct gtaskqueue *queue) { - struct gtaskqueue_busy tb_marker, *tb_first; + struct gtaskqueue_busy *tb; + u_int seq; - if (TAILQ_EMPTY(&queue->tq_active)) + if (LIST_EMPTY(&queue->tq_active)) return; /* Block taskq_terminate().*/ queue->tq_callouts++; - /* - * Wait for all currently executing taskqueue threads - * to go idle. - */ - tb_marker.tb_running = TB_DRAIN_WAITER; - TAILQ_INSERT_TAIL(&queue->tq_active, &tb_marker, tb_link); - while (TAILQ_FIRST(&queue->tq_active) != &tb_marker) - TQ_SLEEP(queue, &tb_marker, &queue->tq_mutex, PWAIT, "-", 0); - TAILQ_REMOVE(&queue->tq_active, &tb_marker, tb_link); + /* Wait for any active task with sequence from the past. */ + seq = queue->tq_seq; +restart: + LIST_FOREACH(tb, &queue->tq_active, tb_link) { + if ((int)(tb->tb_seq - seq) <= 0) { + TQ_SLEEP(queue, tb->tb_running, "gtq_adrain"); + goto restart; + } + } - /* - * Wakeup any other drain waiter that happened to queue up - * without any intervening active thread. - */ - tb_first = TAILQ_FIRST(&queue->tq_active); - if (tb_first != NULL && tb_first->tb_running == TB_DRAIN_WAITER) - wakeup(tb_first); - /* Release taskqueue_terminate(). */ queue->tq_callouts--; if ((queue->tq_flags & TQ_FLAGS_ACTIVE) == 0) @@ -306,40 +298,27 @@ static void gtaskqueue_run_locked(struct gtaskqueue *queue) { struct gtaskqueue_busy tb; - struct gtaskqueue_busy *tb_first; struct gtask *gtask; KASSERT(queue != NULL, ("tq is NULL")); TQ_ASSERT_LOCKED(queue); tb.tb_running = NULL; + LIST_INSERT_HEAD(&queue->tq_active, &tb, tb_link); - while (STAILQ_FIRST(&queue->tq_queue)) { - TAILQ_INSERT_TAIL(&queue->tq_active, &tb, tb_link); - - /* - * Carefully remove the first task from the queue and - * clear its TASK_ENQUEUED flag - */ - gtask = STAILQ_FIRST(&queue->tq_queue); - KASSERT(gtask != NULL, ("task is NULL")); + while ((gtask = STAILQ_FIRST(&queue->tq_queue)) != NULL) { STAILQ_REMOVE_HEAD(&queue->tq_queue, ta_link); gtask->ta_flags &= ~TASK_ENQUEUED; tb.tb_running = gtask; + tb.tb_seq = ++queue->tq_seq; TQ_UNLOCK(queue); KASSERT(gtask->ta_func != NULL, ("task->ta_func is NULL")); gtask->ta_func(gtask->ta_context); TQ_LOCK(queue); - tb.tb_running = NULL; wakeup(gtask); - - TAILQ_REMOVE(&queue->tq_active, &tb, tb_link); - tb_first = TAILQ_FIRST(&queue->tq_active); - if (tb_first != NULL && - tb_first->tb_running == TB_DRAIN_WAITER) - wakeup(tb_first); } + LIST_REMOVE(&tb, tb_link); } static int @@ -348,7 +327,7 @@ task_is_running(struct gtaskqueue *queue, struct gtask struct gtaskqueue_busy *tb; TQ_ASSERT_LOCKED(queue); - TAILQ_FOREACH(tb, &queue->tq_active, tb_link) { + LIST_FOREACH(tb, &queue->tq_active, tb_link) { if (tb->tb_running == gtask) return (1); } @@ -386,7 +365,7 @@ gtaskqueue_drain(struct gtaskqueue *queue, struct gtas TQ_LOCK(queue); while ((gtask->ta_flags & TASK_ENQUEUED) || task_is_running(queue, gtask)) - TQ_SLEEP(queue, gtask, &queue->tq_mutex, PWAIT, "-", 0); + TQ_SLEEP(queue, gtask, "gtq_drain"); TQ_UNLOCK(queue); } @@ -511,7 +490,7 @@ gtaskqueue_thread_loop(void *arg) */ if ((tq->tq_flags & TQ_FLAGS_ACTIVE) == 0) break; - TQ_SLEEP(tq, tq, &tq->tq_mutex, 0, "-", 0); + TQ_SLEEP(tq, tq, "-"); } gtaskqueue_run_locked(tq); /* @@ -537,7 +516,7 @@ gtaskqueue_thread_enqueue(void *context) tqp = context; tq = *tqp; - wakeup_one(tq); + wakeup_any(tq); } Modified: stable/11/sys/kern/subr_taskqueue.c ============================================================================== --- stable/11/sys/kern/subr_taskqueue.c Wed Nov 6 18:02:18 2019 (r354405) +++ stable/11/sys/kern/subr_taskqueue.c Wed Nov 6 18:15:20 2019 (r354406) @@ -54,24 +54,25 @@ static void taskqueue_swi_enqueue(void *); static void taskqueue_swi_giant_enqueue(void *); struct taskqueue_busy { - struct task *tb_running; - TAILQ_ENTRY(taskqueue_busy) tb_link; + struct task *tb_running; + u_int tb_seq; + LIST_ENTRY(taskqueue_busy) tb_link; }; -struct task * const TB_DRAIN_WAITER = (struct task *)0x1; - struct taskqueue { STAILQ_HEAD(, task) tq_queue; + LIST_HEAD(, taskqueue_busy) tq_active; + struct task *tq_hint; + u_int tq_seq; + int tq_callouts; + struct mtx_padalign tq_mutex; taskqueue_enqueue_fn tq_enqueue; void *tq_context; char *tq_name; - TAILQ_HEAD(, taskqueue_busy) tq_active; - struct mtx tq_mutex; struct thread **tq_threads; int tq_tcount; int tq_spin; int tq_flags; - int tq_callouts; taskqueue_callback_fn tq_callbacks[TASKQUEUE_NUM_CALLBACKS]; void *tq_cb_contexts[TASKQUEUE_NUM_CALLBACKS]; }; @@ -114,12 +115,11 @@ _timeout_task_init(struct taskqueue *queue, struct tim } static __inline int -TQ_SLEEP(struct taskqueue *tq, void *p, struct mtx *m, int pri, const char *wm, - int t) +TQ_SLEEP(struct taskqueue *tq, void *p, const char *wm) { if (tq->tq_spin) - return (msleep_spin(p, m, wm, t)); - return (msleep(p, m, pri, wm, t)); + return (msleep_spin(p, (struct mtx *)&tq->tq_mutex, wm, 0)); + return (msleep(p, &tq->tq_mutex, 0, wm, 0)); } static struct taskqueue * @@ -143,7 +143,7 @@ _taskqueue_create(const char *name, int mflags, snprintf(tq_name, TASKQUEUE_NAMELEN, "%s", (name) ? name : "taskqueue"); STAILQ_INIT(&queue->tq_queue); - TAILQ_INIT(&queue->tq_active); + LIST_INIT(&queue->tq_active); queue->tq_enqueue = enqueue; queue->tq_context = context; queue->tq_name = tq_name; @@ -194,7 +194,7 @@ taskqueue_terminate(struct thread **pp, struct taskque while (tq->tq_tcount > 0 || tq->tq_callouts > 0) { wakeup(tq); - TQ_SLEEP(tq, pp, &tq->tq_mutex, PWAIT, "taskqueue_destroy", 0); + TQ_SLEEP(tq, pp, "tq_destroy"); } } @@ -205,7 +205,7 @@ taskqueue_free(struct taskqueue *queue) TQ_LOCK(queue); queue->tq_flags &= ~TQ_FLAGS_ACTIVE; taskqueue_terminate(queue->tq_threads, queue); - KASSERT(TAILQ_EMPTY(&queue->tq_active), ("Tasks still running?")); + KASSERT(LIST_EMPTY(&queue->tq_active), ("Tasks still running?")); KASSERT(queue->tq_callouts == 0, ("Armed timeout tasks")); mtx_destroy(&queue->tq_mutex); free(queue->tq_threads, M_TASKQUEUE); @@ -231,21 +231,30 @@ taskqueue_enqueue_locked(struct taskqueue *queue, stru } /* - * Optimise the case when all tasks have the same priority. + * Optimise cases when all tasks use small set of priorities. + * In case of only one priority we always insert at the end. + * In case of two tq_hint typically gives the insertion point. + * In case of more then two tq_hint should halve the search. */ prev = STAILQ_LAST(&queue->tq_queue, task, ta_link); if (!prev || prev->ta_priority >= task->ta_priority) { STAILQ_INSERT_TAIL(&queue->tq_queue, task, ta_link); } else { - prev = NULL; - for (ins = STAILQ_FIRST(&queue->tq_queue); ins; - prev = ins, ins = STAILQ_NEXT(ins, ta_link)) + prev = queue->tq_hint; + if (prev && prev->ta_priority >= task->ta_priority) { + ins = STAILQ_NEXT(prev, ta_link); + } else { + prev = NULL; + ins = STAILQ_FIRST(&queue->tq_queue); + } + for (; ins; prev = ins, ins = STAILQ_NEXT(ins, ta_link)) if (ins->ta_priority < task->ta_priority) break; - if (prev) + if (prev) { STAILQ_INSERT_AFTER(&queue->tq_queue, prev, task, ta_link); - else + queue->tq_hint = task; + } else STAILQ_INSERT_HEAD(&queue->tq_queue, task, ta_link); } @@ -362,6 +371,7 @@ taskqueue_drain_tq_queue(struct taskqueue *queue) */ TASK_INIT(&t_barrier, USHRT_MAX, taskqueue_task_nop_fn, &t_barrier); STAILQ_INSERT_TAIL(&queue->tq_queue, &t_barrier, ta_link); + queue->tq_hint = &t_barrier; t_barrier.ta_pending = 1; /* @@ -369,7 +379,7 @@ taskqueue_drain_tq_queue(struct taskqueue *queue) * have completed or are currently executing. */ while (t_barrier.ta_pending != 0) - TQ_SLEEP(queue, &t_barrier, &queue->tq_mutex, PWAIT, "-", 0); + TQ_SLEEP(queue, &t_barrier, "tq_qdrain"); return (1); } @@ -381,32 +391,25 @@ taskqueue_drain_tq_queue(struct taskqueue *queue) static int taskqueue_drain_tq_active(struct taskqueue *queue) { - struct taskqueue_busy tb_marker, *tb_first; + struct taskqueue_busy *tb; + u_int seq; - if (TAILQ_EMPTY(&queue->tq_active)) + if (LIST_EMPTY(&queue->tq_active)) return (0); /* Block taskq_terminate().*/ queue->tq_callouts++; - /* - * Wait for all currently executing taskqueue threads - * to go idle. - */ - tb_marker.tb_running = TB_DRAIN_WAITER; - TAILQ_INSERT_TAIL(&queue->tq_active, &tb_marker, tb_link); - while (TAILQ_FIRST(&queue->tq_active) != &tb_marker) - TQ_SLEEP(queue, &tb_marker, &queue->tq_mutex, PWAIT, "-", 0); - TAILQ_REMOVE(&queue->tq_active, &tb_marker, tb_link); + /* Wait for any active task with sequence from the past. */ + seq = queue->tq_seq; +restart: + LIST_FOREACH(tb, &queue->tq_active, tb_link) { + if ((int)(tb->tb_seq - seq) <= 0) { + TQ_SLEEP(queue, tb->tb_running, "tq_adrain"); + goto restart; + } + } - /* - * Wakeup any other drain waiter that happened to queue up - * without any intervening active thread. - */ - tb_first = TAILQ_FIRST(&queue->tq_active); - if (tb_first != NULL && tb_first->tb_running == TB_DRAIN_WAITER) - wakeup(tb_first); - /* Release taskqueue_terminate(). */ queue->tq_callouts--; if ((queue->tq_flags & TQ_FLAGS_ACTIVE) == 0) @@ -438,42 +441,31 @@ static void taskqueue_run_locked(struct taskqueue *queue) { struct taskqueue_busy tb; - struct taskqueue_busy *tb_first; struct task *task; int pending; KASSERT(queue != NULL, ("tq is NULL")); TQ_ASSERT_LOCKED(queue); tb.tb_running = NULL; + LIST_INSERT_HEAD(&queue->tq_active, &tb, tb_link); - while (STAILQ_FIRST(&queue->tq_queue)) { - TAILQ_INSERT_TAIL(&queue->tq_active, &tb, tb_link); - - /* - * Carefully remove the first task from the queue and - * zero its pending count. - */ - task = STAILQ_FIRST(&queue->tq_queue); - KASSERT(task != NULL, ("task is NULL")); + while ((task = STAILQ_FIRST(&queue->tq_queue)) != NULL) { STAILQ_REMOVE_HEAD(&queue->tq_queue, ta_link); + if (queue->tq_hint == task) + queue->tq_hint = NULL; pending = task->ta_pending; task->ta_pending = 0; tb.tb_running = task; + tb.tb_seq = ++queue->tq_seq; TQ_UNLOCK(queue); KASSERT(task->ta_func != NULL, ("task->ta_func is NULL")); task->ta_func(task->ta_context, pending); TQ_LOCK(queue); - tb.tb_running = NULL; wakeup(task); - - TAILQ_REMOVE(&queue->tq_active, &tb, tb_link); - tb_first = TAILQ_FIRST(&queue->tq_active); - if (tb_first != NULL && - tb_first->tb_running == TB_DRAIN_WAITER) - wakeup(tb_first); } + LIST_REMOVE(&tb, tb_link); } void @@ -491,7 +483,7 @@ task_is_running(struct taskqueue *queue, struct task * struct taskqueue_busy *tb; TQ_ASSERT_LOCKED(queue); - TAILQ_FOREACH(tb, &queue->tq_active, tb_link) { + LIST_FOREACH(tb, &queue->tq_active, tb_link) { if (tb->tb_running == task) return (1); } @@ -520,8 +512,11 @@ taskqueue_cancel_locked(struct taskqueue *queue, struc u_int *pendp) { - if (task->ta_pending > 0) + if (task->ta_pending > 0) { STAILQ_REMOVE(&queue->tq_queue, task, task, ta_link); + if (queue->tq_hint == task) + queue->tq_hint = NULL; + } if (pendp != NULL) *pendp = task->ta_pending; task->ta_pending = 0; @@ -570,7 +565,7 @@ taskqueue_drain(struct taskqueue *queue, struct task * TQ_LOCK(queue); while (task->ta_pending != 0 || task_is_running(queue, task)) - TQ_SLEEP(queue, task, &queue->tq_mutex, PWAIT, "-", 0); + TQ_SLEEP(queue, task, "tq_drain"); TQ_UNLOCK(queue); } @@ -776,7 +771,7 @@ taskqueue_thread_loop(void *arg) */ if ((tq->tq_flags & TQ_FLAGS_ACTIVE) == 0) break; - TQ_SLEEP(tq, tq, &tq->tq_mutex, 0, "-", 0); + TQ_SLEEP(tq, tq, "-"); } taskqueue_run_locked(tq); /*