From owner-freebsd-hackers Sun Jun 23 19: 0:57 2002 Delivered-To: freebsd-hackers@freebsd.org Received: from rwcrmhc52.attbi.com (rwcrmhc52.attbi.com [216.148.227.88]) by hub.freebsd.org (Postfix) with ESMTP id 752A437B40E for ; Sun, 23 Jun 2002 19:00:22 -0700 (PDT) Received: from InterJet.elischer.org ([12.232.206.8]) by rwcrmhc52.attbi.com (InterMail vM.4.01.03.27 201-229-121-127-20010626) with ESMTP id <20020624020021.CIGE2751.rwcrmhc52.attbi.com@InterJet.elischer.org>; Mon, 24 Jun 2002 02:00:21 +0000 Received: from localhost (localhost.elischer.org [127.0.0.1]) by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id SAA49450; Sun, 23 Jun 2002 18:42:45 -0700 (PDT) Date: Sun, 23 Jun 2002 18:42:43 -0700 (PDT) From: Julian Elischer To: Jonathan Lemon Cc: dillon@apollo.backplane.com, hackers@freebsd.org Subject: Re: Bug in wakeup() (stable and current) ? In-Reply-To: <200206232158.g5NLw9c49030@prism.flugsvamp.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Sun, 23 Jun 2002, Jonathan Lemon wrote: > In article you write: > >:I'm pretty sure you only need to 'goto restart' if you call into > >:maybe_resched() as someone else may have manipulated the queues. > >: > >:The 'restart' label is only in there for restarting in case one of > >:the functions called may change the lists, if we restart _every_ > >:time we'll traverse the same procs where p->p_wchan != ident over > >:and over needlessly. > >: > >:-Alfred > > > > Look at the code carefully. It's *removing* the element from the list, > > the conditionally restarting rather then removing the element from the > > list and unconditionally restarting. The only reason it works at all > > is because sys/queue.h does not clear out the pointers in the node > > that was just removed. The code is just plain wrong, though, because > > the queue mechanisms make no such (documented) guarentee. > > Looks like the original damage happened in r1.21, where the temporary > variable (used to hold the next item on the list) was replaced by a > dereference through the pointer of the item that was just removed. > > The code works simply because it relies TAILQ_REMOVE() not changing > the tqe_next pointer. I suppose that this should either be documented, > or the loop changed back to use a temp variable: > > for (td = TAILQ_FIRST(qp); td != NULL; td = tdq) { > tdq = TAILQ_NEXT(td, td_slpq); > ... > } I just added debug code in the TAILQ code that sets the forward pointor to -1. Since Matt had this it's possible that this is what hit him? I do this to stop people accessingthings that they shouldn't be counting on.. > > -- > Jonathan > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-hackers" in the body of the message > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message