From owner-freebsd-current@FreeBSD.ORG Mon Jul 26 22:15:25 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C3A4316A4CE; Mon, 26 Jul 2004 22:15:25 +0000 (GMT) Received: from lakermmtao01.cox.net (lakermmtao01.cox.net [68.230.240.38]) by mx1.FreeBSD.org (Postfix) with ESMTP id 32F3043D54; Mon, 26 Jul 2004 22:15:25 +0000 (GMT) (envelope-from conrads@cox.net) Received: from dolphin.local.net ([68.11.71.51]) by lakermmtao01.cox.net (InterMail vM.6.01.03.02.01 201-2131-111-104-103-20040709) with ESMTP <20040726221523.XVNS12235.lakermmtao01.cox.net@dolphin.local.net>; Mon, 26 Jul 2004 18:15:23 -0400 Received: from dolphin.local.net (localhost.local.net [127.0.0.1]) by dolphin.local.net (8.12.11/8.12.11) with ESMTP id i6QMFOF9001631; Mon, 26 Jul 2004 17:15:24 -0500 (CDT) (envelope-from conrads@dolphin.local.net) Received: (from conrads@localhost) by dolphin.local.net (8.12.11/8.12.11/Submit) id i6QMFOep001630; Mon, 26 Jul 2004 17:15:24 -0500 (CDT) (envelope-from conrads) Message-ID: X-Mailer: XFMail 1.5.5 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <200407262155.i6QLtNuZ058373@gw.catspoiler.org> Date: Mon, 26 Jul 2004 17:15:24 -0500 (CDT) Organization: A Rag-Tag Band of Drug-Crazed Hippies From: "Conrad J. Sabatier" To: Don Lewis cc: freebsd-current@FreeBSD.org Subject: Re: Questionable code in sys/dev/sound/pcm/channel.c X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: conrads@cox.net List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Jul 2004 22:15:25 -0000 On 26-Jul-2004 Don Lewis wrote: > On 26 Jul, Conrad J. Sabatier wrote: >> I'm a little perplexed at the following bit of logic in chn_write() >> (which is where the "interrupt timeout, channel dead" messages are >> being generated). >> >> Within an else branch within the main while loop, we have: >> >> else { >> timeout = (hz * sndbuf_getblksz(bs)) / >> (sndbuf_getspd(bs) * sndbuf_getbps(bs)); >> if (timeout < 1) >> timeout = 1; >> timeout = 1; >> >> Why the formulaic calculation of timeout, if it's simply going to be >> unconditionally set to 1 immediately afterwards anyway? What's >> going on >> here? > > Hmn, looks bogus to me. I think the intention is to round timeout up > to 1 if the result of the formula is zero. The final assignment > statement looks bogus to me. Maybe a too short timeout is the > source of this problem. > > It looks like this assignment appeared in rev 1.65. Hmm, your guess is as good as (or probably better than) mine. :-) A little more in the way of comments certainly wouldn't hurt. >> Also, at the end of the function: >> >> if (count <= 0) { >> c->flags |= CHN_F_DEAD; >> printf("%s: play interrupt timeout, channel dead\n", >> c->name); >> } >> >> return ret; >> } >> >> Could it be that the conditional test is wrong here? Perhaps >> we should be using (count < 0) instead? >> >> I don't know. I'm having no small difficulty understanding this >> code, but these two items caught my attention. > > I ran into the same problem when I was looking at the code a few days > ago. > > BTW, the trace output that was posted showed write() returning 0 > immediately before the failure occurred. Are you referring to the truss output I posted a few days ago? The thing of it is, though, that the original "channel dead" message had already occurred in a previous run of madplay (which wasn't traced), so it's really hard to say if there's any useful info to be obtained from tracing a later run, after the pcm device was already "broken". So far, I still haven't gotten the error with the new kernel I'm testing. I wouldn't say absolutely that that single patch (of the final conditional test) is "the fix", but it may help in the meantime. -- Conrad J. Sabatier -- "In Unix veritas"