From owner-freebsd-current@FreeBSD.ORG Sat Jan 17 17:14:08 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id ED2A216A4CE; Sat, 17 Jan 2004 17:14:07 -0800 (PST) Received: from gw.catspoiler.org (217-ip-163.nccn.net [209.79.217.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9C31F43D3F; Sat, 17 Jan 2004 17:14:03 -0800 (PST) (envelope-from truckman@FreeBSD.org) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.12.9p2/8.12.9) with ESMTP id i0I1Dt7E052190; Sat, 17 Jan 2004 17:13:59 -0800 (PST) (envelope-from truckman@FreeBSD.org) Message-Id: <200401180113.i0I1Dt7E052190@gw.catspoiler.org> Date: Sat, 17 Jan 2004 17:13:55 -0800 (PST) From: Don Lewis To: shoesoft@gmx.net In-Reply-To: <1074367329.2465.6.camel@shoeserv.freebsd> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii cc: cg@FreeBSD.org cc: current@FreeBSD.org Subject: Re: sound/pcm/* bugs (was: Re: page fault panic tracked down (selwakeuppri()) - really sound/pcm/*) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 18 Jan 2004 01:14:08 -0000 On 17 Jan, Stefan Ehmann wrote: > On Sat, 2004-01-17 at 14:10, Stefan Ehmann wrote: >> On Sat, 2004-01-17 at 10:15, Don Lewis wrote: >> > I think this problem can be caused by a transient malloc() M_NOWAIT >> > failure. >> > >> > Changes in this version of the patch: >> > >> > When increasing blksz in chn_setblocksize(), increase the size >> > of bufsoft before increasing bufhard, and decrease bufsoft after >> > decreasing bufhard in an attempt to avoid the buffer overflow in >> > feed_vchan_s16(). >> > >> > Buffers are now allocated using M_WAITOK, requiring locks to >> > be dropped for the malloc() calls. This required adding a mutex >> > parameter to sndbuf_remalloc(). >> > >> > Locking order between parent and child channels is changed. The >> > parent is now locked first and then the child. The list of >> > children is protected by the parent's lock. There are still >> > potential race conditions in the vchan creation/destruction code >> > because all the locks have to be dropped in order to call >> > malloc(), etc. >> > >> > Locking cleaned up to eliminate the need for MTX_RECURSE. >> > >> > A new mutex allocator for the channel mutexes was added that >> > initializes the mutexes with the MTX_DUPOK flags so that multiple >> > channel mutexes of the same type can be held at once. >> > >> > Locking simplification in dsp_ioctl(). >> > >> > Added KASSERTs in chn_wrfeed(). >> >> Some more observations: >> >> - I haven't run the patched kernel long enough to see if it actually >> prevents panics. >> >> - Sometimes when trying to open dsp it fails at the first attempt, but >> works if I try again. I had similar experiences when I first used vchans >> but these problems have been gone for long. >> >> - This also seems to lock up one of the vchans each time this happens (I >> haven't really tested this yet but I'm pretty sure). For instance, if I >> try to set the number of vchans to anything lower than 3 right now it >> fails saying device busy. My guess would be that those vchans were >> locked but never unlocked for some reason. >> >> - If I use esd as output device (e.g. for ogg123 because it doesn't work >> properly with the default output) an I stop the player, it takes some >> seconds before the sound actually stops playing (until the buffer is >> written to dsp). This doesn't happen with unpatched pcm module. > > Seems like some of my interpretation were wrong. > > The failing at the first try opening /dev/dsp seems to be related to the > last point. If output is sent to esd by ogg123 (e.g) is terminated but > sound is played for some more seconds. If the next file is opened the > device is still busy. Try changing the two instances of maxblksz = sndbuf_getblksz(b); in chn_setblocksize() to maxblksz = sndbuf_getsize(b) / blkcnt; and change the condition for executing the sndbuf_remalloc() call from if (sndbuf_getblksz(bs) != maxblksz) { to if (sndbuf_getblksz(bs) != maxblksz || sndbuf_getblkcnt(bs) != blkcnt) { It seems that the blkcnt for bufsoft never gets set in the non-vchan case, and gets ignored in the vchan case because the call to chn_setblocksize() that tries to set it passes 0 for the blksz parameter. In either case the code in chn_setblocksize() that calculates blkcnt and blksz from sndbuf_getbps(bs), sndbuf_getspd(bs)), chn_targetirqrate, and CHN_2NDBUFMAXSIZE gets invoked, resulting in a request for a CHN_2NDBUFMAXSIZE-size buffer and wierd values for blkcnt and blksz. I typically see blkcnt getting set to 32. If you combine this larger than expect blkcnt with the blksz from bufhard, which has a much smaller blksz, bufsoft is likely to end up with a huge buffer. With the above change, blkcnt will still be wierd, but its effects will be papered over and bufsoft's size will get capped at CHN_2NDBUFMAXSIZE (131072), so there should only be about 1.5 seconds of buffered audio. > Still have no clue why there are some non-used vchans busy. I haven't seen that here. There is also now the issue of merging in phk's recent commit to the sound code.