Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 05 Jan 2009 22:20:03 -0800
From:      Sean Bruno <sean.bruno@dsl-only.net>
To:        Dieter <freebsd@sopwith.solgatos.com>
Cc:        freebsd-firewire@FreeBSD.org, bug-followup@FreeBSD.org
Subject:   Re: kern/118093: [firewire] firewire bus reset hogs CPU, causing data to be lost
Message-ID:  <1231222803.21260.17.camel@localhost.localdomain>
In-Reply-To: <200901060152.BAA21475@sopwith.solgatos.com>
References:  <200901060152.BAA21475@sopwith.solgatos.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 2009-01-05 at 17:52 +0000, Dieter wrote:
> In message <1231181797.21260.8.camel@localhost.localdomain>, Sean Bruno writes:
> 
> > @@ -1388,6 +1387,7 @@
> >  	struct fw_device *fwdev;
> >  
> >  	s = splfw();
> > +	FW_GLOCK(fc);
> >  	fc->status = FWBUSEXPLORE;
> >  
> >  	/* Invalidate all devices, just after bus reset. */
> > @@ -1396,6 +1396,7 @@
> >  			fwdev->status = FWDEVINVAL;
> >  			fwdev->rcnt = 0;
> >  		}
> > +	FW_GUNLOCK(fc);
> >  	splx(s);
> >  
> >  	wakeup((void *)fc);
> 
> If the (null these days) spl calls have been replaced with FW_GLOCK/FW_GUNLOCK
> calls, can the spl calls be removed now?
> 
They most definitely can be removed.  However, they are a roadmap to
where locks SHOULD be.  I'm using them as guideposts through the code to
see if I can figure out what's missing.

> > @@ -1922,6 +1924,8 @@
> >  	u_int i;
> >  	struct firewire_comm *fc = (struct firewire_comm *)sc;
> >  
> > +	FW_GLOCK(fc);
> > +	FW_GUNLOCK(fc);
> >  	if (stat & OHCI_INT_DMA_IR) {
> >  		irstat = atomic_readandclear_int(&sc->irstat);
> >  		for(i = 0; i < fc->nisodma ; i++){
> 
> This needs to be reviewed by someone who understands what is protected by
> getting a lock and then immediately releasing it.  I have no clue.
> 
Indeed, it's gross.  I only sent this patch over as a "test" of sorts to
validate that the issue I'm looking into(locking in the firewire driver)
is actually something related to what you reported in this issue.

> > @@ -1969,8 +1973,8 @@
> >  			OWRITE(sc, OHCI_LNKCTLCLR, OHCI_CNTL_CYCTIMER);
> >  #endif
> >  			OWRITE(sc, FWOHCI_INTMASKCLR,  OHCI_INT_CYC_LOST);
> > -			device_printf(fc->dev, "too many cycle lost, "
> > -			 "no cycle master presents?\n");
> > +			device_printf(fc->dev, "%s: too many cycle lost, "
> > +			 "no cycle master presents?\n", __func__);
> >  		}
> 
> Should this be "too many cycles lost, " and  "no cycle master present?\n"
> 
Who knows.  I haven't reviewed this specific chunk of code to understand
what it really means as per the FW specs.
> >  	uint32_t fun;
> >  
> > +	FW_GLOCK(fc);
> >  	device_printf(fc->dev, "Initiate bus reset\n");
> >  	sc = (struct fwohci_softc *)fc;
> >  
> > @@ -2487,6 +2495,7 @@
> >  	fun |= FW_PHY_ISBR | FW_PHY_RHB;
> >  	fun = fwphy_wrdata(sc, FW_PHY_ISBR_REG, fun);
> >  #endif
> > +	FW_GUNLOCK(fc);
> >  }
> 
> Does the lock need to protect the printf?

These are gross, overpowered, way to heavy handed locks that I'm playing
with.  I need to prevent pre-emption of certain events while they are in
progress.  One of these events is the firewire's assertion of "bus
reset" on the firewire device.  I see the h/w interrupt firing before
this code can actually complete, causing the driver to be confused on
occasion.


Thanks for surveying this code with me, it helps to see what other
people's eyes can pick up.  I hope to have this in working order
soon-ish.

Sean




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1231222803.21260.17.camel>