Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 08 Mar 2005 15:56:15 +0100
From:      =?ISO-8859-1?Q?S=F8ren_Schmidt?= <sos@DeepCore.dk>
To:        Nate Lawson <nate@root.org>
Cc:        current@freebsd.org
Subject:   Re: patch: fix ata panic with Thinkpad CD and DVD drives
Message-ID:  <422DBD0F.2070206@DeepCore.dk>
In-Reply-To: <422DBA9E.8060502@root.org>
References:  <422225D6.5020009@root.org> <422D84FF.1010707@DeepCore.dk> <422DBA9E.8060502@root.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Nate Lawson wrote:
> S=F8ren Schmidt wrote:
>=20
>> Nate Lawson wrote:
>>
>>> If you've been having "memory modified after free" panics on -current=
=20
>>> and have a Thinkpad, the attached patch should fix things for you.  A=
=20
>>> quick check of RELENG_5 indicates that the bug is probably there also=
=20
>>> but I haven't tested for it there.
>>>
>>> The bug is triggered by timeouts in the ata_getparam() probe path. =20
>>> The ata_timeout() fires and ata_end_transaction() is called to get=20
>>> the status.  However, it continues down into ata_pio_read() even=20
>>> though there is no data available since we had a timeout, not read=20
>>> completion.    ata_pio_read() reads 512 bytes of probably bogus=20
>>> data.  The important problem is that it also advances donecount.  On =

>>> subsequent timeouts (note there are 4 below), donecount advances into=
=20
>>> unallocated memory and so subsequent ata_pio_read() calls overwrite=20
>>> 512 bytes of someone else's memory.
>>>
>>> The fix is to exit immediately if ATA_R_TIMEOUT is set after reading =

>>> the status in ata_end_transaction().  It shouldn't go into=20
>>> ata_pio_read() if there was a timeout.  The patch does this.
>>>
>>> However, it only handles PIO timeouts since I wasn't sure the best=20
>>> way to proceed for unwinding DMA state and the like for the other=20
>>> cases. This is enough to fix the overwrite and subsequent panic on my=
=20
>>> systems.  I've run heavy IO stress and DVD accesses for a while and=20
>>> no further panics.
>>>
>>> While looking into this, I found another potential problem.  In one=20
>>> reinjection case, donecount wasn't reset to 0.  The patch for=20
>>> ata-queue.c does this and I think it's necessary but don't hit this=20
>>> case in testing so I can't be sure.  Finally, there's one whitespace =

>>> nit that helps with clarity.
>>>
>>> These are similar bugs to one found back in August that had the same =

>>> effect.  Here's the closest reference I could find in the mail=20
>>> archives for this:
>>> http://lists.freebsd.org/mailman/htdig/freebsd-current/2004-August/03=
3033.html=20
>>
>>
>>
>>
>> Just a note from here, these bugs are fixed in ATA mkIII so you could =

>> just have gleaned the solution from there (or maybe you did :))
>=20
>=20
> Nope, but I'm glad you can corroborate these fixes are correct.

Actually I cant, I havn't looked at what was committed since I already=20
did fix these problems in the mkIII patches floating around..
Anyhow its in there and the committer has to deal with it until/if I=20
commit mkIII to -current, I'm out of the loop until then...

--=20

-S=F8ren




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?422DBD0F.2070206>