Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 27 Aug 2009 00:38:12 +0100
From:      Andrew Brampton <brampton+freebsd@gmail.com>
To:        Attilio Rao <attilio@freebsd.org>
Cc:        freebsd-current@freebsd.org
Subject:   Re: [patch] rw_try_wlock does not set recursive bit when recursing
Message-ID:  <d41814900908261638w297be37ax53462163fd7e13ed@mail.gmail.com>
In-Reply-To: <3bbf2fe10908261557i286f5ec5q58eb4fec1abbfc7b@mail.gmail.com>
References:  <d41814900908261537r1726c420p11d3547afde98bbc@mail.gmail.com> <3bbf2fe10908261557i286f5ec5q58eb4fec1abbfc7b@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
2009/8/26 Attilio Rao <attilio@freebsd.org>:
> 2009/8/27 Andrew Brampton <brampton+freebsd@gmail.com>:
>> Hi,
>> The following sequence of commands fails on line 4 with an assertion
>> that the lock is not currently held:
>>
>> 1: rw_wlock(&rw);
>> 2: if ( rw_try_wlock(&rw) )
>> 3: =C2=A0 =C2=A0rw_wunlock(&rw);
>> 4: rw_wunlock(&rw);
>>
>> This is because after line 3 is executed the rw lock is no longer
>> held. I tracked this bug down to _rw_try_wlock which correctly
>> increments rw_recurse, but does not set the RW_LOCK_RECURSED bit.
>> Without this bit the third line unlocks the lock and leaves it in a
>> unlocked state (when it should still be locked). Adding a line to set
>> this bit makes _rw_try_wlock match the code in _rw_wlock_hard.
>
> Sorry, but I really don't understand how that can be a bug.
> On STABLE_7, RW_LOCK_RECURSED is not used for checking if the lock is
> recursed or not.
> it just got set for improving debugging and eventually we decided to
> drop it for 8.0.
>
> However, for deciding if a lock is recursed or not in both STABLE_7
> and HEAD we used just checking against the recursion count which is
> correctly handled by the function.
>
> What you describe can't be the problem.
>
> Attilio
>

Ok, so I have had a better look at the code in CURRENT, and compared
it to the code in STABLE_7. I apologise but I think I mixed up my
sources somewhere and the problem does not appear in CURRENT. The
problem does however occur in STABLE_7, and I can explain that below.

1: rw_wlock(&rw);
2: if ( rw_try_wlock(&rw) )
3:    rw_wunlock(&rw);
4: rw_wunlock(&rw);

Line 1, _rw_wlock gets called which calls __rw_wlock. This in turn
calls _rw_write_lock which changes rw->rw_lock to tid
Line 2, _rw_try_wlock gets called, which check if the lock is already
held (which it is), if so it then rw->rw_recurse++
Line 3, _rw_wunlock gets called which calls __rw_wunlock, which then
calls _rw_write_unlock. Now _rw_write_unlock trys to unlock the lock
by checking if rw->rw_lock is tid, if so it sets the lock to
RW_UNLOCKED. This is where the problem occurs, there is no check on
rx->rw_recurse.

Now, if we used this code instead:
1: rw_wlock(&rw);
2: if ( rw_try_wlock(&rw) )
3:    rw_wunlock(&rw);
4: rw_wunlock(&rw);

The order goes:
Line 1, _rw_wlock gets called which calls __rw_wlock. This in turn
calls _rw_write_lock which changes rw->rw_lock to tid
Line 2, _rw_wlock gets called which calls __rw_wlock. __rw_wlock
checks if rw->rw_lock is RW_UNLOCKED, otherwise it ends up calling
_rw_wlock_hard. Inside _rw_wlock_hard it checks if the lock is already
held, if so it increments rw_recurse, and changes rw->rw_lock to
(rw->rw_lock | RW_LOCK_RECURSED) by calling atomic_set_ptr.
Line 3, _rw_wunlock gets called which calls __rw_wunlock, which then
calls _rw_write_unlock. Now _rw_write_unlock trys to unlock the lock
by checking if rw->rw_lock is tid. This is where things differ from
the previous code, rw->rw_lock is NOT tid, it is now (tid |
RW_LOCK_RECURSED), and thus the code drops into _rw_wunlock_hard which
decrements rx->rw_recurse and removes the RW_LOCK_RECURSED flag if
rw_recurse equals zero.

Hopefully you see how not setting the RW_LOCK_RECURSED flag causes a
problem in rw_wunlock. So this can be fixed by either back porting the
CURRENT code to STABLE_7, or using the one line fix in my patch. As I
already stated, I am using my patch and the sample code I have given
no longer panics.

Sorry for the confusion
Andrew



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?d41814900908261638w297be37ax53462163fd7e13ed>