Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 31 May 2005 16:47:19 -0700
From:      steve@Watt.COM (Steve Watt)
To:        efinleywork@efinley.com
Cc:        stable@freebsd.org
Subject:   Re: 5.4-RC2 freezing - ATA related?
Message-ID:  <200505312347.j4VNlKgF064965@wattres.watt.com>
In-Reply-To: <08dc01c55d47$d7697100$37cba1cd@emerytelcom.com>
References:  <001801c55a14$609720d0$37cba1cd@emerytelcom.com> <20050516195859.GA59189@server.vk2pj.dyndns.org> <042501c55ba7$360fac30$37cba1cd@emerytelcom.com> <20050518194356.GA2129@cirb503493.alcatel.com.au>

next in thread | previous in thread | raw e-mail | index | archive | help
In <08dc01c55d47$d7697100$37cba1cd@emerytelcom.com>, efinleywork@efinley.com writes:
>From: "Peter Jeremy" <PeterJeremy@optushome.com.au>
>> On Wed, 2005-May-18 06:43:37 -0600, Elliot Finley wrote:
>> >Had the system lock up again.  This is with the new ATA mkIII patches on
>> >http://people.freebsd.org/~sos/ATA.
>> >
>> >I didn't get the crashdump (forgot to set dumpdev), but I did get 'ps'
>and
>> >'show lockedvnods' output from DDB.  The output is in the form of
>> >screenshots combined into a single .pdf which can be accessed here
>> >http://www.efinley.com/Binder1.pdf
>>
>> That shows a deadlock-to-root in your /dev/ar0s1a (presumably root)
>> filesystem.  The perl process (pid 487) has an exclusive lock on
>> the FS mountpoint - this is blocking 130 other processes.  Pid 487
>> is itself waiting on another filesystem lock (you can't determine
>> the actual lock tree without more poking around kernel memory).
>>
>> The vnode locks are held by processes:
>>  PID   name        waiting on
>>  487  perl       [ufs c3c1c1b4]
>>   57  syncer     [snaplk c535f500]  (holds 2 locks)
>>  476  perl       [ufs c87e4f1c]
>>  489  perl       [snaplk c535f500]  (holds 2 locks)
>> 3337  mksnap_ffs [getblk d77656f4]
>>
>> Looking through the process list, cron has started a "dump -L" which
>> is trying to create a filesystem snapshot.  That has wedged on
>> "getblk" (trying to perform physical disk I/O) and is probably the
>> root of your problem.  Nothing else is waiting on physical I/O.
>>
>> I'd say that your first guess was right:  This is a bug in the ATA
>> code and is probably a job for sos.
>
>I took the -L option off of my dump command in my daily dump script.  I've
>gone two days without locking up which is unusual.  I think that may be what
>was tickling the bug that was locking me up.

This is a filesystem lock problem, not an ATA driver problem.  I analyzed
it, and posted the results to -hackers last week, with the subject "snapshots
and innds".

The problem is that there is an invariant being broken in msync() -- Kirk
describes it fully in his reply to my message.

-- 
Steve Watt KD6GGD  PP-ASEL-IA          ICBM: 121W 56' 57.8" / 37N 20' 14.9"
 Internet: steve @ Watt.COM                         Whois: SW32
   Free time?  There's no such thing.  It just comes in varying prices...



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200505312347.j4VNlKgF064965>