Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 11 Jul 2008 00:59:33 -0700
From:      Jo Rhett <hostmaster@netconsonance.com>
To:        FreeBSD Stable <freebsd-stable@FreeBSD.org>
Subject:   how to get more logging from GEOM?
Message-ID:  <C278655C-4FFB-4A8E-9501-2B84283E324D@netconsonance.com>

next in thread | raw e-mail | index | archive | help
About 10 days ago one of my personal machines started hanging at  
random.  This is the first bit of instability I've ever experienced on  
this machine (2+ years running)

FreeBSD triceratops.netconsonance.com 6.2-RELEASE-p11 FreeBSD 6.2- 
RELEASE-p11 #0: Wed Feb 13 06:44:57 UTC 2008     root@i386-builder.daemonology.net 
:/usr/obj/usr/src/sys/GENERIC  i386

After about 2 weeks of watching it carefully I've learned almost  
nothing.  It's not a disk failure (AFAIK) it's not cpu overheat (now  
running healthd without complaints) it's not based on any given  
network traffic...  however it does appear to accompany heavy cpu/disk  
activity.  It usually dies when indexing my websites at night (but not  
always) and it sometimes dies when compiling programs.   Just heavy  
disk isn't enough to do the job, as backups proceed without  
problems.   Heavy cpu by itself isn't enough to do it either.  But if  
I start compiling things and keep going a while, it will eventually  
hang.

My best guess is that geom is having a problem and locking up.   
There's no log entry before failure to back this idea up, but I think  
this because during boot I see the following:

ad0: 286168MB <Seagate ST3300622A 3.AAH> at ata0-master UDMA100
GEOM_MIRROR: Device gm0 created (id=575427344).
GEOM_MIRROR: Device gm0: provider ad0 detected.
ad1: 286168MB <Seagate ST3300622A 3.AAH> at ata0-slave UDMA100
GEOM_MIRROR: Device gm0: provider ad1 detected.
GEOM_MIRROR: Device gm0: provider ad1 activated.
GEOM_MIRROR: Device gm0: provider mirror/gm0 launched.
GEOM_MIRROR: Device gm0: rebuilding provider ad0.

Every time it is rebuilding ad0.   Every single boot in the last two  
weeks.

Is this any way to get more logging from geom, to confirm or deny this  
theory?

Is there anything else I should be looking at?

FWIW, this never happened before the p11 patch to 6.2.   I don't know  
if that is related or not.

Obviously, I can't upgrade to 6.3 if heavy cpu/disk activity kills the  
system.

No, I don't have any other insights.  I'm not prone to posting "duh  
help me please!" posts, so I'm quite a bit frustrated by this one.

-- 
Jo Rhett
Net Consonance : consonant endings by net philanthropy, open source  
and other randomness





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?C278655C-4FFB-4A8E-9501-2B84283E324D>