Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 01 Jun 1998 21:06:24 -0400 (EDT)
From:      Simon Shapiro <shimon@simon-shapiro.org>
To:        tcobb <tcobb@staff.circle.net>
Cc:        "freebsd-current@freebsd.org" <freebsd-current@FreeBSD.ORG>, "freebsd-scsi@freebsd.org" <freebsd-scsi@FreeBSD.ORG>
Subject:   RE: DPT Redux
Message-ID:  <XFMail.980601210624.shimon@simon-shapiro.org>
In-Reply-To: <509A2986E5C5D111B7DD0060082F32A402FAE8@freya.circle.net>

next in thread | previous in thread | raw e-mail | index | archive | help
I simply am tired of this thread.
There will be no more response from me on this isuue.

The author displays complete lack of understanding of how the FreeBSD
kernel, the SCSI abstraction layer, the DPT driver, and the DPT firmware
operate and interact.  While this lack of knowledge is understandable, the
attempt to diagnose the problem in this context is irritating.

Simon

On 30-May-98 tcobb wrote:
> I won't respond to each of Simon's many emails over the past 24 hours,
> simply because most of them were out-of-context reactions to a thread
> that grew from my original DPT post yesterday.
> 
> Instead, I think that the most productive thing is to provide a bit more
> of the information requested.
> 
> The system is using a single PM3334UW/2 with drives configured in the
> following logical arrays:
> 
> 2 1GB drives as RAID-1        (sd0)
> 7 4GB drives as RAID-5  (sd1)
> 1 4GB hot swap 
> 
> Event #1:
> 1 of the RAID-5 drives fails, DPT hardware begins to auto-rebuild with
> the hot swap drive
> DPT driver freezes access to sd1, system remains running but access to
> sd1 hangs
> 
> I shutdown and rebooted machine  (SYNC failed on shutdown)
> Allowed FreeBSD to boot, it returned the following for sd1
> sd1: <DPT RAID-5 07M0> type 0 fixed SCSI 2
> sd1: Direct-Access 0MB (1 512 byte sectors)
> 
> Then, system continued booting and finally panic'd with a "Page Fault in
> Supervisor Mode" error prior to mounting drives.
> 
> I then booted the system with a DOS floppy, used DPTmgr to examine
> array.  The array was complete, but in degraded mode.  It had begun
> rebuilding itself, which specs say can happen in the background while
> other accesses are going on.  I tested redundancy info on the array AND
> tested random reads on the array -- all succeeded.  
> 
> So, I exited DPTmgr, and tried booting back to FreeBSD, same problem as
> above occurred (0MB 1 sector, panic).  Then, I rebooted into DOS and let
> the DPT card run its rebuild from there.  It completed about 1.5 hours
> later, and showed the array optimal. 
> 
> I then rebooted into FreeBSD which showed the correct info again.
> 
> Event #2:
> This was the next day.  Hard drive fails in array (this was the ex-hot
> swap from above).  This leaves the array with no hotswap to insert, but
> no data lost.  The array is now again in degraded mode.  The card
> screams bloody murder.  HOWEVER, the DPT driver does NOT hang on access
> to the sd1 partition.  I successfully shutdown the machine (SYNC
> succeeded this time).  I insert a new harddrive into the array so that
> the DPT hardware will begin rebuilding with this new drive.  On reboot,
> FreeBSD showed the same results as above (0MB, 1 sector, panic).
> Rebooting back to DOS and running DPTmgr showed that the array was in
> degraded mode, but that no data was lost and that redundancy information
> was all there.  It automatically began rebuilding with the new drive.  I
> tested rebooting into FreeBSD, same results (0/1/panic).  Rebooted back
> to DOS, allowed the hardware to finish its rebuild (1.5 hours), rebooted
> to FreeBSD and it showed the correct results.
> 
> 
> So, here's the summary for those of you who've stayed with me.
> 
> With RAID-5 and a HOT SWAP drive, a single drive failure caused the DPT
> driver in FreeBSD to hang on access to the partition.  This appears to
> be because DPT was doing a background rebuild automatically.
> 
> With RAID-5 and NO hot swap drive, a single drive failure does NOT cause
> the DPT driver in FreeBSD to hang on access to the partition.  This
> appears to be because DPT was NOT doing a background rebuild -- there
> being no drives to rebuild into.
> 
> With RAID-5 and a new drive to rebuild on, the DPT hardware begins
> automatic rebuilds of the array.  However, in these conditions the DPT
> driver (or other FreeBSD component) does not correctly sense the size
> information and panics the kernel during bootup.  This symptom goes away
> after the rebuild is complete.  This symptom does not appear when in DOS
> under the same circumstances.  DOS DPTmgr checks show the array of the
> correct size.  BIOS bootup screen for DPT shows the array of the correct
> size. 
> 
> The super-summary is that it appears the the DPT driver or other FreeBSD
> code component is not correctly coordinating with the DPT hardware (or
> sensing status properly) when the DPT hardware is doing a background
> rebuild of the array.
> 
> This array has been running non-stop since November 1997.  Cabling is
> good.  Active terminators and custom cables created by Granite are used.
> Seagate and Micropolis drives are used.  The RAID-5 array is in an
> external rackmount case.
> 
> -Troy Cobb
>  Circle Net, Inc.
>  http://www.circle.net
> 
> 
> Here's the dmesg ouput, trimmed to show relevant data.
> 
> FreeBSD 3.0-CURRENT #0: Sun May 24 04:30:04 EDT 1998
>     root@kali.circle.net:/usr/src/sys/compile/BENZAITEN-4
> CPU: Pentium (232.67-MHz 586-class CPU)
>   Origin = "GenuineIntel"  Id = 0x543  Stepping=3
>   Features=0x8001bf<FPU,VME,DE,PSE,TSC,MSR,MCE,CX8,MMX>
> real memory  = 134217728 (131072K bytes)
> avail memory = 128147456 (125144K bytes)
> DEVFS: ready for devices
> DPT:  RAID Manager driver, Version 1.0.5
> Probing for devices on PCI bus 0:
> DPT:  PCI SCSI HBA Driver, version 1.4.2
> chip0: <Intel 82437VX PCI cache memory controller> rev 0x02 on pci0.0.0
> chip1: <Intel 82371SB PCI to ISA bridge> rev 0x01 on pci0.7.0
> dpt0: <DPT Caching SCSI RAID Controller> rev 0x02 int a irq 9 on
> pci0.20.0
> dpt0: DPT type 3, model PM3334UW firmware 07M0, Protocol 0 
>       on port 6310 with Write-Back cache.  LED = 0000 0000 
> dpt0: Enabled Options:
>       Recover Lost Interrupts
>       Collect Metrics
>       Optimize CPU Cache
> dpt0: waiting for scsi devices to settle
> scbus0 at dpt0 bus 0
> dpt0: Initializing Lost IRQ Timer
> sd0 at scbus0 target 0 lun 0
> sd0: <DPT RAID-1 07M0> type 0 fixed SCSI 2
> sd0: Direct-Access 1029MB (2109328 512 byte sectors)
> dpt0: waiting for scsi devices to settle
> scbus1 at dpt0 bus 1
> sd1 at scbus1 target 2 lun 0
> sd1: <DPT RAID-5 07M0> type 0 fixed SCSI 2
> sd1: Direct-Access 20503MB (41990720 512 byte sectors)
> 
> To Unsubscribe: send mail to majordomo@FreeBSD.org
> with "unsubscribe freebsd-scsi" in the body of the message

---


Sincerely Yours, 

Simon Shapiro                                           Shimon@Simon-Shapiro.ORG
                                                        770.265.7340

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-scsi" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?XFMail.980601210624.shimon>