From owner-freebsd-questions@FreeBSD.ORG Sun Apr 18 18:00:04 2004 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DD05116A4CE for ; Sun, 18 Apr 2004 18:00:03 -0700 (PDT) Received: from sendmail.leela.ws (209-193-28-35-cdsl-rb1.jnu.acsalaska.net [209.193.28.35]) by mx1.FreeBSD.org (Postfix) with ESMTP id C6CC243D3F for ; Sun, 18 Apr 2004 18:00:02 -0700 (PDT) (envelope-from pgiessel@mac.com) Received: from 192.168.0.8 ([192.168.0.8]) by sendmail.leela.ws (8.12.7/8.12.2) with ESMTP id i3J0xxgt018232 for ; Sun, 18 Apr 2004 17:00:01 -0800 (AKDT) Date: Sun, 18 Apr 2004 16:59:59 -0800 From: "Peter A. Giessel" To: freebsd-questions@freebsd.org X-Priority: 3 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable X-Mailer: Mailsmith 2.1.1 (Blindsider) Subject: vinum drives crash in 5.2.1 but work in 4.9 X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Apr 2004 01:00:04 -0000 I have a rather large (ok, I'm insane, its that large) Vinum array, which works fine in 4.9, but crashes in 5.2.1. I don't think its vinum's fault, but I could be wrong. My question is: any ideas as to why the drives crash when accessed and can't be labeled (other than my boot drive) in 5.2.1, but work fine in 4.9? More info about my setup follows: The array works, mounts, I can read/write to it and everything in 4.9 just fine, but when I try it in 5.2.1, it crashes when it tries to access the disks. (btw, everything is backed up and I can wipe these drives if need be). I'm trying to get this to work in 5.x because Samba 3 needs 5.x for some features to work. Anyway, when I typed "vinum start" at the root prompt in 5.2.1 I got the following panic: _______________________________________________ =46atal trap 18: interger divide fault while in kernel mode cpuid =3D 0; apic id =3D 00 instruction pointer =3D 0x8:0xc07ecd3b stack pointer =3D 0x10:0xe8181954 frame pointer =3D 0x10:0xe81819d4 code segment =3D base 0x0, limit 0xfffff, type 0x1b =3D DPL 0, pres 1, def32 1, gran 1 processor eflags =3D interrupt enabled, resume, IOPL =3D 0 current process =3D 562 (vinum) trap number =3D 18 panic: integer divide fault cpuid =3D 0; syncing disks, bufffers remaining... 1261 panic: bremfree: removing a buffer not on a queue cpuid =3D 0; Uptime: 1m24s _______________________________________________ when i rebooted and just typed "vinum" at the root prompt, the following is the output I captured while I was ssh'd into the machine: _______________________________________________ vinum -> l 0 drives: 0 volumes: 0 plexes: 0 subdisks: vinum -> start vinum -> l 12 drives: D four State: up /dev/ad19s1h A: 47/190779 MB (0%) D three State: up /dev/ad18s1h A: 47/190843 MB (0%) D two State: up /dev/ad17s1h A: 0/190732 MB (0%) D one State: up /dev/ad16s1h A: 0/190732 MB (0%) D eleven State: up /dev/ad13s1h A: 47/190779 MB (0%) D five State: up /dev/ad12s1h A: 47/190843 MB (0%) D eight State: up /dev/ad11s1h A: 47/190779 MB (0%) D nine State: down /dev/ad10s1h A: 47/190843 MB (0%) D seven State: up /dev/ad9s1h A: 0/190732 MB (0%) D six State: up /dev/ad8s1h A: 0/190732 MB (0%) D twelve State: up /dev/ad5s1h A: 0/190732 MB (0%) D ten State: up /dev/ad4s1h A: 47/190843 MB (0%) 1 volumes: V array State: up Plexes: 2 Size: 931 GB 2 plexes: P array.p0 R5 State: up Subdisks: 6 Size: 931 GB P array.p1 R5 State: degraded Subdisks: 6 Size: 931 GB 12 subdisks: S array.p0.s0 State: up D: one Size: 186 GB S array.p0.s1 State: up D: two Size: 186 GB S array.p0.s2 State: up D: three Size: 186 GB S array.p0.s3 State: up D: four Size: 186 GB S array.p0.s4 State: up D: five Size: 186 GB S array.p0.s5 State: up D: eleven Size: 186 GB S array.p1.s0 State: up D: six Size: 186 GB S array.p1.s1 State: up D: seven Size: 186 GB S array.p1.s2 State: up D: eight Size: 186 GB S array.p1.s3 State: crashed D: nine Size: 186 GB S array.p1.s4 State: up D: ten Size: 186 GB S array.p1.s5 State: up D: twelve Size: 186 GB vinum -> start nine vinum -> l 12 drives: D four State: up /dev/ad19s1h A: 47/190779 MB (0%) D three State: down /dev/ad18s1h A: 47/190843 MB (0%) D two State: up /dev/ad17s1h A: 0/190732 MB (0%) D one State: down /dev/ad16s1h A: 0/190732 MB (0%) D eleven State: up /dev/ad13s1h A: 47/190779 MB (0%) D five State: down /dev/ad12s1h A: 47/190843 MB (0%) D eight State: down /dev/ad11s1h A: 47/190779 MB (0%) D nine State: down /dev/ad10s1h A: 190779/190779 MB (100%) D seven State: up /dev/ad9s1h A: 0/190732 MB (0%) D six State: up /dev/ad8s1h A: 0/190732 MB (0%) D twelve State: up /dev/ad5s1h A: 0/190732 MB (0%) D ten State: down /dev/ad4s1h A: 47/190843 MB (0%) [etc... snipped because this is getting really long. basically all the subdisks associated with the "down"'s changed to "crashed"] vinum -> stop vinum unloaded _______________________________________________ So not sure what to do next, I tried to look at some of the disk's labels: # bsdlabel ad4s1 bsdlabel: /dev/ad4s1 read: Input/output error but when I look at my boot drive: # bsdlabel ad0s1 # /dev/ad0s1: 8 partitions: # size offset fstype [fsize bsize bps/cpg] a: 524288 0 4.2BSD 2048 16384 32776=20 b: 4194304 524288 swap =20 c: 39102273 0 unused 0 0 # "raw" part, don't edit d: 524288 4718592 4.2BSD 2048 16384 32776=20 e: 524288 5242880 4.2BSD 2048 16384 32776=20 f: 33335105 5767168 4.2BSD 2048 16384 28552 =20 My dmesg output is as follows: _______________________________________________ #dmesg Copyright (c) 1992-2004 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. =46reeBSD 5.2.1-RELEASE #0: Mon Feb 23 20:45:55 GMT 2004 root@wv1u.btc.adaptec.com:/usr/obj/usr/src/sys/GENERIC Preloaded elf kernel "/boot/kernel/kernel" at 0xc0a35000. Preloaded elf module "/boot/kernel/acpi.ko" at 0xc0a35294. ACPI APIC Table: Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: AMD Athlon(tm) MP 2400+ (2000.08-MHz 686-class CPU) Origin =3D "AuthenticAMD" Id =3D 0x681 Stepping =3D 1 =46eatures=3D0x383fbff AMD Features=3D0xc0480000 real memory =3D 1073217536 (1023 MB) avail memory =3D 1033003008 (985 MB) ioapic0 irqs 0-23 on motherboard Pentium Pro MTRR support enabled npx0: [FAST] npx0: on motherboard npx0: INT 16 interface acpi0: on motherboard pcibios: BIOS version 2.10 Using $PIR table, 14 entries at 0xc00fdee0 acpi0: Power Button (fixed) acpi0: Sleep Button (fixed) Timecounter "ACPI-safe" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x8008-0x800b on acpi0 acpi_cpu0: on acpi0 acpi_cpu1: on acpi0 device_probe_and_attach: acpi_cpu1 attach returned 6 acpi_button0: on acpi0 pcib0: port 0x8080-0x80ff,0x8000-0x807f,0xcf8-0xcff iomem 0xd8000-0xdbfff on acpi0 pci0: on pcib0 agp0: port 0x1060-0x1063 mem 0xe8500000-0xe8500fff,0xec000000-0xefffffff at device 0.0 on pci0 pcib1: at device 1.0 on pci0 pci1: on pcib1 pci1: at device 5.0 (no driver attached) isab0: at device 7.0 on pci0 isa0: on isab0 atapci0: port 0xf000-0xf00f at device 7.1 on pci0 ata0: at 0x1f0 irq 14 on atapci0 ata0: [MPSAFE] ata1: at 0x170 irq 15 on atapci0 ata1: [MPSAFE] pci0: at device 7.3 (no driver attached) atapci1: port 0x1050-0x105f,0x1064-0x1067,0x1068-0x106f,0x1070-0x1073,0x1078-0x107f mem 0xe8020000-0xe80200ff irq 21 at device 9.0 on pci0 atapci1: [MPSAFE] ata2: at 0xe8020000 on atapci1 ata2: [MPSAFE] ata3: at 0xe8020000 on atapci1 ata3: [MPSAFE] em0: port 0x1000-0x103f mem 0xe8000000-0xe801ffff irq 23 at device 11.0 on pci0 em0: Speed:N/A Duplex:N/A pcib2: at device 16.0 on pci0 pci2: on pcib2 ohci0: mem 0xe8220000-0xe8220fff irq 19 at device 0.0 on pci2 usb0: OHCI version 1.0, legacy support usb0: SMM does not respond, resetting usb0: on ohci0 usb0: USB revision 1.0 uhub0: AMD OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 4 ports with 4 removable, self powered atapci2: port 0x3040-0x304f,0x3070-0x3073,0x3078-0x307f,0x3074-0x3077,0x3080-0x3087 mem 0xe8222000-0xe82220ff irq 16 at device 4.0 on pci2 atapci2: [MPSAFE] ata4: at 0xe8222000 on atapci2 ata4: [MPSAFE] ata5: at 0xe8222000 on atapci2 ata5: [MPSAFE] atapci3: port 0x3050-0x305f,0x3088-0x308b,0x3090-0x3097,0x308c-0x308f,0x3098-0x309f mem 0xe8222400-0xe82224ff irq 17 at device 5.0 on pci2 atapci3: [MPSAFE] ata6: at 0xe8222400 on atapci3 ata6: [MPSAFE] ata7: at 0xe8222400 on atapci3 ata7: [MPSAFE] atapci4: port 0x3060-0x306f,0x30a0-0x30a3,0x30a8-0x30af,0x30a4-0x30a7,0x30b0-0x30b7 mem 0xe8222800-0xe82228ff irq 18 at device 6.0 on pci2 atapci4: [MPSAFE] ata8: at 0xe8222800 on atapci4 ata8: [MPSAFE] ata9: at 0xe8222800 on atapci4 ata9: [MPSAFE] fxp0: port 0x3000-0x303f mem 0xe8200000-0xe821ffff,0xe8221000-0xe8221fff irq 18 at device 8.0 on pci2 fxp0: Ethernet address 00:e0:81:25:02:ab miibus0: on fxp0 inphy0: on miibus0 inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto atkbdc0: port 0x64,0x60 irq 1 on acpi0 atkbd0: flags 0x1 irq 1 on atkbdc0 kbd0 at atkbd0 sio0 port 0x3f8-0x3ff irq 4 on acpi0 sio0: type 16550A sio1 port 0x2f8-0x2ff irq 3 on acpi0 sio1: type 16550A ppc0 port 0x778-0x77f,0x378-0x37f irq 7 drq 3 on acpi0 ppc0: SMC-like chipset (ECP/EPP/PS2/NIBBLE) in COMPATIBLE mode ppc0: FIFO with 16/16/9 bytes threshold ppbus0: on ppc0 plip0: on ppbus0 lpt0: on ppbus0 lpt0: Interrupt-driven port ppi0: on ppbus0 fdc0: port 0x3f7,0x3f0-0x3f5 irq 6 drq 2 on acpi0 acpi_cpu1: on acpi0 device_probe_and_attach: acpi_cpu1 attach returned 6 orm0: