Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 18 Oct 2000 08:05:56 -0700 (PDT)
From:      rasmith@aristotle.tamu.edu
To:        freebsd-gnats-submit@FreeBSD.org
Subject:   kern/22086: DMA errors during intensive disk activity on vinum volume
Message-ID:  <20001018150556.5946F37B4E5@hub.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         22086
>Category:       kern
>Synopsis:       DMA errors during intensive disk activity on vinum volume
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Wed Oct 18 08:10:01 PDT 2000
>Closed-Date:
>Last-Modified:
>Originator:     Robin Smith
>Release:        4.1.1-RELEASE #1
>Organization:
Dept. of Philosophy, Texas A&M University
>Environment:
FreeBSD test-drive.tamu.edu 4.1.1-RELEASE FreeBSD 4.1.1-RELEASE #1: Tue Oct 10 18:09:51 CDT 2000     root@test-drive.tamu.edu:/usr/src/sys/compile/TEST-DRIVE  i386
>Description:
System is a dual-PIII, 512MB, 4 AT disks (see below); a raid5 vinum
volume is mounted as /usr (striped across all four disks).  Heavy access
to a vinum raid5 device causes DMA errors; if the system does not 
recover (i.e. a sync fails after multiple attempts), a reboot is triggered.  
The initial error message is of the following form:

Oct 18 08:36:19 test-drive /kernel: ata0-slave: zero length DMA transfer attempted

This message may mention either controller.  If there is no recovery,
then either the terminal hangs indefinitely or the system restarts.

From a hang caused by 'du /usr' (which worked its way to /usr/src/games
before giving up:

Oct 18 08:36:19 test-drive /kernel: ata0-slave: zero length DMA transfer attempted
Oct 18 08:36:29 test-drive /kernel: ad1: WRITE command timeout - resetting
Oct 18 08:36:29 test-drive /kernel: ata0: resetting devices .. done
Oct 18 08:36:29 test-drive /kernel: ata0-slave: zero length DMA transfer attempted

The error message "zero length DMA transfer attempted" is found in 
/
>How-To-Repeat:
any intensive disk activity on the vinum raid5 volume triggers it.  A good
choice is 'du /usr'.

Here is some tracking information on the error message "zero length DMA transfer"

512: grep -C10 "zero length DMA transfer" /usr/src/sys/dev/ata/ata-dma.c 
ata_dmasetup(struct ata_softc *scp, int32_t device, 
             int8_t *data, int32_t count, int32_t flags)
###                                ^^^^  ARGUMENT 4
{
    struct ata_dmaentry *dmatab;
    u_int32_t dma_count, dma_base;
    int i = 0;

    if (((uintptr_t)data & 1) || (count & 1))
        return -1;
##### FOR COUNT=0
    if (!count) {
        ata_printf(scp, device, "zero length DMA transfer attempted\n");
        return -1;
    }
    
    dmatab = scp->dmatab[ATA_DEV(device)];
    dma_base = vtophys(data);
    dma_count = min(count, (PAGE_SIZE - ((uintptr_t)data & PAGE_MASK)));
    data += dma_count;
    count -= dma_count;

    while (count) {


Called by:

516: grep dmasetup /usr/src/sys/dev/ata/* |less
/usr/src/sys/dev/ata/ata-all.h:int32_t ata_dmasetup(struct ata_softc *, int32_t,
 int8_t *, int32_t, int32_t);
/usr/src/sys/dev/ata/ata-disk.c:            !ata_dmasetup(adp->controller, adp->
unit, 
/usr/src/sys/dev/ata/ata-dma.c:ata_dmasetup(struct ata_softc *scp, int32_t devic
e, 
/usr/src/sys/dev/ata/ata-dma.c:ata_dmasetup(struct ata_softc *scp, int32_t devic
e,
/usr/src/sys/dev/ata/atapi-all.c:       !ata_dmasetup(atp->controller, atp->unit
,


In ad_transfer(struct ad_request *request):


524: grep -C6 -n dmasetup /usr/src/sys/dev/ata/ata-disk.c
394-
395-    devstat_start_transaction(&adp->stats);
396-
397-    /* does this drive & transfer work with DMA ? */
398-    request->flags &= ~ADR_F_DMA_USED;
399-    if ((adp->controller->mode[ATA_DEV(adp->unit)] >= ATA_DMA) &&
400:        !ata_dmasetup(adp->controller, adp->unit, 
401-                      (void *)request->data, request->bytecount,
#######                                          ^^^^^^^^^ = count ####
402-                      (request->flags & ADR_F_READ))) {
403-        request->flags |= ADR_F_DMA_USED;
404-        cmd = request->flags&ADR_F_READ ? ATA_C_READ_DMA : ATA_C_WRITE_DMA;
405-        request->currentsize = request->bytecount;
406-    }


The only call to ata_dmasetup in ata-disk is from ad_transfer.

There are two calls to ad_transfer:

At line 570:

    else {
        request->bytecount -= request->currentsize;
        request->donecount += request->currentsize;
        if (request->bytecount > 0) {
            ad_transfer(request);
            return ATA_OP_CONTINUES;
        }

At line 290:

     while (request.bytecount > 0) {
            ad_transfer(&request);


Both are protected by comparisons; bytecount==0 should never happen.

root@test-drive [/usr/home/rasmith]
525: grep -C6 -n dmasetup /usr/src/sys/dev/ata/atapi-all.c
254-    if ((atp->controller->mode[ATA_DEV(atp->unit)] >= ATA_DMA) &&
255-    (request->ccb[0] == ATAPI_READ ||
256-     request->ccb[0] == ATAPI_READ_BIG ||
257-     ((request->ccb[0] == ATAPI_WRITE ||
258-       request->ccb[0] == ATAPI_WRITE_BIG) &&
259-      !(atp->controller->flags & ATA_ATAPI_DMA_RO))) &&
260:    !ata_dmasetup(atp->controller, atp->unit,
261-                  (void *)request->data, request->bytecount,
###########                                  ^^^^^^^^^^^^^ = count ####
262-                  request->flags & ATPR_F_READ)) {
263-    request->flags |= ATPR_F_DMA_USED;
264-    }
265-
266-    /* start ATAPI operation */

518: cat /var/log/mount.today 
/dev/ad0s1a     /       ufs rw  1 1
/dev/ad0s1e     /mnt/tmp        ufs rw  2 2
/dev/ad0s1f     /var    ufs rw  2 2
/dev/vinum/raid5        /usr    ufs rw  2 2
procfs  /proc   procfs rw       0 0

520: cat /var/log/dmesg.today

Copyright (c) 1992-2000 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD 4.1.1-RELEASE #1: Tue Oct 10 18:09:51 CDT 2000
    root@test-drive.tamu.edu:/usr/src/sys/compile/TEST-DRIVE
Timecounter "i8254"  frequency 1193182 Hz
CPU: Pentium III/Pentium III Xeon/Celeron (501.14-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0x683  Stepping = 3
  Features=0x383fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV
,PAT,PSE36,MMX,FXSR,XMM>
real memory  = 536870912 (524288K bytes)
avail memory = 518524928 (506372K bytes)
Programming 24 pins in IOAPIC #0
IOAPIC #0 intpin 2 -> irq 0
IOAPIC #0 intpin 16 -> irq 5
IOAPIC #0 intpin 17 -> irq 11
IOAPIC #0 intpin 18 -> irq 10
IOAPIC #0 intpin 19 -> irq 9
FreeBSD/SMP: Multiprocessor motherboard
 cpu0 (BSP): apic id:  0, version: 0x00040011, at 0xfee00000
 cpu1 (AP):  apic id:  1, version: 0x00040011, at 0xfee00000
 io0 (APIC): apic id:  2, version: 0x00170011, at 0xfec00000
Preloaded elf kernel "kernel" at 0xc041d000.
Pentium Pro MTRR support enabled
md0: Malloc disk
npx0: <math processor> on motherboard
npx0: INT 16 interface
pcib0: <Intel 82443GX host to PCI bridge> on motherboard
pci0: <PCI bus> on pcib0
pcib2: <Intel 82443GX (440 GX) PCI-PCI (AGP) bridge> at device 1.0 on pci0
pci1: <PCI bus> on pcib2
pci1: <Trident model 9750 VGA-compatible display device> at 0.0 irq 11
isab0: <Intel 82371AB PCI to ISA bridge> at device 7.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel PIIX4 ATA33 controller> port 0xffa0-0xffaf at device 7.1 on pci0
ata0: at 0x1f0 irq 14 on atapci0
ata1: at 0x170 irq 15 on atapci0
uhci0: <Intel 82371AB/EB (PIIX4) USB controller> port 0xef80-0xef9f irq 9 at devic
e 7.2 on pci0
usb0: <Intel 82371AB/EB (PIIX4) USB controller> on uhci0
usb0: USB revision 1.0
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
Timecounter "PIIX"  frequency 3579545 Hz
chip1: <Intel 82371AB Power management controller> port 0x440-0x44f at device 7.3 
on pci0
xl0: <3Com 3c905B-TX Fast Etherlink XL> port 0xec00-0xec7f mem 0xfebfef80-0xfebfef
ff irq 5 at device 15.0 on pci0
xl0: Ethernet address: 00:01:02:be:6d:85
miibus0: <MII bus> on xl0
xlphy0: <3Com internal media interface> on miibus0
xlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
xl1: <3Com 3c905B-TX Fast Etherlink XL> port 0xe480-0xe4ff mem 0xfebfef00-0xfebfef
7f irq 11 at device 16.0 on pci0
xl1: Ethernet address: 00:01:02:be:6d:41
miibus1: <MII bus> on xl1
xlphy1: <3Com internal media interface> on miibus1
xlphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
ahc0: <Adaptec 29160 Ultra160 SCSI adapter> port 0xe800-0xe8ff mem 0xfebff000-0xfe
bfffff irq 10 at device 18.0 on pci0
aic7892: Wide Channel A, SCSI Id=7, 32/255 SCBs
pcib1: <Intel 82443GX host to AGP bridge> on motherboard
pci2: <PCI bus> on pcib1
fdc0: <NEC 72065B or clone> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
fdc0: FIFO enabled, 8 bytes threshold
atkbdc0: <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
atkbd0: <AT Keyboard> flags 0x1 irq 1 on atkbdc0
kbd0 at atkbd0
psm0: <PS/2 Mouse> irq 12 on atkbdc0
psm0: model MouseMan+, device ID 0
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 16550A
sio1 at port 0x2f8-0x2ff irq 3 on isa0
sio1: type 16550A
ppc0: <Parallel port> at port 0x378-0x37f irq 7 on isa0
lip0: <PLIP network interface> on ppbus0
lpt0: <Printer> on ppbus0
lpt0: Interrupt-driven port
ppi0: <Parallel I/O> on ppbus0
APIC_IO: Testing 8254 interrupt delivery
APIC_IO: routing 8254 via IOAPIC #0 intpin 2
SMP: AP CPU #1 Launched!
ad0: 29314MB <IBM-DTLA-307030> [59560/16/63] at ata0-master using UDMA33
ad1: 29314MB <IBM-DTLA-307030> [59560/16/63] at ata0-slave using UDMA33
ad2: 29314MB <IBM-DTLA-307030> [59560/16/63] at ata1-master using UDMA33
ad3: 29314MB <IBM-DTLA-307030> [59560/16/63] at ata1-slave using UDMA33
Waiting 15 seconds for SCSI devices to settle
Mounting root from ufs:/dev/ad0s1a
WARNING: / was not properly dismounted
vinum: loaded
vinum: reading configuration from /dev/ad3s1e
vinum: updating configuration from /dev/ad2s1e
vinum: updating configuration from /dev/ad1s1e
vinum: updating configuration from /dev/ad0s1h
cd0 at ahc0 bus 0 target 4 lun 0
cd0: <TOSHIBA CD-ROM XM-6401TA 1001> Removable CD-ROM SCSI-2 device 
cd0: 20.000MB/s transfers (20.000MHz, offset 16)
cd0: cd present [328499 x 2048 byte records]


>Fix:
None known

>Release-Note:
>Audit-Trail:
>Unformatted:


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20001018150556.5946F37B4E5>