Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 13 Jan 2013 18:13:58 +0100
From:      Marius Strobl <marius@alchemy.franken.de>
To:        Alexander Motin <mav@FreeBSD.org>
Cc:        Kurt Lidl <lidl@pix.net>, freebsd-sparc64@FreeBSD.org
Subject:   Re: smartmontools panics 9.1-RELEASE on sunfire 240
Message-ID:  <20130113171358.GL26039@alchemy.franken.de>
In-Reply-To: <50F2E57C.1090305@FreeBSD.org>
References:  <20130104051914.GA22613@pix.net> <20130104235336.GB37999@alchemy.franken.de> <20130105013224.GA31361@pix.net> <20130105015242.GB26039@alchemy.franken.de> <20130106021923.GE1410@funkthat.com> <20130106031245.GC26039@alchemy.franken.de> <50EBE936.4000806@FreeBSD.org> <20130113155516.GK26039@alchemy.franken.de> <50F2E57C.1090305@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, Jan 13, 2013 at 06:49:00PM +0200, Alexander Motin wrote:
> On 13.01.2013 17:55, Marius Strobl wrote:
> > On Tue, Jan 08, 2013 at 11:39:02AM +0200, Alexander Motin wrote:
> >> On 06.01.2013 05:12, Marius Strobl wrote:
> >>> On Sat, Jan 05, 2013 at 06:19:23PM -0800, John-Mark Gurney wrote:
> >>>> Marius Strobl wrote this message on Sat, Jan 05, 2013 at 02:52 +0100:
> >>>>> On Fri, Jan 04, 2013 at 08:32:24PM -0500, Kurt Lidl wrote:
> >>>>>> On Sat, Jan 05, 2013 at 12:53:36AM +0100, Marius Strobl wrote:
> >>>>>>> Uhm, probably an userland buffer which isn't even 16-bit aligned.
> >>>>>>> If that's the cause, the attached patch hopefully should at least
> >>>>>>> prevent the panic. If it does, smartmontools still need to be fixed
> >>>>>>> though.
> >>>>>>
> >>>>>> You patch prevents the panic from happening.
> >>>>>> When I try to start smartd now, it reports:
> >>>>>>
> >>>>>> root@host-98: /usr/local/etc/rc.d/smartd onestart
> >>>>>> Starting smartd.
> >>>>>> smartd: cam_send_ccb: Invalid argument
> >>>>>> /usr/local/etc/rc.d/smartd: WARNING: failed to start smartd
> >>>>>>
> >>>>>> I had updated the kernel on the machine to 9-STABLE, and
> >>>>>> verified that without this patch, the crash still happened with
> >>>>>> a 9-STABLE kernel, in addition to 9.1-RELEASE kernel.
> >>>>>>
> >>>>>> My kernel now identifies itself as:
> >>>>>> FreeBSD 9.1-STABLE (GENERIC) #1 r245044:245048M: Fri Jan  4 20:19:50 EST 2013
> >>>>>>
> >>>>>> -Kurt
> >>>>>>
> >>>>>>> Index: cam_periph.c
> >>>>>>> ===================================================================
> >>>>>>> --- cam_periph.c	(revision 245046)
> >>>>>>> +++ cam_periph.c	(working copy)
> >>>>>>> @@ -744,6 +744,9 @@ cam_periph_mapmem(union ccb *ccb, struct cam_perip
> >>>>>>>  		if ((ccb->ccb_h.flags & CAM_DIR_MASK) == CAM_DIR_NONE)
> >>>>>>>  			return(0);
> >>>>>>>  
> >>>>>>> +		if ((uintptr_t)ccb->ataio.data_ptr % sizeof(uint16_t) != 0)
> >>>>>>> +			return (EINVAL);
> >>>>>>> +
> >>>>>>>  		data_ptrs[0] = &ccb->ataio.data_ptr;
> >>>>>>>  		lengths[0] = ccb->ataio.dxfer_len;
> >>>>>>>  		dirs[0] = ccb->ccb_h.flags & CAM_DIR_MASK;
> >>>>>>
> >>>>>
> >>>>> Alexander, are you okay with this approach or do you have a better
> >>>>> idea how to handle this? In any case, it doesn't seem to make sense
> >>>>> to teach the kernel how to cope with arbitrarily aligned buffers for
> >>>>> ATA.
> >>>>
> >>>> Shouldn't we make it dependant on the __NO_STRICT_ALIGNMENT define so
> >>>> that it won't immediately break other arches?
> >>>>
> >>>
> >>> No, not doing so tremendously helps ensuring that the software is
> >>> properly written (apart from compact flash, ATA devices really
> >>> only support 16-bit and 32-bit accesses) and judging the history
> >>> of the patches in the smartmontools port it apparently already
> >>> has to care about proper alignment for SCSI anyway. It would also
> >>> not be the first time the smartmontools port is blown out of the
> >>> water :)
> >>
> >> That patch would do the trick, but I can't say that I like it. Yes,
> >> there are many things tied to 16-bit in ATA world: both legacy ATA DMA
> >> and AHCI require 16-bit aligned data, legacy ATA DMA also require even
> >> transfer size. On the other side, Silicon Image siis(4) chips have no
> >> such limitations, that makes it chip-specific, not system-specific
> >> problem.
> > 
> > My reading of 3.2.9 of the ATA-7 specification actually is different
> > as it says that apart from CFA devices, both DMA and PIO transfers
> > are 16-bit and generally there's no guarantee that there's anything
> > like a DMA engine or busdma code that can handle unaligned transfers
> > in between, especially for PIO.
> 
> Yes, transfers are 16-bit, but odd-sized transfers are zero-padded by
> transmitter. PIO may drop extra byte, but with DMA that extra received
> byte of padding may overwrite something important.
> 
> >> Also for other DMA cases alignment is handled by busdma code
> >> via bounce buffers -- not free, but transparent for user.
> > 
> > The main purpose of DMA bounce buffers is address range translation
> > in case a device can't access the whole system memory, thus the
> > sparc64 busdma code doesn't use bounce buffers as that job is done
> > by the IOMMU. AFAICT, no MD bus_dmamap_load_uio() implementation
> > currently handles unaligned transfers as this is not just about
> > unaligned buffers in RAM but also the appropriate cache flushing
> > necessary for these and just happens to work on x86.
> 
> Hmm. Pity.
> 
> >> For PIO
> >> transfers data alignment is just outside of the ATA specification and
> >> depends on implementation.
> > 
> > I don't agree here as it is hard to imagine how an ATA controller
> > should be able to handle unaligned PIO (assuming your statement
> > regarding siis(4) driven chips refers to DMA only and unaligned PIO
> > hasn't actually been implemented in some ATA controllers).
> 
> siis(4) uses DMA for all data transfers, including PIO emulation.

Ah, PIO emulation is something I had not thought of.

> And it
> has no address restrictions.  Speaking about legacy ATA controller PIO,
> what it the problem to read 16-bit value from aligned controller port
> and then write it to RAM on odd address in two halves? With PIO
> performance is out of question any way.

I see no real technical problem in doing so, it's just unpleasant.

> 
> >> I think better solution would be to implement support for misaligned PIO
> >> in ata(4) driver. It would cost just about dozen or two lines and should
> >> be quite trivial. Probably I should do it last time this alignment
> >> problem appeared.
> > 
> > In the end I don't really care as long as it works but from a sheer
> > performance point of view it seems wrong to bounce PIO in the kernel
> > if it is trivial for applications to care about the alignment required
> > by the protocol they want to talk in the first place.
> 
> I don't mind if application will manage some alignment, but so far it
> doesn't.
> 

That's hardly an excuse, it seems the majority of applications today
aren't written with support for architectures with strict alignment
requirements in mind and only tested on x86 :) AFAICT cdrtools in
contrast are written with that in mind.

Marius




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130113171358.GL26039>