From owner-freebsd-newbies@FreeBSD.ORG Fri Apr 23 17:00:25 2004 Return-Path: Delivered-To: freebsd-newbies@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id EE4CF16A4CE for ; Fri, 23 Apr 2004 17:00:24 -0700 (PDT) Received: from smtp-out4.blueyonder.co.uk (smtp-out4.blueyonder.co.uk [195.188.213.7]) by mx1.FreeBSD.org (Postfix) with ESMTP id DDD2643D31 for ; Fri, 23 Apr 2004 17:00:23 -0700 (PDT) (envelope-from jfm@blueyonder.co.uk) Received: from lexx ([82.37.145.193]) by smtp-out4.blueyonder.co.uk with Microsoft SMTPSVC(5.0.2195.5600); Sat, 24 Apr 2004 01:00:25 +0100 From: John Murphy To: newbies@freebsd.org Date: Sat, 24 Apr 2004 01:00:22 +0100 Message-ID: <523j80dedv1lqpnmpphkfv80vguaqh2hp4@4ax.com> X-Mailer: Forte Agent 1.93/32.576 English (American) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-OriginalArrivalTime: 24 Apr 2004 00:00:25.0571 (UTC) FILETIME=[25475330:01C4298F] Subject: New day, new drive (Was 'going small') X-BeenThere: freebsd-newbies@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: jfm@blueyonder.co.uk List-Id: Gathering place for new users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 24 Apr 2004 00:00:25 -0000 I said "It's pointless to have such a nice new thing spoilt by a clunky old 1.3GByte disk drive which is so fat I can't get the lid on" She said "Well what about that one you said you could use?" "It's broken" I exclaimed. "You are supposed to be able to fix things" She continued... Anyway, the 40G Fujitsu arrived double bubble wrapped today so I went ahead and attempted to re-install 5.2.1 and got very similar errors to the ones I saw on the "broken" drive. The messages were the same but the locations were slightly different. I remembered something about creating a small =46AT16 slice helping in such situations so I verified all the partitions/slices with Ranish Partition Manager[1] and then deleted all the partition records and created a 64K or so partition. No improvement. I'm doing this on my main (only) desktop therefore no resources available for consultation without using hers. No means of copying error messages[2] but it was something like: ad0:WARNING - write UDMA ICRC error ad0:FAILURE - write dma status=3D51 error 84 ad0:FAILURE - write dma status=3D51LBA=3D65 (to) ad0:FAILURE - write dma status=3D51LBA=3D78 and so on; each time four blocks possibly located at slice boundaries. My first thought was to write to questions@freebsd.org and CC hackers@ and a few others, but thought I'd better consult the documentation first ;) In Section 3 Open issues at: http://www.uk.freebsd.org/releases/5.2.1R/errata.html It says: (9 Jan 2004, updated 28 Feb 2004) In some cases, ATA devices may behave erratically, particularly SATA devices. Reported symptoms include command timeouts or missing interrupts. These problems appear to be timing-dependent, making them rather difficult to isolate. Workarounds include: * Turn off ATA DMA using the ``safe mode'' option of the bootloader or the hw.ata.ata_dma sysctl variable. * Use the host's BIOS setup options to put the ATA controller in its ``legacy mode'', if available. * Disable ACPI, for example using the ``safe mode'' option of the bootloader or using the hint.acpi.0.disabled kernel environment variable. So I tried the middle way first as the BIOS had certainly got the drive geometry wrong anyway - so I set it to the C/H/S fbsd fdisk & ranish had calculated. No change. The 5.2.1 boot process draws an ASCII beastie and gives several boot options triggerable by a number key press, so I pressed 3 for safe mode. Entered ufs:ad0s1a at the next prompt and installed without error. Getting somewhere at last, or so I thought, the first attempt to boot the installation was riddled with UDMA IRC errors. I let the errors run and eventually they stopped at a login prompt. So I logged in as rad0:WARNING - write UDMA ICRC erroroad0:FAILURE - write dma status=3D51 error 84oad0:FAILURE - write dma status=3D51LBA=3D65t Managed to re-boot or re-set, can't remember which, and tried booting in safe mode but the file system was truly trashed. Re-installed (in safe (no dma) mode) and booted the installed OS in safe mode - whoopee, five vr0 watchdog timeouts after starting sshd but stability at last. So where's the sysctl to turn off dma setable from? /etc/sysctl.conf? Nope. It's not writeable from there. Added 'hw.ata.ata_dma=3D"0"' to /boot/loader.conf and the rest was easy. It's just sitting on the LAN, it's as quiet as a Lamb. Should I enable dma?[3] NO DON'T! I hear you say. [1] Ranish Partition Manager http://www.ranish.com/part/ is just so handy sometimes. Boots swiftly from a floppy and even recognises freebsd partitions. Ranish failed to verify partitions on the "broken" drive. [2] There's probably a way to write all error messages to a file on a floppy or to some other safe media. I bet developers do it all the time. [3] Yeah I know - I shouldn't ask technical questions on newbies@. :) --=20 John.