From owner-freebsd-amd64@FreeBSD.ORG Sun Oct 12 16:50:35 2008 Return-Path: Delivered-To: freebsd-amd64@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 204C61065688; Sun, 12 Oct 2008 16:50:35 +0000 (UTC) (envelope-from scottl@samsco.org) Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57]) by mx1.freebsd.org (Postfix) with ESMTP id 7481D8FC12; Sun, 12 Oct 2008 16:50:34 +0000 (UTC) (envelope-from scottl@samsco.org) Received: from phobos.local ([192.168.254.200]) (authenticated bits=0) by pooker.samsco.org (8.14.2/8.14.2) with ESMTP id m9CGAaPG008026; Sun, 12 Oct 2008 10:10:36 -0600 (MDT) (envelope-from scottl@samsco.org) Message-ID: <48F2217C.9010000@samsco.org> Date: Sun, 12 Oct 2008 10:10:36 -0600 From: Scott Long User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.13) Gecko/20080313 SeaMonkey/1.1.9 MIME-Version: 1.0 To: Jeremy Chadwick References: <200810101429.37244.joao@matik.com.br> <20081011113057.7402300c@ernst.jennejohn.org> <20081011101316.GA58119@icarus.home.lan> <20081011164529.198f32c6@ernst.jennejohn.org> <20081011144817.GB64861@icarus.home.lan> <48F0D3B5.6070602@egr.msu.edu> <20081011165250.GA67552@icarus.home.lan> In-Reply-To: <20081011165250.GA67552@icarus.home.lan> X-Enigmail-Version: 0.95.6 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-4.4 required=3.8 tests=ALL_TRUSTED,BAYES_00 autolearn=ham version=3.1.8 X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on pooker.samsco.org X-Mailman-Approved-At: Sun, 12 Oct 2008 20:27:57 +0000 Cc: freebsd-stable@FreeBSD.org, freebsd-amd64@FreeBSD.org Subject: Re: am2 MBs - 4g + SCSI wipes out root partition X-BeenThere: freebsd-amd64@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting FreeBSD to the AMD64 platform List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 12 Oct 2008 16:50:35 -0000 Jeremy Chadwick wrote: > On Sat, Oct 11, 2008 at 12:26:29PM -0400, Adam McDougall wrote: >> Jeremy Chadwick wrote: >>> On Sat, Oct 11, 2008 at 04:45:29PM +0200, Gary Jennejohn wrote: >>>> On Sat, 11 Oct 2008 03:13:16 -0700 >>>> Jeremy Chadwick wrote: >>>> >>>>> On Sat, Oct 11, 2008 at 11:30:57AM +0200, Gary Jennejohn wrote: >>>>>> On Fri, 10 Oct 2008 14:29:37 -0300 >>>>>> JoaoBR wrote: >>>>>> >>>>>>> I tried MBs as Asus, Abit and Gigabyte all same result >>>>>>> >>>>>>> Same hardware with SATA works perfect >>>>>>> >>>>>>> Same hardware with scsi up to 3.5Gigs installed works perfect >>>>>>> >>>>>>> what calls my attention that all this MBs do not have the >>>>>>> memroy hole remapping feature so the complete 4gigs are >>>>>>> available what normally was not the case with amd64 Mbs for the >>>>>>> Athlon 64 CPUs >>>>>>> >>>>>>> some has an opinion if this is a freebsd issue or MB falure or >>>>>>> scsi drv problem? >>>>>>> >>>>>> It's a driver problem. If you want to use SCSI then you'll have to limit >>>>>> memory to 3.5 GB. >>>>> What you're saying is that Adaptec and LSI Logic SCSI controllers behave >>>>> badly (and can cause data loss) on amd64 systems which contain more than >>>>> 3.5GB of RAM. This is a very big claim. >>>>> >>>>> Have you talked to Scott Long about this? >>>>> >>>>> Please expand on this, and provide evidence or references. I need to >>>>> document this in my Wiki if it is indeed true. >>>>> >>>> See the freebsd-scsi thread with Subject "data corruption with ahc driver >>>> and 4GB of memory using a FBSD-8 64-bit installation?" from Wed, 30 Jan >>>> 2008. >>>> >>>> This was for ahc, but the bit-rot which Scott mentions in his reply might >>>> also apply to the LSI Logic controllers. >>>> >>>> Basically the driver doesn't correctly handle DMA above 4GB. Since the PCI >>>> hole gets mapped above 4GB it causes problems. the (S)ATA drivers don't seem >>>> to have this problem. >>> Thank you -- this is the exact information I was looking for. >>> >>> I will update my Wiki page to reflect this quite major problem. >>> >> I am using some LSI (mpt driver) ultra4 (U320 scsi) and LSI SAS >> controllers in FreeBSD 7.x amd64 with 20G of ram, and Adaptec (aac >> driver) with a 5th generation RAID card with 8G of ram, both have no >> such corruption problems. Providing this as a counter-example just to >> document some evidence of which products seem to work fine. > > Is your LSI SAS controller driven by mpt(4) or mfi(4)? > I can personal vouch for MPT and MFI drivers working just fine with >4GB. > Let's break down what we know for sure at this point: > > aac(4) - not affected Works fine > aha(4) - unknown > ahb(4) - unknown These two will likely be using bounce buffers and should work, albeit slowly. > ahc(4) - affected > ahd(4) - unknown; no one answered the OP's question in the thread Both ahc and ahd were designed _AND_TESTED_ to work with >4GB. If they don't work anymore, it's due to unintended bitrot. > asr(4) - unknown Danger! Achtung! Beware of Dog! > ips(4) - unknown I'm pretty sure this works just fine. > mpt(4) - not affected > mfi(4) - unknown Both work just fine > sym(4) - unknown This has had problems in the past, but I think that it might have been fixed recently You forgot to mention isp(4), which also works just fine. > > Could the problem be specific to certain firmware revisions on the > cards? ahc/ahd use custom "firmware" that is part of the driver. Their BIOS can be flashed, but that does little to affect OS operation of the card. So, "firmware revisions" has nothing to do with whatever this problem is. Please do keep in mind that 32bit vs 64bit support, and by correlary, >4GB support, is something that is completely isolated on a per-driver basis. Trying to draw patterns between drivers to say, "FreeBSD SCSI support is broken," is not valid. In fact, traditionally, SCSI drivers in general have had the best support because they are so much more common in the high-end systems that need the support. Out of your whole list, the only card to explicitly stay away from is the asr(4) family, but that's been known for years. If ahc and/or ahd has problems, we need someone willing to dig into code and trace through the DMA path. Scott