From owner-freebsd-sparc64@FreeBSD.ORG Sun Dec 30 03:24:06 2012 Return-Path: Delivered-To: freebsd-sparc64@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9997E89C for ; Sun, 30 Dec 2012 03:24:06 +0000 (UTC) (envelope-from lidl@hydra.pix.net) Received: from hydra.pix.net (hydra.pix.net [IPv6:2001:470:e254::3c]) by mx1.freebsd.org (Postfix) with ESMTP id 516A88FC0C for ; Sun, 30 Dec 2012 03:24:06 +0000 (UTC) Received: from hydra.pix.net (localhost [127.0.0.1]) by hydra.pix.net (8.14.5/8.14.5) with ESMTP id qBU3O3u5029483; Sat, 29 Dec 2012 22:24:03 -0500 (EST) (envelope-from lidl@hydra.pix.net) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.97.6 at mail.pix.net Received: (from lidl@localhost) by hydra.pix.net (8.14.5/8.14.5/Submit) id qBU3O3JM029482; Sat, 29 Dec 2012 22:24:03 -0500 (EST) (envelope-from lidl) Date: Sat, 29 Dec 2012 22:24:03 -0500 From: Kurt Lidl To: Chris Ross Subject: Re: Changes to kern.geom.debugflags? Message-ID: <20121230032403.GA29164@pix.net> References: <7AA0B5D0-D49C-4D5A-8FA0-AA57C091C040@distal.com> <6A0C1005-F328-4C4C-BB83-CA463BD85127@distal.com> <20121225232507.GA47735@alchemy.franken.de> <8D01A854-97D9-4F1F-906A-7AB59BF8850B@distal.com> <6FC4189B-85FA-466F-AA00-C660E9C16367@distal.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <6FC4189B-85FA-466F-AA00-C660E9C16367@distal.com> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-sparc64@freebsd.org, Marius Strobl X-BeenThere: freebsd-sparc64@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Porting FreeBSD to the Sparc List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 30 Dec 2012 03:24:06 -0000 On Sat, Dec 29, 2012 at 03:01:55PM -0500, Chris Ross wrote: > On Dec 27, 2012, at 1:15 PM, Chris Ross wrote: > > > On Dec 27, 2012, at 10:43 AM, Chris Ross wrote: > >>>> FreeBSD/sparc64 ZFS boot block > >> Boot path: /pci@1c,600000/scsi@2/disk@1,0:a > >> Consoles: Open Firmware console > >> ERROR: Last Trap: Division by Zero > >> > >> {1} ok ctrace > >> No saved state > >> {1} ok > >> > >> Anything else you can suggest to get debugging information out of zfsloader? > > > > So, I've started with the tiring process of "printf debugging". I have gotten out of > > the loader code, and can show that it's inside of the call to the zfs devsw dv_init() > > call where it's failing. > > Okay. Many many iterations, and I found out where it's crashing. In > sys/boot/zfs/zfsimpl.c, in dnode_read(), the first line of the while loop > is: > > uint64_t bn = offset / bsize; > > And, bsize is calculated from: > > int bsize = dnode->dn_datablkszsec << SPA_MINBLOCKSHIFT; > > When running the code, though, I can confirm that bsize is 0 before > the divide is hit, thus causing the divide by zero trap. > > I'm going to guess this is a problem with dnode->dn_datablkszsec. > Has anything changed recently in zfs_fmtdev, or more likely zfs_get_root() > or objset_get_dnode(), which is the callchain right before dnode_read() ? Well, there has been a big set of bugfixes integrated into zfs since the 9.1-RELEASE, particularly revision r243674: ------------------------------------------------------------------------ r243674 | mm | 2012-11-29 09:05:04 -0500 (Thu, 29 Nov 2012) | 223 lines Merge ZFS feature flags support and related bugfixes: 236884, 237001, 237119, 237458, 237972, 238113, 238391, 238422, 238926, 238950, 238951, 239389, 239394, 239620, 239774, 239953, 239958, 239967, 239968, 240063, 240133, 240153, 240303, 240345, 240415, 240955, 241655, 243014, 243505, 243506 [ long, long description elided ] ------------------------------------------------------------------------ However, I think it's more likely that this revision, particularly r239068 that is the probably cause of the issue you are seeing. I don't particularly care for giant patches like this, where trying to figure out exactly what piece of what patch did what, but that's the way it goes. So, to answer your question, "Yes, lots of stuff has changed recently with ZFS". -Kurt ------------------------------------------------------------------------ r243243 | ae | 2012-11-18 12:09:29 -0500 (Sun, 18 Nov 2012) | 135 lines MFC 239054,239057,239058,239060,239066,239067,239068,239070,239073, 239087,239088,239127,239210,239211,239230,239231,239232,239243, 239292,239293,239294,239325,240272,240273,240274,240275,240276, 240277,240335,240481,241023,241047,241053,241065,241068,241069, 241070,241164,241809,241876 239054: Create the interface to work with various partition tables from the loader(8). The following partition tables are supported: BSD label, GPT, MBR, EBR and VTOC8. 239057: Remove unused variables. 239058: Introduce new API to work with disks from the loader's drivers. It uses new API from the part.c to work with partition tables. 239060: When GPT signature is invalid in the primary GPT header, then try to read backup GPT header. 239066: Add offset field to the i386_devdesc structure to be compatible with disk_devdesc structure. Update biosdisk driver to the new disk API. 239067: Remove unneeded flag. 239068: Teach the ZFS use new partitions API when probing. Note: now ZFS does probe only for partitions with type "freebsd-zfs" and "freebsd". 239070: Add simple test program that uses the partition tables handling code. It is useful to test and debug how boot loader handles partition tables metadata. 239073: Bump USERBOOT_VERSION. 239087: Add to the debug output the offset from the parent partitioning scheme. 239088: Fix start offset calculation for the EBR partitions. 239127: As it turned out, there are some installations, where BSD label contains partitions with type zero. And it has worked. So, allow detect these partitions. 239210: Add more debug messages. 239211: Add another debug message. 239230: Unbreak booting from the true dedicated disks. When we open the disk, check the type of partition table, that has been detected. If this is BSD label, then we assume this is DD mode. 239231: Remove colons from the debug message, device name returned by the disk_fmtdev() already has the colons. 239232: Restore the old behaviour. If requested partition is a BSD slice, but d_partition isn't explicitly set, then try to open BSD label and its first partition. 239243: After r239066, reinitialize v86.ctl and v86.addr for int 13 EDD probing in sys/boot/i386/libi386/biosdisk.c. Otherwise, when DISK_DEBUG is enabled, the DEBUG() macros will clobber those fields, and cause the probing to always fail mysteriously when debugging is enabled. 239292: Explicitly terminate the string after strncpy(3). 239293: Rework r239232 to unbreak ZFS detection on MBR slices. 239294: Some BIOSes return incorrect number of sectors, make checks less strictly, to do not lost some partitions. 239325: Add comment why the code has been disabled. 240272: Make struct uboot_devdesc compatible with struct disk_devdesc. 240273: Use disk_fmtdev() and disk_parsedev() functions from the new DISK API. 240274: Update uboot's disk driver to use new DISK API. 240275: Build disk.c only when DISK_SUPPORT is enabled. 240276: Update according to the change of struct uboot_devdesc. 240277: Handle LOADER_NO_DISK_SUPPORT knob in the arm and powerpc ubldr. 240335: Slightly reduce an overhead for the open() call in the zfsloader. libstand(3) tries to detect file system in the predefined order, but zfsloader usually is used for the booting from ZFS, and there is no need to try detect several file system types for each open() call. 240481: The MBR data is not necessarily aligned. This is a problem on ARM. 241023: Make the loader a bit smarter, when it tries to open disk and the slice number is not exactly specified. When the disk has MBR, also try to read BSD label after ptable_getpart() call. When the disk has GPT, also set d_partition to 255. Mostly, this is how it worked before. 241047: Disable splitfs support, since we aren't support floppies for a long time. This slightly reduces an overhead, when loader tries to open file that doesn't exist. 241053: Almost each time when loader opens a file, this leads to calling disk_open(). Very often this is called several times for one file. This leads to reading partition table metadata for each call. To reduce the number of disk I/O we have a simple block cache, but it is very dumb and more than half of I/O operations related to reading metadata, misses this cache. Introduce new cache layer to resolve this problem. It is independent and doesn't need initialization like bcache, and will work by default for all loaders which use the new DISK API. A successful disk_open() call to each new disk or partition produces new entry in the cache. Even more, when disk was already open, now opening of any nested partitions does not require reading top level partition table. So, if without this cache, partition table metadata was read around 20-50 times during boot, now it reads only once. This affects the booting from GPT and MBR from the UFS. 241065: Fix disk_cleanup() to work without DISK_DEBUG too. 241068: Reduce the number of attempts to detect proper kld format for the amd64 loader. 241069: Remember the file format of the last loaded module and try to use it for next files. 241070: Fix the style. 241164: Replace all references to loader_callbacks_v1 with loader_callbacks. 241809: Add the flags parameter to the disk_open() function and DISK_F_NOCACHE flag, that disables the caching of partition tables metadata. Use this flag for floppies in the libi386/biosdisk driver. 241876: When loader tries to open GPT partition, but partition table is not GPT, then try automatically detect an appropriate partition type. ------------------------------------------------------------------------ > > - Chris > > _______________________________________________ > freebsd-sparc64@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-sparc64 > To unsubscribe, send any mail to "freebsd-sparc64-unsubscribe@freebsd.org"