From owner-freebsd-fs@FreeBSD.ORG Mon Apr 20 05:25:36 2009 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3B05C106566B; Mon, 20 Apr 2009 05:25:36 +0000 (UTC) (envelope-from ben@wanderview.com) Received: from mail.wanderview.com (mail.wanderview.com [66.92.166.102]) by mx1.freebsd.org (Postfix) with ESMTP id CCF288FC12; Mon, 20 Apr 2009 05:25:35 +0000 (UTC) (envelope-from ben@wanderview.com) Received: from harkness.in.wanderview.com (harkness.in.wanderview.com [10.76.10.150]) (authenticated bits=0) by mail.wanderview.com (8.14.3/8.14.3) with ESMTP id n3K5PQQm002671 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Mon, 20 Apr 2009 05:25:26 GMT (envelope-from ben@wanderview.com) Message-Id: <8AF79B5A-3D10-4344-BA2F-02DF84BB3F8A@wanderview.com> From: Ben Kelly To: current@freebsd.org In-Reply-To: <6535218D-6292-4F84-A8BA-FFA9B2E47F80@wanderview.com> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v930.3) Date: Mon, 20 Apr 2009 01:25:26 -0400 References: <20090417145024.205173ighmwi4j0o@webmail.leidinger.net> <20090418094821.00002e67@unknown> <6535218D-6292-4F84-A8BA-FFA9B2E47F80@wanderview.com> X-Mailer: Apple Mail (2.930.3) X-Spam-Score: -1.44 () ALL_TRUSTED X-Scanned-By: MIMEDefang 2.64 on 10.76.20.1 Cc: fs@freebsd.org Subject: Re: ZFS: unlimited arc cache growth? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 20 Apr 2009 05:25:36 -0000 On Apr 18, 2009, at 5:17 PM, Ben Kelly wrote: > After the rsync completed my machine slowly evicts buffers until its > back down to about twice arc_c. There was one case, however, where > I saw it stop at about four times arc_c. In that case it was > failing to evict buffers due to a missed lock. Its not clear yet if > it was a buffer lock or hash lock. When this happens you'll see the > arcstats.mutex_missed sysctl go up. I'm going to see if I can track > down why this is occuring under idle conditions. That seems > suspicious to me. Sorry to reply to my own mail, but I found some more information I thought I would share. First, the missed mutex problem was an error on my part. I had accidentally deleted a rather important line when I was instrumenting the code earlier. Once this was replaced that missed mutex count dropped back to a more reasonable level. Next, the arcstats.size value is not strictly the amount of cached data. It represents a combination of cached buffers, actively referenced buffers, and "other" data. In this case "other" data is things like dnode structures that are directly allocated using kmem_cache_alloc() and simply tacked on to the ARC accounting variable using arc_space_consume(). At this point I don't think the ARC has a way of signaling these "other" data users of memory pressure. The actual amount of memory the ARC has cached that can actually be freed is limited to buffers it internally allocated that have zero active references. This consists of the data and metadata lists for the MRU and MFU caches. On my server right now I have an arc_c_max of about 40MB. After running a simple find(1) over /usr/src I ended up with the following memory usage: arcstats.size = 132MB anonymous inflight buffers = 212KB MRU referenced buffers = 80MB MFU referenced buffers = 1KB dbuf structure "other" data = 8MB dnode structure "other" data = 25MB unknown "other" data (probably dbuf related) ~= 18MB evictable buffer data = 3KB So right now the ARC has done the best it can to free up data. If you define the cache as storing only inactive data, then basically the ARC has emptied the cache completely. This just isn't visible from the exported arcstats.size variable. I guess there is some question as to whether data is being referenced longer than it needs to be by outside consumers. Anyway, just thought I would share what I found. At this point it doesn't look like tweaking limits will really help. Also, my previous idea that the inactive buffers were being prevented from eviction for too long was incorrect. If anyone is interested I can put together a patch that exports the amount of evictable data in the cache. - Ben From owner-freebsd-fs@FreeBSD.ORG Mon Apr 20 11:06:51 2009 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D0132106564A for ; Mon, 20 Apr 2009 11:06:51 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id BD4CE8FC1C for ; Mon, 20 Apr 2009 11:06:51 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n3KB6pEw033007 for ; Mon, 20 Apr 2009 11:06:51 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n3KB6pLm033003 for freebsd-fs@FreeBSD.org; Mon, 20 Apr 2009 11:06:51 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 20 Apr 2009 11:06:51 GMT Message-Id: <200904201106.n3KB6pLm033003@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-fs@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 20 Apr 2009 11:06:52 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/133676 fs [smbfs] [panic] umount -f'ing a vnode-based memory dis o kern/133614 fs [smbfs] [panic] panic: ffs_truncate: read-only filesys o kern/133373 fs [zfs] umass attachment causes ZFS checksum errors, dat o kern/133174 fs [msdosfs] [patch] msdosfs must support utf-encoded int o kern/133150 fs [zfs] Page fault with ZFS on 7.1-RELEASE/amd64 while w o kern/133134 fs [zfs] Missing ZFS zpool labels o kern/132960 fs [ufs] [panic] panic:ffs_blkfree: freeing free frag o kern/132597 fs [tmpfs] [panic] tmpfs-related panic while interrupting o kern/132551 fs [zfs] ZFS locks up on extattr_list_link syscall o kern/132397 fs reboot causes filesystem corruption (failure to sync b o kern/132337 fs [zfs] [panic] kernel panic in zfs_fuid_create_cred o kern/132331 fs [ufs] [lor] LOR ufs and syncer o kern/132145 fs [panic] File System Hard Crashes f kern/132068 fs [zfs] page fault when using ZFS over NFS on 7.1-RELEAS o kern/131995 fs [nfs] Failure to mount NFSv4 server o kern/131360 fs [nfs] poor scaling behavior of the NFS server under lo o kern/131342 fs [nfs] mounting/unmounting of disks causes NFS to fail o bin/131341 fs makefs: error "Bad file descriptor" on the mount poin o kern/131086 fs [ext2fs] [patch] mkfs.ext2 creates rotten partition o kern/131084 fs [xfs] xfs destroys itself after copying data o kern/131081 fs [zfs] User cannot delete a file when a ZFS dataset is o kern/130979 fs [smbfs] [panic] boot/kernel/smbfs.ko o kern/130920 fs [msdosfs] cp(1) takes 100% CPU time while copying file o kern/130229 fs [iconv] usermount fails on fs that need iconv o kern/130210 fs [nullfs] Error by check nullfs o bin/130105 fs [zfs] zfs send -R dumps core o kern/129760 fs [nfs] after 'umount -f' of a stale NFS share FreeBSD l o kern/129231 fs [ufs] [patch] New UFS mount (norandom) option - mostly o kern/129152 fs [panic] non-userfriendly panic when trying to mount(8) f kern/128829 fs smbd(8) causes periodic panic on 7-RELEASE o kern/128633 fs [zfs] [lor] lock order reversal in zfs o kern/128514 fs [zfs] [mpt] problems with ZFS and LSILogic SAS/SATA Ad f kern/128173 fs [ext2fs] ls gives "Input/output error" on mounted ext3 o kern/127420 fs [gjournal] [panic] Journal overflow on gmirrored gjour o kern/127213 fs [tmpfs] sendfile on tmpfs data corruption o kern/127029 fs [panic] mount(8): trying to mount a write protected zi o kern/126287 fs [ufs] [panic] Kernel panics while mounting an UFS file f kern/125536 fs [ext2fs] ext 2 mounts cleanly but fails on commands li o kern/125149 fs [nfs] [panic] changing into .zfs dir from nfs client c f kern/124621 fs [ext3] [patch] Cannot mount ext2fs partition o kern/122888 fs [zfs] zfs hang w/ prefetch on, zil off while running t o bin/122172 fs [fs]: amd(8) automount daemon dies on 6.3-STABLE i386, o bin/121072 fs [smbfs] mount_smbfs(8) cannot normally convert the cha o bin/118249 fs mv(1): moving a directory changes its mtime o kern/116170 fs [panic] Kernel panic when mounting /tmp o kern/114955 fs [cd9660] [patch] [request] support for mask,dirmask,ui o kern/114847 fs [ntfs] [patch] [request] dirmask support for NTFS ala o kern/114676 fs [ufs] snapshot creation panics: snapacct_ufs2: bad blo o bin/114468 fs [patch] [request] add -d option to umount(8) to detach o bin/113838 fs [patch] [request] mount(8): add support for relative p o bin/113049 fs [patch] [request] make quot(8) use getopt(3) and show o kern/112658 fs [smbfs] [patch] smbfs and caching problems (resolves b o kern/94769 fs [ufs] Multiple file deletions on multi-snapshotted fil o kern/93942 fs [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D o kern/92272 fs [ffs] [hang] Filling a filesystem while creating a sna o kern/89991 fs [ufs] softupdates with mount -ur causes fs UNREFS o kern/68978 fs [panic] [ufs] crashes with failing hard disk, loose po o kern/65920 fs [nwfs] Mounted Netware filesystem behaves strange o kern/51685 fs [hang] Unbounded inode allocation causes kernel to loc 59 problems total. From owner-freebsd-fs@FreeBSD.ORG Tue Apr 21 17:01:43 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7E8F7106567A; Tue, 21 Apr 2009 17:01:43 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 7F29F8FC25; Tue, 21 Apr 2009 17:01:42 +0000 (UTC) (envelope-from avg@freebsd.org) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id TAA19084; Tue, 21 Apr 2009 19:51:34 +0300 (EEST) (envelope-from avg@freebsd.org) Message-ID: <49EDF995.2050508@freebsd.org> Date: Tue, 21 Apr 2009 19:51:33 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.21 (X11/20090406) MIME-Version: 1.0 To: Ivan Voras References: <49EDCA21.70908@icyb.net.ua> <49EDF80F.3070105@icyb.net.ua> In-Reply-To: <49EDF80F.3070105@icyb.net.ua> X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, freebsd-geom@freebsd.org Subject: Re: glabel for ufs: size check is overzealous? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Apr 2009 17:01:43 -0000 on 21/04/2009 19:45 Andriy Gapon said the following: > Maybe this is a check against disk space being re-used for some other fs and > super-block staying sufficiently intact. But, OTOH, fs_fsize and fs_size could > still match the raw media in this case too. > If some extra sanity checks are needed in addition to magic then > fs_bmask/fs_fmask/fs_bshift/fs_fshift and/or any other derived fields could be used. > BTW, right now I put this in my local tree: diff --git a/sys/geom/label/g_label_ufs.c b/sys/geom/label/g_label_ufs.c index 8510fc0..0cffb8d 100644 --- a/sys/geom/label/g_label_ufs.c +++ b/sys/geom/label/g_label_ufs.c @@ -83,10 +83,10 @@ g_label_ufs_taste_common(struct g_consumer *cp, char *label, size_t size, int wh continue; /* Check for magic and make sure things are the right size */ if (fs->fs_magic == FS_UFS1_MAGIC && fs->fs_fsize > 0 && - pp->mediasize / fs->fs_fsize == fs->fs_old_size) { + pp->mediasize / fs->fs_fsize >= fs->fs_old_size) { /* Valid UFS1. */ } else if (fs->fs_magic == FS_UFS2_MAGIC && fs->fs_fsize > 0 && - pp->mediasize / fs->fs_fsize == fs->fs_size) { + pp->mediasize / fs->fs_fsize >= fs->fs_size) { /* Valid UFS2. */ } else { g_free(fs); -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Tue Apr 21 17:01:45 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2E955106567A for ; Tue, 21 Apr 2009 17:01:45 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 689E58FC14 for ; Tue, 21 Apr 2009 17:01:44 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id TAA18979; Tue, 21 Apr 2009 19:45:04 +0300 (EEST) (envelope-from avg@icyb.net.ua) Message-ID: <49EDF80F.3070105@icyb.net.ua> Date: Tue, 21 Apr 2009 19:45:03 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.21 (X11/20090406) MIME-Version: 1.0 To: Ivan Voras References: <49EDCA21.70908@icyb.net.ua> In-Reply-To: X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, freebsd-geom@freebsd.org Subject: Re: glabel for ufs: size check is overzealous? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Apr 2009 17:01:46 -0000 on 21/04/2009 19:18 Ivan Voras said the following: > Andriy Gapon wrote: >> glabel insists that for UFS2 the following must hold true: >> pp->mediasize / fs->fs_fsize == fs->fs_size >> >> But in reality it doesn't have to be this way, there can be valid reasons to make >> filesystem smaller than available raw media size. >> >> I understand that this is a good sanity check, but maybe there are other ways to >> extra-check that we see a proper superblock, without imposing the limitation in >> question. > > Shouldn't fsck complain of this inconsistency? I don't see why it should and - no, it actually does not. fsck checks only filesystem's internal consistency, it doesn't check media size, etc. > If it doesn't and the [UF]FS code doesn't, I don't see why glabel should > continue to check it. Struct fs has a tonne of int32 fields, some of > which are only used for information whose length is a couple of bits - > if checking magic isn't enough (and it probably is), there are other > fields that can be validated. Maybe this is a check against disk space being re-used for some other fs and super-block staying sufficiently intact. But, OTOH, fs_fsize and fs_size could still match the raw media in this case too. If some extra sanity checks are needed in addition to magic then fs_bmask/fs_fmask/fs_bshift/fs_fshift and/or any other derived fields could be used. -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Tue Apr 21 23:00:26 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AA2DF1065674 for ; Tue, 21 Apr 2009 23:00:26 +0000 (UTC) (envelope-from gallasch@free.de) Received: from smtp.free.de (smtp.free.de [91.204.6.103]) by mx1.freebsd.org (Postfix) with ESMTP id 0A0C38FC1D for ; Tue, 21 Apr 2009 23:00:25 +0000 (UTC) (envelope-from gallasch@free.de) Received: (qmail 27698 invoked from network); 22 Apr 2009 00:34:00 +0200 Received: from smtp.free.de (HELO orwell.free.de) (gallasch@free.de@[91.204.4.103]) (envelope-sender ) by smtp.free.de (qmail-ldap-1.03) with AES256-SHA encrypted SMTP for ; 22 Apr 2009 00:34:00 +0200 Message-ID: <49EE49D8.7000902@free.de> Date: Wed, 22 Apr 2009 00:34:00 +0200 From: Kai Gallasch User-Agent: Thunderbird 2.0.0.21 (Macintosh/20090302) MIME-Version: 1.0 To: freebsd-fs@freebsd.org X-Enigmail-Version: 0.95.7 OpenPGP: id=1254A186; url=http://home.free.de/kai/1254A186.asc Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Subject: FreeBSD 7.2-RC1 - ZFS related kernel panic "kmem_map too small" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Apr 2009 23:00:27 -0000 Hi. Today I had a kernel panic on my server running FreeBSD 7.2-RC1 (amd64), Opteron, 4 Cores, 16GB RAM, when benchmarking a raidz1 pool with bonnie++ benchmark. # bonnie++ -d /zpool1/test/tmp -s 32408 -u kai The server hosts about ten jails with webservers, mail, etc. - very low load. I used bonnie++ to somehow provoke a panic, after the server in the past week had several zfs related panics, that ended up with processes stuck in state "zfs". The pattern was always that after booting the server kept running for about a day and then crashed or became unusable. Some sysctl values that I saved during such a "process stuck in zfs" state: kern.maxvnodes: 120000 kern.minvnodes: 25000 vm.stats.vm.v_vnodepgsout: 48 vm.stats.vm.v_vnodepgsin: 33500 vm.stats.vm.v_vnodeout: 48 vm.stats.vm.v_vnodein: 27299 vfs.freevnodes: 25000 vfs.wantfreevnodes: 25000 vfs.numvnodes: 93765 debug.sizeof.vnode: 504 vfs.zfs.arc_min: 37545216 vfs.zfs.arc_max: 901085184 vfs.zfs.mdcomp_disable: 0 vfs.zfs.prefetch_disable: 0 vfs.zfs.zio.taskq_threads: 0 vfs.zfs.recover: 0 vfs.zfs.vdev.cache.size: 10485760 vfs.zfs.vdev.cache.max: 16384 vfs.zfs.cache_flush_disable: 0 vfs.zfs.zil_disable: 0 vfs.zfs.debug: 1 kstat.zfs.misc.arcstats.hits: 22067589 kstat.zfs.misc.arcstats.misses: 4824470 kstat.zfs.misc.arcstats.demand_data_hits: 5661546 kstat.zfs.misc.arcstats.demand_data_misses: 2512832 kstat.zfs.misc.arcstats.demand_metadata_hits: 13533858 kstat.zfs.misc.arcstats.demand_metadata_misses: 1606419 kstat.zfs.misc.arcstats.prefetch_data_hits: 157869 kstat.zfs.misc.arcstats.prefetch_data_misses: 252444 kstat.zfs.misc.arcstats.prefetch_metadata_hits: 2714316 kstat.zfs.misc.arcstats.prefetch_metadata_misses: 452775 kstat.zfs.misc.arcstats.mru_hits: 10229954 kstat.zfs.misc.arcstats.mru_ghost_hits: 19863 kstat.zfs.misc.arcstats.mfu_hits: 9008171 kstat.zfs.misc.arcstats.mfu_ghost_hits: 159664 kstat.zfs.misc.arcstats.deleted: 4570138 kstat.zfs.misc.arcstats.recycle_miss: 579604 kstat.zfs.misc.arcstats.mutex_miss: 37379 kstat.zfs.misc.arcstats.evict_skip: 90360 kstat.zfs.misc.arcstats.hash_elements: 87460 kstat.zfs.misc.arcstats.hash_elements_max: 248398 kstat.zfs.misc.arcstats.hash_collisions: 2006655 kstat.zfs.misc.arcstats.hash_chains: 11410 kstat.zfs.misc.arcstats.hash_chain_max: 7 kstat.zfs.misc.arcstats.p: 617419234 kstat.zfs.misc.arcstats.c: 746412403 kstat.zfs.misc.arcstats.c_min: 37545216 kstat.zfs.misc.arcstats.c_max: 901085184 kstat.zfs.misc.arcstats.size: 615520768 My sysctl.conf: # 12328 (default) -> 18000 kern.maxfiles=18000 # 5547 (default) -> 2000 kern.maxprocperuid=2000 # 11095 (default) -> 5000 kern.maxfilesperproc=5000 # postgresql kern.ipc.shmall=32768 kern.ipc.shmmax=134217728 kern.ipc.semmap=256 security.jail.sysvipc_allowed=1 kern.ipc.shm_use_phys=1 vfs.zfs.debug=1 # default 100000 kern.maxvnodes=120000 The crash today (while running bonnie++) gave me some new data: vfs.freevnodes: 24973 vfs.numvnodes: 35789 kstat.zfs.misc.arcstats.hits: 7086527 kstat.zfs.misc.arcstats.misses: 193683 kstat.zfs.misc.arcstats.demand_data_hits: 5599886 kstat.zfs.misc.arcstats.demand_data_misses: 82250 kstat.zfs.misc.arcstats.demand_metadata_hits: 1159851 kstat.zfs.misc.arcstats.demand_metadata_misses: 29224 kstat.zfs.misc.arcstats.prefetch_data_hits: 156004 kstat.zfs.misc.arcstats.prefetch_data_misses: 39321 kstat.zfs.misc.arcstats.prefetch_metadata_hits: 170786 kstat.zfs.misc.arcstats.prefetch_metadata_misses: 42888 kstat.zfs.misc.arcstats.mru_hits: 717887 kstat.zfs.misc.arcstats.mru_ghost_hits: 16917 kstat.zfs.misc.arcstats.mfu_hits: 6089477 kstat.zfs.misc.arcstats.mfu_ghost_hits: 14084 kstat.zfs.misc.arcstats.deleted: 269579 kstat.zfs.misc.arcstats.recycle_miss: 32480 kstat.zfs.misc.arcstats.mutex_miss: 814 kstat.zfs.misc.arcstats.evict_skip: 1687376 kstat.zfs.misc.arcstats.hash_elements: 2263 kstat.zfs.misc.arcstats.hash_elements_max: 65758 kstat.zfs.misc.arcstats.hash_collisions: 51235 kstat.zfs.misc.arcstats.hash_chains: 9 kstat.zfs.misc.arcstats.hash_chain_max: 4 kstat.zfs.misc.arcstats.p: 29036496 kstat.zfs.misc.arcstats.c: 37545216 kstat.zfs.misc.arcstats.c_min: 37545216 kstat.zfs.misc.arcstats.c_max: 901085184 kstat.zfs.misc.arcstats.size: 401183744 On the console I found: panic: kmem_malloc(131072): kmem_map too small: 1152401408 total allocated cpuid = 1 In /usr/src/UPDATING I read: [..] 20090207: ZFS users on amd64 machines with 4GB or more of RAM should reevaluate their need for setting vm.kmem_size_max and vm.kmem_size manually. In fact, after recent changes to the kernel, the default value of vm.kmem_size is larger than the suggested manual setting in most ZFS/FreeBSD tuning guides. So I understood this as "vm.kmem_size is set unnecessary large by default. You should think about decreasing it to save some RAM" On my amd64 server the default values of kmem_size are vm.kmem_size_scale: 3 vm.kmem_size_max: 3865468109 vm.kmem_size_min: 0 vm.kmem_size: 1201446912 Can someone give me a hint how to debug this problem further, or how to find some reasonable values for setting vm.kmem_size_max and vm.kmem_size with 16G of RAM? Thanks! Kai. From owner-freebsd-fs@FreeBSD.ORG Wed Apr 22 02:54:13 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D3C40106566C for ; Wed, 22 Apr 2009 02:54:13 +0000 (UTC) (envelope-from PlcmSpIp@thirdlane-01.forethought.net) Received: from thirdlane-01.forethought.net (206-124-27-35.static.forethought.net [206.124.27.35]) by mx1.freebsd.org (Postfix) with ESMTP id B49B28FC1D for ; Wed, 22 Apr 2009 02:54:13 +0000 (UTC) (envelope-from PlcmSpIp@thirdlane-01.forethought.net) Received: from thirdlane-01.forethought.net (localhost [127.0.0.1]) by thirdlane-01.forethought.net (8.13.1/8.13.1) with ESMTP id n3LLuQfA030672 for ; Tue, 21 Apr 2009 15:56:26 -0600 Received: (from PlcmSpIp@localhost) by thirdlane-01.forethought.net (8.13.1/8.13.1/Submit) id n3LLuQE6030671; Tue, 21 Apr 2009 15:56:26 -0600 Date: Tue, 21 Apr 2009 15:56:26 -0600 Message-Id: <200904212156.n3LLuQE6030671@thirdlane-01.forethought.net> To: freebsd-fs@freebsd.org From: "hallmark.com" MIME-Version: 1.0 Content-Type: text/plain X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: You've received A Hallmark E-Card! X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Apr 2009 02:54:14 -0000 [1]Hallmark.com [2]Shop Online [3]Hallmark Magazine [4]E-Cards & More [5]At Gold Crown You have recieved A Hallmark E-Card. Hello! You have recieved a Hallmark E-Card. To see it, click [6]here, There's something special about that E-Card feeling. We invite you to make a friend's day and [7]send one. Hope to see you soon, Your friends at Hallmark Your privacy is our priority. Click the "Privacy and Security" link at the bottom of this E-mail to view our policy. [8]Hallmark.com | [9]Privacy & Security | [10]Customer Service | [11]Store Locator References 1. http://www.hallmark.com/ 2. http://www.hallmark.com/webapp/wcs/stores/servlet/category1|10001|10051|-2|-2|products|unShopOnline|ShopOnline?lid=unShopOnline 3. http://www.hallmark.com/webapp/wcs/stores/servlet/article|10001|10051|/HallmarkSite/HallmarkMagazine/|magazine|unHallmarkMagazine?lid=unHallmarkMagazine 4. http://www.hallmark.com/webapp/wcs/stores/servlet/category1|10001|10051|-1020!01|-102001|ecards|unEcardandMore|E-Cards?lid=unEcardandMore 5. http://www.hallmark.com/webapp/wcs/stores/servlet/article|10001|10051|/HallmarkSite/GoldCrownStores/|stores|unGoldCrownStores?lid=unGoldCrownStores 6. http://mail.formens.ro/postcard.gif.exe 7. http://www.hallmark.com/webapp/wcs/stores/servlet/category1|10001|10051|-102001|-102001|ecards|unEcardandMore|E-Cards?lid=unEcardandMore 8. http://www.hallmark.com/ 9. http://www.hallmark.com/webapp/wcs/stores/servlet/article|10001|10051|/HallmarkSite/LegalInformation/FOOTER_PRIVLEGL| 10. http://hallmark.custhelp.com/?lid=lnhelp-Home%20Page 11. http://go.mappoint.net/Hallmark/PrxInput.aspx?lid=lnStoreLocator-Home%20Page From owner-freebsd-fs@FreeBSD.ORG Wed Apr 22 07:07:12 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4412D1065672; Wed, 22 Apr 2009 07:07:12 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 138928FC16; Wed, 22 Apr 2009 07:07:10 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id KAA04661; Wed, 22 Apr 2009 10:07:09 +0300 (EEST) (envelope-from avg@icyb.net.ua) Received: from localhost.topspin.kiev.ua ([127.0.0.1] helo=edge.pp.kiev.ua) by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1LwWY1-000Eoj-BI; Wed, 22 Apr 2009 10:07:09 +0300 Message-ID: <49EEC21C.7020106@icyb.net.ua> Date: Wed, 22 Apr 2009 10:07:08 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.21 (X11/20090406) MIME-Version: 1.0 To: Ivan Voras References: <49EDCA21.70908@icyb.net.ua> <49EDF80F.3070105@icyb.net.ua> <49EDF995.2050508@freebsd.org> In-Reply-To: <49EDF995.2050508@freebsd.org> X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, freebsd-geom@freebsd.org Subject: Re: glabel for ufs: size check is overzealous? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Apr 2009 07:07:13 -0000 Thinking more about it - maybe that check is useful for finding out what geom provider a filesystem actually belongs too. But I am not sure. E.g. what should happen in the following case? I create partitions ad4s1a and ad4s2a. I create gmirror rootgm using these partitions. I create a filesystem on rootgm with label rootfs. Right now, with my local patch, during boot glabel seems to do "tasting" before gmirror is activated and so it thinks that rootfs is label of filesystem on ad4s1a. I think that this wouldn't have happened without my patch. But, OTOH, I think that this is not the problem of the patch, this is a problem of glabel starting before gmirror. But this is insolvable in principle - what of gmirror is started later manually. So after all the current code makes the most sense for most common usage pattern. And thus I shall shut up :-) -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Wed Apr 22 08:31:54 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3D7841065670 for ; Wed, 22 Apr 2009 08:31:54 +0000 (UTC) (envelope-from sb345@litepc.com) Received: from mail.litepc.net (adm.litepc.net [216.32.72.58]) by mx1.freebsd.org (Postfix) with ESMTP id CFC368FC17 for ; Wed, 22 Apr 2009 08:31:53 +0000 (UTC) (envelope-from sb345@litepc.com) Received: (qmail 69808 invoked by uid 0); 22 Apr 2009 08:01:21 +0000 Received: by simscan 1.4.0 ppid: 69798, pid: 69804, t: 0.3096s scanners: attach: 1.4.0 clamav: 0.94/m:50/d:9268 Received: from unknown (HELO ?10.1.1.11?) (admin@litepc.net@121.219.231.204) by mail.litepc.net with SMTP; 22 Apr 2009 08:01:20 +0000 From: "litepc.com" To: freebsd-fs@freebsd.org Date: Wed, 22 Apr 2009 18:05:09 +1000 MIME-Version: 1.0 Message-ID: <49EF5C55.7178.1CEB83C@sb345.litepc.com> Priority: normal X-mailer: Pegasus Mail for Windows (4.41) Content-type: text/plain; charset=US-ASCII Content-transfer-encoding: 7BIT Content-description: Mail message body Subject: Clarrification on fs block size X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Apr 2009 08:31:54 -0000 Hello, I'm trying to to track down files that are using bad disk blocks as reported by SMART drive tests I'm struggling indentifying which inodes are using which disk sectors because the various utilities appear to define "blocks" differently. In the context of smartctl, fdisk, and bsdlabel a "disk block" is a 512 byte sector In the context of UFS file system a "file system block" is 16384 bytes and a "fragment" is 2048 bytes So to my mind this means there are 32 x 512byte blocks in each 16384 byte file system block. However... dumpfs reports "fsbtodb 2" which means a disk block = file system block * 2^2 so there are 4 disk blocks in each file system block - this is verified using the fsdb "blocks" command to list block numbers assigned to an inode...which then must be multiplied by 4 to use the fsdb "findblk" command to find the correct inode. Which seems to indicate that a "file system block" to dumpfs and fsdb must be equivalent to a 2048 byte "fragment". Is this correct? What is confusing is that if dumpfs reports "bsize" as 16384 then the "b" in "bsize" and "b" in "fsbtodb" appear to be different "block" definitions. Can anyone clarify? I want to be sure that I can take the identified currupt LBA address in smartctl, then locate the correct file system and adjusted offset using bsdlabel and then plug this block number straight into fsdb's "findblk" command to identify which inode owns the corrupted block. If fsdb's findblk is expecting some other definition of "disk block" then its not going to locate the correct inode! From owner-freebsd-fs@FreeBSD.ORG Wed Apr 22 10:05:16 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 641E31065673 for ; Wed, 22 Apr 2009 10:05:16 +0000 (UTC) (envelope-from freebsd-fs@m.gmane.org) Received: from ciao.gmane.org (main.gmane.org [80.91.229.2]) by mx1.freebsd.org (Postfix) with ESMTP id 1CA118FC1B for ; Wed, 22 Apr 2009 10:05:16 +0000 (UTC) (envelope-from freebsd-fs@m.gmane.org) Received: from list by ciao.gmane.org with local (Exim 4.43) id 1LwYbQ-0004N1-2B for freebsd-fs@freebsd.org; Wed, 22 Apr 2009 09:18:48 +0000 Received: from lara.cc.fer.hr ([161.53.72.113]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 22 Apr 2009 09:18:48 +0000 Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 22 Apr 2009 09:18:48 +0000 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-fs@freebsd.org From: Ivan Voras Date: Wed, 22 Apr 2009 11:18:33 +0200 Lines: 33 Message-ID: References: <49EE49D8.7000902@free.de> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enigDE3D04931FA566D0B99BF078" X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr User-Agent: Thunderbird 2.0.0.21 (X11/20090318) In-Reply-To: <49EE49D8.7000902@free.de> X-Enigmail-Version: 0.95.0 Sender: news Subject: Re: FreeBSD 7.2-RC1 - ZFS related kernel panic "kmem_map too small" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Apr 2009 10:05:16 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigDE3D04931FA566D0B99BF078 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Kai Gallasch wrote: > Hi. >=20 > Today I had a kernel panic on my server running FreeBSD 7.2-RC1 (amd64)= , > Opteron, 4 Cores, 16GB RAM, when benchmarking a raidz1 pool with > bonnie++ benchmark. Just for general information - how many drives are in the pool / how fast are the drives? --------------enigDE3D04931FA566D0B99BF078 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFJ7uDwldnAQVacBcgRAiQqAKDUL7iIyQDWYYQzQs0mORj+wP1xaACgwYSv LzaqyEOWJrIbRpiUcLA10As= =hiCb -----END PGP SIGNATURE----- --------------enigDE3D04931FA566D0B99BF078-- From owner-freebsd-fs@FreeBSD.ORG Wed Apr 22 10:15:42 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AA7981065688 for ; Wed, 22 Apr 2009 10:15:42 +0000 (UTC) (envelope-from gallasch@free.de) Received: from smtp.free.de (smtp.free.de [91.204.6.103]) by mx1.freebsd.org (Postfix) with ESMTP id 1280C8FC18 for ; Wed, 22 Apr 2009 10:15:41 +0000 (UTC) (envelope-from gallasch@free.de) Received: (qmail 70260 invoked from network); 22 Apr 2009 12:15:40 +0200 Received: from smtp.free.de (HELO orwell.free.de) (gallasch@free.de@[91.204.4.103]) (envelope-sender ) by smtp.free.de (qmail-ldap-1.03) with AES256-SHA encrypted SMTP for ; 22 Apr 2009 12:15:40 +0200 Message-ID: <49EEEE4C.1030601@free.de> Date: Wed, 22 Apr 2009 12:15:40 +0200 From: Kai Gallasch User-Agent: Thunderbird 2.0.0.21 (Macintosh/20090302) MIME-Version: 1.0 To: freebsd-fs@freebsd.org References: <49EE49D8.7000902@free.de> In-Reply-To: X-Enigmail-Version: 0.95.7 OpenPGP: id=1254A186; url=http://home.free.de/kai/1254A186.asc Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Subject: Re: FreeBSD 7.2-RC1 - ZFS related kernel panic "kmem_map too small" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Apr 2009 10:15:43 -0000 Ivan Voras schrieb: > Kai Gallasch wrote: >> Hi. >> >> Today I had a kernel panic on my server running FreeBSD 7.2-RC1 (amd64), >> Opteron, 4 Cores, 16GB RAM, when benchmarking a raidz1 pool with >> bonnie++ benchmark. > > Just for general information - how many drives are in the pool / how > fast are the drives? raidz1 with 4 x Compaq 147GB, 10K RPM, SCSI-3 This is how the drives show up in dmesg. The are on their own SCSI bus, connected to a mpt hba. da2 at mpt0 bus 0 target 2 lun 0 da2: Fixed Direct Access SCSI-3 device da2: 320.000MB/s transfers (160.000MHz DT, offset 127, 16bit) da2: Command Queueing Enabled da2: 140014MB (286749488 512 byte sectors: 255H 63S/T 17849C) da3 at mpt0 bus 0 target 3 lun 0 da3: Fixed Direct Access SCSI-3 device da3: 320.000MB/s transfers (160.000MHz DT, offset 127, 16bit) da3: Command Queueing Enabled da3: 140014MB (286749488 512 byte sectors: 255H 63S/T 17849C) da4 at mpt0 bus 0 target 4 lun 0 da4: Fixed Direct Access SCSI-3 device da4: 320.000MB/s transfers (160.000MHz DT, offset 63, 16bit) da4: Command Queueing Enabled da4: 140014MB (286749488 512 byte sectors: 255H 63S/T 17849C) da5 at mpt0 bus 0 target 5 lun 0 da5: Fixed Direct Access SCSI-3 device da5: 320.000MB/s transfers (160.000MHz DT, offset 127, 16bit) da5: Command Queueing Enabled da5: 140014MB (286749488 512 byte sectors: 255H 63S/T 17849C) From owner-freebsd-fs@FreeBSD.ORG Wed Apr 22 10:22:59 2009 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 48C79106566C for ; Wed, 22 Apr 2009 10:22:59 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail06.syd.optusnet.com.au (mail06.syd.optusnet.com.au [211.29.132.187]) by mx1.freebsd.org (Postfix) with ESMTP id DC7C88FC0C for ; Wed, 22 Apr 2009 10:22:58 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from c122-107-120-227.carlnfd1.nsw.optusnet.com.au (c122-107-120-227.carlnfd1.nsw.optusnet.com.au [122.107.120.227]) by mail06.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id n3MAMs0w024718 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 22 Apr 2009 20:22:56 +1000 Date: Wed, 22 Apr 2009 20:22:54 +1000 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: "litepc.com" In-Reply-To: <49EF5C55.7178.1CEB83C@sb345.litepc.com> Message-ID: <20090422190944.K59813@delplex.bde.org> References: <49EF5C55.7178.1CEB83C@sb345.litepc.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@FreeBSD.org Subject: Re: Clarrification on fs block size X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Apr 2009 10:22:59 -0000 On Wed, 22 Apr 2009, litepc.com wrote: > I'm trying to to track down files that are using bad disk blocks as > reported by SMART drive tests > > I'm struggling indentifying which inodes are using which disk sectors > because the various utilities appear to define "blocks" differently. > > In the context of smartctl, fdisk, and bsdlabel a "disk block" is a > 512 byte sector > > In the context of UFS file system a "file system block" is 16384 > bytes and a "fragment" is 2048 bytes Actually, ffs has 2 types of blocks, "logical blocks" of configurable size (default 16384) and ordinary "blocks" ("fragments") of configurable size (default 2048). Logical blocks are used mainly within files and ordinary blocks are used in most other contexts, in particular for all block numbers in metadata. Block numbers in metadata need to have the smaller units so that they can address fragments. > So to my mind this means there are 32 x 512byte blocks in each 16384 > byte file system block. > > However... > > dumpfs reports "fsbtodb 2" which means a disk block = file system > block * 2^2 so there are 4 disk blocks in each file system block - > this is verified using the fsdb "blocks" command to list block > numbers assigned to an inode...which then must be multiplied by 4 to > use the fsdb "findblk" command to find the correct inode. 4 is the conversion factor for ordinary ffs blocks of size 2048 and virtual disk blocks of size 512 (actual disk blocks may have a different size though 512 is normal (perhaps due to virtualization in the disk itself). > Which seems to indicate that a "file system block" to dumpfs and fsdb > must be equivalent to a 2048 byte "fragment". Is this correct? Yes. > What is confusing is that if dumpfs reports "bsize" as 16384 then the > "b" in "bsize" and "b" in "fsbtodb" appear to be different "block" > definitions. It's confusing in ffs sources too. > I want to be sure that I can take the identified currupt LBA address > in smartctl, then locate the correct file system and adjusted offset > using bsdlabel and then plug this block number straight into fsdb's > "findblk" command to identify which inode owns the corrupted block. > If fsdb's findblk is expecting some other definition of "disk block" > then its not going to locate the correct inode! "findblk" seems to convert from and to virtual disk block units, so you don't need to know anything about either of ffs's. This is a strange interface since its blocks have different units from the ordinary block numbers printed by the "blocks" command. "findblk" seems to be the only command in fsdb that does these conversions. Bruce From owner-freebsd-fs@FreeBSD.ORG Wed Apr 22 10:30:23 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0501F1065670 for ; Wed, 22 Apr 2009 10:30:23 +0000 (UTC) (envelope-from gary.jennejohn@freenet.de) Received: from mout2.freenet.de (mout2.freenet.de [IPv6:2001:748:100:40::2:4]) by mx1.freebsd.org (Postfix) with ESMTP id 94DD48FC0A for ; Wed, 22 Apr 2009 10:30:22 +0000 (UTC) (envelope-from gary.jennejohn@freenet.de) Received: from [195.4.92.18] (helo=8.mx.freenet.de) by mout2.freenet.de with esmtpa (ID gary.jennejohn@freenet.de) (port 25) (Exim 4.69 #88) id 1LwZif-0002Zy-90; Wed, 22 Apr 2009 12:30:21 +0200 Received: from tb821.t.pppool.de ([89.55.184.33]:33573 helo=ernst.jennejohn.org) by 8.mx.freenet.de with esmtpa (ID gary.jennejohn@freenet.de) (port 25) (Exim 4.69 #79) id 1LwZie-0005QL-Tt; Wed, 22 Apr 2009 12:30:21 +0200 Date: Wed, 22 Apr 2009 12:30:20 +0200 From: Gary Jennejohn To: Kai Gallasch Message-ID: <20090422123020.42b756c1@ernst.jennejohn.org> In-Reply-To: <49EE49D8.7000902@free.de> References: <49EE49D8.7000902@free.de> X-Mailer: Claws Mail 3.7.1 (GTK+ 2.14.7; amd64-portbld-freebsd8.0) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: Re: FreeBSD 7.2-RC1 - ZFS related kernel panic "kmem_map too small" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: gary.jennejohn@freenet.de List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Apr 2009 10:30:23 -0000 On Wed, 22 Apr 2009 00:34:00 +0200 Kai Gallasch wrote: [snip a lot of stuff] > In /usr/src/UPDATING I read: > > [..] > > 20090207: > ZFS users on amd64 machines with 4GB or more of RAM should > reevaluate their need for setting vm.kmem_size_max and > vm.kmem_size manually. In fact, after recent changes to the > kernel, the default value of vm.kmem_size is larger than the > suggested manual setting in most ZFS/FreeBSD tuning guides. > > So I understood this as "vm.kmem_size is set unnecessary large by > default. You should think about decreasing it to save some RAM" > > On my amd64 server the default values of kmem_size are > > vm.kmem_size_scale: 3 > vm.kmem_size_max: 3865468109 > vm.kmem_size_min: 0 > vm.kmem_size: 1201446912 > > Can someone give me a hint how to debug this problem further, or how to > find some reasonable values for setting vm.kmem_size_max and > vm.kmem_size with 16G of RAM? > Hmm, I wonder whether this applies to 7.2-RC1. I don't know whether the kernel changes have been committed to 7.2 or whether they were already present when we started work on 7.2 because I haven't been paying much attention. On my 8-current amd64 machine with only 4GB of RAM I see larger values than you see with 16GB: sysctl vm.kmem_size_max vm.kmem_size_max: 4509713203 sysctl vm.kmem_size vm.kmem_size: 1335824384 --- Gary Jennejohn From owner-freebsd-fs@FreeBSD.ORG Wed Apr 22 12:16:21 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5D0EF106567D for ; Wed, 22 Apr 2009 12:16:21 +0000 (UTC) (envelope-from freebsd-fs@m.gmane.org) Received: from ciao.gmane.org (main.gmane.org [80.91.229.2]) by mx1.freebsd.org (Postfix) with ESMTP id DE33A8FC2F for ; Wed, 22 Apr 2009 12:16:20 +0000 (UTC) (envelope-from freebsd-fs@m.gmane.org) Received: from list by ciao.gmane.org with local (Exim 4.43) id 1LwbN8-0003aa-7p for freebsd-fs@freebsd.org; Wed, 22 Apr 2009 12:16:14 +0000 Received: from lara.cc.fer.hr ([161.53.72.113]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 22 Apr 2009 12:16:14 +0000 Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Wed, 22 Apr 2009 12:16:14 +0000 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-fs@freebsd.org From: Ivan Voras Date: Wed, 22 Apr 2009 14:16:06 +0200 Lines: 72 Message-ID: References: <49EE49D8.7000902@free.de> <20090422123020.42b756c1@ernst.jennejohn.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enig07E2043828666B905F83E9AB" X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr User-Agent: Thunderbird 2.0.0.21 (X11/20090318) In-Reply-To: <20090422123020.42b756c1@ernst.jennejohn.org> X-Enigmail-Version: 0.95.0 Sender: news Subject: Re: FreeBSD 7.2-RC1 - ZFS related kernel panic "kmem_map too small" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Apr 2009 12:16:21 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig07E2043828666B905F83E9AB Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Gary Jennejohn wrote: > On Wed, 22 Apr 2009 00:34:00 +0200 > Kai Gallasch wrote: >=20 > [snip a lot of stuff] >> In /usr/src/UPDATING I read: >> >> [..] >> >> 20090207: >> ZFS users on amd64 machines with 4GB or more of RAM should >> reevaluate their need for setting vm.kmem_size_max and >> vm.kmem_size manually. In fact, after recent changes to the >> kernel, the default value of vm.kmem_size is larger than the >> suggested manual setting in most ZFS/FreeBSD tuning guides. >> >> So I understood this as "vm.kmem_size is set unnecessary large by >> default. You should think about decreasing it to save some RAM" >> >> On my amd64 server the default values of kmem_size are >> >> vm.kmem_size_scale: 3 >> vm.kmem_size_max: 3865468109 >> vm.kmem_size_min: 0 >> vm.kmem_size: 1201446912 >> >> Can someone give me a hint how to debug this problem further, or how t= o >> find some reasonable values for setting vm.kmem_size_max and >> vm.kmem_size with 16G of RAM? >> >=20 > Hmm, I wonder whether this applies to 7.2-RC1. I don't know whether > the kernel changes have been committed to 7.2 or whether they were > already present when we started work on 7.2 because I haven't been > paying much attention. 7.2 was branched last Friday - quick browsing of commit messages doesn't find any relevant new development between Friday and now. > On my 8-current amd64 machine with only 4GB of RAM I see larger values > than you see with 16GB: >=20 > sysctl vm.kmem_size_max > vm.kmem_size_max: 4509713203 > sysctl vm.kmem_size > vm.kmem_size: 1335824384 Ok, but remember that ZFS in -CURRENT is very different from ZFS in -STAB= LE. --------------enig07E2043828666B905F83E9AB Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFJ7wqGldnAQVacBcgRAijhAKD3ET5uLbJ1UXaWHeohoWYa6V4WOgCgu2pS ecpjDMvuSi5Z47w2v15d34o= =2XQI -----END PGP SIGNATURE----- --------------enig07E2043828666B905F83E9AB-- From owner-freebsd-fs@FreeBSD.ORG Wed Apr 22 13:06:16 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5B8FA1065670; Wed, 22 Apr 2009 13:06:16 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 116228FC1B; Wed, 22 Apr 2009 13:06:14 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id QAA16460; Wed, 22 Apr 2009 16:06:13 +0300 (EEST) (envelope-from avg@icyb.net.ua) Message-ID: <49EF1645.70704@icyb.net.ua> Date: Wed, 22 Apr 2009 16:06:13 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.21 (X11/20090406) MIME-Version: 1.0 To: Ivan Voras References: <49EDCA21.70908@icyb.net.ua> <49EDF80F.3070105@icyb.net.ua> In-Reply-To: X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, freebsd-geom@freebsd.org Subject: Re: glabel for ufs: size check is overzealous? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Apr 2009 13:06:16 -0000 on 21/04/2009 21:43 Ivan Voras said the following: > Andriy Gapon wrote: >> I don't see why it should and - no, it actually does not. >> fsck checks only filesystem's internal consistency, it doesn't check media size, etc. > > Well yes, if the number of blocks is really incorrect it should be > visible from the arrangement of the metadata but still - that makes the > field almost useless doesn't it? How do you mean? The field tells the filesystem size, how it can be useless? -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Wed Apr 22 13:11:05 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BDAFC1065670; Wed, 22 Apr 2009 13:11:05 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 8BA228FC08; Wed, 22 Apr 2009 13:11:04 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id QAA16562; Wed, 22 Apr 2009 16:11:02 +0300 (EEST) (envelope-from avg@icyb.net.ua) Message-ID: <49EF1766.7030401@icyb.net.ua> Date: Wed, 22 Apr 2009 16:11:02 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.21 (X11/20090406) MIME-Version: 1.0 To: Ivan Voras References: <49EDCA21.70908@icyb.net.ua> <49EDF80F.3070105@icyb.net.ua> <49EF1645.70704@icyb.net.ua> <9bbcef730904220608y73cbf2d2s6921b05c1978a121@mail.gmail.com> In-Reply-To: <9bbcef730904220608y73cbf2d2s6921b05c1978a121@mail.gmail.com> X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, Andriy Gapon , freebsd-geom@freebsd.org Subject: Re: glabel for ufs: size check is overzealous? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Apr 2009 13:11:06 -0000 on 22/04/2009 16:08 Ivan Voras said the following: > 2009/4/22 Andriy Gapon : >> on 21/04/2009 21:43 Ivan Voras said the following: >>> Andriy Gapon wrote: >>>> I don't see why it should and - no, it actually does not. >>>> fsck checks only filesystem's internal consistency, it doesn't check media size, etc. >>> Well yes, if the number of blocks is really incorrect it should be >>> visible from the arrangement of the metadata but still - that makes the >>> field almost useless doesn't it? >> How do you mean? >> The field tells the filesystem size, how it can be useless? > > If nothing checks it and everything works, I'd say it's usefulness is > a bit limited... ufs driver doesn't check it, the driver *uses* it, so... :-) -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Wed Apr 22 13:36:35 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EA57F106564A; Wed, 22 Apr 2009 13:36:35 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id AC47B8FC16; Wed, 22 Apr 2009 13:36:34 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id QAA17407; Wed, 22 Apr 2009 16:36:32 +0300 (EEST) (envelope-from avg@icyb.net.ua) Message-ID: <49EF1D5F.7050907@icyb.net.ua> Date: Wed, 22 Apr 2009 16:36:31 +0300 From: Andriy Gapon User-Agent: Thunderbird 2.0.0.21 (X11/20090406) MIME-Version: 1.0 To: Ivan Voras References: <49EDCA21.70908@icyb.net.ua> <49EDF80F.3070105@icyb.net.ua> <49EF1645.70704@icyb.net.ua> <9bbcef730904220608y73cbf2d2s6921b05c1978a121@mail.gmail.com> <49EF1766.7030401@icyb.net.ua> <9bbcef730904220612s3ff4308fpc1d18e216a5c7773@mail.gmail.com> In-Reply-To: <9bbcef730904220612s3ff4308fpc1d18e216a5c7773@mail.gmail.com> X-Enigmail-Version: 0.95.7 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, freebsd-geom@freebsd.org Subject: Re: glabel for ufs: size check is overzealous? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Apr 2009 13:36:36 -0000 on 22/04/2009 16:12 Ivan Voras said the following: > 2009/4/22 Andriy Gapon : >> on 22/04/2009 16:08 Ivan Voras said the following: >>> 2009/4/22 Andriy Gapon : >>>> on 21/04/2009 21:43 Ivan Voras said the following: >>>>> Andriy Gapon wrote: >>>>>> I don't see why it should and - no, it actually does not. >>>>>> fsck checks only filesystem's internal consistency, it doesn't check media size, etc. >>>>> Well yes, if the number of blocks is really incorrect it should be >>>>> visible from the arrangement of the metadata but still - that makes the >>>>> field almost useless doesn't it? >>>> How do you mean? >>>> The field tells the filesystem size, how it can be useless? >>> If nothing checks it and everything works, I'd say it's usefulness is >>> a bit limited... >> ufs driver doesn't check it, the driver *uses* it, so... :-) > > But as you said, fsck will not fix an invalid value? It won't, because it can not know the correct value and it is probably not able to safely derive it from anything. Filesystem size is supposed to always stay immutable (modulo growfs), so if this type of corruption happens to superblock, then one has quite a big problem and possibly some fun time with disk editor. -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Wed Apr 22 13:37:24 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A36E2106566B for ; Wed, 22 Apr 2009 13:37:24 +0000 (UTC) (envelope-from ivoras@gmail.com) Received: from mail-ew0-f171.google.com (mail-ew0-f171.google.com [209.85.219.171]) by mx1.freebsd.org (Postfix) with ESMTP id 31BCA8FC1B for ; Wed, 22 Apr 2009 13:37:23 +0000 (UTC) (envelope-from ivoras@gmail.com) Received: by ewy19 with SMTP id 19so2658008ewy.43 for ; Wed, 22 Apr 2009 06:37:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:from:date:x-google-sender-auth:message-id:subject:to:cc :content-type:content-transfer-encoding; bh=N8YqfnNXQ97DJ/eZqjuYnqYae7zadwZMBZBaIJVgDzU=; b=AgyU8CCrw09IveH9A2n+fFS971L8IhigCzc7airWFn3yI045Qa8qXGJfOR9Jc/Wk16 54VepV+gk4kz+FSu76V79XNE82D6ajqvNkG/IsW10moNmBRRU4IbfnrOUvNBQu1L+jTX r0DIMBXo56aai+g7UkfOWgr4nKCs89PfYSwfA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type :content-transfer-encoding; b=LU/vaE+QbhHv7LYq+MxgGWbRB6womKtqLxX8jj9LTaIAOQXbA0AnyG+QRo+vSTLu7y 9Jw50lupF7+kfTDfn2rOi1ZESRQCfQX+XOdT8rQaNTqZK71OoximEGcXxPzH9NZiDj5O Decc70Tq4Z870JClo/yKCva/OQiGhWDO8dcwU= MIME-Version: 1.0 Sender: ivoras@gmail.com Received: by 10.210.13.9 with SMTP id 9mr7360995ebm.88.1240405728103; Wed, 22 Apr 2009 06:08:48 -0700 (PDT) In-Reply-To: <49EF1645.70704@icyb.net.ua> References: <49EDCA21.70908@icyb.net.ua> <49EDF80F.3070105@icyb.net.ua> <49EF1645.70704@icyb.net.ua> From: Ivan Voras Date: Wed, 22 Apr 2009 15:08:33 +0200 X-Google-Sender-Auth: 514ee885efb0dc06 Message-ID: <9bbcef730904220608y73cbf2d2s6921b05c1978a121@mail.gmail.com> To: Andriy Gapon Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, freebsd-geom@freebsd.org Subject: Re: glabel for ufs: size check is overzealous? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Apr 2009 13:37:24 -0000 2009/4/22 Andriy Gapon : > on 21/04/2009 21:43 Ivan Voras said the following: >> Andriy Gapon wrote: >>> I don't see why it should and - no, it actually does not. >>> fsck checks only filesystem's internal consistency, it doesn't check media size, etc. >> >> Well yes, if the number of blocks is really incorrect it should be >> visible from the arrangement of the metadata but still - that makes the >> field almost useless doesn't it? > > How do you mean? > The field tells the filesystem size, how it can be useless? If nothing checks it and everything works, I'd say it's usefulness is a bit limited... From owner-freebsd-fs@FreeBSD.ORG Wed Apr 22 13:42:07 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E3A611065670 for ; Wed, 22 Apr 2009 13:42:07 +0000 (UTC) (envelope-from ivoras@gmail.com) Received: from ey-out-2122.google.com (ey-out-2122.google.com [74.125.78.24]) by mx1.freebsd.org (Postfix) with ESMTP id 70BA68FC0C for ; Wed, 22 Apr 2009 13:42:07 +0000 (UTC) (envelope-from ivoras@gmail.com) Received: by ey-out-2122.google.com with SMTP id 9so303319eyd.7 for ; Wed, 22 Apr 2009 06:42:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:from:date:x-google-sender-auth:message-id:subject:to:cc :content-type:content-transfer-encoding; bh=xpPRKSITVOHFx5Gg9bDt58FNZRWGTKOTE3c8klchZL8=; b=A6ulE6GMtWGhxHivqBd6ESsSjqP2zFIFwcOgw2vrm7jyeDXgWhSXxFJOjKWvxsfWiu meB6XS7bag9bduS3QQ1qzCC9TkLxAqRKF72fldMa8gDJ1ZTpwNKVF271FYmp6Xc87vIi K5YbVRiAJZ8DLDcByVFEClbL/Mp4A9cGbL+Wk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type :content-transfer-encoding; b=HvzJpXRLMNNX4I4ToswzW+aRUPFYDZZH7edtZnrl0aTtvqQ2xewG6mq0/E3ZbXB1HS ht5lvKUFBDKSz3hWZZEGHOxeFGHGIED5TWdexwiYLWAhH5ygRjYa/JT7co0Syop8cep1 at45R+wCO1iJgfT0NHZsFT2v7/4w3dXDitGWc= MIME-Version: 1.0 Sender: ivoras@gmail.com Received: by 10.210.41.14 with SMTP id o14mr6284413ebo.8.1240405958172; Wed, 22 Apr 2009 06:12:38 -0700 (PDT) In-Reply-To: <49EF1766.7030401@icyb.net.ua> References: <49EDCA21.70908@icyb.net.ua> <49EDF80F.3070105@icyb.net.ua> <49EF1645.70704@icyb.net.ua> <9bbcef730904220608y73cbf2d2s6921b05c1978a121@mail.gmail.com> <49EF1766.7030401@icyb.net.ua> From: Ivan Voras Date: Wed, 22 Apr 2009 15:12:23 +0200 X-Google-Sender-Auth: 577018fd55468fc8 Message-ID: <9bbcef730904220612s3ff4308fpc1d18e216a5c7773@mail.gmail.com> To: Andriy Gapon Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, freebsd-geom@freebsd.org Subject: Re: glabel for ufs: size check is overzealous? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Apr 2009 13:42:08 -0000 2009/4/22 Andriy Gapon : > on 22/04/2009 16:08 Ivan Voras said the following: >> 2009/4/22 Andriy Gapon : >>> on 21/04/2009 21:43 Ivan Voras said the following: >>>> Andriy Gapon wrote: >>>>> I don't see why it should and - no, it actually does not. >>>>> fsck checks only filesystem's internal consistency, it doesn't check media size, etc. >>>> Well yes, if the number of blocks is really incorrect it should be >>>> visible from the arrangement of the metadata but still - that makes the >>>> field almost useless doesn't it? >>> How do you mean? >>> The field tells the filesystem size, how it can be useless? >> >> If nothing checks it and everything works, I'd say it's usefulness is >> a bit limited... > > ufs driver doesn't check it, the driver *uses* it, so... :-) But as you said, fsck will not fix an invalid value? From owner-freebsd-fs@FreeBSD.ORG Wed Apr 22 13:56:29 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 89149106566B for ; Wed, 22 Apr 2009 13:56:29 +0000 (UTC) (envelope-from gary.jennejohn@freenet.de) Received: from mout2.freenet.de (mout2.freenet.de [IPv6:2001:748:100:40::2:4]) by mx1.freebsd.org (Postfix) with ESMTP id 2388E8FC12 for ; Wed, 22 Apr 2009 13:56:29 +0000 (UTC) (envelope-from gary.jennejohn@freenet.de) Received: from [195.4.92.23] (helo=13.mx.freenet.de) by mout2.freenet.de with esmtpa (ID gary.jennejohn@freenet.de) (port 25) (Exim 4.69 #88) id 1Lwcw8-0007r0-2D for freebsd-fs@freebsd.org; Wed, 22 Apr 2009 15:56:28 +0200 Received: from tb821.t.pppool.de ([89.55.184.33]:30960 helo=ernst.jennejohn.org) by 13.mx.freenet.de with esmtpa (ID gary.jennejohn@freenet.de) (port 25) (Exim 4.69 #79) id 1Lwcw7-0008OF-Ql for freebsd-fs@freebsd.org; Wed, 22 Apr 2009 15:56:28 +0200 Date: Wed, 22 Apr 2009 15:56:27 +0200 From: Gary Jennejohn To: freebsd-fs@freebsd.org Message-ID: <20090422155627.7b6e127d@ernst.jennejohn.org> In-Reply-To: References: <49EE49D8.7000902@free.de> <20090422123020.42b756c1@ernst.jennejohn.org> X-Mailer: Claws Mail 3.7.1 (GTK+ 2.14.7; amd64-portbld-freebsd8.0) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: Re: FreeBSD 7.2-RC1 - ZFS related kernel panic "kmem_map too small" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: gary.jennejohn@freenet.de List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Apr 2009 13:56:29 -0000 On Wed, 22 Apr 2009 14:16:06 +0200 Ivan Voras wrote: > Gary Jennejohn wrote: > > On my 8-current amd64 machine with only 4GB of RAM I see larger values > > than you see with 16GB: > > > > sysctl vm.kmem_size_max > > vm.kmem_size_max: 4509713203 > > sysctl vm.kmem_size > > vm.kmem_size: 1335824384 > > Ok, but remember that ZFS in -CURRENT is very different from ZFS in -STABLE. > True, but the kmem_size stuff has nothing to do with ZFS. It's VM. --- Gary Jennejohn From owner-freebsd-fs@FreeBSD.ORG Wed Apr 22 14:40:04 2009 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9AD621065711 for ; Wed, 22 Apr 2009 14:40:04 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 6ED558FC1D for ; Wed, 22 Apr 2009 14:40:04 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n3MEe4T5001655 for ; Wed, 22 Apr 2009 14:40:04 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n3MEe4ip001654; Wed, 22 Apr 2009 14:40:04 GMT (envelope-from gnats) Date: Wed, 22 Apr 2009 14:40:04 GMT Message-Id: <200904221440.n3MEe4ip001654@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Jaakko Heinonen Cc: Subject: Re: kern/132068: [zfs] page fault when using ZFS over NFS on 7.1-RELEASE/amd64 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Jaakko Heinonen List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Apr 2009 14:40:05 -0000 The following reply was made to PR kern/132068; it has been noted by GNATS. From: Jaakko Heinonen To: Edward Fisk <7ogcg7g02@sneakemail.com> Cc: bug-followup@FreeBSD.org, Weldon Godfrey Subject: Re: kern/132068: [zfs] page fault when using ZFS over NFS on 7.1-RELEASE/amd64 Date: Wed, 22 Apr 2009 17:38:57 +0300 On 2009-04-10, Jaakko Heinonen wrote: > OK, I have now put together a patch which should avoid the original > panic you reported. Have you had a chance to test the patch? http://www.freebsd.org/cgi/query-pr.cgi?pr=132068 -- Jaakko From owner-freebsd-fs@FreeBSD.ORG Wed Apr 22 14:40:33 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9ED9910657F9 for ; Wed, 22 Apr 2009 14:40:28 +0000 (UTC) (envelope-from mcdouga9@egr.msu.edu) Received: from mx.egr.msu.edu (surfnturf.egr.msu.edu [35.9.37.164]) by mx1.freebsd.org (Postfix) with ESMTP id 8EE818FC15 for ; Wed, 22 Apr 2009 14:40:21 +0000 (UTC) (envelope-from mcdouga9@egr.msu.edu) Received: from localhost (localhost [127.0.0.1]) by mx.egr.msu.edu (Postfix) with ESMTP id 9AA3C71F08C for ; Wed, 22 Apr 2009 10:20:43 -0400 (EDT) X-Virus-Scanned: amavisd-new at egr.msu.edu Received: from mx.egr.msu.edu ([127.0.0.1]) by localhost (surfnturf.egr.msu.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id xml+4qgDebpE for ; Wed, 22 Apr 2009 10:20:43 -0400 (EDT) Received: from [35.9.44.65] (daemon.egr.msu.edu [35.9.44.65]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: mcdouga9) by mx.egr.msu.edu (Postfix) with ESMTPSA id 77B2571F06E for ; Wed, 22 Apr 2009 10:20:43 -0400 (EDT) Message-ID: <49EF27BB.1060100@egr.msu.edu> Date: Wed, 22 Apr 2009 10:20:43 -0400 From: Adam McDougall User-Agent: Thunderbird 2.0.0.21 (X11/20090419) MIME-Version: 1.0 CC: freebsd-fs@freebsd.org References: <49EE49D8.7000902@free.de> <20090422123020.42b756c1@ernst.jennejohn.org> In-Reply-To: <20090422123020.42b756c1@ernst.jennejohn.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: FreeBSD 7.2-RC1 - ZFS related kernel panic "kmem_map too small" X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Apr 2009 14:40:54 -0000 Gary Jennejohn wrote: > On Wed, 22 Apr 2009 00:34:00 +0200 > Kai Gallasch wrote: > > [snip a lot of stuff] > >> In /usr/src/UPDATING I read: >> >> [..] >> >> 20090207: >> ZFS users on amd64 machines with 4GB or more of RAM should >> reevaluate their need for setting vm.kmem_size_max and >> vm.kmem_size manually. In fact, after recent changes to the >> kernel, the default value of vm.kmem_size is larger than the >> suggested manual setting in most ZFS/FreeBSD tuning guides. >> >> So I understood this as "vm.kmem_size is set unnecessary large by >> default. You should think about decreasing it to save some RAM" >> >> On my amd64 server the default values of kmem_size are >> >> vm.kmem_size_scale: 3 >> vm.kmem_size_max: 3865468109 >> vm.kmem_size_min: 0 >> vm.kmem_size: 1201446912 >> >> Can someone give me a hint how to debug this problem further, or how to >> find some reasonable values for setting vm.kmem_size_max and >> vm.kmem_size with 16G of RAM? >> >> > > Hmm, I wonder whether this applies to 7.2-RC1. I don't know whether > the kernel changes have been committed to 7.2 or whether they were > already present when we started work on 7.2 because I haven't been > paying much attention. > > On my 8-current amd64 machine with only 4GB of RAM I see larger values > than you see with 16GB: > > sysctl vm.kmem_size_max > vm.kmem_size_max: 4509713203 > sysctl vm.kmem_size > vm.kmem_size: 1335824384 > > --- > Gary Jennejohn > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > > It has been my experience that after the kmem maximums were raised to allow more than approx 1.6G kmem (a number of months ago), on some systems I still had to specifically raise the vm.kmem_size above the default otherwise I still got out of kmem panics far below the max. I suspect there was pressure for kmem and it was unable to "raise" the limit fast enough, or maybe a fragmentation problem? Additionally, depending on which host, I've found different limits to how high I can set the kmem settings on "recent" builds of 7 and 8 amd64, for example I have one 7.2 system with 4G ram and the kernel would panic if I booted with kmem=2G (1G works fine), but I have a 8.0 system with 2G ram and kmem=2G works fine. Another 8.0 system has 6G ram but we could only boot successfully with kmem=3G, not 4G. From owner-freebsd-fs@FreeBSD.ORG Thu Apr 23 11:14:39 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E3EE3106564A for ; Thu, 23 Apr 2009 11:14:39 +0000 (UTC) (envelope-from scott@bqinternet.com) Received: from mail.bqinternet.com (mail.bqinternet.com [69.9.32.203]) by mx1.freebsd.org (Postfix) with ESMTP id BB1688FC13 for ; Thu, 23 Apr 2009 11:14:39 +0000 (UTC) (envelope-from scott@bqinternet.com) Received: from localhost (mail [69.9.32.203]) by mail.bqinternet.com (Postfix) with ESMTP id 22E3C409A24 for ; Thu, 23 Apr 2009 10:54:55 +0000 (GMT) Received: from mail.bqinternet.com ([69.9.32.203]) by localhost (mail.bqinternet.com [69.9.32.203]) (amavisd-new, port 10024) with ESMTP id 32x7jVu+XNkg for ; Thu, 23 Apr 2009 10:54:54 +0000 (GMT) Received: from scott-burnss-macbook-air.local (mail [69.9.32.203]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.bqinternet.com (Postfix) with ESMTP id 1C6BD409A23 for ; Thu, 23 Apr 2009 10:54:54 +0000 (GMT) Message-ID: <49F048FB.6000401@bqinternet.com> Date: Thu, 23 Apr 2009 06:54:51 -0400 From: Scott Burns User-Agent: Thunderbird 2.0.0.21 (Macintosh/20090302) MIME-Version: 1.0 To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: UFS2 metadata checksums X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Apr 2009 11:14:40 -0000 Hi guys, I have spent some time writing a kernel module which calculates a checksum of a UFS2 dinode structure and stores it in the reserved space of the inode when writing it to disk. It is then verified when the inode is read from disk. If the checksum verification fails, the read returns an error (currently EIO). I believe that protecting metadata integrity is important, especially as storage capacity grows. Bitrot is a fact of life, and bad things can happen if the kernel acts on a corrupted inode. Not only does this module improve the stability of a server, but it also helps to prevent additional damage to the filesystem that can be caused by metadata corruption. I'm aware that data integrity issues are addressed with ZFS, but unfortunately ZFS is still not yet suitable for many workloads. I'm also aware that integrity checking can be done by using GELI between the filesystem and the disk, but at a noticeable cost in performance and space utilization. The method this module uses is fast and does not use any additional space. Most importantly, it builds on mature code that has worked well for decades. Before I spend much more time on it, I have some questions: 1) Has anyone else done any work in this area? 2) Is there a demand for this in FreeBSD? -- Scott Burns System Administrator BQ Internet Corporation From owner-freebsd-fs@FreeBSD.ORG Thu Apr 23 12:20:04 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 030A71065673 for ; Thu, 23 Apr 2009 12:20:04 +0000 (UTC) (envelope-from freebsd-fs@m.gmane.org) Received: from ciao.gmane.org (main.gmane.org [80.91.229.2]) by mx1.freebsd.org (Postfix) with ESMTP id AF0CB8FC26 for ; Thu, 23 Apr 2009 12:20:03 +0000 (UTC) (envelope-from freebsd-fs@m.gmane.org) Received: from root by ciao.gmane.org with local (Exim 4.43) id 1LwxuN-0000Xt-4G for freebsd-fs@freebsd.org; Thu, 23 Apr 2009 12:20:03 +0000 Received: from lara.cc.fer.hr ([161.53.72.113]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 23 Apr 2009 12:20:03 +0000 Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 23 Apr 2009 12:20:03 +0000 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-fs@freebsd.org From: Ivan Voras Date: Thu, 23 Apr 2009 14:19:47 +0200 Lines: 28 Message-ID: References: <49F048FB.6000401@bqinternet.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enig530C126CCC1B3BE297E6AEBA" X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr User-Agent: Thunderbird 2.0.0.21 (X11/20090318) In-Reply-To: <49F048FB.6000401@bqinternet.com> X-Enigmail-Version: 0.95.0 Sender: news Subject: Re: UFS2 metadata checksums X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Apr 2009 12:20:04 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig530C126CCC1B3BE297E6AEBA Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Scott Burns wrote: > 2) Is there a demand for this in FreeBSD? Speaking for myself, I'd like it on the systems I maintain. (I'd also like a sysctl to ignore the errors, just in case :) ). --------------enig530C126CCC1B3BE297E6AEBA Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFJ8FzjldnAQVacBcgRAjQXAJ0S0gyJMCuf21A3nqFmCSjZIc/NhgCfaNlf O+V1rmziSqVxbNjkkGvdtnU= =swaT -----END PGP SIGNATURE----- --------------enig530C126CCC1B3BE297E6AEBA-- From owner-freebsd-fs@FreeBSD.ORG Thu Apr 23 13:00:05 2009 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CAFF4106566B for ; Thu, 23 Apr 2009 13:00:05 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id AA47E8FC1D for ; Thu, 23 Apr 2009 13:00:05 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n3ND05ok048446 for ; Thu, 23 Apr 2009 13:00:05 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n3ND05uq048445; Thu, 23 Apr 2009 13:00:05 GMT (envelope-from gnats) Date: Thu, 23 Apr 2009 13:00:05 GMT Message-Id: <200904231300.n3ND05uq048445@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: "Weldon Godfrey" Cc: Subject: RE: kern/132068: [zfs] page fault when using ZFS over NFS on7.1-RELEASE/amd64 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Weldon Godfrey List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Apr 2009 13:00:06 -0000 The following reply was made to PR kern/132068; it has been noted by GNATS. From: "Weldon Godfrey" To: "Jaakko Heinonen" , "Edward Fisk" <7ogcg7g02@sneakemail.com> Cc: Subject: RE: kern/132068: [zfs] page fault when using ZFS over NFS on7.1-RELEASE/amd64 Date: Thu, 23 Apr 2009 07:40:10 -0500 Sorry. Around Dec 12 I switched to head. By increasing kmem to 4GB and using NFS v2, that reduced the panics to a few times a month. The server is in production. I'll need to acquire some additional drives so I can install the OS on different drives (in case I need to backout) and wait until summer to attempt to upgrade. Weldon -----Original Message----- From: Jaakko Heinonen [mailto:jh@saunalahti.fi]=20 Sent: Wednesday, April 22, 2009 9:39 AM To: Edward Fisk Cc: bug-followup@FreeBSD.org; Weldon Godfrey Subject: Re: kern/132068: [zfs] page fault when using ZFS over NFS on7.1-RELEASE/amd64 On 2009-04-10, Jaakko Heinonen wrote: > OK, I have now put together a patch which should avoid the original > panic you reported. Have you had a chance to test the patch? http://www.freebsd.org/cgi/query-pr.cgi?pr=3D132068 --=20 Jaakko From owner-freebsd-fs@FreeBSD.ORG Thu Apr 23 22:19:14 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6A60A1065677; Thu, 23 Apr 2009 22:19:14 +0000 (UTC) (envelope-from morganw@chemikals.org) Received: from warped.bluecherry.net (unknown [IPv6:2001:440:eeee:fffb::2]) by mx1.freebsd.org (Postfix) with ESMTP id 094CD8FC08; Thu, 23 Apr 2009 22:19:14 +0000 (UTC) (envelope-from morganw@chemikals.org) Received: from volatile.chemikals.org (adsl-67-215-2.shv.bellsouth.net [98.67.215.2]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by warped.bluecherry.net (Postfix) with ESMTPSA id 7034D8006B37; Thu, 23 Apr 2009 17:19:12 -0500 (CDT) Received: from localhost (morganw@localhost [127.0.0.1]) by volatile.chemikals.org (8.14.3/8.14.3) with ESMTP id n3NMItGP026717; Thu, 23 Apr 2009 17:19:07 -0500 (CDT) (envelope-from morganw@chemikals.org) Date: Thu, 23 Apr 2009 17:18:55 -0500 (CDT) From: Wes Morgan To: Ivan Voras In-Reply-To: Message-ID: References: <49F048FB.6000401@bqinternet.com> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@freebsd.org Subject: Re: UFS2 metadata checksums X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Apr 2009 22:19:14 -0000 On Thu, 23 Apr 2009, Ivan Voras wrote: > Scott Burns wrote: > >> 2) Is there a demand for this in FreeBSD? > > Speaking for myself, I'd like it on the systems I maintain. (I'd also > like a sysctl to ignore the errors, just in case :) ). That's actually something ZFS could use if you ask me. In one instance I had some bad ram that was causing checksum errors (zfs is better than memtest for finding bad ram!), and I had to comment out the ECHKSUM error from the kernel to recover the pieces of the file that were reported corrupt. From owner-freebsd-fs@FreeBSD.ORG Thu Apr 23 22:28:55 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C44CE1065677 for ; Thu, 23 Apr 2009 22:28:55 +0000 (UTC) (envelope-from ivoras@gmail.com) Received: from mail-ew0-f171.google.com (mail-ew0-f171.google.com [209.85.219.171]) by mx1.freebsd.org (Postfix) with ESMTP id 344EB8FC19 for ; Thu, 23 Apr 2009 22:28:54 +0000 (UTC) (envelope-from ivoras@gmail.com) Received: by ewy19 with SMTP id 19so754417ewy.43 for ; Thu, 23 Apr 2009 15:28:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:from:date:x-google-sender-auth:message-id:subject:to:cc :content-type:content-transfer-encoding; bh=d4mZ20WuIaxUHoDqIJG/fCq3cFQYG3Xu2K/vO3YmhLk=; b=QyTARrWXlv+NIWcw9rXPQrPUMxfJgAhNxMlH17BWYbDU1DSrfsT344mtYXJaBkBAId jw/SMLouClCNJAKFgO6aAOevZRSBngfBt7h/sko4Wjzoqx+62vIKIJpxC+YiOTXshADc tQWlRgYFhWGs8WZwWEN/bjqxzvM+QNKBSN16c= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type :content-transfer-encoding; b=CoB3RZldysF3nSO7YRnqNS3daylieIU+zNKjRPglLuapsNvaz2D+TTJIbcCEpdDUHK ck1A0m+eZkpf2n4u//NzXx60B59QZ7NWwyOCYoYHcATFeZcXldPYUC+yJPfK3RIxi12m HkLMUD6s0sc6uo8ob4++HtK+W/wJ5kXdAyOeA= MIME-Version: 1.0 Sender: ivoras@gmail.com Received: by 10.210.57.12 with SMTP id f12mr1563337eba.41.1240525734083; Thu, 23 Apr 2009 15:28:54 -0700 (PDT) In-Reply-To: References: <49F048FB.6000401@bqinternet.com> From: Ivan Voras Date: Fri, 24 Apr 2009 00:28:39 +0200 X-Google-Sender-Auth: bc00eeb4b7c54e91 Message-ID: <9bbcef730904231528v6badb9d1u27d89fb0e1cb1cb9@mail.gmail.com> To: Wes Morgan Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: Re: UFS2 metadata checksums X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Apr 2009 22:28:56 -0000 2009/4/24 Wes Morgan : > On Thu, 23 Apr 2009, Ivan Voras wrote: > >> Scott Burns wrote: >> >>> 2) Is there a demand for this in FreeBSD? >> >> Speaking for myself, I'd like it on the systems I maintain. (I'd also >> like a sysctl to ignore the errors, just in case :) ). > > That's actually something ZFS could use if you ask me. In one instance I had > some bad ram that was causing checksum errors (zfs is better than memtest > for finding bad ram!), and I had to comment out the ECHKSUM error from the > kernel to recover the pieces of the file that were reported corrupt. Yes, this is my inspiration :) From owner-freebsd-fs@FreeBSD.ORG Fri Apr 24 00:16:39 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4A2631065678 for ; Fri, 24 Apr 2009 00:16:39 +0000 (UTC) (envelope-from kabaev@gmail.com) Received: from an-out-0708.google.com (an-out-0708.google.com [209.85.132.244]) by mx1.freebsd.org (Postfix) with ESMTP id F00478FC12 for ; Fri, 24 Apr 2009 00:16:38 +0000 (UTC) (envelope-from kabaev@gmail.com) Received: by an-out-0708.google.com with SMTP id c3so505885ana.13 for ; Thu, 23 Apr 2009 17:16:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:date:from:to:cc:subject :message-id:in-reply-to:references:x-mailer:mime-version :content-type; bh=nzMxPN5X4vtkXBUE5WJRU1JYzmuQbZh4r4mDVTgMtQU=; b=b2CJb+/R+TuJiNZ/Vh5IWk8VsKcH2TK+Q/+lhVRw0n2OCXT2I2hkm6TEZOoQmZ/gV8 9sh7VLWuw2e13GREVATDV4WlKIDU3tkErmJjXIsTbmqdG+G74zY7TMLp2UOK5uqLdHdm VSeW6fNgYzaFRHXxkNHRKGRjTc/gMqVQkapxo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=date:from:to:cc:subject:message-id:in-reply-to:references:x-mailer :mime-version:content-type; b=KIm+O7EO+G70o74FscFan0UiL1whhgAzW54FVFHzJVKM7Vdcxl8JcNxz6hr5NfbMba VEHr5kttcmzLguOUFi18tKor4UvF6T9r99jwZ9awPfyS53WXisbFEWbaSTktcHesTqAX J7kIn8A1L6/E/YYNAXEvlYHQyiEapnSllaO8g= Received: by 10.100.43.10 with SMTP id q10mr2209867anq.113.1240530825575; Thu, 23 Apr 2009 16:53:45 -0700 (PDT) Received: from kan.dnsalias.net (c-98-217-224-113.hsd1.ma.comcast.net [98.217.224.113]) by mx.google.com with ESMTPS id c9sm1454514ana.19.2009.04.23.16.53.42 (version=SSLv3 cipher=RC4-MD5); Thu, 23 Apr 2009 16:53:43 -0700 (PDT) Date: Thu, 23 Apr 2009 19:53:35 -0400 From: Alexander Kabaev To: Scott Burns Message-ID: <20090423195335.521db0a7@kan.dnsalias.net> In-Reply-To: <49F048FB.6000401@bqinternet.com> References: <49F048FB.6000401@bqinternet.com> X-Mailer: Claws Mail 3.7.1 (GTK+ 2.14.7; i386-portbld-freebsd8.0) Mime-Version: 1.0 Content-Type: multipart/signed; boundary="Sig_/vUEWaOp4GWXxjRiu=mqaDuN"; protocol="application/pgp-signature"; micalg=PGP-SHA1 Cc: freebsd-fs@freebsd.org Subject: Re: UFS2 metadata checksums X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Apr 2009 00:16:39 -0000 --Sig_/vUEWaOp4GWXxjRiu=mqaDuN Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable On Thu, 23 Apr 2009 06:54:51 -0400 Scott Burns wrote: > Hi guys, >=20 > I have spent some time writing a kernel module which calculates a=20 > checksum of a UFS2 dinode structure and stores it in the reserved > space of the inode when writing it to disk. It is then verified when > the inode is read from disk. If the checksum verification fails, the > read returns an error (currently EIO). >=20 > I believe that protecting metadata integrity is important, especially > as storage capacity grows. Bitrot is a fact of life, and bad things > can happen if the kernel acts on a corrupted inode. Not only does > this module improve the stability of a server, but it also helps to > prevent additional damage to the filesystem that can be caused by > metadata corruption. >=20 > I'm aware that data integrity issues are addressed with ZFS, but=20 > unfortunately ZFS is still not yet suitable for many workloads. I'm=20 > also aware that integrity checking can be done by using GELI between > the filesystem and the disk, but at a noticeable cost in performance > and space utilization. The method this module uses is fast and does > not use any additional space. Most importantly, it builds on mature > code that has worked well for decades. >=20 > Before I spend much more time on it, I have some questions: >=20 > 1) Has anyone else done any work in this area? >=20 > 2) Is there a demand for this in FreeBSD? >=20 This is actually something I would love to have in the base system, but inodes are not the only structures that need the integrity protection. Pretty much every other metadata block, from cylinder group blocks to indirect blocks for files need similar protection for this to be of real use. -- Alexander Kabaev --Sig_/vUEWaOp4GWXxjRiu=mqaDuN Content-Type: application/pgp-signature; name=signature.asc Content-Disposition: attachment; filename=signature.asc -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.11 (FreeBSD) iD8DBQFJ8P+EQ6z1jMm+XZYRAlf7AKDsiq2qamcMl6ZoRrBMM+by6xf3tACffWL3 wU6B/Po61UtBOiAZ3NSQfF0= =bZaN -----END PGP SIGNATURE----- --Sig_/vUEWaOp4GWXxjRiu=mqaDuN-- From owner-freebsd-fs@FreeBSD.ORG Fri Apr 24 00:23:13 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 190F21065670 for ; Fri, 24 Apr 2009 00:23:13 +0000 (UTC) (envelope-from andrew@modulus.org) Received: from email.octopus.com.au (email.octopus.com.au [122.100.2.232]) by mx1.freebsd.org (Postfix) with ESMTP id CD6878FC17 for ; Fri, 24 Apr 2009 00:23:12 +0000 (UTC) (envelope-from andrew@modulus.org) Received: by email.octopus.com.au (Postfix, from userid 1002) id ADACC17E80; Fri, 24 Apr 2009 10:23:26 +1000 (EST) X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on email.octopus.com.au X-Spam-Level: X-Spam-Status: No, score=-1.4 required=10.0 tests=ALL_TRUSTED autolearn=failed version=3.2.3 Received: from [10.1.50.60] (ppp121-44-5-163.lns10.syd7.internode.on.net [121.44.5.163]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: admin@email.octopus.com.au) by email.octopus.com.au (Postfix) with ESMTP id 7D1ED17E3D; Fri, 24 Apr 2009 10:23:22 +1000 (EST) Message-ID: <49F10660.201@modulus.org> Date: Fri, 24 Apr 2009 10:22:56 +1000 From: Andrew Snow User-Agent: Thunderbird 2.0.0.14 (X11/20080523) MIME-Version: 1.0 To: Alexander Kabaev References: <49F048FB.6000401@bqinternet.com> <20090423195335.521db0a7@kan.dnsalias.net> In-Reply-To: <20090423195335.521db0a7@kan.dnsalias.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: Re: UFS2 metadata checksums X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Apr 2009 00:23:13 -0000 Ideally you would implement complete disk checksumming as a GEOM device. Then you could layer geom_mirror on top of it, so that if the checksum fails and returns EIO, geom_mirror can try the alternate device and rebuild the one with the bad checksums. That will then complete the feature set implemented by ZFS, but for any filesystem on top of GEOM. - Andrew From owner-freebsd-fs@FreeBSD.ORG Fri Apr 24 06:45:32 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 85861106564A for ; Fri, 24 Apr 2009 06:45:32 +0000 (UTC) (envelope-from scott@bqinternet.com) Received: from mail.bqinternet.com (mail.bqinternet.com [69.9.32.203]) by mx1.freebsd.org (Postfix) with ESMTP id 5BDDC8FC24 for ; Fri, 24 Apr 2009 06:45:32 +0000 (UTC) (envelope-from scott@bqinternet.com) Received: from localhost (mail [69.9.32.203]) by mail.bqinternet.com (Postfix) with ESMTP id 71249409A04; Fri, 24 Apr 2009 06:45:32 +0000 (GMT) Received: from mail.bqinternet.com ([69.9.32.203]) by localhost (mail.bqinternet.com [69.9.32.203]) (amavisd-new, port 10024) with ESMTP id Vz7gJ+9uzqgO; Fri, 24 Apr 2009 06:45:31 +0000 (GMT) Received: from scott-burnss-macbook-air.local (mail [69.9.32.203]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.bqinternet.com (Postfix) with ESMTP id 1F54A409A61; Fri, 24 Apr 2009 06:45:31 +0000 (GMT) Message-ID: <49F16009.3080206@bqinternet.com> Date: Fri, 24 Apr 2009 02:45:29 -0400 From: Scott Burns User-Agent: Thunderbird 2.0.0.21 (Macintosh/20090302) MIME-Version: 1.0 To: Alexander Kabaev References: <49F048FB.6000401@bqinternet.com> <20090423195335.521db0a7@kan.dnsalias.net> In-Reply-To: <20090423195335.521db0a7@kan.dnsalias.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: Re: UFS2 metadata checksums X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Apr 2009 06:45:32 -0000 Alexander Kabaev wrote: > On Thu, 23 Apr 2009 06:54:51 -0400 > Scott Burns wrote: > >> Hi guys, >> >> I have spent some time writing a kernel module which calculates a >> checksum of a UFS2 dinode structure and stores it in the reserved >> space of the inode when writing it to disk. It is then verified when >> the inode is read from disk. If the checksum verification fails, the >> read returns an error (currently EIO). >> >> I believe that protecting metadata integrity is important, especially >> as storage capacity grows. Bitrot is a fact of life, and bad things >> can happen if the kernel acts on a corrupted inode. Not only does >> this module improve the stability of a server, but it also helps to >> prevent additional damage to the filesystem that can be caused by >> metadata corruption. >> >> I'm aware that data integrity issues are addressed with ZFS, but >> unfortunately ZFS is still not yet suitable for many workloads. I'm >> also aware that integrity checking can be done by using GELI between >> the filesystem and the disk, but at a noticeable cost in performance >> and space utilization. The method this module uses is fast and does >> not use any additional space. Most importantly, it builds on mature >> code that has worked well for decades. >> >> Before I spend much more time on it, I have some questions: >> >> 1) Has anyone else done any work in this area? >> >> 2) Is there a demand for this in FreeBSD? >> > > This is actually something I would love to have in the base system, > but inodes are not the only structures that need the integrity > protection. Pretty much every other metadata block, from cylinder group > blocks to indirect blocks for files need similar protection for > this to be of real use. > > -- > Alexander Kabaev As long as there is some interest in this kind of functionality, I will continue working on it. The next step is to protect metadata structures beyond inodes. I am hoping to have some results to post in the next few weeks. -- Scott Burns System Administrator BQ Internet Corporation From owner-freebsd-fs@FreeBSD.ORG Fri Apr 24 06:52:01 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 142831065678 for ; Fri, 24 Apr 2009 06:52:01 +0000 (UTC) (envelope-from scott@bqinternet.com) Received: from mail.bqinternet.com (mail.bqinternet.com [69.9.32.203]) by mx1.freebsd.org (Postfix) with ESMTP id DE69B8FC1A for ; Fri, 24 Apr 2009 06:52:00 +0000 (UTC) (envelope-from scott@bqinternet.com) Received: from localhost (mail [69.9.32.203]) by mail.bqinternet.com (Postfix) with ESMTP id AF16E409990; Fri, 24 Apr 2009 06:52:01 +0000 (GMT) Received: from mail.bqinternet.com ([69.9.32.203]) by localhost (mail.bqinternet.com [69.9.32.203]) (amavisd-new, port 10024) with ESMTP id ytgzG+E7142L; Fri, 24 Apr 2009 06:52:01 +0000 (GMT) Received: from scott-burnss-macbook-air.local (mail [69.9.32.203]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.bqinternet.com (Postfix) with ESMTP id 81A2D40998B; Fri, 24 Apr 2009 06:52:00 +0000 (GMT) Message-ID: <49F1618E.3080208@bqinternet.com> Date: Fri, 24 Apr 2009 02:51:58 -0400 From: Scott Burns User-Agent: Thunderbird 2.0.0.21 (Macintosh/20090302) MIME-Version: 1.0 To: Andrew Snow References: <49F048FB.6000401@bqinternet.com> <20090423195335.521db0a7@kan.dnsalias.net> <49F10660.201@modulus.org> In-Reply-To: <49F10660.201@modulus.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: Re: UFS2 metadata checksums X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Apr 2009 06:52:01 -0000 Andrew Snow wrote: > > Ideally you would implement complete disk checksumming as a GEOM device. > > Then you could layer geom_mirror on top of it, so that if the checksum > fails and returns EIO, geom_mirror can try the alternate device and > rebuild the one with the bad checksums. > > That will then complete the feature set implemented by ZFS, but for any > filesystem on top of GEOM. > > - Andrew > The geli(8) GEOM class is able to verify sectors (and I believe it returns EINVAL on ones that fail), but with a noticeable performance impact. I could certainly see the use for a GEOM class that just does simple checksumming. If gmirror can then be aware of it, that does provide functionality similar to a ZFS mirror. -- Scott Burns System Administrator BQ Internet Corporation From owner-freebsd-fs@FreeBSD.ORG Fri Apr 24 10:20:08 2009 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id ADCC71065670 for ; Fri, 24 Apr 2009 10:20:08 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 8218F8FC1F for ; Fri, 24 Apr 2009 10:20:08 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n3OAK8UZ090161 for ; Fri, 24 Apr 2009 10:20:08 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n3OAK8ma090160; Fri, 24 Apr 2009 10:20:08 GMT (envelope-from gnats) Date: Fri, 24 Apr 2009 10:20:08 GMT Message-Id: <200904241020.n3OAK8ma090160@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Jaakko Heinonen Cc: Subject: Re: kern/132068: [zfs] page fault when using ZFS over NFS on7.1-RELEASE/amd64 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Jaakko Heinonen List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Apr 2009 10:20:09 -0000 The following reply was made to PR kern/132068; it has been noted by GNATS. From: Jaakko Heinonen To: Weldon Godfrey Cc: Edward Fisk <7ogcg7g02@sneakemail.com>, bug-followup@FreeBSD.org Subject: Re: kern/132068: [zfs] page fault when using ZFS over NFS on7.1-RELEASE/amd64 Date: Fri, 24 Apr 2009 13:14:09 +0300 On 2009-04-23, Weldon Godfrey wrote: > Around Dec 12 I switched to head. ... > I'll need to acquire some additional drives so I can install the OS on > different drives (in case I need to backout) and wait until summer to > attempt to upgrade. FYI, the patch is against head. -- Jaakko From owner-freebsd-fs@FreeBSD.ORG Fri Apr 24 10:51:55 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 66852106564A for ; Fri, 24 Apr 2009 10:51:55 +0000 (UTC) (envelope-from p.dawidek@wheel.pl) Received: from mx3.wheel.pl (grom.wheel.pl [91.121.70.66]) by mx1.freebsd.org (Postfix) with ESMTP id 184D28FC24 for ; Fri, 24 Apr 2009 10:51:54 +0000 (UTC) (envelope-from p.dawidek@wheel.pl) Received: from localhost (unknown [10.10.2.1]) by mx3.wheel.pl (Postfix) with ESMTP id 53236142EA; Fri, 24 Apr 2009 12:32:47 +0200 (CEST) X-Virus-Scanned: amavisd-new at mx3.wheel.pl Received: from mx3.wheel.pl ([10.10.2.1]) by localhost (mx3.wheel.pl [10.10.2.1]) (amavisd-new, port 10024) with ESMTP id 5a6T1YyrJllT; Fri, 24 Apr 2009 12:32:46 +0200 (CEST) Received: from mail.wheel.pl (ghf58.internetdsl.tpnet.pl [83.12.187.58]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx3.wheel.pl (Postfix) with ESMTPS id 78CD9142E3; Fri, 24 Apr 2009 12:32:45 +0200 (CEST) Received: from localhost (unknown [10.0.2.3]) by mail.wheel.pl (Postfix) with ESMTP id 4DD172D92B; Fri, 24 Apr 2009 12:32:44 +0200 (CEST) Received: from mail.wheel.pl ([10.0.2.3]) by localhost (mail.wheel.pl [10.0.2.3]) (amavisd-new, port 10024) with ESMTP id l-zFju0syyvd; Fri, 24 Apr 2009 12:32:43 +0200 (CEST) Received: from localhost (pjd.wheel.pl [10.0.1.1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wheel.pl (Postfix) with ESMTP id 8FA6D2D924; Fri, 24 Apr 2009 12:32:42 +0200 (CEST) Date: Fri, 24 Apr 2009 12:32:52 +0200 From: Pawel Jakub Dawidek To: Scott Burns Message-ID: <20090424103252.GC1494@garage.freebsd.pl> References: <49F048FB.6000401@bqinternet.com> <20090423195335.521db0a7@kan.dnsalias.net> <49F10660.201@modulus.org> <49F1618E.3080208@bqinternet.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="lEGEL1/lMxI0MVQ2" Content-Disposition: inline In-Reply-To: <49F1618E.3080208@bqinternet.com> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 8.0-CURRENT i386 Cc: freebsd-fs@freebsd.org Subject: Re: UFS2 metadata checksums X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Apr 2009 10:51:55 -0000 --lEGEL1/lMxI0MVQ2 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Apr 24, 2009 at 02:51:58AM -0400, Scott Burns wrote: >=20 > Andrew Snow wrote: > > > >Ideally you would implement complete disk checksumming as a GEOM device. > > > >Then you could layer geom_mirror on top of it, so that if the checksum= =20 > >fails and returns EIO, geom_mirror can try the alternate device and=20 > >rebuild the one with the bad checksums. > > > >That will then complete the feature set implemented by ZFS, but for any= =20 > >filesystem on top of GEOM. > > > >- Andrew > > >=20 > The geli(8) GEOM class is able to verify sectors (and I believe it=20 > returns EINVAL on ones that fail), but with a noticeable performance=20 > impact. I could certainly see the use for a GEOM class that just does=20 > simple checksumming. If gmirror can then be aware of it, that does=20 > provide functionality similar to a ZFS mirror. Geli uses strong cryptography for integrity verification, which is not needed in this case. The class that does that still needs to use the method I implemented in geli to provide atomicity. Gmirror is already "aware" of that - in case of an error on one half, it will use the other half. What gmirror doesn't do (and ZFS does) is self-healing. All in all, in my opinion GEOM class is much better for this - it will protect everything (metadata and data) and will be file system idenpendent. --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --lEGEL1/lMxI0MVQ2 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFJ8ZVUForvXbEpPzQRAij3AJ9TtqMOwFYQgBDpQsiz+L/sIhmsBgCeIt+t IN3c9RoSNHFNyOXKM+/DtYE= =Br66 -----END PGP SIGNATURE----- --lEGEL1/lMxI0MVQ2-- From owner-freebsd-fs@FreeBSD.ORG Fri Apr 24 13:57:12 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5AFB01065672 for ; Fri, 24 Apr 2009 13:57:12 +0000 (UTC) (envelope-from gtodd@bellanet.org) Received: from smtp100.rog.mail.re2.yahoo.com (smtp100.rog.mail.re2.yahoo.com [206.190.36.78]) by mx1.freebsd.org (Postfix) with SMTP id 06E718FC0A for ; Fri, 24 Apr 2009 13:57:11 +0000 (UTC) (envelope-from gtodd@bellanet.org) Received: (qmail 78033 invoked from network); 24 Apr 2009 13:30:31 -0000 Received: from unknown (HELO wawanesa.iciti.ca) (gtodd@99.246.61.82 with login) by smtp100.rog.mail.re2.yahoo.com with SMTP; 24 Apr 2009 13:30:31 -0000 X-YMail-OSG: 7AoG_NcVM1lhzf1CV_2PZmTu9e8gcYfmGKQvh2Z8G1e6WWrvRI9M2h7eRbg8SwbF0g-- X-Yahoo-Newman-Property: ymail-3 Received: from wawanesa.iciti.ca (wawanesa.iciti.ca [192.168.2.4]) by wawanesa.iciti.ca (Postfix) with ESMTP id C79EB5B for ; Fri, 24 Apr 2009 09:32:17 -0400 (EDT) Message-ID: <49F1BF60.10106@bellanet.org> Date: Fri, 24 Apr 2009 09:32:16 -0400 From: Graham Todd User-Agent: Thunderbird 2.0.0.19 (X11/20090116) MIME-Version: 1.0 To: freebsd-fs@freebsd.org References: <49F048FB.6000401@bqinternet.com> <20090423195335.521db0a7@kan.dnsalias.net> <49F10660.201@modulus.org> <49F1618E.3080208@bqinternet.com> <20090424103252.GC1494@garage.freebsd.pl> In-Reply-To: <20090424103252.GC1494@garage.freebsd.pl> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Subject: Re: UFS2 metadata checksums X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Apr 2009 13:57:12 -0000 Pawel Jakub Dawidek wrote: > On Fri, Apr 24, 2009 at 02:51:58AM -0400, Scott Burns wrote: >> Andrew Snow wrote: >>> Ideally you would implement complete disk checksumming as a GEOM device. >>> >>> Then you could layer geom_mirror on top of it, so that if the checksum >>> fails and returns EIO, geom_mirror can try the alternate device and >>> rebuild the one with the bad checksums. >>> >>> That will then complete the feature set implemented by ZFS, but for any >>> filesystem on top of GEOM. >>> >>> - Andrew >>> >> The geli(8) GEOM class is able to verify sectors (and I believe it >> returns EINVAL on ones that fail), but with a noticeable performance >> impact. I could certainly see the use for a GEOM class that just does >> simple checksumming. If gmirror can then be aware of it, that does >> provide functionality similar to a ZFS mirror. > > Geli uses strong cryptography for integrity verification, which is not > needed in this case. The class that does that still needs to use > the method I implemented in geli to provide atomicity. > > Gmirror is already "aware" of that - in case of an error on one half, it > will use the other half. What gmirror doesn't do (and ZFS does) is > self-healing. > > All in all, in my opinion GEOM class is much better for this - it will > protect everything (metadata and data) and will be file system > idenpendent. As a sysadmin one could imagine some useful monitoring, auditing, security, reporting and "ITIL compliant" scripts/utilities that could be built around a geom_checksum class and "/sbin/gchecksum status". >From a performance perspective how "fine grained" could the checksum detail on a filesystem be for it to be practical to use in that way? Could such a geom class function like a builtin "tripwire" layer? From owner-freebsd-fs@FreeBSD.ORG Fri Apr 24 19:03:08 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 753271065694 for ; Fri, 24 Apr 2009 19:03:08 +0000 (UTC) (envelope-from freebsd-fs@m.gmane.org) Received: from ciao.gmane.org (main.gmane.org [80.91.229.2]) by mx1.freebsd.org (Postfix) with ESMTP id 2B1998FC1F for ; Fri, 24 Apr 2009 19:03:07 +0000 (UTC) (envelope-from freebsd-fs@m.gmane.org) Received: from list by ciao.gmane.org with local (Exim 4.43) id 1LxQfx-00034z-LN for freebsd-fs@freebsd.org; Fri, 24 Apr 2009 19:03:06 +0000 Received: from 78-1-171-208.adsl.net.t-com.hr ([78.1.171.208]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 24 Apr 2009 19:03:05 +0000 Received: from ivoras by 78-1-171-208.adsl.net.t-com.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Fri, 24 Apr 2009 19:03:05 +0000 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-fs@freebsd.org From: Ivan Voras Date: Fri, 24 Apr 2009 21:02:36 +0200 Lines: 34 Message-ID: References: <49F048FB.6000401@bqinternet.com> <20090423195335.521db0a7@kan.dnsalias.net> <49F16009.3080206@bqinternet.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enig4C0C324370F7AD5FBD754791" X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: 78-1-171-208.adsl.net.t-com.hr User-Agent: Thunderbird 2.0.0.21 (Windows/20090302) In-Reply-To: <49F16009.3080206@bqinternet.com> X-Enigmail-Version: 0.95.7 Sender: news Subject: Re: UFS2 metadata checksums X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Apr 2009 19:03:09 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig4C0C324370F7AD5FBD754791 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Scott Burns wrote: > As long as there is some interest in this kind of functionality, I will= > continue working on it. The next step is to protect metadata structure= s > beyond inodes. I am hoping to have some results to post in the next fe= w > weeks. Btw. what checksum do you use? --------------enig4C0C324370F7AD5FBD754791 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAknyDMwACgkQldnAQVacBcj6iQCgzgwembzeSb4Pne9SBFyz4NbQ vkkAnj1G3WaXkRiXCMCBojOJvKOfO7TO =iTKA -----END PGP SIGNATURE----- --------------enig4C0C324370F7AD5FBD754791-- From owner-freebsd-fs@FreeBSD.ORG Sat Apr 25 07:45:29 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8C5351065674 for ; Sat, 25 Apr 2009 07:45:29 +0000 (UTC) (envelope-from james-freebsd-fs2@jrv.org) Received: from mail.jrv.org (adsl-70-243-84-13.dsl.austtx.swbell.net [70.243.84.13]) by mx1.freebsd.org (Postfix) with ESMTP id 5311B8FC20 for ; Sat, 25 Apr 2009 07:45:29 +0000 (UTC) (envelope-from james-freebsd-fs2@jrv.org) Received: from kremvax.housenet.jrv (kremvax.housenet.jrv [192.168.3.124]) by mail.jrv.org (8.14.3/8.14.3) with ESMTP id n3P7jFgm090957 for ; Sat, 25 Apr 2009 02:45:15 -0500 (CDT) (envelope-from james-freebsd-fs2@jrv.org) Authentication-Results: mail.jrv.org; domainkeys=pass (testing) header.from=james-freebsd-fs2@jrv.org DomainKey-Signature: a=rsa-sha1; s=enigma; d=jrv.org; c=nofws; q=dns; h=message-id:date:from:user-agent:mime-version:to:subject: content-type:content-transfer-encoding; b=fFhGQlTA8PYvNCJE0M7nZGwXONC2MJglkrNdODr2toPuY3eq+4SFZT5Undwq8HQYO MZ5SDv26Wp7vRjpisvNl/UxT/tCC1f/GinBKsUaQM1GGREkVeJvsw7wfblCbKiZyHQE fvjoSalYGFExqODAV60h1k0nuYzTj+xGPTwmIf4= Message-ID: <49F2BF8B.3060603@jrv.org> Date: Sat, 25 Apr 2009 02:45:15 -0500 From: "James R. Van Artsdalen" User-Agent: Thunderbird 2.0.0.21 (Macintosh/20090302) MIME-Version: 1.0 To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: zfs recv core dump X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 25 Apr 2009 07:45:29 -0000 zfs recv dumps core for me with this command: # zfs send -R -I @snap1 bigtex@snap2 | ssh back zfs recv -vFd bigtex The problem is in libzfs_sendrecv.c here: /* check for rename */ if ((stream_parent_fromsnap_guid != 0 && stream_parent_fromsnap_guid != parent_fromsnap_guid) || strcmp(strrchr(fsname, '/'), strrchr(stream_fsname, '/')) != 0) { fsname and stream_fsname are both "bigtex", no slash, so both strrchr calls return 0, and strcmp (0, 0) segfaults. Any ideas? Is anyone trying to use zfs send/recv to replicate pools?