From owner-freebsd-hackers@FreeBSD.ORG Tue Nov 23 04:11:24 2004 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B06EE16A4CE; Tue, 23 Nov 2004 04:11:24 +0000 (GMT) Received: from dan.emsphone.com (dan.emsphone.com [199.67.51.101]) by mx1.FreeBSD.org (Postfix) with ESMTP id 590FD43D45; Tue, 23 Nov 2004 04:11:24 +0000 (GMT) (envelope-from dan@dan.emsphone.com) Received: (from dan@localhost) by dan.emsphone.com (8.13.1/8.13.1) id iAN4B8XP034179; Mon, 22 Nov 2004 22:11:08 -0600 (CST) (envelope-from dan) Date: Mon, 22 Nov 2004 22:11:08 -0600 From: Dan Nelson To: David Gilbert Message-ID: <20041123041107.GC48882@dan.emsphone.com> References: <16795.43373.413946.559615@canoe.dclg.ca> <419BB453.70205@elischer.org> <16795.46413.508033.379777@canoe.dclg.ca> <20041122232817.GA1473@green.homeunix.org> <16802.46241.989001.30646@canoe.dclg.ca> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <16802.46241.989001.30646@canoe.dclg.ca> X-OS: FreeBSD 5.3-STABLE X-message-flag: Outlook Error User-Agent: Mutt/1.5.6i cc: freebsd-stable@freebsd.org cc: Julian Elischer cc: freebsd-hackers@freebsd.org Subject: Re: Snapshot corruption. X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Nov 2004 04:11:24 -0000 In the last episode (Nov 22), David Gilbert said: > >>>>> "Brian" == Brian Fundakowski Feldman writes: > Brian> Long strings of NUL bytes? Missing data? Spam (from the same > Brian> file, or from other files)? > > Well... I don't really know db file formats. Most of the corruption > I found in berkley db files. mailgraph uses rrd. mailman uses some > form of berkley db, too. I don't know what the corruption "looked" > like other than the db library would no longer accept it. db files are very fragile when it comes to OS or process crashes. There is no logging, and writes are cached until the process exits or a db->sync() is called, virtually guaranteeing corruption. Ideally, db files should only cache data and be rebuildable from other data, or they should db->sync() after every write. db 2+ databases can do logging, but I don't know how many applications actually request it. -- Dan Nelson dnelson@allantgroup.com