From owner-freebsd-bugs@FreeBSD.ORG Fri Mar 26 14:28:56 2004 Return-Path: Delivered-To: freebsd-bugs@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5A72D16A4CE; Fri, 26 Mar 2004 14:28:56 -0800 (PST) Received: from mailhost.faperj.br (caronte.faperj.br [200.6.41.100]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8B44443D1D; Fri, 26 Mar 2004 14:28:55 -0800 (PST) (envelope-from jonny@faperj.br) Received: from faperj.br (zeus.faperj.br [10.0.0.2]) by mailhost.faperj.br (Postfix) with ESMTP id 0AF9025591E; Fri, 26 Mar 2004 19:28:53 -0300 (BRT) Received: by faperj.br (Postfix, from userid 1000) id 0A787BA206; Fri, 26 Mar 2004 19:28:53 -0300 (BRT) Date: Fri, 26 Mar 2004 19:28:53 -0300 From: Joao Carlos Mendes Luis To: grog@freebsd.org Message-ID: <20040326222853.GA93269@zeus.faperj.br> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.1i cc: bugs@freebsd.org cc: jonny@jonny.eng.br Subject: Serious bug in vinum? X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 26 Mar 2004 22:28:56 -0000 Hi Greg, I've been a big fan of vinum since it's beggining. I use it for RAID0 and RAID1 solution for lots of servers. In some RAID0 (stripe) configurations, though, I've had some serious problems. If an underlying disk fails, the respective plex and volume do not fail, as they should. This leads to full corruption of data, but worst of that, leads to a system which believes the data is safe. In one ocasion, for example, the backup ran and overwrote good data with bad data, full of zeros. I am not fully aware of vinum programming details, but a quick look at 4.9-STABLE, in file vinumstate.c, dated Jul, 7, 2000, at line 588, function update_volume_state() sets volume state to up if plex state is corrupt or better for at least one plex: for (plexno = 0; plexno < vol->plexes; plexno++) { struct plex *plex = &PLEX[vol->plex[plexno]]; /* point to the plex */ if (plex->state >= plex_corrupt) { /* something accessible, */ vol->state = volume_up; break; } } I think this should be like: if (plex->state > plex_corrupt) { /* something accessible, */ Or, in other words, volume state is up only if plex state is degraded or better. I did not test this, since the situation is not easy to reproduce, but I think it depends only on the real meaning of the "corrupt" state. Thanks in advance for your attention, Jonny