From owner-freebsd-arch@FreeBSD.ORG Tue May 12 16:19:48 2009 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 97B67106567C for ; Tue, 12 May 2009 16:19:48 +0000 (UTC) (envelope-from rpaulo@gmail.com) Received: from fg-out-1718.google.com (fg-out-1718.google.com [72.14.220.155]) by mx1.freebsd.org (Postfix) with ESMTP id 179DF8FC15 for ; Tue, 12 May 2009 16:19:47 +0000 (UTC) (envelope-from rpaulo@gmail.com) Received: by fg-out-1718.google.com with SMTP id e12so764293fga.12 for ; Tue, 12 May 2009 09:19:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:sender:cc:message-id:from:to :in-reply-to:content-type:content-transfer-encoding:mime-version :subject:date:references:x-pgp-agent:x-mailer; bh=heupCCtl1vL/rOs5uAxKGbpAXX8S8dOJe8yTmU03XQw=; b=r1T/pW9Gp2aXNbfxa5ua4PO5Vemzx9aBHpzflPbgcrE/kOeYSMs+y9wSeLBCgm5k9V 8bk339L0XHH6xThY3Pxh9DMKJxss/ZS4OJ/NHHMxrzWko3UCdZPD/+dXFGqYvlHHpu3L 7aVmRwvubmD7QznJIkTL4xwoWBOnaxe8JOdTg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=sender:cc:message-id:from:to:in-reply-to:content-type :content-transfer-encoding:mime-version:subject:date:references :x-pgp-agent:x-mailer; b=ufJdiupbw0PeAoENbss/E62/49aVNprxxOm4nb1Zvr9BdBqFLnZ3z22TB8bG6qkzjX PLHNqei3XP27hBhZt7vW3JwZWYSFHspvUx7vVftYipImQGT1kBIZ4yJqrgA/P0tt+OAP cixD3xAwqkVVvF9UKm5OpYT8xPn9whdNr52xg= Received: by 10.86.4.7 with SMTP id 7mr7921054fgd.46.1242143316753; Tue, 12 May 2009 08:48:36 -0700 (PDT) Received: from epsilon.lan (bl9-154-171.dsl.telepac.pt [85.242.154.171]) by mx.google.com with ESMTPS id e11sm1238922fga.16.2009.05.12.08.48.35 (version=TLSv1/SSLv3 cipher=RC4-MD5); Tue, 12 May 2009 08:48:36 -0700 (PDT) Sender: Rui Paulo Message-Id: From: Rui Paulo To: Zachary Loafman In-Reply-To: <20090511162928.GD17203@isilon.com> Content-Type: multipart/signed; protocol="application/pgp-signature"; micalg=pgp-sha1; boundary="Apple-Mail-1--307476871" Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v930.3) Date: Tue, 12 May 2009 15:56:51 +0100 References: <20090511162928.GD17203@isilon.com> X-Pgp-Agent: GPGMail 1.2.0 (v56) X-Mailer: Apple Mail (2.930.3) Cc: arch@freebsd.org Subject: Re: FAIL: kernel fault injection X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 12 May 2009 16:19:49 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --Apple-Mail-1--307476871 Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit On 11 May 2009, at 17:29, Zachary Loafman wrote: > Arch - > > I'd like to contribute the kernel fault injection system that Isilon > uses. Before contributing it, I'd like to get approval for the APIs > involved. > > Testing errors is hard. Let's say you have: > > int foo(void) { > [...] > error = bar(); > if (error) { > /* do stuff */ > } > } > > .. but some_func() can't reliably be made to fail. How do you test it? > We added error injection macros that look like this: > > int foo(void) { > [...] > error = bar(); > KFAIL_POINT_CODE(FP_KERN, bar_fails_foo, error = RETURN_VALUE); > if (error) { > /* do stuff */ > } > } > > The KFAIL_POINT_CODE macro adds a sysctl MIB that allows > you to inject errors into the above code. For example: > > # sysctl fail_point.kern.bar_fails_foo=".1%return(5)" > > This says, ".1% of the time, evaluate the fail point code with 5 as > the RETURN_VALUE". If this were a standard errno, you could read the > above setting as "1/1000th of the time, pretend bar() returned EIO". > > We also have a few wrappers around KFAIL_POINT_CODE that essentially > wrap common uses. For example, the above use can be shorthanded to: > KFAIL_POINT_ERROR(FP_KERN, bar_fails_foo, error) > > Currently, the sysctl parser accepts the following variants: > return(x) - triggers the code with RETURN_VALUE set to x > sleep(t) - sleep t milliseconds, > panic/break - panic or break into the debugger > print - print that the fail point was hit > > In addition to the commands, we have a syntax to express the > when to evaluate those commands: > p% - evaluate command p% of the time (example above) > 5* - evaluate command 5 times, then disable the expression > > And you can compound with expr1->expr2, so, e.g.: > 5%return(5)->1%return(22): > 5% of the time, return 5, 1% of the remaining time, return 22 > 5*return(0)->10*return(5)->1%return(19) > return 0 for 5 times, then 5 for 10 times, and after those, > return 19 1% of the time. > 1%5*return(22): > 1/100th of the time, return 22, but only do it 5 times total. > > I've also attached an ascii rendering of a (rough draft) man page that > goes into more detail. > > Comments? This is great and I would like to see this go in. I just have to minor modifications (possible bikeshed, but whatever): * What about kern.fail_point instead of fail_point.kern ? This framework seems to be only for kernel. * On the man page, you don't explain the 'sleep' type. Is that on purpose? About the CAVEAT section on the man page (second paragraph), do you have any ideas to evaluate if msleep is being called on a correct context? Thanks. -- Rui Paulo --Apple-Mail-1--307476871 content-type: application/pgp-signature; x-mac-type=70674453; name=PGP.sig content-description: This is a digitally signed message part content-disposition: inline; filename=PGP.sig content-transfer-encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) iEYEARECAAYFAkoJjjMACgkQfD8M/ASTygKtDQCgxt4+1uCRY3M7RBL4OqU8JskO 6TQAoKuaC6EEb/YAO19qd6YWlAmUMLnP =H7fV -----END PGP SIGNATURE----- --Apple-Mail-1--307476871--