Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 27 May 2009 13:05:44 -0400
From:      Ben Kaduk <minimarmot@gmail.com>
To:        Zachary Loafman <zml@freebsd.org>
Cc:        svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org
Subject:   Re: svn commit: r192908 - in head: share/man/man9 sys/conf sys/kern sys/sys
Message-ID:  <47d0403c0905271005t7f57f9b9h3d0721bbb1fbedc2@mail.gmail.com>
In-Reply-To: <200905271636.n4RGasNe003922@svn.freebsd.org>
References:  <200905271636.n4RGasNe003922@svn.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, May 27, 2009 at 12:36 PM, Zachary Loafman <zml@freebsd.org> wrote:
> Author: zml
> Date: Wed May 27 16:36:54 2009
> New Revision: 192908
> URL: http://svn.freebsd.org/changeset/base/192908
>
> Log:
>  fail(9) support:
>
>  Add support for kernel fault injection using KFAIL_POINT_* macros and
>  fail_point_* infrastructure. Add example fail point in vfs_bio.c to
>  simulate VM buf pressure.

Very cool!  A few (doc and grammar) style points below.


>
>  Approved by:        dfr (mentor)
>
> Added:
>  head/share/man/man9/fail.9   (contents, props changed)
>  head/sys/kern/kern_fail.c   (contents, props changed)
>  head/sys/sys/fail.h   (contents, props changed)
> Modified:
>  head/share/man/man9/Makefile
>  head/sys/conf/files
>  head/sys/kern/vfs_bio.c
>  head/sys/sys/queue.h
>
> Modified: head/share/man/man9/Makefile
> ==============================================================================
> --- head/share/man/man9/Makefile        Wed May 27 16:34:08 2009        (r192907)
> +++ head/share/man/man9/Makefile        Wed May 27 16:36:54 2009        (r192908)
> @@ -99,6 +99,7 @@ MAN=  accept_filter.9 \
>        DRIVER_MODULE.9 \
>        EVENTHANDLER.9 \
>        extattr.9 \
> +       fail.9 \
>        fetch.9 \
>        firmware.9 \
>        g_access.9 \
>
> Added: head/share/man/man9/fail.9
> ==============================================================================
> --- /dev/null   00:00:00 1970   (empty, because file is newly added)
> +++ head/share/man/man9/fail.9  Wed May 27 16:36:54 2009        (r192908)
> @@ -0,0 +1,197 @@
> +.\"
> +.\" Copyright (c) 2009 Isilon Inc http://www.isilon.com/
> +.\"
> +.\" Redistribution and use in source and binary forms, with or without
> +.\" modification, are permitted provided that the following conditions
> +.\" are met:
> +.\" 1. Redistributions of source code must retain the above copyright
> +.\"    notice(s), this list of conditions and the following disclaimer as
> +.\"    the first lines of this file unmodified other than the possible
> +.\"    addition of one or more copyright notices.
> +.\" 2. Redistributions in binary form must reproduce the above copyright
> +.\"    notice(s), this list of conditions and the following disclaimer in the
> +.\"    documentation and/or other materials provided with the distribution.
> +.\"
> +.\" THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER(S) ``AS IS'' AND ANY
> +.\" EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
> +.\" WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
> +.\" DISCLAIMED.  IN NO EVENT SHALL THE COPYRIGHT HOLDER(S) BE LIABLE FOR ANY
> +.\" DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
> +.\" (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
> +.\" SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
> +.\" CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
> +.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
> +.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
> +.\" DAMAGE.
> +.\"
> +.\" $FreeBSD$
> +.\"
> +.Dd May 10, 2009
> +.Dt FAIL 9
> +.Os
> +.Sh NAME
> +.Nm KFAIL_POINT_CODE ,
> +.Nm KFAIL_POINT_RETURN ,
> +.Nm KFAIL_POINT_RETURN_VOID ,
> +.Nm KFAIL_POINT_ERROR ,
> +.Nm KFAIL_POINT_GOTO ,
> +.Nm fail_point ,
> +.Nm DEBUG_FP
> +.
> +.Nd fail points
> +.Sh SYNOPSIS
> +.In sys/fail.h
> +.Fn KFAIL_POINT_CODE "parent" "name" "code"
> +.Fn KFAIL_POINT_RETURN "parent" "name"
> +.Fn KFAIL_POINT_RETURN_VOID "parent" "name"
> +.Fn KFAIL_POINT_ERROR "parent" "name" "error_var"
> +.Fn KFAIL_POINT_GOTO "parent" "name" "error_var" "label"
> +.Sh DESCRIPTION
> +Fail points are used to add code points where errors may be injected
> +in a user controlled fashion. Fail points provide a convenient wrapper
> +around user provided error injection code, providing a

The pedant in me would prefer the hyphenated "user-provided"


> +.Xr sysctl 9 MIB , and a parser for that MIB that describes how the error

I think it is traditional to put the macro and its arguments on its own
line, starting a new one for the rest of the text.

> +injection code should fire.
> +.Pp
> +The base fail point macro is
> +.Fn KFAIL_POINT_CODE
> +where
> +.Fa parent
> +is a sysctl tree (frequently
> +.Sy DEBUG_FP
> +for kernel fail points, but various subsystems may wish to provide
> +their own fail point trees), and
> +.Fa name
> +is the name of the MIB in that tree, and
> +.Fa code
> +is the error injection code. The

Our man page style says that new sentences should start on new lines.
(Since the '.' character is used for defining macros, and sometimes
the processor can get confused.)

> +.Fa code
> +argument does not require braces, but it is considered good style to
> +use braces for any multi-line code arguments. Inside the

Likewise here.

> +.Fa code
> +argument, the evaluation of
> +.Sy RETURN_VALUE
> +is derived from the
> +.Fn return
> +value set in the sysctl MIB. See

And here.

> +.Sx SYSCTL SETTINGS
> +below.
> +.Pp
> +The remaining
> +.Fn KFAIL_POINT_*
> +macros are wrappers around common error injection paths:
> +.Bl -tag -width 8
> +.It Fn KFAIL_POINT_RETURN parent name
> +is the equivalent of
> +.Sy KFAIL_POINT_CODE(..., return RETURN_VALUE)
> +.It Fn KFAIL_POINT_RETURN_VOID parent name
> +is the equivalent of
> +.Sy KFAIL_POINT_CODE(..., return)
> +.It Fn KFAIL_POINT_ERROR parent name error_var
> +is the equivalent of
> +.Sy KFAIL_POINT_CODE(..., error_var = RETURN_VALUE)
> +.It Fn KFAIL_POINT_GOTO parent name error_var label
> +is the equivalent of
> +.Sy KFAIL_POINT_CODE(...,
> +  { error_var = RETURN_VALUE; goto label;})
> +.El
> +.Pp
> +.Sh SYSCTL VARIABLES
> +The
> +.Fn KFAIL_POINT_*
> +macros add sysctl MIBs where specified. Many base kernel MIBs can be

Ditto.

> +found in the
> +.Sy debug.fail_point
> +tree (referenced in code by
> +.Sy DEBUG_FP
> +).
> +.Pp
> +The sysctl setting recognizes the following grammar:

I'm not entirely sure that I understand this sentence correctly.
The following is describing the ways that I can set the sysctl
variable and have it do something useful, right?  It might be
better to s/setting/variable/ .


> +.Pp
> +  <fail_point> ::
> +      <term> ( "->" <term> )*
> +.Pp
> +  <term> ::
> +      ( (<float> "%") | (<integer> "*" ) )*
> +      <type>
> +      [ "(" <integer> ")" ]
> +.Pp
> +  <float> ::
> +      <integer> [ "." <integer> ] |
> +      "." <integer>
> +.Pp
> +  <type> ::
> +      "off" | "return" | "sleep" | "panic" | "break" | "print"
> +.Pp
> +The <type>
> +argument specifies which action to take:
> +.Bl -tag -width ".Dv return"
> +.It Sy off
> +Take no action (does not trigger fail point code)
> +.It Sy return
> +Trigger fail point code with specified argument
> +.It Sy sleep
> +Sleep the specified number of milliseconds
> +.It Sy panic
> +Panic
> +.It Sy break
> +Break into the debugger.

What happens if there is no debugger support compiled into the kernel?


> +.It Sy print
> +Print that the fail point executed
> +.El
> +.Pp
> +The <float>% and <integer>* modifiers prior to <type> control when
> +<type> is executed. The <float>% form (e.g. "1.2%") can be used to
> +specify a probability that <type> will execute. The <integer>* form
> +(e.g. "5*") can be used to specify the number of times <type> should
> +be executed before this <term> is disabled. Only the last probability
> +and the last count are used if multiple are specified, i.e. "1.2%2%"
> +is the same as "2%". When both a probability and a count are
> +specified, the probability is evaluated before the count, i.e. "2%5*"
> +means "2% of the time, but only execute it 5 times total".

There is a disparity in these two clauses.  I think:
"2% of the time, but only 5 times total".
would be better.

> +.Pp
> +The operator -> can be used to express cascading terms. If you specify
> +<term1>-><term2>, it means that if <term1> doesn't 'execute', <term2>
> +is evaluated. For the purpose of this operator, the return() and
> +print() operators are the only types that cascade. A return() term
> +only cascades if the code executes, and a print() term only cascades
> +when passed a non-zero argument.

Quite a few new sentences here; new lines for all

> +.Pp
> +.Sh EXAMPLES
> +.Bl -tag
> +.It Sy sysctl debug.fail_point.foobar="2.1%return(5)"
> +21/1000ths of the time, execute
> +.Fa code
> +with RETURN_VALUE set to 5.
> +.It Sy sysctl debug.fail_point.foobar="2%return(5)->5%return(22)"
> +2/100th of the time, execute
> +.Fa code
> +with RETURN_VALUE set to 5. If that doesn't happen, 5% of the time

And here.

> +execute
> +.Fa code
> +with RETURN_VALUE set to 22.
> +.It Sy sysctl debug.fail_point.foobar="5*return(5)->0.1%return(22)"
> +For 5 times, return 5. After that, 1/1000ths of the time, return 22.

ditto.  I guess 1/1000th is singular, too.

> +.It Sy sysctl debug.fail_point.foobar="0.1%5*return(5)"
> +Return 5 for 1 in 1000 executions, but only execute 5 times total.
> +.It Sy sysctl debug.fail_point.foobar="1%*sleep(50)"
> +1/100ths of the time, sleep 50ms.
> +.El
> +.Pp
> +.Sh CAVEATS
> +It's easy to shoot yourself in the foot by setting fail points too
> +aggressively or setting too many in combination. For example, forcing

new sentence, new line

> +.Fn malloc
> +to fail consistently is potentially harmful to uptime.
> +.Pp
> +The
> +.Fn sleep
> +sysctl setting may not be appropriate in all situations. Currently,

ditto

> +.Fn fail_point_eval
> +does not verify whether the context is appropriate for calling
> +.Fn msleep .
> +.Pp
> +.Sh AUTHORS
> +.An -nosplit
> +This manual page was written by
> +.An Zach Loafman Aq zml@FreeBSD.org .
>


Many thanks for getting this into the tree!

-Ben Kaduk



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?47d0403c0905271005t7f57f9b9h3d0721bbb1fbedc2>