From owner-svn-src-stable-8@FreeBSD.ORG Mon Jan 13 15:22:41 2014 Return-Path: Delivered-To: svn-src-stable-8@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id ED9B4E1D; Mon, 13 Jan 2014 15:22:41 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id D682218F0; Mon, 13 Jan 2014 15:22:41 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id s0DFMfaP032512; Mon, 13 Jan 2014 15:22:41 GMT (envelope-from pfg@svn.freebsd.org) Received: (from pfg@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id s0DFMc3r032490; Mon, 13 Jan 2014 15:22:38 GMT (envelope-from pfg@svn.freebsd.org) Message-Id: <201401131522.s0DFMc3r032490@svn.freebsd.org> From: "Pedro F. Giffuni" Date: Mon, 13 Jan 2014 15:22:38 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r260590 - in stable/8/contrib: gcc gcc/config/rs6000 gcc/cp gcc/doc gcclibs/libcpp X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Jan 2014 15:22:42 -0000 Author: pfg Date: Mon Jan 13 15:22:37 2014 New Revision: 260590 URL: http://svnweb.freebsd.org/changeset/base/260590 Log: MFC r258081, r258138, r258143, r258179, r258157, r258204, r258205, r258206, r258207, r258321 This is a series of commits inspired on Google's gcc-4.2.1 for Android that were taken from the gcc pre-4.3 under the GPLv2. gcc: Backport fixes for -W parentheses in C++ This fixes GCC 19564. gcc: merge rs6000 change from FSF pre-gcc43 Don't set MASK_PPC_GFXOPT for 8540 or 8548. Merge vrp-tree fix from gcc-4.3 Fix missed conversion from / to >> (GCC PR32521) Merge in GCCr120505 to include definition of TREE_OVERFLOW_P gcc: warn about integer overflow in constant expressions in the C++ frontend. gcc: Add a new option -Wvla to warn variable length array. libcpp: preprocessor speedup patches from upstream gcc. gcc: add femit-struct-debug support to reduce Reduce dwarf debug size gcc: Fix postreload-gcse treatment of call-clobbered registers. gcc: Record some previous commits in the ChangeLog.gcc43 file. Tested by: danfe Modified: stable/8/contrib/gcc/ChangeLog.gcc43 stable/8/contrib/gcc/c-common.c stable/8/contrib/gcc/c-common.h stable/8/contrib/gcc/c-decl.c stable/8/contrib/gcc/c-opts.c stable/8/contrib/gcc/c-typeck.c stable/8/contrib/gcc/c.opt stable/8/contrib/gcc/config/rs6000/rs6000.c stable/8/contrib/gcc/cp/cp-lang.c stable/8/contrib/gcc/cp/cp-tree.h stable/8/contrib/gcc/cp/decl.c stable/8/contrib/gcc/cp/parser.c stable/8/contrib/gcc/cp/pt.c stable/8/contrib/gcc/cp/semantics.c stable/8/contrib/gcc/cp/tree.c stable/8/contrib/gcc/cp/typeck.c stable/8/contrib/gcc/doc/invoke.texi stable/8/contrib/gcc/dwarf2out.c stable/8/contrib/gcc/flags.h stable/8/contrib/gcc/langhooks-def.h stable/8/contrib/gcc/langhooks.h stable/8/contrib/gcc/opts.c stable/8/contrib/gcc/postreload-gcse.c stable/8/contrib/gcc/regs.h stable/8/contrib/gcc/rtlanal.c stable/8/contrib/gcc/tree-vrp.c stable/8/contrib/gcc/tree.h stable/8/contrib/gcclibs/libcpp/files.c stable/8/contrib/gcclibs/libcpp/internal.h stable/8/contrib/gcclibs/libcpp/lex.c Directory Properties: stable/8/ (props changed) stable/8/contrib/ (props changed) stable/8/contrib/gcc/ (props changed) stable/8/contrib/gcclibs/ (props changed) Modified: stable/8/contrib/gcc/ChangeLog.gcc43 ============================================================================== --- stable/8/contrib/gcc/ChangeLog.gcc43 Mon Jan 13 15:21:11 2014 (r260589) +++ stable/8/contrib/gcc/ChangeLog.gcc43 Mon Jan 13 15:22:37 2014 (r260590) @@ -11,6 +11,57 @@ with SSE3 instruction set support. * doc/invoke.texi: Likewise. +2007-04-16 Lawrence Crowl + + * doc/invoke.texi (Debugging Options): Add documentation for the + -femit-struct-debug options -femit-struct-debug-baseonly, + -femit-struct-debug-reduced, and + -femit-struct-debug-detailed[=...]. + + * c-opts.c (c_common_handle_option): Add + OPT_femit_struct_debug_baseonly, OPT_femit_struct_debug_reduced, + and OPT_femit_struct_debug_detailed_. + * c.opt: Add specifications for + -femit-struct-debug-baseonly, -femit-struct-debug-reduced, + and -femit-struct-debug-detailed[=...]. + * opts.c (set_struct_debug_option): Parse the + -femit-struct-debug-... options. + * opts.c (matches_main_base, main_input_basename, + main_input_baselength, base_of_path, matches_main_base): Add + variables and functions to compare header base name to compilation + unit base name. + * opts.c (should_emit_struct_debug): Add to determine to emit a + structure based on the option. + (dump_struct_debug) Also disabled function to debug this + function. + * opts.c (handle_options): Save the base name of the + compilation unit. + + * langhooks-def.h (LANG_HOOKS_GENERIC_TYPE_P): Define. + (LANG_HOOKS_FOR_TYPES_INITIALIZER): Add. + This hook indicates if a type is generic. Set it by default + to "never generic". + * langhooks.h (struct lang_hooks_for_types): Add a new hook + to determine if a struct type is generic or not. + * cp/cp-tree.h (class_tmpl_impl_spec_p): Declare a C++ hook. + * cp/tree.c (class_tmpl_impl_spec_p): Implement the C++ hook. + * cp/cp-lang.c (LANG_HOOKS_GENERIC_TYPE_P): Override null C hook + with live C++ hook. + + * flags.h (enum debug_info_usage): Add an enumeration to describe + a program's use of a structure type. + * dwarf2out.c (gen_struct_or_union_type_die): Add a new parameter + to indicate the program's usage of the type. Filter structs based + on the -femit-struct-debug-... specification. + (gen_type_die): Split into two routines, gen_type_die and + gen_type_die_with_usage. gen_type_die is now a wrapper + that assumes direct usage. + (gen_type_die_with_usage): Replace calls to gen_type_die + with gen_type_die_with_usage adding the program usage of + the referenced type. + (dwarf2out_imported_module_or_decl): Suppress struct debug + information using should_emit_struct_debug when appropriate. + 2007-04-12 Richard Guenther (r123736) PR tree-optimization/24689 @@ -48,6 +99,17 @@ * config.gcc: Support core2 processor. +2006-12-13 Ian Lance Taylor (r119855) + + PR c++/19564 + PR c++/19756 + * c-typeck.c (parser_build_binary_op): Move parentheses warnings + to warn_about_parentheses in c-common.c. + * c-common.c (warn_about_parentheses): New function. + * c-common.h (warn_about_parentheses): Declare. + * doc/invoke.texi (Warning Options): Update -Wparentheses + description. + 2006-12-02 H.J. Lu (r119454 - partial) PR target/30040 @@ -81,6 +143,35 @@ (override_options): Add entries for Core2. (ix86_issue_rate): Add case for Core2. +2006-10-31 Geoffrey Keating (r118356) + + * c-decl.c (grokdeclarator): Don't set DECL_EXTERNAL on + inline static functions in c99 mode. + + PR 16622 + * doc/extend.texi (Inline): Update. + * c-tree.h (struct language_function): Remove field 'extern_inline'. + * c-decl.c (current_extern_inline): Delete. + (pop_scope): Adjust test for an undefined nested function. + Add warning about undeclared inline function. + (diagnose_mismatched_decls): Update comments. Disallow overriding + of inline functions in a translation unit in C99. Allow inline + declarations in C99 at any time. + (merge_decls): Boolize variables. Handle C99 'extern inline' + semantics. + (grokdeclarator): Set DECL_EXTERNAL here for functions. Handle + C99 inline semantics. + (start_function): Don't clear current_extern_inline. Don't set + DECL_EXTERNAL. + (c_push_function_context): Don't push current_extern_inline. + (c_pop_function_context): Don't restore current_extern_inline. + + PR 11377 + * c-typeck.c (build_external_ref): Warn about static variables + used in extern inline functions. + * c-decl.c (start_decl): Warn about static variables declared + in extern inline functions. + 2006-10-27 Vladimir Makarov (r118090) * config/i386/i386.h (TARGET_GEODE): Modified: stable/8/contrib/gcc/c-common.c ============================================================================== --- stable/8/contrib/gcc/c-common.c Mon Jan 13 15:21:11 2014 (r260589) +++ stable/8/contrib/gcc/c-common.c Mon Jan 13 15:22:37 2014 (r260590) @@ -2585,9 +2585,13 @@ c_common_truthvalue_conversion (tree exp break; case MODIFY_EXPR: - if (!TREE_NO_WARNING (expr)) - warning (OPT_Wparentheses, - "suggest parentheses around assignment used as truth value"); + if (!TREE_NO_WARNING (expr) + && warn_parentheses) + { + warning (OPT_Wparentheses, + "suggest parentheses around assignment used as truth value"); + TREE_NO_WARNING (expr) = 1; + } break; default: @@ -6471,5 +6475,87 @@ warn_array_subscript_with_type_char (tre warning (OPT_Wchar_subscripts, "array subscript has type %"); } +/* Implement -Wparentheses for the unexpected C precedence rules, to + cover cases like x + y << z which readers are likely to + misinterpret. We have seen an expression in which CODE is a binary + operator used to combine expressions headed by CODE_LEFT and + CODE_RIGHT. CODE_LEFT and CODE_RIGHT may be ERROR_MARK, which + means that that side of the expression was not formed using a + binary operator, or it was enclosed in parentheses. */ + +void +warn_about_parentheses (enum tree_code code, enum tree_code code_left, + enum tree_code code_right) +{ + if (!warn_parentheses) + return; + + if (code == LSHIFT_EXPR || code == RSHIFT_EXPR) + { + if (code_left == PLUS_EXPR || code_left == MINUS_EXPR + || code_right == PLUS_EXPR || code_right == MINUS_EXPR) + warning (OPT_Wparentheses, + "suggest parentheses around + or - inside shift"); + } + + if (code == TRUTH_ORIF_EXPR) + { + if (code_left == TRUTH_ANDIF_EXPR + || code_right == TRUTH_ANDIF_EXPR) + warning (OPT_Wparentheses, + "suggest parentheses around && within ||"); + } + + if (code == BIT_IOR_EXPR) + { + if (code_left == BIT_AND_EXPR || code_left == BIT_XOR_EXPR + || code_left == PLUS_EXPR || code_left == MINUS_EXPR + || code_right == BIT_AND_EXPR || code_right == BIT_XOR_EXPR + || code_right == PLUS_EXPR || code_right == MINUS_EXPR) + warning (OPT_Wparentheses, + "suggest parentheses around arithmetic in operand of |"); + /* Check cases like x|y==z */ + if (TREE_CODE_CLASS (code_left) == tcc_comparison + || TREE_CODE_CLASS (code_right) == tcc_comparison) + warning (OPT_Wparentheses, + "suggest parentheses around comparison in operand of |"); + } + + if (code == BIT_XOR_EXPR) + { + if (code_left == BIT_AND_EXPR + || code_left == PLUS_EXPR || code_left == MINUS_EXPR + || code_right == BIT_AND_EXPR + || code_right == PLUS_EXPR || code_right == MINUS_EXPR) + warning (OPT_Wparentheses, + "suggest parentheses around arithmetic in operand of ^"); + /* Check cases like x^y==z */ + if (TREE_CODE_CLASS (code_left) == tcc_comparison + || TREE_CODE_CLASS (code_right) == tcc_comparison) + warning (OPT_Wparentheses, + "suggest parentheses around comparison in operand of ^"); + } + + if (code == BIT_AND_EXPR) + { + if (code_left == PLUS_EXPR || code_left == MINUS_EXPR + || code_right == PLUS_EXPR || code_right == MINUS_EXPR) + warning (OPT_Wparentheses, + "suggest parentheses around + or - in operand of &"); + /* Check cases like x&y==z */ + if (TREE_CODE_CLASS (code_left) == tcc_comparison + || TREE_CODE_CLASS (code_right) == tcc_comparison) + warning (OPT_Wparentheses, + "suggest parentheses around comparison in operand of &"); + } + + /* Similarly, check for cases like 1<=i<=10 that are probably errors. */ + if (TREE_CODE_CLASS (code) == tcc_comparison + && (TREE_CODE_CLASS (code_left) == tcc_comparison + || TREE_CODE_CLASS (code_right) == tcc_comparison)) + warning (OPT_Wparentheses, "comparisons like X<=Y<=Z do not " + "have their mathematical meaning"); +} + #include "gt-c-common.h" Modified: stable/8/contrib/gcc/c-common.h ============================================================================== --- stable/8/contrib/gcc/c-common.h Mon Jan 13 15:21:11 2014 (r260589) +++ stable/8/contrib/gcc/c-common.h Mon Jan 13 15:22:37 2014 (r260590) @@ -850,6 +850,9 @@ extern int complete_array_type (tree *, extern tree builtin_type_for_size (int, bool); extern void warn_array_subscript_with_type_char (tree); +extern void warn_about_parentheses (enum tree_code, enum tree_code, + enum tree_code); + /* In c-gimplify.c */ extern void c_genericize (tree); Modified: stable/8/contrib/gcc/c-decl.c ============================================================================== --- stable/8/contrib/gcc/c-decl.c Mon Jan 13 15:21:11 2014 (r260589) +++ stable/8/contrib/gcc/c-decl.c Mon Jan 13 15:22:37 2014 (r260590) @@ -3931,6 +3931,61 @@ check_bitfield_type_and_width (tree *typ } + +/* Print warning about variable length array if necessary. */ + +static void +warn_variable_length_array (const char *name, tree size) +{ + int ped = !flag_isoc99 && pedantic && warn_vla != 0; + int const_size = TREE_CONSTANT (size); + + if (ped) + { + if (const_size) + { + if (name) + pedwarn ("ISO C90 forbids array %qs whose size " + "can%'t be evaluated", + name); + else + pedwarn ("ISO C90 forbids array whose size " + "can%'t be evaluated"); + } + else + { + if (name) + pedwarn ("ISO C90 forbids variable length array %qs", + name); + else + pedwarn ("ISO C90 forbids variable length array"); + } + } + else if (warn_vla > 0) + { + if (const_size) + { + if (name) + warning (OPT_Wvla, + "the size of array %qs can" + "%'t be evaluated", name); + else + warning (OPT_Wvla, + "the size of array can %'t be evaluated"); + } + else + { + if (name) + warning (OPT_Wvla, + "variable length array %qs is used", + name); + else + warning (OPT_Wvla, + "variable length array is used"); + } + } +} + /* Given declspecs and a declarator, determine the name and type of the object declared and construct a ..._DECL node for it. @@ -4329,17 +4384,7 @@ grokdeclarator (const struct c_declarato nonconstant even if it is (eg) a const variable with known value. */ size_varies = 1; - - if (!flag_isoc99 && pedantic) - { - if (TREE_CONSTANT (size)) - pedwarn ("ISO C90 forbids array %qs whose size " - "can%'t be evaluated", - name); - else - pedwarn ("ISO C90 forbids variable-size array %qs", - name); - } + warn_variable_length_array (orig_name, size); if (warn_variable_decl) warning (0, "variable-sized array %qs", name); } Modified: stable/8/contrib/gcc/c-opts.c ============================================================================== --- stable/8/contrib/gcc/c-opts.c Mon Jan 13 15:21:11 2014 (r260589) +++ stable/8/contrib/gcc/c-opts.c Mon Jan 13 15:22:37 2014 (r260590) @@ -818,6 +818,18 @@ c_common_handle_option (size_t scode, co flag_gen_declaration = 1; break; + case OPT_femit_struct_debug_baseonly: + set_struct_debug_option ("base"); + break; + + case OPT_femit_struct_debug_reduced: + set_struct_debug_option ("dir:ord:sys,dir:gen:any,ind:base"); + break; + + case OPT_femit_struct_debug_detailed_: + set_struct_debug_option (arg); + break; + case OPT_idirafter: add_path (xstrdup (arg), AFTER, 0, true); break; Modified: stable/8/contrib/gcc/c-typeck.c ============================================================================== --- stable/8/contrib/gcc/c-typeck.c Mon Jan 13 15:21:11 2014 (r260589) +++ stable/8/contrib/gcc/c-typeck.c Mon Jan 13 15:22:37 2014 (r260590) @@ -2631,73 +2631,7 @@ parser_build_binary_op (enum tree_code c /* Check for cases such as x+y< Detailed reduced debug info for structs + idirafter C ObjC C++ ObjC++ Joined Separate -idirafter Add to the end of the system include path Modified: stable/8/contrib/gcc/config/rs6000/rs6000.c ============================================================================== --- stable/8/contrib/gcc/config/rs6000/rs6000.c Mon Jan 13 15:21:11 2014 (r260589) +++ stable/8/contrib/gcc/config/rs6000/rs6000.c Mon Jan 13 15:22:37 2014 (r260590) @@ -1171,11 +1171,9 @@ rs6000_override_options (const char *def {"801", PROCESSOR_MPCCORE, POWERPC_BASE_MASK | MASK_SOFT_FLOAT}, {"821", PROCESSOR_MPCCORE, POWERPC_BASE_MASK | MASK_SOFT_FLOAT}, {"823", PROCESSOR_MPCCORE, POWERPC_BASE_MASK | MASK_SOFT_FLOAT}, - {"8540", PROCESSOR_PPC8540, - POWERPC_BASE_MASK | MASK_PPC_GFXOPT | MASK_STRICT_ALIGN}, + {"8540", PROCESSOR_PPC8540, POWERPC_BASE_MASK | MASK_STRICT_ALIGN}, /* 8548 has a dummy entry for now. */ - {"8548", PROCESSOR_PPC8540, - POWERPC_BASE_MASK | MASK_PPC_GFXOPT | MASK_STRICT_ALIGN}, + {"8548", PROCESSOR_PPC8540, POWERPC_BASE_MASK | MASK_STRICT_ALIGN}, {"860", PROCESSOR_MPCCORE, POWERPC_BASE_MASK | MASK_SOFT_FLOAT}, {"970", PROCESSOR_POWER4, POWERPC_7400_MASK | MASK_PPC_GPOPT | MASK_MFCRF | MASK_POWERPC64}, Modified: stable/8/contrib/gcc/cp/cp-lang.c ============================================================================== --- stable/8/contrib/gcc/cp/cp-lang.c Mon Jan 13 15:21:11 2014 (r260589) +++ stable/8/contrib/gcc/cp/cp-lang.c Mon Jan 13 15:22:37 2014 (r260590) @@ -44,6 +44,8 @@ static void cp_init_ts (void); #define LANG_HOOKS_NAME "GNU C++" #undef LANG_HOOKS_INIT #define LANG_HOOKS_INIT cxx_init +#undef LANG_HOOKS_GENERIC_TYPE_P +#define LANG_HOOKS_GENERIC_TYPE_P class_tmpl_impl_spec_p #undef LANG_HOOKS_DECL_PRINTABLE_NAME #define LANG_HOOKS_DECL_PRINTABLE_NAME cxx_printable_name #undef LANG_HOOKS_FOLD_OBJ_TYPE_REF Modified: stable/8/contrib/gcc/cp/cp-tree.h ============================================================================== --- stable/8/contrib/gcc/cp/cp-tree.h Mon Jan 13 15:21:11 2014 (r260589) +++ stable/8/contrib/gcc/cp/cp-tree.h Mon Jan 13 15:22:37 2014 (r260590) @@ -4373,6 +4373,7 @@ extern tree add_stmt_to_compound (tree, extern tree cxx_maybe_build_cleanup (tree); extern void init_tree (void); extern int pod_type_p (tree); +extern bool class_tmpl_impl_spec_p (tree); extern int zero_init_p (tree); extern tree canonical_type_variant (tree); extern tree copy_binfo (tree, tree, tree, @@ -4460,8 +4461,9 @@ extern tree build_x_indirect_ref (tree, extern tree build_indirect_ref (tree, const char *); extern tree build_array_ref (tree, tree); extern tree get_member_function_from_ptrfunc (tree *, tree); -extern tree build_x_binary_op (enum tree_code, tree, tree, - bool *); +extern tree build_x_binary_op (enum tree_code, tree, + enum tree_code, tree, + enum tree_code, bool *); extern tree build_x_unary_op (enum tree_code, tree); extern tree unary_complex_lvalue (enum tree_code, tree); extern tree build_x_conditional_expr (tree, tree, tree); Modified: stable/8/contrib/gcc/cp/decl.c ============================================================================== --- stable/8/contrib/gcc/cp/decl.c Mon Jan 13 15:21:11 2014 (r260589) +++ stable/8/contrib/gcc/cp/decl.c Mon Jan 13 15:22:37 2014 (r260590) @@ -6702,12 +6702,21 @@ compute_array_index_type (tree name, tre error ("size of array is not an integral constant-expression"); size = integer_one_node; } - else if (pedantic) + else if (pedantic && warn_vla != 0) { if (name) - pedwarn ("ISO C++ forbids variable-size array %qD", name); + pedwarn ("ISO C++ forbids variable length array %qD", name); else - pedwarn ("ISO C++ forbids variable-size array"); + pedwarn ("ISO C++ forbids variable length array"); + } + else if (warn_vla > 0) + { + if (name) + warning (OPT_Wvla, + "variable length array %qD is used", name); + else + warning (OPT_Wvla, + "variable length array is used"); } if (processing_template_decl && !TREE_CONSTANT (size)) Modified: stable/8/contrib/gcc/cp/parser.c ============================================================================== --- stable/8/contrib/gcc/cp/parser.c Mon Jan 13 15:21:11 2014 (r260589) +++ stable/8/contrib/gcc/cp/parser.c Mon Jan 13 15:22:37 2014 (r260590) @@ -1177,8 +1177,15 @@ typedef enum cp_parser_status_kind typedef struct cp_parser_expression_stack_entry { + /* Left hand side of the binary operation we are currently + parsing. */ tree lhs; + /* Original tree code for left hand side, if it was a binary + expression itself (used for -Wparentheses). */ + enum tree_code lhs_type; + /* Tree code for the binary operation we are parsing. */ enum tree_code tree_type; + /* Precedence of the binary operation we are parsing. */ int prec; } cp_parser_expression_stack_entry; @@ -1536,7 +1543,7 @@ static tree cp_parser_builtin_offsetof /* Statements [gram.stmt.stmt] */ static void cp_parser_statement - (cp_parser *, tree, bool); + (cp_parser *, tree, bool, bool *); static void cp_parser_label_for_labeled_statement (cp_parser *); static tree cp_parser_expression_statement @@ -1546,7 +1553,7 @@ static tree cp_parser_compound_statement static void cp_parser_statement_seq_opt (cp_parser *, tree); static tree cp_parser_selection_statement - (cp_parser *); + (cp_parser *, bool *); static tree cp_parser_condition (cp_parser *); static tree cp_parser_iteration_statement @@ -1559,7 +1566,7 @@ static void cp_parser_declaration_statem (cp_parser *); static tree cp_parser_implicitly_scoped_statement - (cp_parser *); + (cp_parser *, bool *); static void cp_parser_already_scoped_statement (cp_parser *); @@ -5730,12 +5737,13 @@ cp_parser_binary_expression (cp_parser* cp_parser_expression_stack_entry *sp = &stack[0]; tree lhs, rhs; cp_token *token; - enum tree_code tree_type; + enum tree_code tree_type, lhs_type, rhs_type; enum cp_parser_prec prec = PREC_NOT_OPERATOR, new_prec, lookahead_prec; bool overloaded_p; /* Parse the first expression. */ lhs = cp_parser_cast_expression (parser, /*address_p=*/false, cast_p); + lhs_type = ERROR_MARK; for (;;) { @@ -5768,6 +5776,7 @@ cp_parser_binary_expression (cp_parser* /* Extract another operand. It may be the RHS of this expression or the LHS of a new, higher priority expression. */ rhs = cp_parser_simple_cast_expression (parser); + rhs_type = ERROR_MARK; /* Get another operator token. Look up its precedence to avoid building a useless (immediately popped) stack entry for common @@ -5783,8 +5792,10 @@ cp_parser_binary_expression (cp_parser* sp->prec = prec; sp->tree_type = tree_type; sp->lhs = lhs; + sp->lhs_type = lhs_type; sp++; lhs = rhs; + lhs_type = rhs_type; prec = new_prec; new_prec = lookahead_prec; goto get_rhs; @@ -5801,11 +5812,15 @@ cp_parser_binary_expression (cp_parser* prec = sp->prec; tree_type = sp->tree_type; rhs = lhs; + rhs_type = lhs_type; lhs = sp->lhs; + lhs_type = sp->lhs_type; } overloaded_p = false; - lhs = build_x_binary_op (tree_type, lhs, rhs, &overloaded_p); + lhs = build_x_binary_op (tree_type, lhs, lhs_type, rhs, rhs_type, + &overloaded_p); + lhs_type = tree_type; /* If the binary operator required the use of an overloaded operator, then this expression cannot be an integral constant-expression. @@ -6222,17 +6237,23 @@ cp_parser_builtin_offsetof (cp_parser *p try-block IN_COMPOUND is true when the statement is nested inside a - cp_parser_compound_statement; this matters for certain pragmas. */ + cp_parser_compound_statement; this matters for certain pragmas. + + If IF_P is not NULL, *IF_P is set to indicate whether the statement + is a (possibly labeled) if statement which is not enclosed in braces + and has an else clause. This is used to implement -Wparentheses. */ static void cp_parser_statement (cp_parser* parser, tree in_statement_expr, - bool in_compound) + bool in_compound, bool *if_p) { tree statement; cp_token *token; location_t statement_location; restart: + if (if_p != NULL) + *if_p = false; /* There is no statement yet. */ statement = NULL_TREE; /* Peek at the next token. */ @@ -6257,7 +6278,7 @@ cp_parser_statement (cp_parser* parser, case RID_IF: case RID_SWITCH: - statement = cp_parser_selection_statement (parser); + statement = cp_parser_selection_statement (parser, if_p); break; case RID_WHILE: @@ -6522,7 +6543,7 @@ cp_parser_statement_seq_opt (cp_parser* break; /* Parse the statement. */ - cp_parser_statement (parser, in_statement_expr, true); + cp_parser_statement (parser, in_statement_expr, true, NULL); } } @@ -6533,14 +6554,22 @@ cp_parser_statement_seq_opt (cp_parser* if ( condition ) statement else statement switch ( condition ) statement - Returns the new IF_STMT or SWITCH_STMT. */ + Returns the new IF_STMT or SWITCH_STMT. + + If IF_P is not NULL, *IF_P is set to indicate whether the statement + is a (possibly labeled) if statement which is not enclosed in + braces and has an else clause. This is used to implement + -Wparentheses. */ static tree -cp_parser_selection_statement (cp_parser* parser) +cp_parser_selection_statement (cp_parser* parser, bool *if_p) { cp_token *token; enum rid keyword; + if (if_p != NULL) + *if_p = false; + /* Peek at the next token. */ token = cp_parser_require (parser, CPP_KEYWORD, "selection-statement"); @@ -6576,11 +6605,13 @@ cp_parser_selection_statement (cp_parser if (keyword == RID_IF) { + bool nested_if; + /* Add the condition. */ finish_if_stmt_cond (condition, statement); /* Parse the then-clause. */ - cp_parser_implicitly_scoped_statement (parser); + cp_parser_implicitly_scoped_statement (parser, &nested_if); finish_then_clause (statement); /* If the next token is `else', parse the else-clause. */ @@ -6591,8 +6622,28 @@ cp_parser_selection_statement (cp_parser cp_lexer_consume_token (parser->lexer); begin_else_clause (statement); /* Parse the else-clause. */ - cp_parser_implicitly_scoped_statement (parser); + cp_parser_implicitly_scoped_statement (parser, NULL); finish_else_clause (statement); + + /* If we are currently parsing a then-clause, then + IF_P will not be NULL. We set it to true to + indicate that this if statement has an else clause. + This may trigger the Wparentheses warning below + when we get back up to the parent if statement. */ + if (if_p != NULL) + *if_p = true; + } + else + { + /* This if statement does not have an else clause. If + NESTED_IF is true, then the then-clause is an if + statement which does have an else clause. We warn + about the potential ambiguity. */ + if (nested_if) + warning (OPT_Wparentheses, + ("%Hsuggest explicit braces " + "to avoid ambiguous %"), + EXPR_LOCUS (statement)); } /* Now we're all done with the if-statement. */ @@ -6611,7 +6662,7 @@ cp_parser_selection_statement (cp_parser in_statement = parser->in_statement; parser->in_switch_statement_p = true; parser->in_statement |= IN_SWITCH_STMT; - cp_parser_implicitly_scoped_statement (parser); + cp_parser_implicitly_scoped_statement (parser, NULL); parser->in_switch_statement_p = in_switch_statement_p; parser->in_statement = in_statement; @@ -6789,7 +6840,7 @@ cp_parser_iteration_statement (cp_parser statement = begin_do_stmt (); /* Parse the body of the do-statement. */ parser->in_statement = IN_ITERATION_STMT; - cp_parser_implicitly_scoped_statement (parser); + cp_parser_implicitly_scoped_statement (parser, NULL); parser->in_statement = in_statement; finish_do_body (statement); /* Look for the `while' keyword. */ @@ -7031,13 +7082,21 @@ cp_parser_declaration_statement (cp_pars but ensures that is in its own scope, even if it is not a compound-statement. + If IF_P is not NULL, *IF_P is set to indicate whether the statement + is a (possibly labeled) if statement which is not enclosed in + braces and has an else clause. This is used to implement + -Wparentheses. + Returns the new statement. */ static tree -cp_parser_implicitly_scoped_statement (cp_parser* parser) +cp_parser_implicitly_scoped_statement (cp_parser* parser, bool *if_p) { tree statement; + if (if_p != NULL) + *if_p = false; + /* Mark if () ; with a special NOP_EXPR. */ if (cp_lexer_next_token_is (parser->lexer, CPP_SEMICOLON)) { @@ -7053,7 +7112,7 @@ cp_parser_implicitly_scoped_statement (c /* Create a compound-statement. */ statement = begin_compound_stmt (0); /* Parse the dependent-statement. */ - cp_parser_statement (parser, NULL_TREE, false); + cp_parser_statement (parser, NULL_TREE, false, if_p); /* Finish the dummy compound-statement. */ finish_compound_stmt (statement); } @@ -7072,7 +7131,7 @@ cp_parser_already_scoped_statement (cp_p { /* If the token is a `{', then we must take special action. */ if (cp_lexer_next_token_is_not (parser->lexer, CPP_OPEN_BRACE)) - cp_parser_statement (parser, NULL_TREE, false); + cp_parser_statement (parser, NULL_TREE, false, NULL); else { /* Avoid calling cp_parser_compound_statement, so that we @@ -18645,7 +18704,7 @@ cp_parser_omp_structured_block (cp_parse tree stmt = begin_omp_structured_block (); unsigned int save = cp_parser_begin_omp_structured_block (parser); - cp_parser_statement (parser, NULL_TREE, false); + cp_parser_statement (parser, NULL_TREE, false, NULL); cp_parser_end_omp_structured_block (parser, save); return finish_omp_structured_block (stmt); @@ -18890,7 +18949,7 @@ cp_parser_omp_for_loop (cp_parser *parse /* Note that the grammar doesn't call for a structured block here, though the loop as a whole is a structured block. */ body = push_stmt_list (); - cp_parser_statement (parser, NULL_TREE, false); + cp_parser_statement (parser, NULL_TREE, false, NULL); body = pop_stmt_list (body); return finish_omp_for (loc, decl, init, cond, incr, body, pre_body); @@ -18983,7 +19042,7 @@ cp_parser_omp_sections_scope (cp_parser while (1) { - cp_parser_statement (parser, NULL_TREE, false); + cp_parser_statement (parser, NULL_TREE, false, NULL); tok = cp_lexer_peek_token (parser->lexer); if (tok->pragma_kind == PRAGMA_OMP_SECTION) Modified: stable/8/contrib/gcc/cp/pt.c ============================================================================== --- stable/8/contrib/gcc/cp/pt.c Mon Jan 13 15:21:11 2014 (r260589) +++ stable/8/contrib/gcc/cp/pt.c Mon Jan 13 15:22:37 2014 (r260590) @@ -9078,7 +9078,13 @@ tsubst_copy_and_build (tree t, return build_x_binary_op (TREE_CODE (t), RECUR (TREE_OPERAND (t, 0)), + (TREE_NO_WARNING (TREE_OPERAND (t, 0)) + ? ERROR_MARK + : TREE_CODE (TREE_OPERAND (t, 0))), RECUR (TREE_OPERAND (t, 1)), + (TREE_NO_WARNING (TREE_OPERAND (t, 1)) + ? ERROR_MARK + : TREE_CODE (TREE_OPERAND (t, 1))), /*overloaded_p=*/NULL); case SCOPE_REF: @@ -9087,7 +9093,14 @@ tsubst_copy_and_build (tree t, case ARRAY_REF: op1 = tsubst_non_call_postfix_expression (TREE_OPERAND (t, 0), args, complain, in_decl); - return build_x_binary_op (ARRAY_REF, op1, RECUR (TREE_OPERAND (t, 1)), + return build_x_binary_op (ARRAY_REF, op1, + (TREE_NO_WARNING (TREE_OPERAND (t, 0)) + ? ERROR_MARK + : TREE_CODE (TREE_OPERAND (t, 0))), + RECUR (TREE_OPERAND (t, 1)), + (TREE_NO_WARNING (TREE_OPERAND (t, 1)) + ? ERROR_MARK + : TREE_CODE (TREE_OPERAND (t, 1))), /*overloaded_p=*/NULL); case SIZEOF_EXPR: Modified: stable/8/contrib/gcc/cp/semantics.c ============================================================================== --- stable/8/contrib/gcc/cp/semantics.c Mon Jan 13 15:21:11 2014 (r260589) +++ stable/8/contrib/gcc/cp/semantics.c Mon Jan 13 15:22:37 2014 (r260590) @@ -587,6 +587,16 @@ maybe_convert_cond (tree cond) /* Do the conversion. */ cond = convert_from_reference (cond); + + if (TREE_CODE (cond) == MODIFY_EXPR + && !TREE_NO_WARNING (cond) + && warn_parentheses) + { + warning (OPT_Wparentheses, + "suggest parentheses around assignment used as truth value"); + TREE_NO_WARNING (cond) = 1; + } + return condition_conversion (cond); } Modified: stable/8/contrib/gcc/cp/tree.c ============================================================================== --- stable/8/contrib/gcc/cp/tree.c Mon Jan 13 15:21:11 2014 (r260589) +++ stable/8/contrib/gcc/cp/tree.c Mon Jan 13 15:22:37 2014 (r260590) @@ -1762,6 +1762,14 @@ pod_type_p (tree t) return 1; } +/* Nonzero iff type T is a class template implicit specialization. */ + +bool +class_tmpl_impl_spec_p (tree t) +{ + return CLASS_TYPE_P (t) && CLASSTYPE_TEMPLATE_INSTANTIATION (t); +} + /* Returns 1 iff zero initialization of type T means actually storing zeros in it. */ Modified: stable/8/contrib/gcc/cp/typeck.c ============================================================================== --- stable/8/contrib/gcc/cp/typeck.c Mon Jan 13 15:21:11 2014 (r260589) +++ stable/8/contrib/gcc/cp/typeck.c Mon Jan 13 15:22:37 2014 (r260590) @@ -1690,17 +1690,20 @@ rationalize_conditional_expr (enum tree_ are equal, so we know what conditional expression this used to be. */ if (TREE_CODE (t) == MIN_EXPR || TREE_CODE (t) == MAX_EXPR) { + tree op0 = TREE_OPERAND (t, 0); + tree op1 = TREE_OPERAND (t, 1); + /* The following code is incorrect if either operand side-effects. */ - gcc_assert (!TREE_SIDE_EFFECTS (TREE_OPERAND (t, 0)) - && !TREE_SIDE_EFFECTS (TREE_OPERAND (t, 1))); + gcc_assert (!TREE_SIDE_EFFECTS (op0) + && !TREE_SIDE_EFFECTS (op1)); return build_conditional_expr (build_x_binary_op ((TREE_CODE (t) == MIN_EXPR ? LE_EXPR : GE_EXPR), - TREE_OPERAND (t, 0), - TREE_OPERAND (t, 1), + op0, TREE_CODE (op0), + op1, TREE_CODE (op1), /*overloaded_p=*/NULL), - build_unary_op (code, TREE_OPERAND (t, 0), 0), - build_unary_op (code, TREE_OPERAND (t, 1), 0)); + build_unary_op (code, op0, 0), + build_unary_op (code, op1, 0)); } return @@ -2908,8 +2911,8 @@ convert_arguments (tree typelist, tree v conversions on the operands. CODE is the kind of expression to build. */ tree -build_x_binary_op (enum tree_code code, tree arg1, tree arg2, - bool *overloaded_p) +build_x_binary_op (enum tree_code code, tree arg1, enum tree_code arg1_code, + tree arg2, enum tree_code arg2_code, bool *overloaded_p) { tree orig_arg1; tree orig_arg2; @@ -2933,6 +2936,17 @@ build_x_binary_op (enum tree_code code, expr = build_new_op (code, LOOKUP_NORMAL, arg1, arg2, NULL_TREE, overloaded_p); + /* Check for cases such as x+y< Delivered-To: svn-src-stable-8@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 22A0DFC9; Mon, 13 Jan 2014 15:28:40 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id E7929192B; Mon, 13 Jan 2014 15:28:39 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id s0DFSdJM033142; Mon, 13 Jan 2014 15:28:39 GMT (envelope-from pfg@svn.freebsd.org) Received: (from pfg@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id s0DFSdSJ033141; Mon, 13 Jan 2014 15:28:39 GMT (envelope-from pfg@svn.freebsd.org) Message-Id: <201401131528.s0DFSdSJ033141@svn.freebsd.org> From: "Pedro F. Giffuni" Date: Mon, 13 Jan 2014 15:28:39 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r260591 - stable/8/contrib/tcpdump X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Jan 2014 15:28:40 -0000 Author: pfg Date: Mon Jan 13 15:28:39 2014 New Revision: 260591 URL: http://svnweb.freebsd.org/changeset/base/260591 Log: MFC r258573: Removes strict-aliasing warnings from newer GCC in tcpdump. Modified: stable/8/contrib/tcpdump/extract.h Directory Properties: stable/8/contrib/ (props changed) stable/8/contrib/tcpdump/ (props changed) Modified: stable/8/contrib/tcpdump/extract.h ============================================================================== --- stable/8/contrib/tcpdump/extract.h Mon Jan 13 15:22:37 2014 (r260590) +++ stable/8/contrib/tcpdump/extract.h Mon Jan 13 15:28:39 2014 (r260591) @@ -51,13 +51,25 @@ typedef struct { u_int32_t val; } __attribute__((packed)) unaligned_u_int32_t; -#define EXTRACT_16BITS(p) \ - ((u_int16_t)ntohs(((const unaligned_u_int16_t *)(p))->val)) -#define EXTRACT_32BITS(p) \ - ((u_int32_t)ntohl(((const unaligned_u_int32_t *)(p))->val)) -#define EXTRACT_64BITS(p) \ - ((u_int64_t)(((u_int64_t)ntohl(((const unaligned_u_int32_t *)(p) + 0)->val)) << 32 | \ - ((u_int64_t)ntohl(((const unaligned_u_int32_t *)(p) + 1)->val)) << 0)) +static inline u_int16_t +EXTRACT_16BITS(const void *p) +{ + return ((u_int16_t)ntohs(((const unaligned_u_int16_t *)(p))->val)); +} + +static inline u_int32_t +EXTRACT_32BITS(const void *p) +{ + return ((u_int32_t)ntohl(((const unaligned_u_int32_t *)(p))->val)); +} + +static inline u_int64_t +EXTRACT_64BITS(const void *p) +{ + return ((u_int64_t)(((u_int64_t)ntohl(((const unaligned_u_int32_t *)(p) + 0)->val)) << 32 | \ + ((u_int64_t)ntohl(((const unaligned_u_int32_t *)(p) + 1)->val)) << 0)); + +} #else /* HAVE___ATTRIBUTE__ */ /* @@ -88,13 +100,26 @@ typedef struct { * The processor natively handles unaligned loads, so we can just * cast the pointer and fetch through it. */ -#define EXTRACT_16BITS(p) \ - ((u_int16_t)ntohs(*(const u_int16_t *)(p))) -#define EXTRACT_32BITS(p) \ - ((u_int32_t)ntohl(*(const u_int32_t *)(p))) -#define EXTRACT_64BITS(p) \ - ((u_int64_t)(((u_int64_t)ntohl(*((const u_int32_t *)(p) + 0))) << 32 | \ - ((u_int64_t)ntohl(*((const u_int32_t *)(p) + 1))) << 0)) +static inline u_int16_t +EXTRACT_16BITS(const void *p) +{ + return ((u_int16_t)ntohs(*(const u_int16_t *)(p))); +} + +static inline u_int32_t +EXTRACT_32BITS(const void *p) +{ + return ((u_int32_t)ntohl(*(const u_int32_t *)(p))); +} + +static inline u_int64_t +EXTRACT_64BITS(const void *p) +{ + return ((u_int64_t)(((u_int64_t)ntohl(*((const u_int32_t *)(p) + 0))) << 32 | \ + ((u_int64_t)ntohl(*((const u_int32_t *)(p) + 1))) << 0)); + +} + #endif /* LBL_ALIGN */ #define EXTRACT_24BITS(p) \ From owner-svn-src-stable-8@FreeBSD.ORG Mon Jan 13 15:32:38 2014 Return-Path: Delivered-To: svn-src-stable-8@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 87E2520E; Mon, 13 Jan 2014 15:32:38 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 67EA719B3; Mon, 13 Jan 2014 15:32:38 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id s0DFWcSj036196; Mon, 13 Jan 2014 15:32:38 GMT (envelope-from pfg@svn.freebsd.org) Received: (from pfg@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id s0DFWbwh036193; Mon, 13 Jan 2014 15:32:37 GMT (envelope-from pfg@svn.freebsd.org) Message-Id: <201401131532.s0DFWbwh036193@svn.freebsd.org> From: "Pedro F. Giffuni" Date: Mon, 13 Jan 2014 15:32:37 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r260592 - in stable/8/contrib/gcc: . doc X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Jan 2014 15:32:38 -0000 Author: pfg Date: Mon Jan 13 15:32:37 2014 New Revision: 260592 URL: http://svnweb.freebsd.org/changeset/base/260592 Log: MFC r259920: gcc: Implement -Wmost for compatibility with clang. This is equivalent to -Wall -Wno-parentheses. Obtained from: Apple GCC 4.2 - 5531 Modified: stable/8/contrib/gcc/c-opts.c stable/8/contrib/gcc/c.opt stable/8/contrib/gcc/doc/invoke.texi Directory Properties: stable/8/ (props changed) stable/8/contrib/ (props changed) stable/8/contrib/gcc/ (props changed) Modified: stable/8/contrib/gcc/c-opts.c ============================================================================== --- stable/8/contrib/gcc/c-opts.c Mon Jan 13 15:28:39 2014 (r260591) +++ stable/8/contrib/gcc/c-opts.c Mon Jan 13 15:32:37 2014 (r260592) @@ -385,12 +385,17 @@ c_common_handle_option (size_t scode, co break; case OPT_Wall: + /* APPLE LOCAL -Wmost */ + case OPT_Wmost: set_Wunused (value); set_Wformat (value); set_Wimplicit (value); warn_char_subscripts = value; warn_missing_braces = value; - warn_parentheses = value; + /* APPLE LOCAL begin -Wmost --dpatel */ + if (code != OPT_Wmost) + warn_parentheses = value; + /* APPLE LOCAL end -Wmost --dpatel */ warn_return_type = value; warn_sequence_point = value; /* Was C only. */ if (c_dialect_cxx ()) Modified: stable/8/contrib/gcc/c.opt ============================================================================== --- stable/8/contrib/gcc/c.opt Mon Jan 13 15:28:39 2014 (r260591) +++ stable/8/contrib/gcc/c.opt Mon Jan 13 15:32:37 2014 (r260592) @@ -284,6 +284,12 @@ Wmissing-prototypes C ObjC Var(warn_missing_prototypes) Warn about global functions without prototypes +; APPLE LOCAL begin -Wmost +Wmost +C ObjC C++ ObjC++ +Like -Wall but without -Wparentheses +; APPLE LOCAL end -Wmost + Wmultichar C ObjC C++ ObjC++ Warn about use of multi-character character constants Modified: stable/8/contrib/gcc/doc/invoke.texi ============================================================================== --- stable/8/contrib/gcc/doc/invoke.texi Mon Jan 13 15:28:39 2014 (r260591) +++ stable/8/contrib/gcc/doc/invoke.texi Mon Jan 13 15:32:37 2014 (r260592) @@ -238,6 +238,8 @@ Objective-C and Objective-C++ Dialects}. -Wmain -Wmissing-braces -Wmissing-field-initializers @gol -Wmissing-format-attribute -Wmissing-include-dirs @gol -Wmissing-noreturn @gol +@c APPLE LOCAL -Wmost +-Wmost (APPLE ONLY) @gol -Wno-multichar -Wnonnull -Wno-overflow @gol -Woverlength-strings -Wpacked -Wpadded @gol -Wparentheses -Wpointer-arith -Wno-pointer-to-int-cast @gol @@ -2897,7 +2899,12 @@ that are easy to avoid (or modify to pre conjunction with macros. This also enables some language-specific warnings described in @ref{C++ Dialect Options} and @ref{Objective-C and Objective-C++ Dialect Options}. +@c APPLE LOCAL begin -Wmost +@item -Wmost +@opindex Wmost +This is equivalent to -Wall -Wno-parentheses. (Apple compatible) @end table +@c APPLE LOCAL end -Wmost The following @option{-W@dots{}} options are not implied by @option{-Wall}. Some of them warn about constructions that users generally do not From owner-svn-src-stable-8@FreeBSD.ORG Mon Jan 13 16:05:22 2014 Return-Path: Delivered-To: svn-src-stable-8@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 2DF46903; Mon, 13 Jan 2014 16:05:22 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 17D811D4B; Mon, 13 Jan 2014 16:05:22 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id s0DG5LMh047802; Mon, 13 Jan 2014 16:05:21 GMT (envelope-from pfg@svn.freebsd.org) Received: (from pfg@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id s0DG5Jlu047784; Mon, 13 Jan 2014 16:05:19 GMT (envelope-from pfg@svn.freebsd.org) Message-Id: <201401131605.s0DG5Jlu047784@svn.freebsd.org> From: "Pedro F. Giffuni" Date: Mon, 13 Jan 2014 16:05:19 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r260593 - in stable/8/contrib/gcc: . cp X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Jan 2014 16:05:22 -0000 Author: pfg Date: Mon Jan 13 16:05:18 2014 New Revision: 260593 URL: http://svnweb.freebsd.org/changeset/base/260593 Log: MFC r259666, r259696: gcc: warnings from -Wformat-security Obtained from: Apple GCC 4.2 - 5646 (Radar 5764921) Modified: stable/8/contrib/gcc/c-common.c stable/8/contrib/gcc/c-convert.c stable/8/contrib/gcc/c-incpath.c stable/8/contrib/gcc/c-typeck.c stable/8/contrib/gcc/cfg.c stable/8/contrib/gcc/collect2.c stable/8/contrib/gcc/cp/cvt.c stable/8/contrib/gcc/cp/pt.c stable/8/contrib/gcc/cp/typeck.c stable/8/contrib/gcc/fold-const.c stable/8/contrib/gcc/gcc.c stable/8/contrib/gcc/gcov.c stable/8/contrib/gcc/tlink.c Directory Properties: stable/8/ (props changed) stable/8/contrib/ (props changed) stable/8/contrib/gcc/ (props changed) Modified: stable/8/contrib/gcc/c-common.c ============================================================================== --- stable/8/contrib/gcc/c-common.c Mon Jan 13 15:32:37 2014 (r260592) +++ stable/8/contrib/gcc/c-common.c Mon Jan 13 16:05:18 2014 (r260593) @@ -5922,11 +5922,11 @@ c_parse_error (const char *gmsgid, enum message = NULL; } else - error (gmsgid); + error (gmsgid, ""); if (message) { - error (message); + error (message, ""); free (message); } #undef catenate_messages Modified: stable/8/contrib/gcc/c-convert.c ============================================================================== --- stable/8/contrib/gcc/c-convert.c Mon Jan 13 15:32:37 2014 (r260592) +++ stable/8/contrib/gcc/c-convert.c Mon Jan 13 16:05:18 2014 (r260593) @@ -80,7 +80,7 @@ convert (tree type, tree expr) if ((invalid_conv_diag = targetm.invalid_conversion (TREE_TYPE (expr), type))) { - error (invalid_conv_diag); + error (invalid_conv_diag, ""); return error_mark_node; } Modified: stable/8/contrib/gcc/c-incpath.c ============================================================================== --- stable/8/contrib/gcc/c-incpath.c Mon Jan 13 15:32:37 2014 (r260592) +++ stable/8/contrib/gcc/c-incpath.c Mon Jan 13 16:05:18 2014 (r260593) @@ -72,7 +72,7 @@ free_path (struct cpp_dir *path, int rea case REASON_DUP_SYS: fprintf (stderr, _("ignoring duplicate directory \"%s\"\n"), path->name); if (reason == REASON_DUP_SYS) - fprintf (stderr, + fprintf (stderr, "%s", _(" as it is a non-system directory that duplicates a system directory\n")); break; @@ -292,16 +292,16 @@ merge_include_chains (cpp_reader *pfile, { struct cpp_dir *p; - fprintf (stderr, _("#include \"...\" search starts here:\n")); + fprintf (stderr, "%s", _("#include \"...\" search starts here:\n")); for (p = heads[QUOTE];; p = p->next) { if (p == heads[BRACKET]) - fprintf (stderr, _("#include <...> search starts here:\n")); + fprintf (stderr, "%s", _("#include <...> search starts here:\n")); if (!p) break; fprintf (stderr, " %s\n", p->name); } - fprintf (stderr, _("End of search list.\n")); + fprintf (stderr, "%s", _("End of search list.\n")); } } Modified: stable/8/contrib/gcc/c-typeck.c ============================================================================== --- stable/8/contrib/gcc/c-typeck.c Mon Jan 13 15:32:37 2014 (r260592) +++ stable/8/contrib/gcc/c-typeck.c Mon Jan 13 16:05:18 2014 (r260593) @@ -2571,7 +2571,7 @@ convert_arguments (tree typelist, tree v else if ((invalid_func_diag = targetm.calls.invalid_arg_for_unprototyped_fn (typelist, fundecl, val))) { - error (invalid_func_diag); + error (invalid_func_diag, ""); return error_mark_node; } else @@ -2762,7 +2762,7 @@ build_unary_op (enum tree_code code, tre if ((invalid_op_diag = targetm.invalid_unary_op (code, TREE_TYPE (xarg)))) { - error (invalid_op_diag); + error (invalid_op_diag, ""); return error_mark_node; } @@ -7802,7 +7802,7 @@ build_binary_op (enum tree_code code, tr if ((invalid_op_diag = targetm.invalid_binary_op (code, type0, type1))) { - error (invalid_op_diag); + error (invalid_op_diag, ""); return error_mark_node; } Modified: stable/8/contrib/gcc/cfg.c ============================================================================== --- stable/8/contrib/gcc/cfg.c Mon Jan 13 15:32:37 2014 (r260592) +++ stable/8/contrib/gcc/cfg.c Mon Jan 13 16:05:18 2014 (r260593) @@ -830,7 +830,7 @@ dump_cfg_bb_info (FILE *file, basic_bloc else fprintf (file, ", "); first = false; - fprintf (file, bb_bitnames[i]); + fprintf (file, "%s", bb_bitnames[i]); } if (!first) fprintf (file, ")"); Modified: stable/8/contrib/gcc/collect2.c ============================================================================== --- stable/8/contrib/gcc/collect2.c Mon Jan 13 15:32:37 2014 (r260592) +++ stable/8/contrib/gcc/collect2.c Mon Jan 13 16:05:18 2014 (r260593) @@ -1562,10 +1562,10 @@ collect_execute (const char *prog, char if (err != 0) { errno = err; - fatal_perror (errmsg); + fatal_perror ("%s", errmsg); } else - fatal (errmsg); + fatal ("%s", errmsg); } return pex; @@ -2050,10 +2050,10 @@ scan_prog_file (const char *prog_name, e if (err != 0) { errno = err; - fatal_perror (errmsg); + fatal_perror ("%s", errmsg); } else - fatal (errmsg); + fatal ("%s", errmsg); } int_handler = (void (*) (int)) signal (SIGINT, SIG_IGN); Modified: stable/8/contrib/gcc/cp/cvt.c ============================================================================== --- stable/8/contrib/gcc/cp/cvt.c Mon Jan 13 15:32:37 2014 (r260592) +++ stable/8/contrib/gcc/cp/cvt.c Mon Jan 13 16:05:18 2014 (r260593) @@ -615,7 +615,7 @@ ocp_convert (tree type, tree expr, int c if ((invalid_conv_diag = targetm.invalid_conversion (TREE_TYPE (expr), type))) { - error (invalid_conv_diag); + error (invalid_conv_diag, ""); return error_mark_node; } Modified: stable/8/contrib/gcc/cp/pt.c ============================================================================== --- stable/8/contrib/gcc/cp/pt.c Mon Jan 13 15:32:37 2014 (r260592) +++ stable/8/contrib/gcc/cp/pt.c Mon Jan 13 16:05:18 2014 (r260593) @@ -8925,7 +8925,7 @@ tsubst_copy_and_build (tree t, /*template_arg_p=*/false, &error_msg); if (error_msg) - error (error_msg); + error ("%s", error_msg); if (!function_p && TREE_CODE (decl) == IDENTIFIER_NODE) decl = unqualified_name_lookup_error (decl); return decl; Modified: stable/8/contrib/gcc/cp/typeck.c ============================================================================== --- stable/8/contrib/gcc/cp/typeck.c Mon Jan 13 15:32:37 2014 (r260592) +++ stable/8/contrib/gcc/cp/typeck.c Mon Jan 13 16:05:18 2014 (r260593) @@ -3091,7 +3091,7 @@ build_binary_op (enum tree_code code, tr if ((invalid_op_diag = targetm.invalid_binary_op (code, type0, type1))) { - error (invalid_op_diag); + error (invalid_op_diag, ""); return error_mark_node; } @@ -4018,7 +4018,7 @@ build_unary_op (enum tree_code code, tre : code), TREE_TYPE (xarg)))) { - error (invalid_op_diag); + error (invalid_op_diag, ""); return error_mark_node; } Modified: stable/8/contrib/gcc/fold-const.c ============================================================================== --- stable/8/contrib/gcc/fold-const.c Mon Jan 13 15:32:37 2014 (r260592) +++ stable/8/contrib/gcc/fold-const.c Mon Jan 13 16:05:18 2014 (r260593) @@ -992,7 +992,7 @@ fold_overflow_warning (const char* gmsgi } } else if (issue_strict_overflow_warning (wc)) - warning (OPT_Wstrict_overflow, gmsgid); + warning (OPT_Wstrict_overflow, "%s", gmsgid); } /* Return true if the built-in mathematical function specified by CODE Modified: stable/8/contrib/gcc/gcc.c ============================================================================== --- stable/8/contrib/gcc/gcc.c Mon Jan 13 15:32:37 2014 (r260592) +++ stable/8/contrib/gcc/gcc.c Mon Jan 13 16:05:18 2014 (r260593) @@ -2963,7 +2963,7 @@ execute (void) if (errmsg != NULL) { if (err == 0) - fatal (errmsg); + fatal ("%s", errmsg); else { errno = err; @@ -6514,7 +6514,7 @@ main (int argc, char **argv) if (! verbose_flag) { - printf (_("\nFor bug reporting instructions, please see:\n")); + printf ("%s", _("\nFor bug reporting instructions, please see:\n")); printf ("%s.\n", bug_report_url); return (0); Modified: stable/8/contrib/gcc/gcov.c ============================================================================== --- stable/8/contrib/gcc/gcov.c Mon Jan 13 15:32:37 2014 (r260592) +++ stable/8/contrib/gcc/gcov.c Mon Jan 13 16:05:18 2014 (r260593) @@ -417,7 +417,7 @@ print_version (void) fnotice (stdout, "gcov (GCC) %s\n", version_string); fprintf (stdout, "Copyright %s 2006 Free Software Foundation, Inc.\n", _("(C)")); - fnotice (stdout, + fnotice (stdout, "%s", _("This is free software; see the source for copying conditions.\n" "There is NO warranty; not even for MERCHANTABILITY or \n" "FITNESS FOR A PARTICULAR PURPOSE.\n\n")); Modified: stable/8/contrib/gcc/tlink.c ============================================================================== --- stable/8/contrib/gcc/tlink.c Mon Jan 13 15:32:37 2014 (r260592) +++ stable/8/contrib/gcc/tlink.c Mon Jan 13 16:05:18 2014 (r260593) @@ -381,7 +381,7 @@ read_repo_file (file *f) FILE *stream = fopen (f->key, "r"); if (tlink_verbose >= 2) - fprintf (stderr, _("collect: reading %s\n"), f->key); + fprintf (stderr, "%s", _("collect: reading %s\n"), f->key); while (fscanf (stream, "%c ", &c) == 1) { From owner-svn-src-stable-8@FreeBSD.ORG Mon Jan 13 19:14:30 2014 Return-Path: Delivered-To: svn-src-stable-8@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 4139AD9; Mon, 13 Jan 2014 19:14:30 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 2CBEB1DFC; Mon, 13 Jan 2014 19:14:30 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id s0DJEURD025178; Mon, 13 Jan 2014 19:14:30 GMT (envelope-from mav@svn.freebsd.org) Received: (from mav@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id s0DJEUHl025177; Mon, 13 Jan 2014 19:14:30 GMT (envelope-from mav@svn.freebsd.org) Message-Id: <201401131914.s0DJEUHl025177@svn.freebsd.org> From: Alexander Motin Date: Mon, 13 Jan 2014 19:14:29 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r260602 - stable/8/sys/x86/cpufreq X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Jan 2014 19:14:30 -0000 Author: mav Date: Mon Jan 13 19:14:29 2014 New Revision: 260602 URL: http://svnweb.freebsd.org/changeset/base/260602 Log: MFC r259197: Do not DELAY() for P-state transition unless we want to see the result. Intel manual says: "If a transition is already in progress, transition to a new value will subsequently take effect. Reads of IA32_PERF_CTL determine the last targeted operating point." So seems it should be fine to just trigger wanted transition and go. Linux does the same. Modified: stable/8/sys/x86/cpufreq/est.c Directory Properties: stable/8/ (props changed) stable/8/sys/ (props changed) stable/8/sys/x86/ (props changed) Modified: stable/8/sys/x86/cpufreq/est.c ============================================================================== --- stable/8/sys/x86/cpufreq/est.c Mon Jan 13 19:08:25 2014 (r260601) +++ stable/8/sys/x86/cpufreq/est.c Mon Jan 13 19:14:29 2014 (r260602) @@ -1288,10 +1288,9 @@ est_set_id16(device_t dev, uint16_t id16 msr = (msr & ~0xffff) | id16; wrmsr(MSR_PERF_CTL, msr); - /* Wait a short while for the new setting. XXX Is this necessary? */ - DELAY(EST_TRANS_LAT); - if (need_check) { + /* Wait a short while and read the new status. */ + DELAY(EST_TRANS_LAT); est_get_id16(&new_id16); if (new_id16 != id16) { if (bootverbose) From owner-svn-src-stable-8@FreeBSD.ORG Mon Jan 13 21:29:36 2014 Return-Path: Delivered-To: svn-src-stable-8@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 5C3C8670; Mon, 13 Jan 2014 21:29:36 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 428A719C1; Mon, 13 Jan 2014 21:29:36 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id s0DLTaPE095209; Mon, 13 Jan 2014 21:29:36 GMT (envelope-from jhb@svn.freebsd.org) Received: (from jhb@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id s0DLTZ5e095202; Mon, 13 Jan 2014 21:29:35 GMT (envelope-from jhb@svn.freebsd.org) Message-Id: <201401132129.s0DLTZ5e095202@svn.freebsd.org> From: John Baldwin Date: Mon, 13 Jan 2014 21:29:35 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r260606 - in stable/8: share/man/man4 sys/kern tools/regression/sockets/unix_seqpacket tools/regression/sockets/unix_seqpacket_exercise usr.bin/netstat X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Jan 2014 21:29:36 -0000 Author: jhb Date: Mon Jan 13 21:29:34 2014 New Revision: 260606 URL: http://svnweb.freebsd.org/changeset/base/260606 Log: MFC 197775,197777-197779,197781,197794,243152,243313,255478: First cut at implementing SOCK_SEQPACKET support for UNIX (local) domain sockets. This allows for reliable bi-directional datagram communication over UNIX domain sockets, in contrast to SOCK_DGRAM (M:N, unreliable) or SOCK_STERAM (bi-directional bytestream). Largely, this reuses existing UNIX domain socket code. This allows applications requiring record- oriented semantics to do so reliably via local IPC. Added: stable/8/tools/regression/sockets/unix_seqpacket/ - copied from r197781, head/tools/regression/sockets/unix_seqpacket/ - copied from r197781, head/tools/regression/sockets/unix_seqpacket_exercise/ Directory Properties: stable/8/tools/regression/sockets/unix_seqpacket_exercise/ (props changed) Modified: stable/8/share/man/man4/unix.4 stable/8/sys/kern/uipc_usrreq.c stable/8/tools/regression/sockets/unix_seqpacket_exercise/unix_seqpacket_exercise.c stable/8/usr.bin/netstat/main.c (contents, props changed) stable/8/usr.bin/netstat/netstat.h (contents, props changed) stable/8/usr.bin/netstat/unix.c (contents, props changed) Directory Properties: stable/8/share/man/man4/ (props changed) stable/8/sys/ (props changed) stable/8/sys/kern/ (props changed) stable/8/tools/regression/sockets/ (props changed) stable/8/usr.bin/netstat/ (props changed) Modified: stable/8/share/man/man4/unix.4 ============================================================================== --- stable/8/share/man/man4/unix.4 Mon Jan 13 20:55:15 2014 (r260605) +++ stable/8/share/man/man4/unix.4 Mon Jan 13 21:29:34 2014 (r260606) @@ -32,7 +32,7 @@ .\" @(#)unix.4 8.1 (Berkeley) 6/9/93 .\" $FreeBSD$ .\" -.Dd July 15, 2001 +.Dd October 5, 2009 .Dt UNIX 4 .Os .Sh NAME @@ -52,7 +52,8 @@ mechanisms. The .Ux Ns -domain family supports the -.Dv SOCK_STREAM +.Dv SOCK_STREAM , +.Dv SOCK_SEQPACKET , and .Dv SOCK_DGRAM socket types and uses @@ -127,11 +128,14 @@ The .Ux Ns -domain protocol family is comprised of simple transport protocols that support the -.Dv SOCK_STREAM +.Dv SOCK_STREAM , +.Dv SOCK_SEQPACKET , and .Dv SOCK_DGRAM abstractions. .Dv SOCK_STREAM +and +.Dv SOCK_SEQPACKET sockets also support the communication of .Ux file descriptors through the use of the @@ -206,8 +210,9 @@ and tested with .Xr getsockopt 2 : .Bl -tag -width ".Dv LOCAL_CONNWAIT" .It Dv LOCAL_CREDS -This option may be enabled on a -.Dv SOCK_DGRAM +This option may be enabled on +.Dv SOCK_DGRAM , +.Dv SOCK_SEQPACKET , or a .Dv SOCK_STREAM socket. Modified: stable/8/sys/kern/uipc_usrreq.c ============================================================================== --- stable/8/sys/kern/uipc_usrreq.c Mon Jan 13 20:55:15 2014 (r260605) +++ stable/8/sys/kern/uipc_usrreq.c Mon Jan 13 21:29:34 2014 (r260606) @@ -50,7 +50,8 @@ * garbage collector to find and tear down cycles of disconnected sockets. * * TODO: - * SEQPACKET, RDM + * RDM + * distinguish datagram size limits from flow control limits in SEQPACKET * rethink name space problems * need a proper out-of-band */ @@ -113,6 +114,7 @@ static ino_t unp_ino; /* Prototype for static int unp_rights; /* (g) File descriptors in flight. */ static struct unp_head unp_shead; /* (l) List of stream sockets. */ static struct unp_head unp_dhead; /* (l) List of datagram sockets. */ +static struct unp_head unp_sphead; /* (l) List of seqpacket sockets. */ struct unp_defer { SLIST_ENTRY(unp_defer) ud_link; @@ -154,10 +156,14 @@ static u_long unpst_sendspace = PIPSIZ; static u_long unpst_recvspace = PIPSIZ; static u_long unpdg_sendspace = 2*1024; /* really max datagram size */ static u_long unpdg_recvspace = 4*1024; +static u_long unpsp_sendspace = PIPSIZ; /* really max datagram size */ +static u_long unpsp_recvspace = PIPSIZ; SYSCTL_NODE(_net, PF_LOCAL, local, CTLFLAG_RW, 0, "Local domain"); SYSCTL_NODE(_net_local, SOCK_STREAM, stream, CTLFLAG_RW, 0, "SOCK_STREAM"); SYSCTL_NODE(_net_local, SOCK_DGRAM, dgram, CTLFLAG_RW, 0, "SOCK_DGRAM"); +SYSCTL_NODE(_net_local, SOCK_SEQPACKET, seqpacket, CTLFLAG_RW, 0, + "SOCK_SEQPACKET"); SYSCTL_ULONG(_net_local_stream, OID_AUTO, sendspace, CTLFLAG_RW, &unpst_sendspace, 0, "Default stream send space."); @@ -167,6 +173,10 @@ SYSCTL_ULONG(_net_local_dgram, OID_AUTO, &unpdg_sendspace, 0, "Default datagram send space."); SYSCTL_ULONG(_net_local_dgram, OID_AUTO, recvspace, CTLFLAG_RW, &unpdg_recvspace, 0, "Default datagram receive space."); +SYSCTL_ULONG(_net_local_seqpacket, OID_AUTO, maxseqpacket, CTLFLAG_RW, + &unpsp_sendspace, 0, "Default seqpacket send space."); +SYSCTL_ULONG(_net_local_seqpacket, OID_AUTO, recvspace, CTLFLAG_RW, + &unpsp_recvspace, 0, "Default seqpacket receive space."); SYSCTL_INT(_net_local, OID_AUTO, inflight, CTLFLAG_RD, &unp_rights, 0, "File descriptors in flight."); SYSCTL_INT(_net_local, OID_AUTO, deferred, CTLFLAG_RD, @@ -282,6 +292,7 @@ static void unp_process_defers(void * __ */ static struct domain localdomain; static struct pr_usrreqs uipc_usrreqs_dgram, uipc_usrreqs_stream; +static struct pr_usrreqs uipc_usrreqs_seqpacket; static struct protosw localsw[] = { { .pr_type = SOCK_STREAM, @@ -296,6 +307,20 @@ static struct protosw localsw[] = { .pr_flags = PR_ATOMIC|PR_ADDR|PR_RIGHTS, .pr_usrreqs = &uipc_usrreqs_dgram }, +{ + .pr_type = SOCK_SEQPACKET, + .pr_domain = &localdomain, + + /* + * XXXRW: For now, PR_ADDR because soreceive will bump into them + * due to our use of sbappendaddr. A new sbappend variants is needed + * that supports both atomic record writes and control data. + */ + .pr_flags = PR_ADDR|PR_ATOMIC|PR_CONNREQUIRED|PR_WANTRCVD| + PR_RIGHTS, + .pr_ctloutput = &uipc_ctloutput, + .pr_usrreqs = &uipc_usrreqs_seqpacket, +}, }; static struct domain localdomain = { @@ -378,6 +403,11 @@ uipc_attach(struct socket *so, int proto recvspace = unpdg_recvspace; break; + case SOCK_SEQPACKET: + sendspace = unpsp_sendspace; + recvspace = unpsp_recvspace; + break; + default: panic("uipc_attach"); } @@ -397,8 +427,22 @@ uipc_attach(struct socket *so, int proto UNP_LIST_LOCK(); unp->unp_gencnt = ++unp_gencnt; unp_count++; - LIST_INSERT_HEAD(so->so_type == SOCK_DGRAM ? &unp_dhead : &unp_shead, - unp, unp_link); + switch (so->so_type) { + case SOCK_STREAM: + LIST_INSERT_HEAD(&unp_shead, unp, unp_link); + break; + + case SOCK_DGRAM: + LIST_INSERT_HEAD(&unp_dhead, unp, unp_link); + break; + + case SOCK_SEQPACKET: + LIST_INSERT_HEAD(&unp_sphead, unp, unp_link); + break; + + default: + panic("uipc_attach"); + } UNP_LIST_UNLOCK(); return (0); @@ -732,11 +776,8 @@ uipc_rcvd(struct socket *so, int flags) unp = sotounpcb(so); KASSERT(unp != NULL, ("uipc_rcvd: unp == NULL")); - if (so->so_type == SOCK_DGRAM) - panic("uipc_rcvd DGRAM?"); - - if (so->so_type != SOCK_STREAM) - panic("uipc_rcvd unknown socktype"); + if (so->so_type != SOCK_STREAM && so->so_type != SOCK_SEQPACKET) + panic("uipc_rcvd socktype %d", so->so_type); /* * Adjust backpressure on sender and wakeup any waiting to write. @@ -851,6 +892,7 @@ uipc_send(struct socket *so, int flags, break; } + case SOCK_SEQPACKET: case SOCK_STREAM: if ((so->so_state & SS_ISCONNECTED) == 0) { if (nam != NULL) { @@ -893,7 +935,8 @@ uipc_send(struct socket *so, int flags, SOCKBUF_LOCK(&so2->so_rcv); if (unp2->unp_flags & UNP_WANTCRED) { /* - * Credentials are passed only once on SOCK_STREAM. + * Credentials are passed only once on SOCK_STREAM + * and SOCK_SEQPACKET. */ unp2->unp_flags &= ~UNP_WANTCRED; control = unp_addsockcred(td, control); @@ -902,11 +945,33 @@ uipc_send(struct socket *so, int flags, * Send to paired receive port, and then reduce send buffer * hiwater marks to maintain backpressure. Wake up readers. */ - if (control != NULL) { - if (sbappendcontrol_locked(&so2->so_rcv, m, control)) + switch (so->so_type) { + case SOCK_STREAM: + if (control != NULL) { + if (sbappendcontrol_locked(&so2->so_rcv, m, + control)) + control = NULL; + } else + sbappend_locked(&so2->so_rcv, m); + break; + + case SOCK_SEQPACKET: { + const struct sockaddr *from; + + from = &sun_noname; + if (sbappendaddr_locked(&so2->so_rcv, from, m, + control)) control = NULL; - } else - sbappend_locked(&so2->so_rcv, m); + break; + } + } + + /* + * XXXRW: While fine for SOCK_STREAM, this conflates maximum + * datagram size and back-pressure for SOCK_SEQPACKET, which + * can lead to undesired return of EMSGSIZE on send instead + * of more desirable blocking. + */ mbcnt_delta = so2->so_rcv.sb_mbcnt - unp2->unp_mbcnt; unp2->unp_mbcnt = so2->so_rcv.sb_mbcnt; sbcc = so2->so_rcv.sb_cc; @@ -969,7 +1034,8 @@ uipc_sense(struct socket *so, struct sta UNP_LINK_RLOCK(); UNP_PCB_LOCK(unp); unp2 = unp->unp_conn; - if (so->so_type == SOCK_STREAM && unp2 != NULL) { + if ((so->so_type == SOCK_STREAM || so->so_type == SOCK_SEQPACKET) && + unp2 != NULL) { so2 = unp2->unp_socket; sb->st_blksize += so2->so_rcv.sb_cc; } @@ -1039,6 +1105,26 @@ static struct pr_usrreqs uipc_usrreqs_dg .pru_close = uipc_close, }; +static struct pr_usrreqs uipc_usrreqs_seqpacket = { + .pru_abort = uipc_abort, + .pru_accept = uipc_accept, + .pru_attach = uipc_attach, + .pru_bind = uipc_bind, + .pru_connect = uipc_connect, + .pru_connect2 = uipc_connect2, + .pru_detach = uipc_detach, + .pru_disconnect = uipc_disconnect, + .pru_listen = uipc_listen, + .pru_peeraddr = uipc_peeraddr, + .pru_rcvd = uipc_rcvd, + .pru_send = uipc_send, + .pru_sense = uipc_sense, + .pru_shutdown = uipc_shutdown, + .pru_sockaddr = uipc_sockaddr, + .pru_soreceive = soreceive_generic, /* XXX: or...? */ + .pru_close = uipc_close, +}; + static struct pr_usrreqs uipc_usrreqs_stream = { .pru_abort = uipc_abort, .pru_accept = uipc_accept, @@ -1340,6 +1426,7 @@ unp_connect2(struct socket *so, struct s break; case SOCK_STREAM: + case SOCK_SEQPACKET: unp2->unp_conn = unp; if (req == PRU_CONNECT && ((unp->unp_flags | unp2->unp_flags) & UNP_CONNWAIT)) @@ -1377,6 +1464,7 @@ unp_disconnect(struct unpcb *unp, struct break; case SOCK_STREAM: + case SOCK_SEQPACKET: soisdisconnected(unp->unp_socket); unp2->unp_conn = NULL; soisdisconnected(unp2->unp_socket); @@ -1402,7 +1490,22 @@ unp_pcblist(SYSCTL_HANDLER_ARGS) struct unp_head *head; struct xunpcb *xu; - head = ((intptr_t)arg1 == SOCK_DGRAM ? &unp_dhead : &unp_shead); + switch ((intptr_t)arg1) { + case SOCK_STREAM: + head = &unp_shead; + break; + + case SOCK_DGRAM: + head = &unp_dhead; + break; + + case SOCK_SEQPACKET: + head = &unp_sphead; + break; + + default: + panic("unp_pcblist: arg1 %d", (int)(intptr_t)arg1); + } /* * The process of preparing the PCB list is too time-consuming and @@ -1515,6 +1618,9 @@ SYSCTL_PROC(_net_local_dgram, OID_AUTO, SYSCTL_PROC(_net_local_stream, OID_AUTO, pcblist, CTLFLAG_RD, (caddr_t)(long)SOCK_STREAM, 0, unp_pcblist, "S,xunpcb", "List of active local stream sockets"); +SYSCTL_PROC(_net_local_seqpacket, OID_AUTO, pcblist, CTLFLAG_RD, + (caddr_t)(long)SOCK_SEQPACKET, 0, unp_pcblist, "S,xunpcb", + "List of active local seqpacket sockets"); static void unp_shutdown(struct unpcb *unp) @@ -1526,7 +1632,8 @@ unp_shutdown(struct unpcb *unp) UNP_PCB_LOCK_ASSERT(unp); unp2 = unp->unp_conn; - if (unp->unp_socket->so_type == SOCK_STREAM && unp2 != NULL) { + if ((unp->unp_socket->so_type == SOCK_STREAM || + (unp->unp_socket->so_type == SOCK_SEQPACKET)) && unp2 != NULL) { so = unp2->unp_socket; if (so != NULL) socantrcvmore(so); @@ -1692,6 +1799,7 @@ unp_init(void) NULL, EVENTHANDLER_PRI_ANY); LIST_INIT(&unp_dhead); LIST_INIT(&unp_shead); + LIST_INIT(&unp_sphead); SLIST_INIT(&unp_defers); TASK_INIT(&unp_gc_task, 0, unp_gc, NULL); TASK_INIT(&unp_defer_task, 0, unp_process_defers, NULL); @@ -2065,7 +2173,8 @@ SYSCTL_INT(_net_local, OID_AUTO, taskcou static void unp_gc(__unused void *arg, int pending) { - struct unp_head *heads[] = { &unp_dhead, &unp_shead, NULL }; + struct unp_head *heads[] = { &unp_dhead, &unp_shead, &unp_sphead, + NULL }; struct unp_head **head; struct file *f, **unref; struct unpcb *unp; Modified: stable/8/tools/regression/sockets/unix_seqpacket_exercise/unix_seqpacket_exercise.c ============================================================================== --- head/tools/regression/sockets/unix_seqpacket_exercise/unix_seqpacket_exercise.c Mon Oct 5 15:27:01 2009 (r197781) +++ stable/8/tools/regression/sockets/unix_seqpacket_exercise/unix_seqpacket_exercise.c Mon Jan 13 21:29:34 2014 (r260606) @@ -50,21 +50,21 @@ __FBSDID("$FreeBSD$"); #define SEQPACKET_SNDBUF (131072-16) #define FAILERR(str) err(-1, "%s: %s", __func__, str) -#define FAILNERR(str, n) err(-1, "%s %d: %s", __func__, n, str) -#define FAILNMERR(str, n, m) err(-1, "%s %d %d: %s", __func__, n, m, str) +#define FAILNERR(str, n) err(-1, "%s %zd: %s", __func__, n, str) +#define FAILNMERR(str, n, m) err(-1, "%s %zd %d: %s", __func__, n, m, str) #define FAILERRX(str) errx(-1, "%s: %s", __func__, str) -#define FAILNERRX(str, n) errx(-1, "%s %d: %s", __func__, n, str) -#define FAILNMERRX(str, n, m) errx(-1, "%s %d %d: %s", __func__, n, m, str) +#define FAILNERRX(str, n) errx(-1, "%s %zd: %s", __func__, n, str) +#define FAILNMERRX(str, n, m) errx(-1, "%s %zd %d: %s", __func__, n, m, str) static int ann = 0; #define ANN() (ann ? warnx("%s: start", __func__) : 0) -#define ANNN(n) (ann ? warnx("%s %d: start", __func__, (n)) : 0) -#define ANNNM(n, m) (ann ? warnx("%s %d %d: start", __func__, (n), (m)) : 0) +#define ANNN(n) (ann ? warnx("%s %zd: start", __func__, (n)) : 0) +#define ANNNM(n, m) (ann ? warnx("%s %zd %d: start", __func__, (n), (m)):0) #define OK() warnx("%s: ok", __func__) -#define OKN(n) warnx("%s %d: ok", __func__, (n)) -#define OKNM(n, m) warnx("%s %d %d: ok", __func__, (n), (m)) +#define OKN(n) warnx("%s %zd: ok", __func__, (n)) +#define OKNM(n, m) warnx("%s %zd %d: ok", __func__, (n), (m)) #ifdef SO_NOSIGPIPE #define NEW_SOCKET(s) do { \ @@ -168,7 +168,7 @@ server(int s_listen) break; } if (ssize_send != ssize_recv) - warnx("server: recv %d sent %d", + warnx("server: recv %zd sent %zd", ssize_recv, ssize_send); } while (1); close(s_accept); Modified: stable/8/usr.bin/netstat/main.c ============================================================================== --- stable/8/usr.bin/netstat/main.c Mon Jan 13 20:55:15 2014 (r260605) +++ stable/8/usr.bin/netstat/main.c Mon Jan 13 21:29:34 2014 (r260606) @@ -186,6 +186,8 @@ static struct nlist nl[] = { { .n_name = "_mfctablesize" }, #define N_ARPSTAT 55 { .n_name = "_arpstat" }, +#define N_UNP_SPHEAD 56 + { .n_name = "unp_sphead" }, { .n_name = NULL }, }; @@ -627,7 +629,8 @@ main(int argc, char *argv[]) #endif /* NETGRAPH */ if ((af == AF_UNIX || af == AF_UNSPEC) && !sflag) unixpr(nl[N_UNP_COUNT].n_value, nl[N_UNP_GENCNT].n_value, - nl[N_UNP_DHEAD].n_value, nl[N_UNP_SHEAD].n_value); + nl[N_UNP_DHEAD].n_value, nl[N_UNP_SHEAD].n_value, + nl[N_UNP_SPHEAD].n_value); exit(0); } Modified: stable/8/usr.bin/netstat/netstat.h ============================================================================== --- stable/8/usr.bin/netstat/netstat.h Mon Jan 13 20:55:15 2014 (r260605) +++ stable/8/usr.bin/netstat/netstat.h Mon Jan 13 21:29:34 2014 (r260606) @@ -158,7 +158,7 @@ void ddp_stats(u_long, const char *, int void netgraphprotopr(u_long, const char *, int, int); #endif -void unixpr(u_long, u_long, u_long, u_long); +void unixpr(u_long, u_long, u_long, u_long, u_long); void esis_stats(u_long, const char *, int, int); void clnp_stats(u_long, const char *, int, int); Modified: stable/8/usr.bin/netstat/unix.c ============================================================================== --- stable/8/usr.bin/netstat/unix.c Mon Jan 13 20:55:15 2014 (r260605) +++ stable/8/usr.bin/netstat/unix.c Mon Jan 13 21:29:34 2014 (r260606) @@ -193,21 +193,37 @@ fail: } void -unixpr(u_long count_off, u_long gencnt_off, u_long dhead_off, u_long shead_off) +unixpr(u_long count_off, u_long gencnt_off, u_long dhead_off, u_long shead_off, + u_long sphead_off) { char *buf; int ret, type; struct xsocket *so; struct xunpgen *xug, *oxug; struct xunpcb *xunp; + u_long head_off; for (type = SOCK_STREAM; type <= SOCK_SEQPACKET; type++) { if (live) ret = pcblist_sysctl(type, &buf); - else - ret = pcblist_kvm(count_off, gencnt_off, - type == SOCK_STREAM ? shead_off : - (type == SOCK_DGRAM ? dhead_off : 0), &buf); + else { + head_off = 0; + switch (type) { + case SOCK_STREAM: + head_off = shead_off; + break; + + case SOCK_DGRAM: + head_off = dhead_off; + break; + + case SOCK_SEQPACKET: + head_off = sphead_off; + break; + } + ret = pcblist_kvm(count_off, gencnt_off, head_off, + &buf); + } if (ret == -1) continue; if (ret < 0) From owner-svn-src-stable-8@FreeBSD.ORG Tue Jan 14 10:03:32 2014 Return-Path: Delivered-To: svn-src-stable-8@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 9C2F01DC; Tue, 14 Jan 2014 10:03:32 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 6D94D1423; Tue, 14 Jan 2014 10:03:32 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id s0EA3WFY094042; Tue, 14 Jan 2014 10:03:32 GMT (envelope-from pluknet@svn.freebsd.org) Received: (from pluknet@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id s0EA3Whf094040; Tue, 14 Jan 2014 10:03:32 GMT (envelope-from pluknet@svn.freebsd.org) Message-Id: <201401141003.s0EA3Whf094040@svn.freebsd.org> From: Sergey Kandaurov Date: Tue, 14 Jan 2014 10:03:32 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r260625 - stable/8/lib/libc/sys X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Jan 2014 10:03:32 -0000 Author: pluknet Date: Tue Jan 14 10:03:31 2014 New Revision: 260625 URL: http://svnweb.freebsd.org/changeset/base/260625 Log: MFC r259921,259950: Provide the manual page for aio_fsync(2). Added: stable/8/lib/libc/sys/aio_fsync.2 - copied, changed from r259921, head/lib/libc/sys/aio_fsync.2 Modified: stable/8/lib/libc/sys/Makefile.inc Directory Properties: stable/8/lib/libc/ (props changed) stable/8/lib/libc/sys/ (props changed) Modified: stable/8/lib/libc/sys/Makefile.inc ============================================================================== --- stable/8/lib/libc/sys/Makefile.inc Tue Jan 14 09:58:33 2014 (r260624) +++ stable/8/lib/libc/sys/Makefile.inc Tue Jan 14 10:03:31 2014 (r260625) @@ -65,7 +65,7 @@ ${SPSEUDO}: >> ${.TARGET} MAN+= abort2.2 accept.2 access.2 acct.2 adjtime.2 \ - aio_cancel.2 aio_error.2 aio_read.2 aio_return.2 \ + aio_cancel.2 aio_error.2 aio_fsync.2 aio_read.2 aio_return.2 \ aio_suspend.2 aio_waitcomplete.2 aio_write.2 \ bind.2 brk.2 chdir.2 chflags.2 \ chmod.2 chown.2 chroot.2 clock_gettime.2 close.2 closefrom.2 \ Copied and modified: stable/8/lib/libc/sys/aio_fsync.2 (from r259921, head/lib/libc/sys/aio_fsync.2) ============================================================================== --- head/lib/libc/sys/aio_fsync.2 Thu Dec 26 19:16:30 2013 (r259921, copy source) +++ stable/8/lib/libc/sys/aio_fsync.2 Tue Jan 14 10:03:31 2014 (r260625) @@ -24,7 +24,7 @@ .\" .\" $FreeBSD$ .\" -.Dd May 4, 2013 +.Dd December 27, 2013 .Dt AIO_FSYNC 2 .Os .Sh NAME @@ -49,7 +49,7 @@ completed at the time the call returns. .Pp The .Fa op -argument could be set only to +argument can only be set to .Dv O_SYNC to cause all currently queued I/O operations to be completed as if by a call to @@ -109,7 +109,8 @@ returned in .It Bq Er EBADF The .Fa iocb->aio_fildes -is invalid for writing. +argument +is not a valid descriptor. .It Bq Er EINVAL This implementation does not support synchronized I/O for this file. .El From owner-svn-src-stable-8@FreeBSD.ORG Tue Jan 14 19:17:21 2014 Return-Path: Delivered-To: svn-src-stable-8@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id CF0C2675; Tue, 14 Jan 2014 19:17:21 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id B81271617; Tue, 14 Jan 2014 19:17:21 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id s0EJHLbX008677; Tue, 14 Jan 2014 19:17:21 GMT (envelope-from delphij@svn.freebsd.org) Received: (from delphij@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id s0EJHLHU008676; Tue, 14 Jan 2014 19:17:21 GMT (envelope-from delphij@svn.freebsd.org) Message-Id: <201401141917.s0EJHLHU008676@svn.freebsd.org> From: Xin LI Date: Tue, 14 Jan 2014 19:17:21 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r260642 - in stable: 8/contrib/bsnmp/lib 9/contrib/bsnmp/lib X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Jan 2014 19:17:21 -0000 Author: delphij Date: Tue Jan 14 19:17:20 2014 New Revision: 260642 URL: http://svnweb.freebsd.org/changeset/base/260642 Log: MFC r260636: Fix bsnmpd remote denial of service vulnerability. Reported by: dinoex Submitted by: harti Security: FreeBSD-SA-14:01.bsnmpd Security: CVE-2014-1452 Modified: stable/8/contrib/bsnmp/lib/snmpagent.c Directory Properties: stable/8/contrib/bsnmp/ (props changed) Changes in other areas also in this revision: Modified: stable/9/contrib/bsnmp/lib/snmpagent.c Directory Properties: stable/9/contrib/bsnmp/ (props changed) Modified: stable/8/contrib/bsnmp/lib/snmpagent.c ============================================================================== --- stable/8/contrib/bsnmp/lib/snmpagent.c Tue Jan 14 19:12:40 2014 (r260641) +++ stable/8/contrib/bsnmp/lib/snmpagent.c Tue Jan 14 19:17:20 2014 (r260642) @@ -488,6 +488,11 @@ snmp_getbulk(struct snmp_pdu *pdu, struc for (cnt = 0; cnt < pdu->error_index; cnt++) { eomib = 1; for (i = non_rep; i < pdu->nbindings; i++) { + + if (resp->nbindings == SNMP_MAX_BINDINGS) + /* PDU is full */ + goto done; + if (cnt == 0) result = do_getnext(&context, &pdu->bindings[i], &resp->bindings[resp->nbindings], pdu); From owner-svn-src-stable-8@FreeBSD.ORG Tue Jan 14 19:20:42 2014 Return-Path: Delivered-To: svn-src-stable-8@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id A76EA994; Tue, 14 Jan 2014 19:20:42 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 92ACC1694; Tue, 14 Jan 2014 19:20:42 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id s0EJKgP5009558; Tue, 14 Jan 2014 19:20:42 GMT (envelope-from delphij@svn.freebsd.org) Received: (from delphij@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id s0EJKgTc009557; Tue, 14 Jan 2014 19:20:42 GMT (envelope-from delphij@svn.freebsd.org) Message-Id: <201401141920.s0EJKgTc009557@svn.freebsd.org> From: Xin LI Date: Tue, 14 Jan 2014 19:20:42 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r260643 - in stable: 8/contrib/ntp/ntpd 9/contrib/ntp/ntpd X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Jan 2014 19:20:42 -0000 Author: delphij Date: Tue Jan 14 19:20:41 2014 New Revision: 260643 URL: http://svnweb.freebsd.org/changeset/base/260643 Log: MFC r260637: Disable 'monitor' feature in ntpd by default. Security: FreeBSD-SA-14:02.ntpd Approved by: so Modified: stable/8/contrib/ntp/ntpd/ntp_config.c Directory Properties: stable/8/contrib/ntp/ (props changed) Changes in other areas also in this revision: Modified: stable/9/contrib/ntp/ntpd/ntp_config.c Directory Properties: stable/9/contrib/ntp/ (props changed) Modified: stable/8/contrib/ntp/ntpd/ntp_config.c ============================================================================== --- stable/8/contrib/ntp/ntpd/ntp_config.c Tue Jan 14 19:17:20 2014 (r260642) +++ stable/8/contrib/ntp/ntpd/ntp_config.c Tue Jan 14 19:20:41 2014 (r260643) @@ -597,6 +597,8 @@ getconfig( #endif /* not SYS_WINNT */ } + proto_config(PROTO_MONITOR, 0, 0., NULL); + for (;;) { if (tok == CONFIG_END) break; From owner-svn-src-stable-8@FreeBSD.ORG Tue Jan 14 19:27:43 2014 Return-Path: Delivered-To: svn-src-stable-8@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 12DCAE49; Tue, 14 Jan 2014 19:27:43 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id F1C5616F1; Tue, 14 Jan 2014 19:27:42 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id s0EJRgPG012592; Tue, 14 Jan 2014 19:27:42 GMT (envelope-from delphij@svn.freebsd.org) Received: (from delphij@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id s0EJRgY8012591; Tue, 14 Jan 2014 19:27:42 GMT (envelope-from delphij@svn.freebsd.org) Message-Id: <201401141927.s0EJRgY8012591@svn.freebsd.org> From: Xin LI Date: Tue, 14 Jan 2014 19:27:42 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r260644 - in stable: 8/sys/dev/random 9/sys/dev/random X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Jan 2014 19:27:43 -0000 Author: delphij Date: Tue Jan 14 19:27:42 2014 New Revision: 260644 URL: http://svnweb.freebsd.org/changeset/base/260644 Log: On stable/8 and stable/9, disable hardware random number generators by default. This is a direct commit to stable/ branches because HEAD and stable/10 have superior implementation of random device. Approved by: so Modified: stable/8/sys/dev/random/probe.c Changes in other areas also in this revision: Modified: stable/9/sys/dev/random/probe.c Modified: stable/8/sys/dev/random/probe.c ============================================================================== --- stable/8/sys/dev/random/probe.c Tue Jan 14 19:20:41 2014 (r260643) +++ stable/8/sys/dev/random/probe.c Tue Jan 14 19:27:42 2014 (r260644) @@ -73,7 +73,7 @@ random_ident_hardware(struct random_syst if (via_feature_rng & VIA_HAS_RNG) { int enable; - enable = 1; + enable = 0; TUNABLE_INT_FETCH("hw.nehemiah_rng_enable", &enable); if (enable) *systat = random_nehemiah; @@ -83,7 +83,7 @@ random_ident_hardware(struct random_syst if (cpu_feature2 & CPUID2_RDRAND) { int enable; - enable = 1; + enable = 0; TUNABLE_INT_FETCH("hw.ivy_rng_enable", &enable); if (enable) *systat = random_ivy; From owner-svn-src-stable-8@FreeBSD.ORG Tue Jan 14 19:33:29 2014 Return-Path: Delivered-To: svn-src-stable-8@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 1DB3B199; Tue, 14 Jan 2014 19:33:29 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 08A7F177F; Tue, 14 Jan 2014 19:33:29 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id s0EJXSY9015912; Tue, 14 Jan 2014 19:33:28 GMT (envelope-from delphij@svn.freebsd.org) Received: (from delphij@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id s0EJXSlq015911; Tue, 14 Jan 2014 19:33:28 GMT (envelope-from delphij@svn.freebsd.org) Message-Id: <201401141933.s0EJXSlq015911@svn.freebsd.org> From: Xin LI Date: Tue, 14 Jan 2014 19:33:28 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r260645 - stable/8/sys/vm X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Jan 2014 19:33:29 -0000 Author: delphij Date: Tue Jan 14 19:33:28 2014 New Revision: 260645 URL: http://svnweb.freebsd.org/changeset/base/260645 Log: MFC r259951 (kib): Do not coalesce stack entry. Pass MAP_STACK_GROWS_DOWN and MAP_STACK_GROWS_UP flags to vm_map_insert() from vm_map_stack() Modified: stable/8/sys/vm/vm_map.c Directory Properties: stable/8/sys/ (props changed) stable/8/sys/vm/ (props changed) Modified: stable/8/sys/vm/vm_map.c ============================================================================== --- stable/8/sys/vm/vm_map.c Tue Jan 14 19:27:42 2014 (r260644) +++ stable/8/sys/vm/vm_map.c Tue Jan 14 19:33:28 2014 (r260645) @@ -1217,6 +1217,7 @@ charged: } else if ((prev_entry != &map->header) && (prev_entry->eflags == protoeflags) && + (cow & (MAP_ENTRY_GROWS_DOWN | MAP_ENTRY_GROWS_UP)) == 0 && (prev_entry->end == start) && (prev_entry->wired_count == 0) && (prev_entry->uip == uip || @@ -3189,7 +3190,6 @@ vm_map_stack(vm_map_t map, vm_offset_t a * NOTE: We explicitly allow bi-directional stacks. */ orient = cow & (MAP_STACK_GROWS_DOWN|MAP_STACK_GROWS_UP); - cow &= ~orient; KASSERT(orient != 0, ("No stack grow direction")); if (addrbos < vm_map_min(map) || From owner-svn-src-stable-8@FreeBSD.ORG Tue Jan 14 19:38:38 2014 Return-Path: Delivered-To: svn-src-stable-8@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 23A823B6; Tue, 14 Jan 2014 19:38:38 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id E851317AE; Tue, 14 Jan 2014 19:38:37 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id s0EJcbIa016533; Tue, 14 Jan 2014 19:38:37 GMT (envelope-from delphij@svn.freebsd.org) Received: (from delphij@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id s0EJcboo016532; Tue, 14 Jan 2014 19:38:37 GMT (envelope-from delphij@svn.freebsd.org) Message-Id: <201401141938.s0EJcboo016532@svn.freebsd.org> From: Xin LI Date: Tue, 14 Jan 2014 19:38:37 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r260646 - in stable: 8/contrib/bind9/bin/named 9/contrib/bind9/bin/named X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Jan 2014 19:38:38 -0000 Author: delphij Date: Tue Jan 14 19:38:37 2014 New Revision: 260646 URL: http://svnweb.freebsd.org/changeset/base/260646 Log: Fix BIND remote denial of service vulnerability. Security: FreeBSD-SA-14:04.bind Security: CVE-2014-0591 Modified: stable/8/contrib/bind9/bin/named/query.c Changes in other areas also in this revision: Modified: stable/9/contrib/bind9/bin/named/query.c Modified: stable/8/contrib/bind9/bin/named/query.c ============================================================================== --- stable/8/contrib/bind9/bin/named/query.c Tue Jan 14 19:33:28 2014 (r260645) +++ stable/8/contrib/bind9/bin/named/query.c Tue Jan 14 19:38:37 2014 (r260646) @@ -5088,8 +5088,7 @@ query_findclosestnsec3(dns_name_t *qname dns_fixedname_t fixed; dns_hash_t hash; dns_name_t name; - int order; - unsigned int count; + unsigned int skip = 0, labels; dns_rdata_nsec3_t nsec3; dns_rdata_t rdata = DNS_RDATA_INIT; isc_boolean_t optout; @@ -5102,6 +5101,7 @@ query_findclosestnsec3(dns_name_t *qname dns_name_init(&name, NULL); dns_name_clone(qname, &name); + labels = dns_name_countlabels(&name); /* * Map unknown algorithm to known value. @@ -5133,13 +5133,14 @@ query_findclosestnsec3(dns_name_t *qname dns_rdata_reset(&rdata); optout = ISC_TF((nsec3.flags & DNS_NSEC3FLAG_OPTOUT) != 0); if (found != NULL && optout && - dns_name_fullcompare(&name, dns_db_origin(db), &order, - &count) == dns_namereln_subdomain) { + dns_name_issubdomain(&name, dns_db_origin(db))) + { dns_rdataset_disassociate(rdataset); if (dns_rdataset_isassociated(sigrdataset)) dns_rdataset_disassociate(sigrdataset); - count = dns_name_countlabels(&name) - 1; - dns_name_getlabelsequence(&name, 1, count, &name); + skip++; + dns_name_getlabelsequence(qname, skip, labels - skip, + &name); ns_client_log(client, DNS_LOGCATEGORY_DNSSEC, NS_LOGMODULE_QUERY, ISC_LOG_DEBUG(3), "looking for closest provable encloser"); @@ -5157,7 +5158,11 @@ query_findclosestnsec3(dns_name_t *qname ns_client_log(client, DNS_LOGCATEGORY_DNSSEC, NS_LOGMODULE_QUERY, ISC_LOG_WARNING, "expected covering NSEC3, got an exact match"); - if (found != NULL) + if (found == qname) { + if (skip != 0U) + dns_name_getlabelsequence(qname, skip, labels - skip, + found); + } else if (found != NULL) dns_name_copy(&name, found, NULL); return; } From owner-svn-src-stable-8@FreeBSD.ORG Thu Jan 16 14:05:08 2014 Return-Path: Delivered-To: svn-src-stable-8@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 6E89CA3B; Thu, 16 Jan 2014 14:05:08 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 58A941FB2; Thu, 16 Jan 2014 14:05:08 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id s0GE58Y2024489; Thu, 16 Jan 2014 14:05:08 GMT (envelope-from avg@svn.freebsd.org) Received: (from avg@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id s0GE56Vh024476; Thu, 16 Jan 2014 14:05:06 GMT (envelope-from avg@svn.freebsd.org) Message-Id: <201401161405.s0GE56Vh024476@svn.freebsd.org> From: Andriy Gapon Date: Thu, 16 Jan 2014 14:05:06 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r260721 - in stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Jan 2014 14:05:08 -0000 Author: avg Date: Thu Jan 16 14:05:05 2014 New Revision: 260721 URL: http://svnweb.freebsd.org/changeset/base/260721 Log: MFC r253821,254753,256259 MFV r253783: 3834 incremental replication of 'holey' file systems is slow MFV r254747:4047 panic from dbuf_free_range() from dmu_free_object() while doing zfs receive MFV r255257: 4082 zfs receive gets EFBIG from dmu_tx_hold_free() Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_destroy.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_impl.h stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_send.h stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dnode.h Directory Properties: stable/8/sys/ (props changed) stable/8/sys/cddl/ (props changed) stable/8/sys/cddl/contrib/opensolaris/ (props changed) Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Thu Jan 16 13:58:55 2014 (r260720) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Thu Jan 16 14:05:05 2014 (r260721) @@ -26,6 +26,7 @@ #include #include +#include #include #include #include @@ -38,6 +39,12 @@ #include #include +/* + * Number of times that zfs_free_range() took the slow path while doing + * a zfs receive. A nonzero value indicates a potential performance problem. + */ +uint64_t zfs_free_range_recv_miss; + static void dbuf_destroy(dmu_buf_impl_t *db); static boolean_t dbuf_undirty(dmu_buf_impl_t *db, dmu_tx_t *tx); static void dbuf_write(dbuf_dirty_record_t *dr, arc_buf_t *data, dmu_tx_t *tx); @@ -793,9 +800,12 @@ dbuf_unoverride(dbuf_dirty_record_t *dr) /* * Evict (if its unreferenced) or clear (if its referenced) any level-0 * data blocks in the free range, so that any future readers will find - * empty blocks. Also, if we happen accross any level-1 dbufs in the + * empty blocks. Also, if we happen across any level-1 dbufs in the * range that have not already been marked dirty, mark them dirty so * they stay in memory. + * + * This is a no-op if the dataset is in the middle of an incremental + * receive; see comment below for details. */ void dbuf_free_range(dnode_t *dn, uint64_t start, uint64_t end, dmu_tx_t *tx) @@ -811,7 +821,23 @@ dbuf_free_range(dnode_t *dn, uint64_t st last_l1 = end >> epbs; } dprintf_dnode(dn, "start=%llu end=%llu\n", start, end); + mutex_enter(&dn->dn_dbufs_mtx); + if (start >= dn->dn_unlisted_l0_blkid * dn->dn_datablksz) { + /* There can't be any dbufs in this range; no need to search. */ + mutex_exit(&dn->dn_dbufs_mtx); + return; + } else if (dmu_objset_is_receiving(dn->dn_objset)) { + /* + * If we are receiving, we expect there to be no dbufs in + * the range to be freed, because receive modifies each + * block at most once, and in offset order. If this is + * not the case, it can lead to performance problems, + * so note that we unexpectedly took the slow path. + */ + atomic_inc_64(&zfs_free_range_recv_miss); + } + for (db = list_head(&dn->dn_dbufs); db; db = db_next) { db_next = list_next(&dn->dn_dbufs, db); ASSERT(db->db_blkid != DMU_BONUS_BLKID); @@ -1699,6 +1725,9 @@ dbuf_create(dnode_t *dn, uint8_t level, return (odb); } list_insert_head(&dn->dn_dbufs, db); + if (db->db_level == 0 && db->db_blkid >= + dn->dn_unlisted_l0_blkid) + dn->dn_unlisted_l0_blkid = db->db_blkid + 1; db->db_state = DB_UNCACHED; mutex_exit(&dn->dn_dbufs_mtx); arc_space_consume(sizeof (dmu_buf_impl_t), ARC_SPACE_OTHER); Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c Thu Jan 16 13:58:55 2014 (r260720) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c Thu Jan 16 14:05:05 2014 (r260721) @@ -573,98 +573,93 @@ dmu_prefetch(objset_t *os, uint64_t obje * the end so that the file gets shorter over time (if we crashes in the * middle, this will leave us in a better state). We find allocated file * data by simply searching the allocated level 1 indirects. + * + * On input, *start should be the first offset that does not need to be + * freed (e.g. "offset + length"). On return, *start will be the first + * offset that should be freed. */ static int -get_next_chunk(dnode_t *dn, uint64_t *start, uint64_t limit) +get_next_chunk(dnode_t *dn, uint64_t *start, uint64_t minimum) { - uint64_t len = *start - limit; - uint64_t blkcnt = 0; - uint64_t maxblks = DMU_MAX_ACCESS / (1ULL << (dn->dn_indblkshift + 1)); + uint64_t maxblks = DMU_MAX_ACCESS >> (dn->dn_indblkshift + 1); + /* bytes of data covered by a level-1 indirect block */ uint64_t iblkrange = dn->dn_datablksz * EPB(dn->dn_indblkshift, SPA_BLKPTRSHIFT); - ASSERT(limit <= *start); + ASSERT3U(minimum, <=, *start); - if (len <= iblkrange * maxblks) { - *start = limit; + if (*start - minimum <= iblkrange * maxblks) { + *start = minimum; return (0); } ASSERT(ISP2(iblkrange)); - while (*start > limit && blkcnt < maxblks) { + for (uint64_t blks = 0; *start > minimum && blks < maxblks; blks++) { int err; - /* find next allocated L1 indirect */ + /* + * dnode_next_offset(BACKWARDS) will find an allocated L1 + * indirect block at or before the input offset. We must + * decrement *start so that it is at the end of the region + * to search. + */ + (*start)--; err = dnode_next_offset(dn, DNODE_FIND_BACKWARDS, start, 2, 1, 0); - /* if there are no more, then we are done */ + /* if there are no indirect blocks before start, we are done */ if (err == ESRCH) { - *start = limit; - return (0); - } else if (err) { + *start = minimum; + break; + } else if (err != 0) { return (err); } - blkcnt += 1; - /* reset offset to end of "next" block back */ + /* set start to the beginning of this L1 indirect */ *start = P2ALIGN(*start, iblkrange); - if (*start <= limit) - *start = limit; - else - *start -= 1; } + if (*start < minimum) + *start = minimum; return (0); } static int dmu_free_long_range_impl(objset_t *os, dnode_t *dn, uint64_t offset, - uint64_t length, boolean_t free_dnode) + uint64_t length) { - dmu_tx_t *tx; - uint64_t object_size, start, end, len; - boolean_t trunc = (length == DMU_OBJECT_END); - int align, err; - - align = 1 << dn->dn_datablkshift; - ASSERT(align > 0); - object_size = align == 1 ? dn->dn_datablksz : - (dn->dn_maxblkid + 1) << dn->dn_datablkshift; - - end = offset + length; - if (trunc || end > object_size) - end = object_size; - if (end <= offset) + uint64_t object_size = (dn->dn_maxblkid + 1) * dn->dn_datablksz; + int err; + + if (offset >= object_size) return (0); - length = end - offset; - while (length) { - start = end; - /* assert(offset <= start) */ - err = get_next_chunk(dn, &start, offset); + if (length == DMU_OBJECT_END || offset + length > object_size) + length = object_size - offset; + + while (length != 0) { + uint64_t chunk_end, chunk_begin; + + chunk_end = chunk_begin = offset + length; + + /* move chunk_begin backwards to the beginning of this chunk */ + err = get_next_chunk(dn, &chunk_begin, offset); if (err) return (err); - len = trunc ? DMU_OBJECT_END : end - start; + ASSERT3U(chunk_begin, >=, offset); + ASSERT3U(chunk_begin, <=, chunk_end); - tx = dmu_tx_create(os); - dmu_tx_hold_free(tx, dn->dn_object, start, len); + dmu_tx_t *tx = dmu_tx_create(os); + dmu_tx_hold_free(tx, dn->dn_object, + chunk_begin, chunk_end - chunk_begin); err = dmu_tx_assign(tx, TXG_WAIT); if (err) { dmu_tx_abort(tx); return (err); } - - dnode_free_range(dn, start, trunc ? -1 : len, tx); - - if (start == 0 && free_dnode) { - ASSERT(trunc); - dnode_free(dn, tx); - } - - length -= end - start; - + dnode_free_range(dn, chunk_begin, chunk_end - chunk_begin, tx); dmu_tx_commit(tx); - end = start; + + length -= chunk_end - chunk_begin; } return (0); } @@ -679,38 +674,42 @@ dmu_free_long_range(objset_t *os, uint64 err = dnode_hold(os, object, FTAG, &dn); if (err != 0) return (err); - err = dmu_free_long_range_impl(os, dn, offset, length, FALSE); + err = dmu_free_long_range_impl(os, dn, offset, length); + + /* + * It is important to zero out the maxblkid when freeing the entire + * file, so that (a) subsequent calls to dmu_free_long_range_impl() + * will take the fast path, and (b) dnode_reallocate() can verify + * that the entire file has been freed. + */ + if (offset == 0 && length == DMU_OBJECT_END) + dn->dn_maxblkid = 0; + dnode_rele(dn, FTAG); return (err); } int -dmu_free_object(objset_t *os, uint64_t object) +dmu_free_long_object(objset_t *os, uint64_t object) { - dnode_t *dn; dmu_tx_t *tx; int err; - err = dnode_hold_impl(os, object, DNODE_MUST_BE_ALLOCATED, - FTAG, &dn); + err = dmu_free_long_range(os, object, 0, DMU_OBJECT_END); if (err != 0) return (err); - if (dn->dn_nlevels == 1) { - tx = dmu_tx_create(os); - dmu_tx_hold_bonus(tx, object); - dmu_tx_hold_free(tx, dn->dn_object, 0, DMU_OBJECT_END); - err = dmu_tx_assign(tx, TXG_WAIT); - if (err == 0) { - dnode_free_range(dn, 0, DMU_OBJECT_END, tx); - dnode_free(dn, tx); - dmu_tx_commit(tx); - } else { - dmu_tx_abort(tx); - } + + tx = dmu_tx_create(os); + dmu_tx_hold_bonus(tx, object); + dmu_tx_hold_free(tx, object, 0, DMU_OBJECT_END); + err = dmu_tx_assign(tx, TXG_WAIT); + if (err == 0) { + err = dmu_object_free(os, object, tx); + dmu_tx_commit(tx); } else { - err = dmu_free_long_range_impl(os, dn, 0, DMU_OBJECT_END, TRUE); + dmu_tx_abort(tx); } - dnode_rele(dn, FTAG); + return (err); } Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c Thu Jan 16 13:58:55 2014 (r260720) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c Thu Jan 16 14:05:05 2014 (r260721) @@ -96,6 +96,32 @@ dump_free(dmu_sendarg_t *dsp, uint64_t o { struct drr_free *drrf = &(dsp->dsa_drr->drr_u.drr_free); + /* + * When we receive a free record, dbuf_free_range() assumes + * that the receiving system doesn't have any dbufs in the range + * being freed. This is always true because there is a one-record + * constraint: we only send one WRITE record for any given + * object+offset. We know that the one-record constraint is + * true because we always send data in increasing order by + * object,offset. + * + * If the increasing-order constraint ever changes, we should find + * another way to assert that the one-record constraint is still + * satisfied. + */ + ASSERT(object > dsp->dsa_last_data_object || + (object == dsp->dsa_last_data_object && + offset > dsp->dsa_last_data_offset)); + + /* + * If we are doing a non-incremental send, then there can't + * be any data in the dataset we're receiving into. Therefore + * a free record would simply be a no-op. Save space by not + * sending it to begin with. + */ + if (!dsp->dsa_incremental) + return (0); + if (length != -1ULL && offset + length < offset) length = -1ULL; @@ -162,6 +188,15 @@ dump_data(dmu_sendarg_t *dsp, dmu_object { struct drr_write *drrw = &(dsp->dsa_drr->drr_u.drr_write); + /* + * We send data in increasing object, offset order. + * See comment in dump_free() for details. + */ + ASSERT(object > dsp->dsa_last_data_object || + (object == dsp->dsa_last_data_object && + offset > dsp->dsa_last_data_offset)); + dsp->dsa_last_data_object = object; + dsp->dsa_last_data_offset = offset + blksz - 1; /* * If there is any kind of pending aggregation (currently either @@ -229,6 +264,10 @@ dump_freeobjects(dmu_sendarg_t *dsp, uin { struct drr_freeobjects *drrfo = &(dsp->dsa_drr->drr_u.drr_freeobjects); + /* See comment in dump_free(). */ + if (!dsp->dsa_incremental) + return (0); + /* * If there is a pending op, but it's not PENDING_FREEOBJECTS, * push it out, since free block aggregation can only be done for @@ -305,9 +344,9 @@ dump_dnode(dmu_sendarg_t *dsp, uint64_t if (dump_bytes(dsp, DN_BONUS(dnp), P2ROUNDUP(dnp->dn_bonuslen, 8)) != 0) return (SET_ERROR(EINTR)); - /* free anything past the end of the file */ + /* Free anything past the end of the file. */ if (dump_free(dsp, object, (dnp->dn_maxblkid + 1) * - (dnp->dn_datablkszsec << SPA_MINBLOCKSHIFT), -1ULL)) + (dnp->dn_datablkszsec << SPA_MINBLOCKSHIFT), -1ULL) != 0) return (SET_ERROR(EINTR)); if (dsp->dsa_err != 0) return (SET_ERROR(EINTR)); @@ -495,6 +534,7 @@ dmu_send_impl(void *tag, dsl_pool_t *dp, dsp->dsa_toguid = ds->ds_phys->ds_guid; ZIO_SET_CHECKSUM(&dsp->dsa_zc, 0, 0, 0, 0); dsp->dsa_pending_op = PENDING_NONE; + dsp->dsa_incremental = (fromtxg != 0); mutex_enter(&ds->ds_sendstream_lock); list_insert_head(&ds->ds_sendstreams, dsp); @@ -1238,7 +1278,7 @@ restore_freeobjects(struct restorearg *r if (dmu_object_info(os, obj, NULL) != 0) continue; - err = dmu_free_object(os, obj); + err = dmu_free_long_object(os, obj); if (err != 0) return (err); } @@ -1764,3 +1804,13 @@ dmu_recv_end(dmu_recv_cookie_t *drc, voi else return (dmu_recv_existing_end(drc)); } + +/* + * Return TRUE if this objset is currently being received into. + */ +boolean_t +dmu_objset_is_receiving(objset_t *os) +{ + return (os->os_dsl_dataset != NULL && + os->os_dsl_dataset->ds_owner == dmu_recv_tag); +} Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c Thu Jan 16 13:58:55 2014 (r260720) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c Thu Jan 16 14:05:05 2014 (r260721) @@ -587,8 +587,7 @@ dmu_tx_hold_free(dmu_tx_t *tx, uint64_t { dmu_tx_hold_t *txh; dnode_t *dn; - uint64_t start, end, i; - int err, shift; + int err; zio_t *zio; ASSERT(tx->tx_txg == 0); @@ -599,34 +598,49 @@ dmu_tx_hold_free(dmu_tx_t *tx, uint64_t return; dn = txh->txh_dnode; - /* first block */ - if (off != 0) - dmu_tx_count_write(txh, off, 1); - /* last block */ - if (len != DMU_OBJECT_END) - dmu_tx_count_write(txh, off+len, 1); - - dmu_tx_count_dnode(txh); - if (off >= (dn->dn_maxblkid+1) * dn->dn_datablksz) return; if (len == DMU_OBJECT_END) len = (dn->dn_maxblkid+1) * dn->dn_datablksz - off; + dmu_tx_count_dnode(txh); + /* - * For i/o error checking, read the first and last level-0 - * blocks, and all the level-1 blocks. The above count_write's - * have already taken care of the level-0 blocks. + * For i/o error checking, we read the first and last level-0 + * blocks if they are not aligned, and all the level-1 blocks. + * + * Note: dbuf_free_range() assumes that we have not instantiated + * any level-0 dbufs that will be completely freed. Therefore we must + * exercise care to not read or count the first and last blocks + * if they are blocksize-aligned. + */ + if (dn->dn_datablkshift == 0) { + if (off != 0 || len < dn->dn_datablksz) + dmu_tx_count_write(txh, 0, dn->dn_datablksz); + } else { + /* first block will be modified if it is not aligned */ + if (!IS_P2ALIGNED(off, 1 << dn->dn_datablkshift)) + dmu_tx_count_write(txh, off, 1); + /* last block will be modified if it is not aligned */ + if (!IS_P2ALIGNED(off + len, 1 << dn->dn_datablkshift)) + dmu_tx_count_write(txh, off+len, 1); + } + + /* + * Check level-1 blocks. */ if (dn->dn_nlevels > 1) { - shift = dn->dn_datablkshift + dn->dn_indblkshift - + int shift = dn->dn_datablkshift + dn->dn_indblkshift - SPA_BLKPTRSHIFT; - start = off >> shift; - end = dn->dn_datablkshift ? ((off+len) >> shift) : 0; + uint64_t start = off >> shift; + uint64_t end = (off + len) >> shift; + + ASSERT(dn->dn_datablkshift != 0); + ASSERT(dn->dn_indblkshift != 0); zio = zio_root(tx->tx_pool->dp_spa, NULL, NULL, ZIO_FLAG_CANFAIL); - for (i = start; i <= end; i++) { + for (uint64_t i = start; i <= end; i++) { uint64_t ibyte = i << shift; err = dnode_next_offset(dn, 0, &ibyte, 2, 1, 0); i = ibyte >> shift; Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c Thu Jan 16 13:58:55 2014 (r260720) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c Thu Jan 16 14:05:05 2014 (r260721) @@ -117,6 +117,7 @@ dnode_cons(void *arg, void *unused, int dn->dn_id_flags = 0; dn->dn_dbufs_count = 0; + dn->dn_unlisted_l0_blkid = 0; list_create(&dn->dn_dbufs, sizeof (dmu_buf_impl_t), offsetof(dmu_buf_impl_t, db_link)); @@ -170,6 +171,7 @@ dnode_dest(void *arg, void *unused) ASSERT0(dn->dn_id_flags); ASSERT0(dn->dn_dbufs_count); + ASSERT0(dn->dn_unlisted_l0_blkid); list_destroy(&dn->dn_dbufs); } @@ -475,6 +477,7 @@ dnode_destroy(dnode_t *dn) dn->dn_newuid = 0; dn->dn_newgid = 0; dn->dn_id_flags = 0; + dn->dn_unlisted_l0_blkid = 0; dmu_zfetch_rele(&dn->dn_zfetch); kmem_cache_free(dnode_cache, dn); @@ -705,6 +708,7 @@ dnode_move_impl(dnode_t *odn, dnode_t *n ASSERT(list_is_empty(&ndn->dn_dbufs)); list_move_tail(&ndn->dn_dbufs, &odn->dn_dbufs); ndn->dn_dbufs_count = odn->dn_dbufs_count; + ndn->dn_unlisted_l0_blkid = odn->dn_unlisted_l0_blkid; ndn->dn_bonus = odn->dn_bonus; ndn->dn_have_spill = odn->dn_have_spill; ndn->dn_zio = odn->dn_zio; @@ -739,6 +743,7 @@ dnode_move_impl(dnode_t *odn, dnode_t *n list_create(&odn->dn_dbufs, sizeof (dmu_buf_impl_t), offsetof(dmu_buf_impl_t, db_link)); odn->dn_dbufs_count = 0; + odn->dn_unlisted_l0_blkid = 0; odn->dn_bonus = NULL; odn->dn_zfetch.zf_dnode = NULL; @@ -1528,7 +1533,7 @@ dnode_free_range(dnode_t *dn, uint64_t o blkshift = dn->dn_datablkshift; epbs = dn->dn_indblkshift - SPA_BLKPTRSHIFT; - if (len == -1ULL) { + if (len == DMU_OBJECT_END) { len = UINT64_MAX - off; trunc = TRUE; } Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_destroy.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_destroy.c Thu Jan 16 13:58:55 2014 (r260720) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_destroy.c Thu Jan 16 14:05:05 2014 (r260721) @@ -899,7 +899,7 @@ dsl_destroy_head(const char *name) for (uint64_t obj = 0; error == 0; error = dmu_object_next(os, &obj, FALSE, prev_snap_txg)) - (void) dmu_free_object(os, obj); + (void) dmu_free_long_object(os, obj); /* sync out all frees */ txg_wait_synced(dmu_objset_pool(os), 0); dmu_objset_disown(os, FTAG); Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h Thu Jan 16 13:58:55 2014 (r260720) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h Thu Jan 16 14:05:05 2014 (r260721) @@ -21,7 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright (c) 2012 by Delphix. All rights reserved. + * Copyright (c) 2013 by Delphix. All rights reserved. * Copyright 2011 Nexenta Systems, Inc. All rights reserved. * Copyright (c) 2012, Joyent, Inc. All rights reserved. */ @@ -583,7 +583,7 @@ int dmu_free_range(objset_t *os, uint64_ uint64_t size, dmu_tx_t *tx); int dmu_free_long_range(objset_t *os, uint64_t object, uint64_t offset, uint64_t size); -int dmu_free_object(objset_t *os, uint64_t object); +int dmu_free_long_object(objset_t *os, uint64_t object); /* * Convenience functions. Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_impl.h ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_impl.h Thu Jan 16 13:58:55 2014 (r260720) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_impl.h Thu Jan 16 14:05:05 2014 (r260721) @@ -21,8 +21,11 @@ /* * Copyright 2010 Sun Microsystems, Inc. All rights reserved. * Use is subject to license terms. + */ +/* * Copyright (c) 2012, Joyent, Inc. All rights reserved. * Copyright (c) 2012, Martin Matuska . All rights reserved. + * Copyright (c) 2013 by Delphix. All rights reserved. */ #ifndef _SYS_DMU_IMPL_H @@ -293,6 +296,9 @@ typedef struct dmu_sendarg { uint64_t dsa_toguid; int dsa_err; dmu_pendop_t dsa_pending_op; + boolean_t dsa_incremental; + uint64_t dsa_last_data_object; + uint64_t dsa_last_data_offset; } dmu_sendarg_t; Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_send.h ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_send.h Thu Jan 16 13:58:55 2014 (r260720) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_send.h Thu Jan 16 14:05:05 2014 (r260721) @@ -21,7 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright (c) 2012 by Delphix. All rights reserved. + * Copyright (c) 2013 by Delphix. All rights reserved. * Copyright 2011 Nexenta Systems, Inc. All rights reserved. * Copyright (c) 2012, Joyent, Inc. All rights reserved. */ @@ -74,5 +74,6 @@ int dmu_recv_stream(dmu_recv_cookie_t *d #endif int cleanup_fd, uint64_t *action_handlep); int dmu_recv_end(dmu_recv_cookie_t *drc, void *owner); +boolean_t dmu_objset_is_receiving(objset_t *os); #endif /* _DMU_SEND_H */ Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dnode.h ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dnode.h Thu Jan 16 13:58:55 2014 (r260720) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dnode.h Thu Jan 16 14:05:05 2014 (r260721) @@ -20,7 +20,7 @@ */ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. - * Copyright (c) 2012 by Delphix. All rights reserved. + * Copyright (c) 2013 by Delphix. All rights reserved. */ #ifndef _SYS_DNODE_H @@ -188,6 +188,8 @@ typedef struct dnode { /* protected by dn_dbufs_mtx; declared here to fill 32-bit hole */ uint32_t dn_dbufs_count; /* count of dn_dbufs */ + /* There are no level-0 blocks of this blkid or higher in dn_dbufs */ + uint64_t dn_unlisted_l0_blkid; /* protected by os_lock: */ list_node_t dn_dirty_link[TXG_SIZE]; /* next on dataset's dirty */ From owner-svn-src-stable-8@FreeBSD.ORG Thu Jan 16 14:15:14 2014 Return-Path: Delivered-To: svn-src-stable-8@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id D8C514A9; Thu, 16 Jan 2014 14:15:14 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id AB56510E0; Thu, 16 Jan 2014 14:15:14 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id s0GEFE6l028952; Thu, 16 Jan 2014 14:15:14 GMT (envelope-from avg@svn.freebsd.org) Received: (from avg@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id s0GEFEuR028951; Thu, 16 Jan 2014 14:15:14 GMT (envelope-from avg@svn.freebsd.org) Message-Id: <201401161415.s0GEFEuR028951@svn.freebsd.org> From: Andriy Gapon Date: Thu, 16 Jan 2014 14:15:14 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r260727 - stable/8/tools/regression/fsx X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Jan 2014 14:15:14 -0000 Author: avg Date: Thu Jan 16 14:15:14 2014 New Revision: 260727 URL: http://svnweb.freebsd.org/changeset/base/260727 Log: MFC r258351: fsx: new option to disable msync(MS_SYNC) after each write Modified: stable/8/tools/regression/fsx/fsx.c Directory Properties: stable/8/tools/regression/fsx/ (props changed) Modified: stable/8/tools/regression/fsx/fsx.c ============================================================================== --- stable/8/tools/regression/fsx/fsx.c Thu Jan 16 14:15:04 2014 (r260726) +++ stable/8/tools/regression/fsx/fsx.c Thu Jan 16 14:15:14 2014 (r260727) @@ -126,6 +126,7 @@ int randomoplen = 1; /* -O flag disable int seed = 1; /* -S flag */ int mapped_writes = 1; /* -W flag disables */ int mapped_reads = 1; /* -R flag disables it */ +int mapped_msync = 1; /* -U flag disables */ int fsxgoodfd = 0; FILE * fsxlogf = NULL; int badoff = -1; @@ -679,12 +680,12 @@ domapwrite(unsigned offset, unsigned siz if ((p = (char *)mmap(0, map_size, PROT_READ | PROT_WRITE, MAP_FILE | MAP_SHARED, fd, - (off_t)(offset - pg_offset))) == (char *)-1) { + (off_t)(offset - pg_offset))) == MAP_FAILED) { prterr("domapwrite: mmap"); report_failure(202); } memcpy(p + pg_offset, good_buf + offset, size); - if (msync(p, map_size, 0) != 0) { + if (mapped_msync && msync(p, map_size, MS_SYNC) != 0) { prterr("domapwrite: msync"); report_failure(203); } @@ -886,6 +887,7 @@ usage(void) -S seed: for random # generator (default 1) 0 gets timestamp\n\ -W: mapped write operations DISabled\n\ -R: mapped read operations DISabled)\n\ + -U: msync after mapped write operations DISabled\n\ fname: this filename is REQUIRED (no default)\n"); exit(90); } @@ -941,8 +943,8 @@ main(int argc, char **argv) setvbuf(stdout, (char *)0, _IOLBF, 0); /* line buffered stdout */ - while ((ch = getopt(argc, argv, "b:c:dl:m:no:p:qr:s:t:w:D:LN:OP:RS:W")) - != -1) + while ((ch = getopt(argc, argv, + "b:c:dl:m:no:p:qr:s:t:w:D:LN:OP:RS:UW")) != -1) switch (ch) { case 'b': simulatedopcount = getnum(optarg, &endp); @@ -1057,6 +1059,11 @@ main(int argc, char **argv) if (!quiet) fprintf(stdout, "mapped writes DISABLED\n"); break; + case 'U': + mapped_msync = 0; + if (!quiet) + fprintf(stdout, "mapped msync DISABLED\n"); + break; default: usage(); From owner-svn-src-stable-8@FreeBSD.ORG Thu Jan 16 14:17:56 2014 Return-Path: Delivered-To: svn-src-stable-8@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 9465D893; Thu, 16 Jan 2014 14:17:56 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 743011109; Thu, 16 Jan 2014 14:17:56 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id s0GEHu6V029368; Thu, 16 Jan 2014 14:17:56 GMT (envelope-from avg@svn.freebsd.org) Received: (from avg@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id s0GEHuCS029367; Thu, 16 Jan 2014 14:17:56 GMT (envelope-from avg@svn.freebsd.org) Message-Id: <201401161417.s0GEHuCS029367@svn.freebsd.org> From: Andriy Gapon Date: Thu, 16 Jan 2014 14:17:56 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r260730 - stable/8/tools/regression/fsx X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Jan 2014 14:17:56 -0000 Author: avg Date: Thu Jan 16 14:17:55 2014 New Revision: 260730 URL: http://svnweb.freebsd.org/changeset/base/260730 Log: MFC r258352: fsx: add an option to randomly call msync(MS_INVALIDATE) Modified: stable/8/tools/regression/fsx/fsx.c Directory Properties: stable/8/tools/regression/fsx/ (props changed) Modified: stable/8/tools/regression/fsx/fsx.c ============================================================================== --- stable/8/tools/regression/fsx/fsx.c Thu Jan 16 14:17:44 2014 (r260729) +++ stable/8/tools/regression/fsx/fsx.c Thu Jan 16 14:17:55 2014 (r260730) @@ -90,6 +90,7 @@ int logcount = 0; /* total ops */ #define OP_MAPREAD 5 #define OP_MAPWRITE 6 #define OP_SKIPPED 7 +#define OP_INVALIDATE 8 int page_size; int page_mask; @@ -107,6 +108,7 @@ unsigned long testcalls = 0; /* calls t unsigned long simulatedopcount = 0; /* -b flag */ int closeprob = 0; /* -c flag */ +int invlprob = 0; /* -i flag */ int debug = 0; /* -d flag */ unsigned long debugstart = 0; /* -D flag */ unsigned long maxfilelen = 256 * 1024; /* -l flag */ @@ -131,6 +133,7 @@ int fsxgoodfd = 0; FILE * fsxlogf = NULL; int badoff = -1; int closeopen = 0; +int invl = 0; void @@ -182,14 +185,12 @@ prterr(char *prefix) void -log4(int operation, int arg0, int arg1, int arg2) +do_log4(int operation, int arg0, int arg1, int arg2) { struct log_entry *le; le = &oplog[logptr]; le->operation = operation; - if (closeopen) - le->operation = ~ le->operation; le->args[0] = arg0; le->args[1] = arg1; le->args[2] = arg2; @@ -201,10 +202,21 @@ log4(int operation, int arg0, int arg1, void +log4(int operation, int arg0, int arg1, int arg2) +{ + do_log4(operation, arg0, arg1, arg2); + if (closeopen) + do_log4(OP_CLOSEOPEN, 0, 0, 0); + if (invl) + do_log4(OP_INVALIDATE, 0, 0, 0); +} + + +void logdump(void) { - int i, count, down; struct log_entry *lp; + int i, count, down, opnum; prt("LOG DUMP (%d total operations):\n", logcount); if (logcount < LOGSIZE) { @@ -214,15 +226,28 @@ logdump(void) i = logptr; count = LOGSIZE; } + + opnum = i + 1 + (logcount/LOGSIZE)*LOGSIZE; for ( ; count > 0; count--) { - int opnum; + lp = &oplog[i]; + + if (lp->operation == OP_CLOSEOPEN || + lp->operation == OP_INVALIDATE) { + switch (lp->operation) { + case OP_CLOSEOPEN: + prt("\t\tCLOSE/OPEN\n"); + break; + case OP_INVALIDATE: + prt("\t\tMS_INVALIDATE\n"); + break; + } + i++; + if (i == LOGSIZE) + i = 0; + continue; + } - opnum = i+1 + (logcount/LOGSIZE)*LOGSIZE; prt("%d(%d mod 256): ", opnum, opnum%256); - lp = &oplog[i]; - if ((closeopen = lp->operation < 0)) - lp->operation = ~ lp->operation; - switch (lp->operation) { case OP_MAPREAD: prt("MAPREAD\t0x%x thru 0x%x\t(0x%x bytes)", @@ -275,9 +300,8 @@ logdump(void) prt("BOGUS LOG ENTRY (operation code = %d)!", lp->operation); } - if (closeopen) - prt("\n\t\tCLOSE/OPEN"); prt("\n"); + opnum++; i++; if (i == LOGSIZE) i = 0; @@ -779,6 +803,36 @@ docloseopen(void) void +doinvl(void) +{ + char *p; + + if (file_size == 0) + return; + if (testcalls <= simulatedopcount) + return; + if (debug) + prt("%lu msync(MS_INVALIDATE)\n", testcalls); + + if ((p = (char *)mmap(0, file_size, PROT_READ | PROT_WRITE, + MAP_FILE | MAP_SHARED, fd, 0)) == MAP_FAILED) { + prterr("doinvl: mmap"); + report_failure(205); + } + + if (msync(p, 0, MS_SYNC | MS_INVALIDATE) != 0) { + prterr("doinvl: msync"); + report_failure(206); + } + + if (munmap(p, file_size) != 0) { + prterr("doinvl: munmap"); + report_failure(207); + } +} + + +void test(void) { unsigned long offset; @@ -798,6 +852,8 @@ test(void) if (closeprob) closeopen = (rv >> 3) < (1 << 28) / closeprob; + if (invlprob) + invl = (rv >> 3) < (1 << 28) / invlprob; if (debugstart > 0 && testcalls >= debugstart) debug = 1; @@ -845,6 +901,8 @@ test(void) } if (sizechecks && testcalls > simulatedopcount) check_size(); + if (invl) + doinvl(); if (closeopen) docloseopen(); } @@ -869,6 +927,7 @@ usage(void) -b opnum: beginning operation number (default 1)\n\ -c P: 1 in P chance of file close+open at each op (default infinity)\n\ -d: debug output for all operations\n\ + -i P: 1 in P chance of calling msync(MS_INVALIDATE) (default infinity)\n\ -l flen: the upper bound on file size (default 262144)\n\ -m startop:endop: monitor (print debug output) specified byte range (default 0:infinity)\n\ -n: no verifications of file size\n\ @@ -944,7 +1003,7 @@ main(int argc, char **argv) setvbuf(stdout, (char *)0, _IOLBF, 0); /* line buffered stdout */ while ((ch = getopt(argc, argv, - "b:c:dl:m:no:p:qr:s:t:w:D:LN:OP:RS:UW")) != -1) + "b:c:di:l:m:no:p:qr:s:t:w:D:LN:OP:RS:UW")) != -1) switch (ch) { case 'b': simulatedopcount = getnum(optarg, &endp); @@ -967,6 +1026,15 @@ main(int argc, char **argv) case 'd': debug = 1; break; + case 'i': + invlprob = getnum(optarg, &endp); + if (!quiet) + fprintf(stdout, + "Chance of MS_INVALIDATE is 1 in %d\n", + invlprob); + if (invlprob <= 0) + usage(); + break; case 'l': maxfilelen = getnum(optarg, &endp); if (maxfilelen <= 0) From owner-svn-src-stable-8@FreeBSD.ORG Thu Jan 16 14:22:03 2014 Return-Path: Delivered-To: svn-src-stable-8@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 93422D91; Thu, 16 Jan 2014 14:22:03 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 7F3AF11A6; Thu, 16 Jan 2014 14:22:03 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id s0GEM31b031992; Thu, 16 Jan 2014 14:22:03 GMT (envelope-from avg@svn.freebsd.org) Received: (from avg@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id s0GEM3GT031991; Thu, 16 Jan 2014 14:22:03 GMT (envelope-from avg@svn.freebsd.org) Message-Id: <201401161422.s0GEM3GT031991@svn.freebsd.org> From: Andriy Gapon Date: Thu, 16 Jan 2014 14:22:03 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r260733 - stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Jan 2014 14:22:03 -0000 Author: avg Date: Thu Jan 16 14:22:03 2014 New Revision: 260733 URL: http://svnweb.freebsd.org/changeset/base/260733 Log: MFC r258638,258642: expose zfs_flags as debug.zfs_flags r/w tunable and sysctl Sponsored by: HybridCluster Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c Directory Properties: stable/8/sys/ (props changed) stable/8/sys/cddl/ (props changed) stable/8/sys/cddl/contrib/opensolaris/ (props changed) Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c Thu Jan 16 14:21:41 2014 (r260732) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c Thu Jan 16 14:22:03 2014 (r260733) @@ -243,6 +243,10 @@ int zfs_flags = ~(ZFS_DEBUG_DPRINTF | ZF #else int zfs_flags = 0; #endif +SYSCTL_DECL(_debug); +TUNABLE_INT("debug.zfs_flags", &zfs_flags); +SYSCTL_INT(_debug, OID_AUTO, zfs_flags, CTLFLAG_RWTUN, &zfs_flags, 0, + "ZFS debug flags."); /* * zfs_recover can be set to nonzero to attempt to recover from From owner-svn-src-stable-8@FreeBSD.ORG Thu Jan 16 14:24:45 2014 Return-Path: Delivered-To: svn-src-stable-8@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 8733D308; Thu, 16 Jan 2014 14:24:45 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 737F911CC; Thu, 16 Jan 2014 14:24:45 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id s0GEOjSO032549; Thu, 16 Jan 2014 14:24:45 GMT (envelope-from avg@svn.freebsd.org) Received: (from avg@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id s0GEOjx5032548; Thu, 16 Jan 2014 14:24:45 GMT (envelope-from avg@svn.freebsd.org) Message-Id: <201401161424.s0GEOjx5032548@svn.freebsd.org> From: Andriy Gapon Date: Thu, 16 Jan 2014 14:24:45 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r260736 - stable/8/tools/tools/zfsboottest X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Jan 2014 14:24:45 -0000 Author: avg Date: Thu Jan 16 14:24:44 2014 New Revision: 260736 URL: http://svnweb.freebsd.org/changeset/base/260736 Log: MFC r258647: zfsboottest: properly specify a library dependency Modified: stable/8/tools/tools/zfsboottest/Makefile Directory Properties: stable/8/tools/tools/zfsboottest/ (props changed) Modified: stable/8/tools/tools/zfsboottest/Makefile ============================================================================== --- stable/8/tools/tools/zfsboottest/Makefile Thu Jan 16 14:24:33 2014 (r260735) +++ stable/8/tools/tools/zfsboottest/Makefile Thu Jan 16 14:24:44 2014 (r260736) @@ -16,7 +16,7 @@ CFLAGS= -O1 \ -I. \ -fdiagnostics-show-option \ -W -Wextra -Wno-sign-compare -Wno-unused-parameter -LDFLAGS+=-lmd +LDADD+= -lmd .if ${MACHINE_ARCH} == "amd64" beforedepend zfsboottest.o: machine From owner-svn-src-stable-8@FreeBSD.ORG Thu Jan 16 14:30:47 2014 Return-Path: Delivered-To: svn-src-stable-8@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 46FE099D; Thu, 16 Jan 2014 14:30:47 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 275E0121C; Thu, 16 Jan 2014 14:30:47 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id s0GEUlOQ036045; Thu, 16 Jan 2014 14:30:47 GMT (envelope-from avg@svn.freebsd.org) Received: (from avg@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id s0GEUkiI036043; Thu, 16 Jan 2014 14:30:46 GMT (envelope-from avg@svn.freebsd.org) Message-Id: <201401161430.s0GEUkiI036043@svn.freebsd.org> From: Andriy Gapon Date: Thu, 16 Jan 2014 14:30:46 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r260741 - in stable/8/sys/cddl: compat/opensolaris/kern contrib/opensolaris/uts/common/sys X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Jan 2014 14:30:47 -0000 Author: avg Date: Thu Jan 16 14:30:46 2014 New Revision: 260741 URL: http://svnweb.freebsd.org/changeset/base/260741 Log: MFC r258628: opensolaris taskq: some cosmetic changes Modified: stable/8/sys/cddl/compat/opensolaris/kern/opensolaris_taskq.c stable/8/sys/cddl/contrib/opensolaris/uts/common/sys/taskq.h Directory Properties: stable/8/sys/ (props changed) stable/8/sys/cddl/ (props changed) stable/8/sys/cddl/contrib/opensolaris/ (props changed) Modified: stable/8/sys/cddl/compat/opensolaris/kern/opensolaris_taskq.c ============================================================================== --- stable/8/sys/cddl/compat/opensolaris/kern/opensolaris_taskq.c Thu Jan 16 14:30:35 2014 (r260740) +++ stable/8/sys/cddl/compat/opensolaris/kern/opensolaris_taskq.c Thu Jan 16 14:30:46 2014 (r260741) @@ -121,7 +121,7 @@ taskq_dispatch(taskq_t *tq, task_func_t mflag = M_WAITOK; else mflag = M_NOWAIT; - /* + /* * If TQ_FRONT is given, we want higher priority for this task, so it * can go at the front of the queue. */ @@ -140,8 +140,6 @@ taskq_dispatch(taskq_t *tq, task_func_t return ((taskqid_t)(void *)task); } -#define TASKQ_MAGIC 0x74541c - static void taskq_run_safe(void *arg, int pending __unused) { @@ -156,7 +154,7 @@ taskq_dispatch_safe(taskq_t *tq, task_fu { int prio; - /* + /* * If TQ_FRONT is given, we want higher priority for this task, so it * can go at the front of the queue. */ Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/sys/taskq.h ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/sys/taskq.h Thu Jan 16 14:30:35 2014 (r260740) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/sys/taskq.h Thu Jan 16 14:30:46 2014 (r260741) @@ -70,24 +70,23 @@ struct proc; extern taskq_t *system_taskq; -extern void taskq_init(void); -extern void taskq_mp_init(void); +void taskq_init(void); +void taskq_mp_init(void); -extern taskq_t *taskq_create(const char *, int, pri_t, int, int, uint_t); -extern taskq_t *taskq_create_instance(const char *, int, int, pri_t, int, - int, uint_t); -extern taskq_t *taskq_create_proc(const char *, int, pri_t, int, int, +taskq_t *taskq_create(const char *, int, pri_t, int, int, uint_t); +taskq_t *taskq_create_instance(const char *, int, int, pri_t, int, int, uint_t); +taskq_t *taskq_create_proc(const char *, int, pri_t, int, int, struct proc *, uint_t); -extern taskq_t *taskq_create_sysdc(const char *, int, int, int, +taskq_t *taskq_create_sysdc(const char *, int, int, int, struct proc *, uint_t, uint_t); -extern taskqid_t taskq_dispatch(taskq_t *, task_func_t, void *, uint_t); -extern void nulltask(void *); -extern void taskq_destroy(taskq_t *); -extern void taskq_wait(taskq_t *); -extern void taskq_suspend(taskq_t *); -extern int taskq_suspended(taskq_t *); -extern void taskq_resume(taskq_t *); -extern int taskq_member(taskq_t *, kthread_t *); +taskqid_t taskq_dispatch(taskq_t *, task_func_t, void *, uint_t); +void nulltask(void *); +void taskq_destroy(taskq_t *); +void taskq_wait(taskq_t *); +void taskq_suspend(taskq_t *); +int taskq_suspended(taskq_t *); +void taskq_resume(taskq_t *); +int taskq_member(taskq_t *, kthread_t *); #endif /* _KERNEL */ From owner-svn-src-stable-8@FreeBSD.ORG Thu Jan 16 14:35:22 2014 Return-Path: Delivered-To: svn-src-stable-8@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 61897EAB; Thu, 16 Jan 2014 14:35:22 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 42CA712B2; Thu, 16 Jan 2014 14:35:22 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id s0GEZMYN036792; Thu, 16 Jan 2014 14:35:22 GMT (envelope-from avg@svn.freebsd.org) Received: (from avg@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id s0GEZLZO036787; Thu, 16 Jan 2014 14:35:21 GMT (envelope-from avg@svn.freebsd.org) Message-Id: <201401161435.s0GEZLZO036787@svn.freebsd.org> From: Andriy Gapon Date: Thu, 16 Jan 2014 14:35:21 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r260744 - in stable/8/sys/cddl: compat/opensolaris/kern compat/opensolaris/sys contrib/opensolaris/uts/common/fs/zfs contrib/opensolaris/uts/common/fs/zfs/sys contrib/opensolaris/uts/co... X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Jan 2014 14:35:22 -0000 Author: avg Date: Thu Jan 16 14:35:20 2014 New Revision: 260744 URL: http://svnweb.freebsd.org/changeset/base/260744 Log: MFC r258630: 734 taskq_dispatch_prealloc() desired Deleted: stable/8/sys/cddl/compat/opensolaris/sys/taskq.h Modified: stable/8/sys/cddl/compat/opensolaris/kern/opensolaris_taskq.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c stable/8/sys/cddl/contrib/opensolaris/uts/common/sys/taskq.h Directory Properties: stable/8/sys/ (props changed) stable/8/sys/cddl/ (props changed) stable/8/sys/cddl/contrib/opensolaris/ (props changed) Modified: stable/8/sys/cddl/compat/opensolaris/kern/opensolaris_taskq.c ============================================================================== --- stable/8/sys/cddl/compat/opensolaris/kern/opensolaris_taskq.c Thu Jan 16 14:35:06 2014 (r260743) +++ stable/8/sys/cddl/compat/opensolaris/kern/opensolaris_taskq.c Thu Jan 16 14:35:20 2014 (r260744) @@ -46,7 +46,7 @@ static void system_taskq_init(void *arg) { - taskq_zone = uma_zcreate("taskq_zone", sizeof(struct ostask), + taskq_zone = uma_zcreate("taskq_zone", sizeof(taskq_ent_t), NULL, NULL, NULL, NULL, 0, 0); system_taskq = taskq_create("system_taskq", mp_ncpus, 0, 0, 0, 0); } @@ -104,9 +104,9 @@ taskq_member(taskq_t *tq, kthread_t *thr static void taskq_run(void *arg, int pending __unused) { - struct ostask *task = arg; + taskq_ent_t *task = arg; - task->ost_func(task->ost_arg); + task->tqent_func(task->tqent_arg); uma_zfree(taskq_zone, task); } @@ -114,7 +114,7 @@ taskq_run(void *arg, int pending __unuse taskqid_t taskq_dispatch(taskq_t *tq, task_func_t func, void *arg, uint_t flags) { - struct ostask *task; + taskq_ent_t *task; int mflag, prio; if ((flags & (TQ_SLEEP | TQ_NOQUEUE)) == TQ_SLEEP) @@ -131,26 +131,26 @@ taskq_dispatch(taskq_t *tq, task_func_t if (task == NULL) return (0); - task->ost_func = func; - task->ost_arg = arg; + task->tqent_func = func; + task->tqent_arg = arg; - TASK_INIT(&task->ost_task, prio, taskq_run, task); - taskqueue_enqueue(tq->tq_queue, &task->ost_task); + TASK_INIT(&task->tqent_task, prio, taskq_run, task); + taskqueue_enqueue(tq->tq_queue, &task->tqent_task); return ((taskqid_t)(void *)task); } static void -taskq_run_safe(void *arg, int pending __unused) +taskq_run_ent(void *arg, int pending __unused) { - struct ostask *task = arg; + taskq_ent_t *task = arg; - task->ost_func(task->ost_arg); + task->tqent_func(task->tqent_arg); } -taskqid_t -taskq_dispatch_safe(taskq_t *tq, task_func_t func, void *arg, u_int flags, - struct ostask *task) +void +taskq_dispatch_ent(taskq_t *tq, task_func_t func, void *arg, u_int flags, + taskq_ent_t *task) { int prio; @@ -160,11 +160,9 @@ taskq_dispatch_safe(taskq_t *tq, task_fu */ prio = !!(flags & TQ_FRONT); - task->ost_func = func; - task->ost_arg = arg; + task->tqent_func = func; + task->tqent_arg = arg; - TASK_INIT(&task->ost_task, prio, taskq_run_safe, task); - taskqueue_enqueue(tq->tq_queue, &task->ost_task); - - return ((taskqid_t)(void *)task); + TASK_INIT(&task->tqent_task, prio, taskq_run_ent, task); + taskqueue_enqueue(tq->tq_queue, &task->tqent_task); } Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c Thu Jan 16 14:35:06 2014 (r260743) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c Thu Jan 16 14:35:20 2014 (r260744) @@ -823,7 +823,7 @@ static taskq_t * spa_taskq_create(spa_t *spa, const char *name, enum zti_modes mode, uint_t value) { - uint_t flags = TASKQ_PREPOPULATE; + uint_t flags = 0; boolean_t batch = B_FALSE; switch (mode) { Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h Thu Jan 16 14:35:06 2014 (r260743) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h Thu Jan 16 14:35:20 2014 (r260744) @@ -23,6 +23,7 @@ * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright (c) 2012 by Delphix. All rights reserved. * Copyright (c) 2013 by Saso Kiselkov. All rights reserved. + * Copyright 2011 Nexenta Systems, Inc. All rights reserved. */ #ifndef _ZIO_H @@ -472,10 +473,9 @@ struct zio { zio_cksum_report_t *io_cksum_report; uint64_t io_ena; -#ifdef _KERNEL - /* FreeBSD only. */ - struct ostask io_task; -#endif + /* Taskq dispatching state */ + taskq_ent_t io_tqent; + avl_node_t io_trim_node; list_node_t io_trim_link; }; Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c Thu Jan 16 14:35:06 2014 (r260743) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c Thu Jan 16 14:35:20 2014 (r260744) @@ -21,6 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright (c) 2013 by Delphix. All rights reserved. + * Copyright (c) 2011 Nexenta Systems, Inc. All rights reserved. */ #include @@ -1183,7 +1184,7 @@ zio_taskq_dispatch(zio_t *zio, enum zio_ { spa_t *spa = zio->io_spa; zio_type_t t = zio->io_type; - int flags = TQ_SLEEP | (cutinline ? TQ_FRONT : 0); + int flags = (cutinline ? TQ_FRONT : 0); ASSERT(q == ZIO_TASKQ_ISSUE || q == ZIO_TASKQ_INTERRUPT); @@ -1209,13 +1210,19 @@ zio_taskq_dispatch(zio_t *zio, enum zio_ q++; ASSERT3U(q, <, ZIO_TASKQ_TYPES); -#ifdef _KERNEL - (void) taskq_dispatch_safe(spa->spa_zio_taskq[t][q], - (task_func_t *)zio_execute, zio, flags, &zio->io_task); + + /* + * NB: We are assuming that the zio can only be dispatched + * to a single taskq at a time. It would be a grievous error + * to dispatch the zio to another taskq at the same time. + */ +#if defined(illumos) || !defined(_KERNEL) + ASSERT(zio->io_tqent.tqent_next == NULL); #else - (void) taskq_dispatch(spa->spa_zio_taskq[t][q], - (task_func_t *)zio_execute, zio, flags); + ASSERT(zio->io_tqent.tqent_task.ta_pending == 0); #endif + taskq_dispatch_ent(spa->spa_zio_taskq[t][q], + (task_func_t *)zio_execute, zio, flags, &zio->io_tqent); } static boolean_t @@ -3133,16 +3140,15 @@ zio_done(zio_t *zio) * Reexecution is potentially a huge amount of work. * Hand it off to the otherwise-unused claim taskq. */ -#ifdef _KERNEL - (void) taskq_dispatch_safe( - spa->spa_zio_taskq[ZIO_TYPE_CLAIM][ZIO_TASKQ_ISSUE], - (task_func_t *)zio_reexecute, zio, TQ_SLEEP, - &zio->io_task); +#if defined(illumos) || !defined(_KERNEL) + ASSERT(zio->io_tqent.tqent_next == NULL); #else - (void) taskq_dispatch( - spa->spa_zio_taskq[ZIO_TYPE_CLAIM][ZIO_TASKQ_ISSUE], - (task_func_t *)zio_reexecute, zio, TQ_SLEEP); + ASSERT(zio->io_tqent.tqent_task.ta_pending == 0); #endif + (void) taskq_dispatch_ent( + spa->spa_zio_taskq[ZIO_TYPE_CLAIM][ZIO_TASKQ_ISSUE], + (task_func_t *)zio_reexecute, zio, 0, + &zio->io_tqent); } return (ZIO_PIPELINE_STOP); } Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/sys/taskq.h ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/sys/taskq.h Thu Jan 16 14:35:06 2014 (r260743) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/sys/taskq.h Thu Jan 16 14:35:20 2014 (r260744) @@ -45,6 +45,12 @@ typedef struct taskq taskq_t; typedef uintptr_t taskqid_t; typedef void (task_func_t)(void *); +typedef struct taskq_ent { + struct task tqent_task; + task_func_t *tqent_func; + void *tqent_arg; +} taskq_ent_t; + struct proc; /* @@ -80,6 +86,8 @@ taskq_t *taskq_create_proc(const char *, taskq_t *taskq_create_sysdc(const char *, int, int, int, struct proc *, uint_t, uint_t); taskqid_t taskq_dispatch(taskq_t *, task_func_t, void *, uint_t); +void taskq_dispatch_ent(taskq_t *, task_func_t, void *, uint_t, + taskq_ent_t *); void nulltask(void *); void taskq_destroy(taskq_t *); void taskq_wait(taskq_t *); From owner-svn-src-stable-8@FreeBSD.ORG Thu Jan 16 14:37:50 2014 Return-Path: Delivered-To: svn-src-stable-8@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id D22D42FE; Thu, 16 Jan 2014 14:37:50 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id BE7E712D3; Thu, 16 Jan 2014 14:37:50 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id s0GEbo8X037187; Thu, 16 Jan 2014 14:37:50 GMT (envelope-from avg@svn.freebsd.org) Received: (from avg@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id s0GEbotl037186; Thu, 16 Jan 2014 14:37:50 GMT (envelope-from avg@svn.freebsd.org) Message-Id: <201401161437.s0GEbotl037186@svn.freebsd.org> From: Andriy Gapon Date: Thu, 16 Jan 2014 14:37:50 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r260747 - stable/8/sys/cddl/contrib/opensolaris/uts/common X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Jan 2014 14:37:50 -0000 Author: avg Date: Thu Jan 16 14:37:50 2014 New Revision: 260747 URL: http://svnweb.freebsd.org/changeset/base/260747 Log: MFC r258743: drop ZUT_OBJ Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/Makefile.files Directory Properties: stable/8/sys/ (props changed) stable/8/sys/cddl/ (props changed) stable/8/sys/cddl/contrib/opensolaris/ (props changed) Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/Makefile.files ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/Makefile.files Thu Jan 16 14:37:41 2014 (r260746) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/Makefile.files Thu Jan 16 14:37:50 2014 (r260747) @@ -126,6 +126,3 @@ ZFS_OBJS += \ zfs_vfsops.o \ zfs_vnops.o \ zvol.o - -ZUT_OBJS += \ - zut.o From owner-svn-src-stable-8@FreeBSD.ORG Thu Jan 16 14:42:23 2014 Return-Path: Delivered-To: svn-src-stable-8@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 289D76B5; Thu, 16 Jan 2014 14:42:23 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 09297135B; Thu, 16 Jan 2014 14:42:23 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id s0GEgMgZ040396; Thu, 16 Jan 2014 14:42:22 GMT (envelope-from avg@svn.freebsd.org) Received: (from avg@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id s0GEgMva040394; Thu, 16 Jan 2014 14:42:22 GMT (envelope-from avg@svn.freebsd.org) Message-Id: <201401161442.s0GEgMva040394@svn.freebsd.org> From: Andriy Gapon Date: Thu, 16 Jan 2014 14:42:22 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r260749 - in stable/8/cddl/contrib/opensolaris/lib/libzpool/common: . sys X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Jan 2014 14:42:23 -0000 Author: avg Date: Thu Jan 16 14:42:22 2014 New Revision: 260749 URL: http://svnweb.freebsd.org/changeset/base/260749 Log: MFC r258630: 734 taskq_dispatch_prealloc() desired Modified: stable/8/cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h stable/8/cddl/contrib/opensolaris/lib/libzpool/common/taskq.c Directory Properties: stable/8/cddl/contrib/opensolaris/ (props changed) Modified: stable/8/cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h ============================================================================== --- stable/8/cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h Thu Jan 16 14:42:08 2014 (r260748) +++ stable/8/cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h Thu Jan 16 14:42:22 2014 (r260749) @@ -23,6 +23,9 @@ * Copyright (c) 2013 by Delphix. All rights reserved. * Copyright (c) 2012, Joyent, Inc. All rights reserved. */ +/* + * Copyright 2011 Nexenta Systems, Inc. All rights reserved. + */ #ifndef _SYS_ZFS_CONTEXT_H #define _SYS_ZFS_CONTEXT_H @@ -364,6 +367,16 @@ typedef struct taskq taskq_t; typedef uintptr_t taskqid_t; typedef void (task_func_t)(void *); +typedef struct taskq_ent { + struct taskq_ent *tqent_next; + struct taskq_ent *tqent_prev; + task_func_t *tqent_func; + void *tqent_arg; + uintptr_t tqent_flags; +} taskq_ent_t; + +#define TQENT_FLAG_PREALLOC 0x1 /* taskq_dispatch_ent used */ + #define TASKQ_PREPOPULATE 0x0001 #define TASKQ_CPR_SAFE 0x0002 /* Use CPR safe protocol */ #define TASKQ_DYNAMIC 0x0004 /* Use dynamic thread scheduling */ @@ -375,6 +388,7 @@ typedef void (task_func_t)(void *); #define TQ_NOQUEUE 0x02 /* Do not enqueue if can't dispatch */ #define TQ_FRONT 0x08 /* Queue in front */ + extern taskq_t *system_taskq; extern taskq_t *taskq_create(const char *, int, pri_t, int, int, uint_t); @@ -383,6 +397,8 @@ extern taskq_t *taskq_create(const char #define taskq_create_sysdc(a, b, d, e, p, dc, f) \ (taskq_create(a, b, maxclsyspri, d, e, f)) extern taskqid_t taskq_dispatch(taskq_t *, task_func_t, void *, uint_t); +extern void taskq_dispatch_ent(taskq_t *, task_func_t, void *, uint_t, + taskq_ent_t *); extern void taskq_destroy(taskq_t *); extern void taskq_wait(taskq_t *); extern int taskq_member(taskq_t *, void *); Modified: stable/8/cddl/contrib/opensolaris/lib/libzpool/common/taskq.c ============================================================================== --- stable/8/cddl/contrib/opensolaris/lib/libzpool/common/taskq.c Thu Jan 16 14:42:08 2014 (r260748) +++ stable/8/cddl/contrib/opensolaris/lib/libzpool/common/taskq.c Thu Jan 16 14:42:22 2014 (r260749) @@ -22,19 +22,15 @@ * Copyright 2010 Sun Microsystems, Inc. All rights reserved. * Use is subject to license terms. */ +/* + * Copyright 2011 Nexenta Systems, Inc. All rights reserved. + */ #include int taskq_now; taskq_t *system_taskq; -typedef struct task { - struct task *task_next; - struct task *task_prev; - task_func_t *task_func; - void *task_arg; -} task_t; - #define TASKQ_ACTIVE 0x00010000 struct taskq { @@ -51,18 +47,18 @@ struct taskq { int tq_maxalloc; kcondvar_t tq_maxalloc_cv; int tq_maxalloc_wait; - task_t *tq_freelist; - task_t tq_task; + taskq_ent_t *tq_freelist; + taskq_ent_t tq_task; }; -static task_t * +static taskq_ent_t * task_alloc(taskq_t *tq, int tqflags) { - task_t *t; + taskq_ent_t *t; int rv; again: if ((t = tq->tq_freelist) != NULL && tq->tq_nalloc >= tq->tq_minalloc) { - tq->tq_freelist = t->task_next; + tq->tq_freelist = t->tqent_next; } else { if (tq->tq_nalloc >= tq->tq_maxalloc) { if (!(tqflags & KM_SLEEP)) @@ -87,7 +83,7 @@ again: if ((t = tq->tq_freelist) != NULL } mutex_exit(&tq->tq_lock); - t = kmem_alloc(sizeof (task_t), tqflags & KM_SLEEP); + t = kmem_alloc(sizeof (taskq_ent_t), tqflags & KM_SLEEP); mutex_enter(&tq->tq_lock); if (t != NULL) @@ -97,15 +93,15 @@ again: if ((t = tq->tq_freelist) != NULL } static void -task_free(taskq_t *tq, task_t *t) +task_free(taskq_t *tq, taskq_ent_t *t) { if (tq->tq_nalloc <= tq->tq_minalloc) { - t->task_next = tq->tq_freelist; + t->tqent_next = tq->tq_freelist; tq->tq_freelist = t; } else { tq->tq_nalloc--; mutex_exit(&tq->tq_lock); - kmem_free(t, sizeof (task_t)); + kmem_free(t, sizeof (taskq_ent_t)); mutex_enter(&tq->tq_lock); } @@ -116,7 +112,7 @@ task_free(taskq_t *tq, task_t *t) taskqid_t taskq_dispatch(taskq_t *tq, task_func_t func, void *arg, uint_t tqflags) { - task_t *t; + taskq_ent_t *t; if (taskq_now) { func(arg); @@ -130,26 +126,58 @@ taskq_dispatch(taskq_t *tq, task_func_t return (0); } if (tqflags & TQ_FRONT) { - t->task_next = tq->tq_task.task_next; - t->task_prev = &tq->tq_task; + t->tqent_next = tq->tq_task.tqent_next; + t->tqent_prev = &tq->tq_task; } else { - t->task_next = &tq->tq_task; - t->task_prev = tq->tq_task.task_prev; + t->tqent_next = &tq->tq_task; + t->tqent_prev = tq->tq_task.tqent_prev; } - t->task_next->task_prev = t; - t->task_prev->task_next = t; - t->task_func = func; - t->task_arg = arg; + t->tqent_next->tqent_prev = t; + t->tqent_prev->tqent_next = t; + t->tqent_func = func; + t->tqent_arg = arg; cv_signal(&tq->tq_dispatch_cv); mutex_exit(&tq->tq_lock); return (1); } void +taskq_dispatch_ent(taskq_t *tq, task_func_t func, void *arg, uint_t flags, + taskq_ent_t *t) +{ + ASSERT(func != NULL); + ASSERT(!(tq->tq_flags & TASKQ_DYNAMIC)); + + /* + * Mark it as a prealloc'd task. This is important + * to ensure that we don't free it later. + */ + t->tqent_flags |= TQENT_FLAG_PREALLOC; + /* + * Enqueue the task to the underlying queue. + */ + mutex_enter(&tq->tq_lock); + + if (flags & TQ_FRONT) { + t->tqent_next = tq->tq_task.tqent_next; + t->tqent_prev = &tq->tq_task; + } else { + t->tqent_next = &tq->tq_task; + t->tqent_prev = tq->tq_task.tqent_prev; + } + t->tqent_next->tqent_prev = t; + t->tqent_prev->tqent_next = t; + t->tqent_func = func; + t->tqent_arg = arg; + cv_signal(&tq->tq_dispatch_cv); + mutex_exit(&tq->tq_lock); +} + +void taskq_wait(taskq_t *tq) { mutex_enter(&tq->tq_lock); - while (tq->tq_task.task_next != &tq->tq_task || tq->tq_active != 0) + while (tq->tq_task.tqent_next != &tq->tq_task || tq->tq_active != 0) cv_wait(&tq->tq_wait_cv, &tq->tq_lock); mutex_exit(&tq->tq_lock); } @@ -158,27 +186,32 @@ static void * taskq_thread(void *arg) { taskq_t *tq = arg; - task_t *t; + taskq_ent_t *t; + boolean_t prealloc; mutex_enter(&tq->tq_lock); while (tq->tq_flags & TASKQ_ACTIVE) { - if ((t = tq->tq_task.task_next) == &tq->tq_task) { + if ((t = tq->tq_task.tqent_next) == &tq->tq_task) { if (--tq->tq_active == 0) cv_broadcast(&tq->tq_wait_cv); cv_wait(&tq->tq_dispatch_cv, &tq->tq_lock); tq->tq_active++; continue; } - t->task_prev->task_next = t->task_next; - t->task_next->task_prev = t->task_prev; + t->tqent_prev->tqent_next = t->tqent_next; + t->tqent_next->tqent_prev = t->tqent_prev; + t->tqent_next = NULL; + t->tqent_prev = NULL; + prealloc = t->tqent_flags & TQENT_FLAG_PREALLOC; mutex_exit(&tq->tq_lock); rw_enter(&tq->tq_threadlock, RW_READER); - t->task_func(t->task_arg); + t->tqent_func(t->tqent_arg); rw_exit(&tq->tq_threadlock); mutex_enter(&tq->tq_lock); - task_free(tq, t); + if (!prealloc) + task_free(tq, t); } tq->tq_nthreads--; cv_broadcast(&tq->tq_wait_cv); @@ -217,8 +250,8 @@ taskq_create(const char *name, int nthre tq->tq_nthreads = nthreads; tq->tq_minalloc = minalloc; tq->tq_maxalloc = maxalloc; - tq->tq_task.task_next = &tq->tq_task; - tq->tq_task.task_prev = &tq->tq_task; + tq->tq_task.tqent_next = &tq->tq_task; + tq->tq_task.tqent_prev = &tq->tq_task; tq->tq_threadlist = kmem_alloc(nthreads * sizeof (thread_t), KM_SLEEP); if (flags & TASKQ_PREPOPULATE) { From owner-svn-src-stable-8@FreeBSD.ORG Thu Jan 16 14:48:27 2014 Return-Path: Delivered-To: svn-src-stable-8@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 45441BBD; Thu, 16 Jan 2014 14:48:27 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 257CA13A2; Thu, 16 Jan 2014 14:48:27 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id s0GEmRRY041301; Thu, 16 Jan 2014 14:48:27 GMT (envelope-from avg@svn.freebsd.org) Received: (from avg@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id s0GEmQCx041298; Thu, 16 Jan 2014 14:48:26 GMT (envelope-from avg@svn.freebsd.org) Message-Id: <201401161448.s0GEmQCx041298@svn.freebsd.org> From: Andriy Gapon Date: Thu, 16 Jan 2014 14:48:26 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r260753 - in stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Jan 2014 14:48:27 -0000 Author: avg Date: Thu Jan 16 14:48:26 2014 New Revision: 260753 URL: http://svnweb.freebsd.org/changeset/base/260753 Log: MFC r258631: MFV r247578 3581 spa_zio_taskq[ZIO_TYPE_FREE][ZIO_TASKQ_ISSUE]->tq_lock is piping hot Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa_impl.h stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c Directory Properties: stable/8/sys/ (props changed) stable/8/sys/cddl/ (props changed) stable/8/sys/cddl/contrib/opensolaris/ (props changed) Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c Thu Jan 16 14:48:23 2014 (r260752) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c Thu Jan 16 14:48:26 2014 (r260753) @@ -95,23 +95,25 @@ SYSCTL_INT(_vfs_zfs, OID_AUTO, check_hos static int zfs_ccw_retry_interval = 300; typedef enum zti_modes { - zti_mode_fixed, /* value is # of threads (min 1) */ - zti_mode_online_percent, /* value is % of online CPUs */ - zti_mode_batch, /* cpu-intensive; value is ignored */ - zti_mode_null, /* don't create a taskq */ - zti_nmodes + ZTI_MODE_FIXED, /* value is # of threads (min 1) */ + ZTI_MODE_ONLINE_PERCENT, /* value is % of online CPUs */ + ZTI_MODE_BATCH, /* cpu-intensive; value is ignored */ + ZTI_MODE_NULL, /* don't create a taskq */ + ZTI_NMODES } zti_modes_t; -#define ZTI_FIX(n) { zti_mode_fixed, (n) } -#define ZTI_PCT(n) { zti_mode_online_percent, (n) } -#define ZTI_BATCH { zti_mode_batch, 0 } -#define ZTI_NULL { zti_mode_null, 0 } +#define ZTI_P(n, q) { ZTI_MODE_FIXED, (n), (q) } +#define ZTI_PCT(n) { ZTI_MODE_ONLINE_PERCENT, (n), 1 } +#define ZTI_BATCH { ZTI_MODE_BATCH, 0, 1 } +#define ZTI_NULL { ZTI_MODE_NULL, 0, 0 } -#define ZTI_ONE ZTI_FIX(1) +#define ZTI_N(n) ZTI_P(n, 1) +#define ZTI_ONE ZTI_N(1) typedef struct zio_taskq_info { - enum zti_modes zti_mode; + zti_modes_t zti_mode; uint_t zti_value; + uint_t zti_count; } zio_taskq_info_t; static const char *const zio_taskq_types[ZIO_TASKQ_TYPES] = { @@ -119,17 +121,30 @@ static const char *const zio_taskq_types }; /* - * Define the taskq threads for the following I/O types: - * NULL, READ, WRITE, FREE, CLAIM, and IOCTL + * This table defines the taskq settings for each ZFS I/O type. When + * initializing a pool, we use this table to create an appropriately sized + * taskq. Some operations are low volume and therefore have a small, static + * number of threads assigned to their taskqs using the ZTI_N(#) or ZTI_ONE + * macros. Other operations process a large amount of data; the ZTI_BATCH + * macro causes us to create a taskq oriented for throughput. Some operations + * are so high frequency and short-lived that the taskq itself can become a a + * point of lock contention. The ZTI_P(#, #) macro indicates that we need an + * additional degree of parallelism specified by the number of threads per- + * taskq and the number of taskqs; when dispatching an event in this case, the + * particular taskq is chosen at random. + * + * The different taskq priorities are to handle the different contexts (issue + * and interrupt) and then to reserve threads for ZIO_PRIORITY_NOW I/Os that + * need to be handled with minimum delay. */ const zio_taskq_info_t zio_taskqs[ZIO_TYPES][ZIO_TASKQ_TYPES] = { /* ISSUE ISSUE_HIGH INTR INTR_HIGH */ - { ZTI_ONE, ZTI_NULL, ZTI_ONE, ZTI_NULL }, - { ZTI_FIX(8), ZTI_NULL, ZTI_BATCH, ZTI_NULL }, - { ZTI_BATCH, ZTI_FIX(5), ZTI_FIX(8), ZTI_FIX(5) }, - { ZTI_FIX(100), ZTI_NULL, ZTI_ONE, ZTI_NULL }, - { ZTI_ONE, ZTI_NULL, ZTI_ONE, ZTI_NULL }, - { ZTI_ONE, ZTI_NULL, ZTI_ONE, ZTI_NULL }, + { ZTI_ONE, ZTI_NULL, ZTI_ONE, ZTI_NULL }, /* NULL */ + { ZTI_N(8), ZTI_NULL, ZTI_BATCH, ZTI_NULL }, /* READ */ + { ZTI_BATCH, ZTI_N(5), ZTI_N(8), ZTI_N(5) }, /* WRITE */ + { ZTI_P(12, 8), ZTI_NULL, ZTI_ONE, ZTI_NULL }, /* FREE */ + { ZTI_ONE, ZTI_NULL, ZTI_ONE, ZTI_NULL }, /* CLAIM */ + { ZTI_ONE, ZTI_NULL, ZTI_ONE, ZTI_NULL }, /* IOCTL */ }; static void spa_sync_version(void *arg, dmu_tx_t *tx); @@ -819,50 +834,124 @@ spa_get_errlists(spa_t *spa, avl_tree_t offsetof(spa_error_entry_t, se_avl)); } -static taskq_t * -spa_taskq_create(spa_t *spa, const char *name, enum zti_modes mode, - uint_t value) +static void +spa_taskqs_init(spa_t *spa, zio_type_t t, zio_taskq_type_t q) { + const zio_taskq_info_t *ztip = &zio_taskqs[t][q]; + enum zti_modes mode = ztip->zti_mode; + uint_t value = ztip->zti_value; + uint_t count = ztip->zti_count; + spa_taskqs_t *tqs = &spa->spa_zio_taskq[t][q]; + char name[32]; uint_t flags = 0; boolean_t batch = B_FALSE; - switch (mode) { - case zti_mode_null: - return (NULL); /* no taskq needed */ - - case zti_mode_fixed: - ASSERT3U(value, >=, 1); - value = MAX(value, 1); - break; + if (mode == ZTI_MODE_NULL) { + tqs->stqs_count = 0; + tqs->stqs_taskq = NULL; + return; + } - case zti_mode_batch: - batch = B_TRUE; - flags |= TASKQ_THREADS_CPU_PCT; - value = zio_taskq_batch_pct; - break; + ASSERT3U(count, >, 0); - case zti_mode_online_percent: - flags |= TASKQ_THREADS_CPU_PCT; - break; + tqs->stqs_count = count; + tqs->stqs_taskq = kmem_alloc(count * sizeof (taskq_t *), KM_SLEEP); - default: - panic("unrecognized mode for %s taskq (%u:%u) in " - "spa_activate()", - name, mode, value); - break; - } + for (uint_t i = 0; i < count; i++) { + taskq_t *tq; + + switch (mode) { + case ZTI_MODE_FIXED: + ASSERT3U(value, >=, 1); + value = MAX(value, 1); + break; + + case ZTI_MODE_BATCH: + batch = B_TRUE; + flags |= TASKQ_THREADS_CPU_PCT; + value = zio_taskq_batch_pct; + break; + + case ZTI_MODE_ONLINE_PERCENT: + flags |= TASKQ_THREADS_CPU_PCT; + break; + + default: + panic("unrecognized mode for %s_%s taskq (%u:%u) in " + "spa_activate()", + zio_type_name[t], zio_taskq_types[q], mode, value); + break; + } + + if (count > 1) { + (void) snprintf(name, sizeof (name), "%s_%s_%u", + zio_type_name[t], zio_taskq_types[q], i); + } else { + (void) snprintf(name, sizeof (name), "%s_%s", + zio_type_name[t], zio_taskq_types[q]); + } #ifdef SYSDC - if (zio_taskq_sysdc && spa->spa_proc != &p0) { - if (batch) - flags |= TASKQ_DC_BATCH; + if (zio_taskq_sysdc && spa->spa_proc != &p0) { + if (batch) + flags |= TASKQ_DC_BATCH; - return (taskq_create_sysdc(name, value, 50, INT_MAX, - spa->spa_proc, zio_taskq_basedc, flags)); - } + tq = taskq_create_sysdc(name, value, 50, INT_MAX, + spa->spa_proc, zio_taskq_basedc, flags); + } else { +#endif + tq = taskq_create_proc(name, value, maxclsyspri, 50, + INT_MAX, spa->spa_proc, flags); +#ifdef SYSDC + } #endif - return (taskq_create_proc(name, value, maxclsyspri, 50, INT_MAX, - spa->spa_proc, flags)); + + tqs->stqs_taskq[i] = tq; + } +} + +static void +spa_taskqs_fini(spa_t *spa, zio_type_t t, zio_taskq_type_t q) +{ + spa_taskqs_t *tqs = &spa->spa_zio_taskq[t][q]; + + if (tqs->stqs_taskq == NULL) { + ASSERT0(tqs->stqs_count); + return; + } + + for (uint_t i = 0; i < tqs->stqs_count; i++) { + ASSERT3P(tqs->stqs_taskq[i], !=, NULL); + taskq_destroy(tqs->stqs_taskq[i]); + } + + kmem_free(tqs->stqs_taskq, tqs->stqs_count * sizeof (taskq_t *)); + tqs->stqs_taskq = NULL; +} + +/* + * Dispatch a task to the appropriate taskq for the ZFS I/O type and priority. + * Note that a type may have multiple discrete taskqs to avoid lock contention + * on the taskq itself. In that case we choose which taskq at random by using + * the low bits of gethrtime(). + */ +void +spa_taskq_dispatch_ent(spa_t *spa, zio_type_t t, zio_taskq_type_t q, + task_func_t *func, void *arg, uint_t flags, taskq_ent_t *ent) +{ + spa_taskqs_t *tqs = &spa->spa_zio_taskq[t][q]; + taskq_t *tq; + + ASSERT3P(tqs->stqs_taskq, !=, NULL); + ASSERT3U(tqs->stqs_count, !=, 0); + + if (tqs->stqs_count == 1) { + tq = tqs->stqs_taskq[0]; + } else { + tq = tqs->stqs_taskq[gethrtime() % tqs->stqs_count]; + } + + taskq_dispatch_ent(tq, func, arg, flags, ent); } static void @@ -870,16 +959,7 @@ spa_create_zio_taskqs(spa_t *spa) { for (int t = 0; t < ZIO_TYPES; t++) { for (int q = 0; q < ZIO_TASKQ_TYPES; q++) { - const zio_taskq_info_t *ztip = &zio_taskqs[t][q]; - enum zti_modes mode = ztip->zti_mode; - uint_t value = ztip->zti_value; - char name[32]; - - (void) snprintf(name, sizeof (name), - "%s_%s", zio_type_name[t], zio_taskq_types[q]); - - spa->spa_zio_taskq[t][q] = - spa_taskq_create(spa, name, mode, value); + spa_taskqs_init(spa, t, q); } } } @@ -1056,9 +1136,7 @@ spa_deactivate(spa_t *spa) for (int t = 0; t < ZIO_TYPES; t++) { for (int q = 0; q < ZIO_TASKQ_TYPES; q++) { - if (spa->spa_zio_taskq[t][q] != NULL) - taskq_destroy(spa->spa_zio_taskq[t][q]); - spa->spa_zio_taskq[t][q] = NULL; + spa_taskqs_fini(spa, t, q); } } Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa_impl.h ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa_impl.h Thu Jan 16 14:48:23 2014 (r260752) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa_impl.h Thu Jan 16 14:48:26 2014 (r260753) @@ -81,16 +81,16 @@ typedef struct spa_config_dirent { char *scd_path; } spa_config_dirent_t; -enum zio_taskq_type { +typedef enum zio_taskq_type { ZIO_TASKQ_ISSUE = 0, ZIO_TASKQ_ISSUE_HIGH, ZIO_TASKQ_INTERRUPT, ZIO_TASKQ_INTERRUPT_HIGH, ZIO_TASKQ_TYPES -}; +} zio_taskq_type_t; /* - * State machine for the zpool-pooname process. The states transitions + * State machine for the zpool-poolname process. The states transitions * are done as follows: * * From To Routine @@ -108,6 +108,11 @@ typedef enum spa_proc_state { SPA_PROC_GONE /* spa_thread() is exiting, spa_proc = &p0 */ } spa_proc_state_t; +typedef struct spa_taskqs { + uint_t stqs_count; + taskq_t **stqs_taskq; +} spa_taskqs_t; + struct spa { /* * Fields protected by spa_namespace_lock. @@ -126,7 +131,7 @@ struct spa { uint8_t spa_sync_on; /* sync threads are running */ spa_load_state_t spa_load_state; /* current load operation */ uint64_t spa_import_flags; /* import specific flags */ - taskq_t *spa_zio_taskq[ZIO_TYPES][ZIO_TASKQ_TYPES]; + spa_taskqs_t spa_zio_taskq[ZIO_TYPES][ZIO_TASKQ_TYPES]; dsl_pool_t *spa_dsl_pool; boolean_t spa_is_initializing; /* true while opening pool */ metaslab_class_t *spa_normal_class; /* normal data class */ @@ -257,6 +262,9 @@ struct spa { extern const char *spa_config_path; +extern void spa_taskq_dispatch_ent(spa_t *spa, zio_type_t t, zio_taskq_type_t q, + task_func_t *func, void *arg, uint_t flags, taskq_ent_t *ent); + #ifdef __cplusplus } #endif Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c Thu Jan 16 14:48:23 2014 (r260752) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c Thu Jan 16 14:48:26 2014 (r260753) @@ -1180,7 +1180,7 @@ zio_free_bp_init(zio_t *zio) */ static void -zio_taskq_dispatch(zio_t *zio, enum zio_taskq_type q, boolean_t cutinline) +zio_taskq_dispatch(zio_t *zio, zio_taskq_type_t q, boolean_t cutinline) { spa_t *spa = zio->io_spa; zio_type_t t = zio->io_type; @@ -1203,10 +1203,11 @@ zio_taskq_dispatch(zio_t *zio, enum zio_ t = ZIO_TYPE_NULL; /* - * If this is a high priority I/O, then use the high priority taskq. + * If this is a high priority I/O, then use the high priority taskq if + * available. */ if (zio->io_priority == ZIO_PRIORITY_NOW && - spa->spa_zio_taskq[t][q + 1] != NULL) + spa->spa_zio_taskq[t][q + 1].stqs_count != 0) q++; ASSERT3U(q, <, ZIO_TASKQ_TYPES); @@ -1221,19 +1222,24 @@ zio_taskq_dispatch(zio_t *zio, enum zio_ #else ASSERT(zio->io_tqent.tqent_task.ta_pending == 0); #endif - taskq_dispatch_ent(spa->spa_zio_taskq[t][q], - (task_func_t *)zio_execute, zio, flags, &zio->io_tqent); + spa_taskq_dispatch_ent(spa, t, q, (task_func_t *)zio_execute, zio, + flags, &zio->io_tqent); } static boolean_t -zio_taskq_member(zio_t *zio, enum zio_taskq_type q) +zio_taskq_member(zio_t *zio, zio_taskq_type_t q) { kthread_t *executor = zio->io_executor; spa_t *spa = zio->io_spa; - for (zio_type_t t = 0; t < ZIO_TYPES; t++) - if (taskq_member(spa->spa_zio_taskq[t][q], executor)) - return (B_TRUE); + for (zio_type_t t = 0; t < ZIO_TYPES; t++) { + spa_taskqs_t *tqs = &spa->spa_zio_taskq[t][q]; + uint_t i; + for (i = 0; i < tqs->stqs_count; i++) { + if (taskq_member(tqs->stqs_taskq[i], executor)) + return (B_TRUE); + } + } return (B_FALSE); } @@ -3145,10 +3151,9 @@ zio_done(zio_t *zio) #else ASSERT(zio->io_tqent.tqent_task.ta_pending == 0); #endif - (void) taskq_dispatch_ent( - spa->spa_zio_taskq[ZIO_TYPE_CLAIM][ZIO_TASKQ_ISSUE], - (task_func_t *)zio_reexecute, zio, 0, - &zio->io_tqent); + spa_taskq_dispatch_ent(spa, ZIO_TYPE_CLAIM, + ZIO_TASKQ_ISSUE, (task_func_t *)zio_reexecute, zio, + 0, &zio->io_tqent); } return (ZIO_PIPELINE_STOP); } From owner-svn-src-stable-8@FreeBSD.ORG Thu Jan 16 15:11:51 2014 Return-Path: Delivered-To: svn-src-stable-8@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 47921BE9; Thu, 16 Jan 2014 15:11:51 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 3024E15E1; Thu, 16 Jan 2014 15:11:51 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id s0GFBpaU053070; Thu, 16 Jan 2014 15:11:51 GMT (envelope-from avg@svn.freebsd.org) Received: (from avg@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id s0GFBm6U053058; Thu, 16 Jan 2014 15:11:48 GMT (envelope-from avg@svn.freebsd.org) Message-Id: <201401161511.s0GFBm6U053058@svn.freebsd.org> From: Andriy Gapon Date: Thu, 16 Jan 2014 15:11:48 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r260755 - in stable/8: cddl/contrib/opensolaris/lib/libzpool/common cddl/contrib/opensolaris/lib/libzpool/common/sys sys/cddl/compat/opensolaris/sys sys/cddl/contrib/opensolaris/uts/com... X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Jan 2014 15:11:51 -0000 Author: avg Date: Thu Jan 16 15:11:48 2014 New Revision: 260755 URL: http://svnweb.freebsd.org/changeset/base/260755 Log: MFC r255437: MFV r247844 (illumos-gate 13975:ef6409bc370f) Note that a different kind of cv_timedwait_hires shim is provided in this branch because cv_timedwait_sbt is not available for better emulation. Sponsored by: HybridCluster [merge] Modified: stable/8/cddl/contrib/opensolaris/lib/libzpool/common/kernel.c stable/8/cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h stable/8/sys/cddl/compat/opensolaris/sys/kcondvar.h stable/8/sys/cddl/compat/opensolaris/sys/time.h stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dir.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/txg.h stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/txg_impl.h stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c Directory Properties: stable/8/cddl/contrib/opensolaris/ (props changed) stable/8/sys/ (props changed) stable/8/sys/cddl/ (props changed) stable/8/sys/cddl/contrib/opensolaris/ (props changed) Modified: stable/8/cddl/contrib/opensolaris/lib/libzpool/common/kernel.c ============================================================================== --- stable/8/cddl/contrib/opensolaris/lib/libzpool/common/kernel.c Thu Jan 16 15:10:29 2014 (r260754) +++ stable/8/cddl/contrib/opensolaris/lib/libzpool/common/kernel.c Thu Jan 16 15:11:48 2014 (r260755) @@ -346,6 +346,41 @@ top: return (1); } +/*ARGSUSED*/ +clock_t +cv_timedwait_hires(kcondvar_t *cv, kmutex_t *mp, hrtime_t tim, hrtime_t res, + int flag) +{ + int error; + timestruc_t ts; + hrtime_t delta; + + ASSERT(flag == 0); + +top: + delta = tim - gethrtime(); + if (delta <= 0) + return (-1); + + ts.tv_sec = delta / NANOSEC; + ts.tv_nsec = delta % NANOSEC; + + ASSERT(mutex_owner(mp) == curthread); + mp->m_owner = NULL; + error = pthread_cond_timedwait(cv, &mp->m_lock, &ts); + mp->m_owner = curthread; + + if (error == ETIMEDOUT) + return (-1); + + if (error == EINTR) + goto top; + + ASSERT(error == 0); + + return (1); +} + void cv_signal(kcondvar_t *cv) { Modified: stable/8/cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h ============================================================================== --- stable/8/cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h Thu Jan 16 15:10:29 2014 (r260754) +++ stable/8/cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h Thu Jan 16 15:11:48 2014 (r260755) @@ -316,6 +316,8 @@ extern void cv_init(kcondvar_t *cv, char extern void cv_destroy(kcondvar_t *cv); extern void cv_wait(kcondvar_t *cv, kmutex_t *mp); extern clock_t cv_timedwait(kcondvar_t *cv, kmutex_t *mp, clock_t abstime); +extern clock_t cv_timedwait_hires(kcondvar_t *cvp, kmutex_t *mp, hrtime_t tim, + hrtime_t res, int flag); extern void cv_signal(kcondvar_t *cv); extern void cv_broadcast(kcondvar_t *cv); Modified: stable/8/sys/cddl/compat/opensolaris/sys/kcondvar.h ============================================================================== --- stable/8/sys/cddl/compat/opensolaris/sys/kcondvar.h Thu Jan 16 15:10:29 2014 (r260754) +++ stable/8/sys/cddl/compat/opensolaris/sys/kcondvar.h Thu Jan 16 15:11:48 2014 (r260755) @@ -1,5 +1,6 @@ /*- * Copyright (c) 2007 Pawel Jakub Dawidek + * Copyright (c) 2013 iXsystems, Inc. * All rights reserved. * * Redistribution and use in source and binary forms, with or without @@ -36,6 +37,8 @@ #include #include +#include +#include typedef struct cv kcondvar_t; @@ -57,6 +60,18 @@ typedef enum { } while (0) #define cv_init(cv, name, type, arg) zfs_cv_init((cv), (name), (type), (arg)) +static clock_t +cv_timedwait_hires(kcondvar_t *cvp, kmutex_t *mp, hrtime_t tim, hrtime_t res, + int flag) +{ + /* XXX real hires is not available. */ + + /* We do not attempt to support any flags yet. */ + ASSERT(flag == 0); + + return (cv_timedwait(cvp, mp, NSEC_TO_TICK(tim))); +} + #endif /* _KERNEL */ #endif /* _OPENSOLARIS_SYS_CONDVAR_H_ */ Modified: stable/8/sys/cddl/compat/opensolaris/sys/time.h ============================================================================== --- stable/8/sys/cddl/compat/opensolaris/sys/time.h Thu Jan 16 15:10:29 2014 (r260754) +++ stable/8/sys/cddl/compat/opensolaris/sys/time.h Thu Jan 16 15:11:48 2014 (r260755) @@ -37,6 +37,9 @@ #define NANOSEC 1000000000 #define TIME_MAX LLONG_MAX +#define MSEC2NSEC(m) ((hrtime_t)(m) * (NANOSEC / MILLISEC)) +#define NSEC2MSEC(n) ((n) / (NANOSEC / MILLISEC)) + typedef longlong_t hrtime_t; #if defined(__i386__) || defined(__powerpc__) Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dir.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dir.c Thu Jan 16 15:10:29 2014 (r260754) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dir.c Thu Jan 16 15:11:48 2014 (r260755) @@ -744,7 +744,8 @@ dsl_dir_tempreserve_space(dsl_dir_t *dd, err = dsl_pool_tempreserve_space(dd->dd_pool, asize, tx); } else { if (err == EAGAIN) { - txg_delay(dd->dd_pool, tx->tx_txg, 1); + txg_delay(dd->dd_pool, tx->tx_txg, + MSEC2NSEC(10), MSEC2NSEC(10)); err = SET_ERROR(ERESTART); } dsl_pool_memory_pressure(dd->dd_pool); Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c Thu Jan 16 15:10:29 2014 (r260754) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c Thu Jan 16 15:11:48 2014 (r260755) @@ -85,6 +85,9 @@ SYSCTL_QUAD(_vfs_zfs, OID_AUTO, write_li &zfs_write_limit_override, 0, "Force a txg if dirty buffers exceed this value (bytes)"); +hrtime_t zfs_throttle_delay = MSEC2NSEC(10); +hrtime_t zfs_throttle_resolution = MSEC2NSEC(10); + int dsl_pool_open_special_dir(dsl_pool_t *dp, const char *name, dsl_dir_t **ddp) { @@ -538,12 +541,13 @@ dsl_pool_sync(dsl_pool_t *dp, uint64_t t * Weight the throughput calculation towards the current value: * thru = 3/4 old_thru + 1/4 new_thru * - * Note: write_time is in nanosecs, so write_time/MICROSEC - * yields millisecs + * Note: write_time is in nanosecs while dp_throughput is expressed in + * bytes per millisecond. */ ASSERT(zfs_write_limit_min > 0); - if (data_written > zfs_write_limit_min / 8 && write_time > MICROSEC) { - uint64_t throughput = data_written / (write_time / MICROSEC); + if (data_written > zfs_write_limit_min / 8 && + write_time > MSEC2NSEC(1)) { + uint64_t throughput = data_written / NSEC2MSEC(write_time); if (dp->dp_throughput) dp->dp_throughput = throughput / 4 + @@ -641,8 +645,10 @@ dsl_pool_tempreserve_space(dsl_pool_t *d * the caller 1 clock tick. This will slow down the "fill" * rate until the sync process can catch up with us. */ - if (reserved && reserved > (write_limit - (write_limit >> 3))) - txg_delay(dp, tx->tx_txg, 1); + if (reserved && reserved > (write_limit - (write_limit >> 3))) { + txg_delay(dp, tx->tx_txg, zfs_throttle_delay, + zfs_throttle_resolution); + } return (0); } Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c Thu Jan 16 15:10:29 2014 (r260754) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c Thu Jan 16 15:11:48 2014 (r260755) @@ -443,7 +443,7 @@ dsl_scan_check_pause(dsl_scan_t *scn, co zfs_resilver_min_time_ms : zfs_scan_min_time_ms; elapsed_nanosecs = gethrtime() - scn->scn_sync_start_time; if (elapsed_nanosecs / NANOSEC > zfs_txg_timeout || - (elapsed_nanosecs / MICROSEC > mintime && + (NSEC2MSEC(elapsed_nanosecs) > mintime && txg_sync_waiting(scn->scn_dp)) || spa_shutting_down(scn->scn_dp->dp_spa)) { if (zb) { @@ -1348,7 +1348,7 @@ dsl_scan_free_should_pause(dsl_scan_t *s elapsed_nanosecs = gethrtime() - scn->scn_sync_start_time; return (elapsed_nanosecs / NANOSEC > zfs_txg_timeout || - (elapsed_nanosecs / MICROSEC > zfs_free_min_time_ms && + (NSEC2MSEC(elapsed_nanosecs) > zfs_free_min_time_ms && txg_sync_waiting(scn->scn_dp)) || spa_shutting_down(scn->scn_dp->dp_spa)); } @@ -1472,7 +1472,7 @@ dsl_scan_sync(dsl_pool_t *dp, dmu_tx_t * "free_bpobj/bptree txg %llu", (longlong_t)scn->scn_visited_this_txg, (longlong_t) - (gethrtime() - scn->scn_sync_start_time) / MICROSEC, + NSEC2MSEC(gethrtime() - scn->scn_sync_start_time), (longlong_t)tx->tx_txg); scn->scn_visited_this_txg = 0; /* @@ -1520,7 +1520,7 @@ dsl_scan_sync(dsl_pool_t *dp, dmu_tx_t * zfs_dbgmsg("visited %llu blocks in %llums", (longlong_t)scn->scn_visited_this_txg, - (longlong_t)(gethrtime() - scn->scn_sync_start_time) / MICROSEC); + (longlong_t)NSEC2MSEC(gethrtime() - scn->scn_sync_start_time)); if (!scn->scn_pausing) { /* finished with scan. */ Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c Thu Jan 16 15:10:29 2014 (r260754) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c Thu Jan 16 15:11:48 2014 (r260755) @@ -538,8 +538,8 @@ spa_add(const char *name, nvlist_t *conf hdlr.cyh_level = CY_LOW_LEVEL; #endif - spa->spa_deadman_synctime = zfs_deadman_synctime * - zfs_txg_synctime_ms * MICROSEC; + spa->spa_deadman_synctime = MSEC2NSEC(zfs_deadman_synctime * + zfs_txg_synctime_ms); #ifdef illumos /* @@ -548,7 +548,7 @@ spa_add(const char *name, nvlist_t *conf * an expensive operation we don't want to check too frequently. * Instead wait for 5 synctimes before checking again. */ - when.cyt_interval = 5ULL * zfs_txg_synctime_ms * MICROSEC; + when.cyt_interval = MSEC2NSEC(5 * zfs_txg_synctime_ms); when.cyt_when = CY_INFINITY; mutex_enter(&cpu_lock); spa->spa_deadman_cycid = cyclic_add(&hdlr, &when); Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/txg.h ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/txg.h Thu Jan 16 15:10:29 2014 (r260754) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/txg.h Thu Jan 16 15:11:48 2014 (r260755) @@ -74,13 +74,8 @@ extern void txg_rele_to_quiesce(txg_hand extern void txg_rele_to_sync(txg_handle_t *txghp); extern void txg_register_callbacks(txg_handle_t *txghp, list_t *tx_callbacks); -/* - * Delay the caller by the specified number of ticks or until - * the txg closes (whichever comes first). This is intended - * to be used to throttle writers when the system nears its - * capacity. - */ -extern void txg_delay(struct dsl_pool *dp, uint64_t txg, int ticks); +extern void txg_delay(struct dsl_pool *dp, uint64_t txg, hrtime_t delta, + hrtime_t resolution); /* * Wait until the given transaction group has finished syncing. Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/txg_impl.h ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/txg_impl.h Thu Jan 16 15:10:29 2014 (r260754) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/txg_impl.h Thu Jan 16 15:11:48 2014 (r260755) @@ -70,7 +70,7 @@ struct tx_cpu { kmutex_t tc_open_lock; /* protects tx_open_txg */ kmutex_t tc_lock; /* protects the rest of this struct */ kcondvar_t tc_cv[TXG_SIZE]; - uint64_t tc_count[TXG_SIZE]; + uint64_t tc_count[TXG_SIZE]; /* tx hold count on each txg */ list_t tc_callbacks[TXG_SIZE]; /* commit cb list */ char tc_pad[8]; /* pad to fill 3 cache lines */ }; @@ -87,8 +87,8 @@ struct tx_cpu { * every cpu (see txg_quiesce()). */ typedef struct tx_state { - tx_cpu_t *tx_cpu; /* protects right to enter txg */ - kmutex_t tx_sync_lock; /* protects tx_state_t */ + tx_cpu_t *tx_cpu; /* protects access to tx_open_txg */ + kmutex_t tx_sync_lock; /* protects the rest of this struct */ uint64_t tx_open_txg; /* currently open txg id */ uint64_t tx_quiesced_txg; /* quiesced txg waiting for sync */ uint64_t tx_syncing_txg; /* currently syncing txg id */ Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c Thu Jan 16 15:10:29 2014 (r260754) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c Thu Jan 16 15:11:48 2014 (r260755) @@ -172,7 +172,7 @@ txg_thread_exit(tx_state_t *tx, callb_cp } static void -txg_thread_wait(tx_state_t *tx, callb_cpr_t *cpr, kcondvar_t *cv, uint64_t time) +txg_thread_wait(tx_state_t *tx, callb_cpr_t *cpr, kcondvar_t *cv, clock_t time) { CALLB_CPR_SAFE_BEGIN(cpr); @@ -301,6 +301,9 @@ txg_quiesce(dsl_pool_t *dp, uint64_t txg ASSERT(txg == tx->tx_open_txg); tx->tx_open_txg++; + DTRACE_PROBE2(txg__quiescing, dsl_pool_t *, dp, uint64_t, txg); + DTRACE_PROBE2(txg__opened, dsl_pool_t *, dp, uint64_t, tx->tx_open_txg); + /* * Now that we've incremented tx_open_txg, we can let threads * enter the next transaction group. @@ -432,6 +435,7 @@ txg_sync_thread(void *arg) txg = tx->tx_quiesced_txg; tx->tx_quiesced_txg = 0; tx->tx_syncing_txg = txg; + DTRACE_PROBE2(txg__syncing, dsl_pool_t *, dp, uint64_t, txg); cv_broadcast(&tx->tx_quiesce_more_cv); dprintf("txg=%llu quiesce_txg=%llu sync_txg=%llu\n", @@ -445,6 +449,7 @@ txg_sync_thread(void *arg) mutex_enter(&tx->tx_sync_lock); tx->tx_synced_txg = txg; tx->tx_syncing_txg = 0; + DTRACE_PROBE2(txg__synced, dsl_pool_t *, dp, uint64_t, txg); cv_broadcast(&tx->tx_sync_done_cv); /* @@ -494,21 +499,22 @@ txg_quiesce_thread(void *arg) */ dprintf("quiesce done, handing off txg %llu\n", txg); tx->tx_quiesced_txg = txg; + DTRACE_PROBE2(txg__quiesced, dsl_pool_t *, dp, uint64_t, txg); cv_broadcast(&tx->tx_sync_more_cv); cv_broadcast(&tx->tx_quiesce_done_cv); } } /* - * Delay this thread by 'ticks' if we are still in the open transaction - * group and there is already a waiting txg quiescing or quiesced. - * Abort the delay if this txg stalls or enters the quiescing state. + * Delay this thread by delay nanoseconds if we are still in the open + * transaction group and there is already a waiting txg quiesing or quiesced. + * Abort the delay if this txg stalls or enters the quiesing state. */ void -txg_delay(dsl_pool_t *dp, uint64_t txg, int ticks) +txg_delay(dsl_pool_t *dp, uint64_t txg, hrtime_t delay, hrtime_t resolution) { tx_state_t *tx = &dp->dp_tx; - clock_t timeout = ddi_get_lbolt() + ticks; + hrtime_t start = gethrtime(); /* don't delay if this txg could transition to quiescing immediately */ if (tx->tx_open_txg > txg || @@ -521,10 +527,11 @@ txg_delay(dsl_pool_t *dp, uint64_t txg, return; } - while (ddi_get_lbolt() < timeout && - tx->tx_syncing_txg < txg-1 && !txg_stalled(dp)) - (void) cv_timedwait(&tx->tx_quiesce_more_cv, &tx->tx_sync_lock, - timeout - ddi_get_lbolt()); + while (gethrtime() - start < delay && + tx->tx_syncing_txg < txg-1 && !txg_stalled(dp)) { + (void) cv_timedwait_hires(&tx->tx_quiesce_more_cv, + &tx->tx_sync_lock, delay, resolution, 0); + } mutex_exit(&tx->tx_sync_lock); } From owner-svn-src-stable-8@FreeBSD.ORG Thu Jan 16 15:22:50 2014 Return-Path: Delivered-To: svn-src-stable-8@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 23C88F7D; Thu, 16 Jan 2014 15:22:50 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 0EFA916A9; Thu, 16 Jan 2014 15:22:50 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id s0GFMn2v057082; Thu, 16 Jan 2014 15:22:49 GMT (envelope-from avg@svn.freebsd.org) Received: (from avg@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id s0GFMnsB057081; Thu, 16 Jan 2014 15:22:49 GMT (envelope-from avg@svn.freebsd.org) Message-Id: <201401161522.s0GFMnsB057081@svn.freebsd.org> From: Andriy Gapon Date: Thu, 16 Jan 2014 15:22:49 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r260757 - stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Jan 2014 15:22:50 -0000 Author: avg Date: Thu Jan 16 15:22:49 2014 New Revision: 260757 URL: http://svnweb.freebsd.org/changeset/base/260757 Log: MFC r248426: Fix typo in sysctl description MFC slacker: mm Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c Directory Properties: stable/8/sys/ (props changed) stable/8/sys/cddl/ (props changed) stable/8/sys/cddl/contrib/opensolaris/ (props changed) Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c Thu Jan 16 15:22:26 2014 (r260756) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c Thu Jan 16 15:22:49 2014 (r260757) @@ -275,7 +275,7 @@ uint64_t zfs_deadman_synctime = 1000ULL; TUNABLE_QUAD("vfs.zfs.deadman_synctime", &zfs_deadman_synctime); SYSCTL_QUAD(_vfs_zfs, OID_AUTO, deadman_synctime, CTLFLAG_RDTUN, &zfs_deadman_synctime, 0, - "Stalled ZFS I/O expiration time in units of vfs.zfs.txg_synctime_ms"); + "Stalled ZFS I/O expiration time in units of vfs.zfs.txg.synctime_ms"); /* * Default value of -1 for zfs_deadman_enabled is resolved in From owner-svn-src-stable-8@FreeBSD.ORG Thu Jan 16 15:29:46 2014 Return-Path: Delivered-To: svn-src-stable-8@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id C084B2E4; Thu, 16 Jan 2014 15:29:46 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id AA43F16EB; Thu, 16 Jan 2014 15:29:46 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id s0GFTkrP058002; Thu, 16 Jan 2014 15:29:46 GMT (envelope-from avg@svn.freebsd.org) Received: (from avg@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id s0GFTjfs057990; Thu, 16 Jan 2014 15:29:45 GMT (envelope-from avg@svn.freebsd.org) Message-Id: <201401161529.s0GFTjfs057990@svn.freebsd.org> From: Andriy Gapon Date: Thu, 16 Jan 2014 15:29:45 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r260759 - in stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Jan 2014 15:29:46 -0000 Author: avg Date: Thu Jan 16 15:29:44 2014 New Revision: 260759 URL: http://svnweb.freebsd.org/changeset/base/260759 Log: MFC r251478: MFV r251474: 3137 L2ARC compression MFC slacker: delphij Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dbuf.h stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_objset.h Directory Properties: stable/8/sys/ (props changed) stable/8/sys/cddl/ (props changed) stable/8/sys/cddl/contrib/opensolaris/ (props changed) Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Thu Jan 16 15:26:16 2014 (r260758) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Thu Jan 16 15:29:44 2014 (r260759) @@ -22,6 +22,7 @@ * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright 2011 Nexenta Systems, Inc. All rights reserved. * Copyright (c) 2013 by Delphix. All rights reserved. + * Copyright (c) 2013 by Saso Kiselkov. All rights reserved. */ /* @@ -120,6 +121,7 @@ #include #include +#include #include #include #include @@ -333,7 +335,11 @@ typedef struct arc_stats { kstat_named_t arcstat_l2_cksum_bad; kstat_named_t arcstat_l2_io_error; kstat_named_t arcstat_l2_size; + kstat_named_t arcstat_l2_asize; kstat_named_t arcstat_l2_hdr_size; + kstat_named_t arcstat_l2_compress_successes; + kstat_named_t arcstat_l2_compress_zeros; + kstat_named_t arcstat_l2_compress_failures; kstat_named_t arcstat_l2_write_trylock_fail; kstat_named_t arcstat_l2_write_passed_headroom; kstat_named_t arcstat_l2_write_spa_mismatch; @@ -406,7 +412,11 @@ static arc_stats_t arc_stats = { { "l2_cksum_bad", KSTAT_DATA_UINT64 }, { "l2_io_error", KSTAT_DATA_UINT64 }, { "l2_size", KSTAT_DATA_UINT64 }, + { "l2_asize", KSTAT_DATA_UINT64 }, { "l2_hdr_size", KSTAT_DATA_UINT64 }, + { "l2_compress_successes", KSTAT_DATA_UINT64 }, + { "l2_compress_zeros", KSTAT_DATA_UINT64 }, + { "l2_compress_failures", KSTAT_DATA_UINT64 }, { "l2_write_trylock_fail", KSTAT_DATA_UINT64 }, { "l2_write_passed_headroom", KSTAT_DATA_UINT64 }, { "l2_write_spa_mismatch", KSTAT_DATA_UINT64 }, @@ -485,6 +495,9 @@ static arc_state_t *arc_l2c_only; #define arc_c_min ARCSTAT(arcstat_c_min) /* min target cache size */ #define arc_c_max ARCSTAT(arcstat_c_max) /* max target cache size */ +#define L2ARC_IS_VALID_COMPRESS(_c_) \ + ((_c_) == ZIO_COMPRESS_LZ4 || (_c_) == ZIO_COMPRESS_EMPTY) + static int arc_no_grow; /* Don't try to grow cache size */ static uint64_t arc_tempreserve; static uint64_t arc_loaned_bytes; @@ -647,7 +660,12 @@ uint64_t zfs_crc64_table[256]; */ #define L2ARC_WRITE_SIZE (8 * 1024 * 1024) /* initial write max */ -#define L2ARC_HEADROOM 2 /* num of writes */ +#define L2ARC_HEADROOM 2 /* num of writes */ +/* + * If we discover during ARC scan any buffers to be compressed, we boost + * our headroom for the next scanning cycle by this percentage multiple. + */ +#define L2ARC_HEADROOM_BOOST 200 #define L2ARC_FEED_SECS 1 /* caching interval secs */ #define L2ARC_FEED_MIN_MS 200 /* min caching interval ms */ @@ -658,6 +676,7 @@ uint64_t zfs_crc64_table[256]; uint64_t l2arc_write_max = L2ARC_WRITE_SIZE; /* default max write size */ uint64_t l2arc_write_boost = L2ARC_WRITE_SIZE; /* extra write during warmup */ uint64_t l2arc_headroom = L2ARC_HEADROOM; /* number of dev writes */ +uint64_t l2arc_headroom_boost = L2ARC_HEADROOM_BOOST; uint64_t l2arc_feed_secs = L2ARC_FEED_SECS; /* interval seconds */ uint64_t l2arc_feed_min_ms = L2ARC_FEED_MIN_MS; /* min interval milliseconds */ boolean_t l2arc_noprefetch = B_TRUE; /* don't cache prefetch bufs */ @@ -731,8 +750,6 @@ typedef struct l2arc_dev { vdev_t *l2ad_vdev; /* vdev */ spa_t *l2ad_spa; /* spa */ uint64_t l2ad_hand; /* next write location */ - uint64_t l2ad_write; /* desired write size, bytes */ - uint64_t l2ad_boost; /* warmup write boost, bytes */ uint64_t l2ad_start; /* first addr on device */ uint64_t l2ad_end; /* last addr on device */ uint64_t l2ad_evict; /* last addr eviction reached */ @@ -753,11 +770,12 @@ static kmutex_t l2arc_free_on_write_mtx; static uint64_t l2arc_ndev; /* number of devices */ typedef struct l2arc_read_callback { - arc_buf_t *l2rcb_buf; /* read buffer */ - spa_t *l2rcb_spa; /* spa */ - blkptr_t l2rcb_bp; /* original blkptr */ - zbookmark_t l2rcb_zb; /* original bookmark */ - int l2rcb_flags; /* original flags */ + arc_buf_t *l2rcb_buf; /* read buffer */ + spa_t *l2rcb_spa; /* spa */ + blkptr_t l2rcb_bp; /* original blkptr */ + zbookmark_t l2rcb_zb; /* original bookmark */ + int l2rcb_flags; /* original flags */ + enum zio_compress l2rcb_compress; /* applied compress */ } l2arc_read_callback_t; typedef struct l2arc_write_callback { @@ -767,8 +785,14 @@ typedef struct l2arc_write_callback { struct l2arc_buf_hdr { /* protected by arc_buf_hdr mutex */ - l2arc_dev_t *b_dev; /* L2ARC device */ - uint64_t b_daddr; /* disk address, offset byte */ + l2arc_dev_t *b_dev; /* L2ARC device */ + uint64_t b_daddr; /* disk address, offset byte */ + /* compression applied to buffer data */ + enum zio_compress b_compress; + /* real alloc'd buffer size depending on b_compress applied */ + int b_asize; + /* temporary buffer holder for in-flight compressed data */ + void *b_tmp_cdata; }; typedef struct l2arc_data_free { @@ -787,6 +811,11 @@ static void l2arc_read_done(zio_t *zio); static void l2arc_hdr_stat_add(void); static void l2arc_hdr_stat_remove(void); +static boolean_t l2arc_compress_buf(l2arc_buf_hdr_t *l2hdr); +static void l2arc_decompress_zio(zio_t *zio, arc_buf_hdr_t *hdr, + enum zio_compress c); +static void l2arc_release_cdata_buf(arc_buf_hdr_t *ab); + static uint64_t buf_hash(uint64_t spa, const dva_t *dva, uint64_t birth) { @@ -1705,6 +1734,7 @@ arc_hdr_destroy(arc_buf_hdr_t *hdr) hdr->b_size, 0); list_remove(l2hdr->b_dev->l2ad_buflist, hdr); ARCSTAT_INCR(arcstat_l2_size, -hdr->b_size); + ARCSTAT_INCR(arcstat_l2_asize, -l2hdr->b_asize); kmem_free(l2hdr, sizeof (l2arc_buf_hdr_t)); if (hdr->b_state == arc_l2c_only) l2arc_hdr_stat_remove(); @@ -3127,6 +3157,8 @@ top: arc_access(hdr, hash_lock); if (*arc_flags & ARC_L2CACHE) hdr->b_flags |= ARC_L2CACHE; + if (*arc_flags & ARC_L2COMPRESS) + hdr->b_flags |= ARC_L2COMPRESS; mutex_exit(hash_lock); ARCSTAT_BUMP(arcstat_hits); ARCSTAT_CONDSTAT(!(hdr->b_flags & ARC_PREFETCH), @@ -3167,6 +3199,8 @@ top: } if (*arc_flags & ARC_L2CACHE) hdr->b_flags |= ARC_L2CACHE; + if (*arc_flags & ARC_L2COMPRESS) + hdr->b_flags |= ARC_L2COMPRESS; if (BP_GET_LEVEL(bp) > 0) hdr->b_flags |= ARC_INDIRECT; } else { @@ -3183,6 +3217,8 @@ top: add_reference(hdr, hash_lock, private); if (*arc_flags & ARC_L2CACHE) hdr->b_flags |= ARC_L2CACHE; + if (*arc_flags & ARC_L2COMPRESS) + hdr->b_flags |= ARC_L2COMPRESS; buf = kmem_cache_alloc(buf_cache, KM_PUSHPAGE); buf->b_hdr = hdr; buf->b_data = NULL; @@ -3260,6 +3296,7 @@ top: cb->l2rcb_bp = *bp; cb->l2rcb_zb = *zb; cb->l2rcb_flags = zio_flags; + cb->l2rcb_compress = hdr->b_l2hdr->b_compress; ASSERT(addr >= VDEV_LABEL_START_SIZE && addr + size < vd->vdev_psize - @@ -3268,16 +3305,31 @@ top: /* * l2arc read. The SCL_L2ARC lock will be * released by l2arc_read_done(). + * Issue a null zio if the underlying buffer + * was squashed to zero size by compression. */ - rzio = zio_read_phys(pio, vd, addr, size, - buf->b_data, ZIO_CHECKSUM_OFF, - l2arc_read_done, cb, priority, zio_flags | - ZIO_FLAG_DONT_CACHE | ZIO_FLAG_CANFAIL | - ZIO_FLAG_DONT_PROPAGATE | - ZIO_FLAG_DONT_RETRY, B_FALSE); + if (hdr->b_l2hdr->b_compress == + ZIO_COMPRESS_EMPTY) { + rzio = zio_null(pio, spa, vd, + l2arc_read_done, cb, + zio_flags | ZIO_FLAG_DONT_CACHE | + ZIO_FLAG_CANFAIL | + ZIO_FLAG_DONT_PROPAGATE | + ZIO_FLAG_DONT_RETRY); + } else { + rzio = zio_read_phys(pio, vd, addr, + hdr->b_l2hdr->b_asize, + buf->b_data, ZIO_CHECKSUM_OFF, + l2arc_read_done, cb, priority, + zio_flags | ZIO_FLAG_DONT_CACHE | + ZIO_FLAG_CANFAIL | + ZIO_FLAG_DONT_PROPAGATE | + ZIO_FLAG_DONT_RETRY, B_FALSE); + } DTRACE_PROBE2(l2arc__read, vdev_t *, vd, zio_t *, rzio); - ARCSTAT_INCR(arcstat_l2_read_bytes, size); + ARCSTAT_INCR(arcstat_l2_read_bytes, + hdr->b_l2hdr->b_asize); if (*arc_flags & ARC_NOWAIT) { zio_nowait(rzio); @@ -3545,6 +3597,7 @@ arc_release(arc_buf_t *buf, void *tag) buf->b_private = NULL; if (l2hdr) { + ARCSTAT_INCR(arcstat_l2_asize, -l2hdr->b_asize); trim_map_free(l2hdr->b_dev->l2ad_vdev, l2hdr->b_daddr, hdr->b_size, 0); kmem_free(l2hdr, sizeof (l2arc_buf_hdr_t)); @@ -3695,9 +3748,9 @@ arc_write_done(zio_t *zio) zio_t * arc_write(zio_t *pio, spa_t *spa, uint64_t txg, - blkptr_t *bp, arc_buf_t *buf, boolean_t l2arc, const zio_prop_t *zp, - arc_done_func_t *ready, arc_done_func_t *done, void *private, - int priority, int zio_flags, const zbookmark_t *zb) + blkptr_t *bp, arc_buf_t *buf, boolean_t l2arc, boolean_t l2arc_compress, + const zio_prop_t *zp, arc_done_func_t *ready, arc_done_func_t *done, + void *private, int priority, int zio_flags, const zbookmark_t *zb) { arc_buf_hdr_t *hdr = buf->b_hdr; arc_write_callback_t *callback; @@ -3710,6 +3763,8 @@ arc_write(zio_t *pio, spa_t *spa, uint64 ASSERT(hdr->b_acb == NULL); if (l2arc) hdr->b_flags |= ARC_L2CACHE; + if (l2arc_compress) + hdr->b_flags |= ARC_L2COMPRESS; callback = kmem_zalloc(sizeof (arc_write_callback_t), KM_SLEEP); callback->awcb_ready = ready; callback->awcb_done = done; @@ -4160,8 +4215,12 @@ arc_fini(void) * 2. The L2ARC attempts to cache data from the ARC before it is evicted. * It does this by periodically scanning buffers from the eviction-end of * the MFU and MRU ARC lists, copying them to the L2ARC devices if they are - * not already there. It scans until a headroom of buffers is satisfied, - * which itself is a buffer for ARC eviction. The thread that does this is + * not already there. It scans until a headroom of buffers is satisfied, + * which itself is a buffer for ARC eviction. If a compressible buffer is + * found during scanning and selected for writing to an L2ARC device, we + * temporarily boost scanning headroom during the next scan cycle to make + * sure we adapt to compression effects (which might significantly reduce + * the data volume we write to L2ARC). The thread that does this is * l2arc_feed_thread(), illustrated below; example sizes are included to * provide a better sense of ratio than this diagram: * @@ -4226,6 +4285,11 @@ arc_fini(void) * l2arc_write_boost extra write bytes during device warmup * l2arc_noprefetch skip caching prefetched buffers * l2arc_headroom number of max device writes to precache + * l2arc_headroom_boost when we find compressed buffers during ARC + * scanning, we multiply headroom by this + * percentage factor for the next scan cycle, + * since more compressed buffers are likely to + * be present * l2arc_feed_secs seconds between L2ARC writing * * Tunables may be removed or added as future performance improvements are @@ -4272,14 +4336,24 @@ l2arc_write_eligible(uint64_t spa_guid, } static uint64_t -l2arc_write_size(l2arc_dev_t *dev) +l2arc_write_size(void) { uint64_t size; - size = dev->l2ad_write; + /* + * Make sure our globals have meaningful values in case the user + * altered them. + */ + size = l2arc_write_max; + if (size == 0) { + cmn_err(CE_NOTE, "Bad value for l2arc_write_max, value must " + "be greater than zero, resetting it to the default (%d)", + L2ARC_WRITE_SIZE); + size = l2arc_write_max = L2ARC_WRITE_SIZE; + } if (arc_warm == B_FALSE) - size += dev->l2ad_boost; + size += l2arc_write_boost; return (size); @@ -4453,12 +4527,20 @@ l2arc_write_done(zio_t *zio) continue; } + abl2 = ab->b_l2hdr; + + /* + * Release the temporary compressed buffer as soon as possible. + */ + if (abl2->b_compress != ZIO_COMPRESS_OFF) + l2arc_release_cdata_buf(ab); + if (zio->io_error != 0) { /* * Error - drop L2ARC entry. */ list_remove(buflist, ab); - abl2 = ab->b_l2hdr; + ARCSTAT_INCR(arcstat_l2_asize, -abl2->b_asize); ab->b_l2hdr = NULL; trim_map_free(abl2->b_dev->l2ad_vdev, abl2->b_daddr, ab->b_size, 0); @@ -4513,6 +4595,13 @@ l2arc_read_done(zio_t *zio) ASSERT3P(hash_lock, ==, HDR_LOCK(hdr)); /* + * If the buffer was compressed, decompress it first. + */ + if (cb->l2rcb_compress != ZIO_COMPRESS_OFF) + l2arc_decompress_zio(zio, hdr, cb->l2rcb_compress); + ASSERT(zio->io_data != NULL); + + /* * Check this survived the L2ARC journey. */ equal = arc_cksum_equal(buf); @@ -4708,6 +4797,7 @@ top: */ if (ab->b_l2hdr != NULL) { abl2 = ab->b_l2hdr; + ARCSTAT_INCR(arcstat_l2_asize, -abl2->b_asize); ab->b_l2hdr = NULL; kmem_free(abl2, sizeof (l2arc_buf_hdr_t)); ARCSTAT_INCR(arcstat_l2_size, -ab->b_size); @@ -4733,38 +4823,55 @@ top: * * An ARC_L2_WRITING flag is set so that the L2ARC buffers are not valid * for reading until they have completed writing. + * The headroom_boost is an in-out parameter used to maintain headroom boost + * state between calls to this function. + * + * Returns the number of bytes actually written (which may be smaller than + * the delta by which the device hand has changed due to alignment). */ static uint64_t -l2arc_write_buffers(spa_t *spa, l2arc_dev_t *dev, uint64_t target_sz) +l2arc_write_buffers(spa_t *spa, l2arc_dev_t *dev, uint64_t target_sz, + boolean_t *headroom_boost) { arc_buf_hdr_t *ab, *ab_prev, *head; - l2arc_buf_hdr_t *hdrl2; list_t *list; - uint64_t passed_sz, write_sz, buf_sz, headroom; + uint64_t write_asize, write_psize, write_sz, headroom, + buf_compress_minsz; void *buf_data; - kmutex_t *hash_lock, *list_lock; - boolean_t have_lock, full; + kmutex_t *list_lock; + boolean_t full; l2arc_write_callback_t *cb; zio_t *pio, *wzio; uint64_t guid = spa_load_guid(spa); + const boolean_t do_headroom_boost = *headroom_boost; int try; ASSERT(dev->l2ad_vdev != NULL); + /* Lower the flag now, we might want to raise it again later. */ + *headroom_boost = B_FALSE; + pio = NULL; - write_sz = 0; + write_sz = write_asize = write_psize = 0; full = B_FALSE; head = kmem_cache_alloc(hdr_cache, KM_PUSHPAGE); head->b_flags |= ARC_L2_WRITE_HEAD; ARCSTAT_BUMP(arcstat_l2_write_buffer_iter); /* + * We will want to try to compress buffers that are at least 2x the + * device sector size. + */ + buf_compress_minsz = 2 << dev->l2ad_vdev->vdev_ashift; + + /* * Copy buffers for L2ARC writing. */ mutex_enter(&l2arc_buflist_mtx); for (try = 0; try < 2 * ARC_BUFC_NUMLISTS; try++) { + uint64_t passed_sz = 0; + list = l2arc_list_locked(try, &list_lock); - passed_sz = 0; ARCSTAT_BUMP(arcstat_l2_write_buffer_list_iter); /* @@ -4773,7 +4880,6 @@ l2arc_write_buffers(spa_t *spa, l2arc_de * Until the ARC is warm and starts to evict, read from the * head of the ARC lists rather than the tail. */ - headroom = target_sz * l2arc_headroom; if (arc_warm == B_FALSE) ab = list_head(list); else @@ -4781,7 +4887,15 @@ l2arc_write_buffers(spa_t *spa, l2arc_de if (ab == NULL) ARCSTAT_BUMP(arcstat_l2_write_buffer_list_null_iter); + headroom = target_sz * l2arc_headroom; + if (do_headroom_boost) + headroom = (headroom * l2arc_headroom_boost) / 100; + for (; ab; ab = ab_prev) { + l2arc_buf_hdr_t *l2hdr; + kmutex_t *hash_lock; + uint64_t buf_sz; + if (arc_warm == B_FALSE) ab_prev = list_next(list, ab); else @@ -4789,8 +4903,7 @@ l2arc_write_buffers(spa_t *spa, l2arc_de ARCSTAT_INCR(arcstat_l2_write_buffer_bytes_scanned, ab->b_size); hash_lock = HDR_LOCK(ab); - have_lock = MUTEX_HELD(hash_lock); - if (!have_lock && !mutex_tryenter(hash_lock)) { + if (!mutex_tryenter(hash_lock)) { ARCSTAT_BUMP(arcstat_l2_write_trylock_fail); /* * Skip this buffer rather than waiting. @@ -4840,15 +4953,26 @@ l2arc_write_buffers(spa_t *spa, l2arc_de /* * Create and add a new L2ARC header. */ - hdrl2 = kmem_zalloc(sizeof (l2arc_buf_hdr_t), KM_SLEEP); - hdrl2->b_dev = dev; - hdrl2->b_daddr = dev->l2ad_hand; - + l2hdr = kmem_zalloc(sizeof (l2arc_buf_hdr_t), KM_SLEEP); + l2hdr->b_dev = dev; ab->b_flags |= ARC_L2_WRITING; - ab->b_l2hdr = hdrl2; - list_insert_head(dev->l2ad_buflist, ab); - buf_data = ab->b_buf->b_data; + + /* + * Temporarily stash the data buffer in b_tmp_cdata. + * The subsequent write step will pick it up from + * there. This is because can't access ab->b_buf + * without holding the hash_lock, which we in turn + * can't access without holding the ARC list locks + * (which we want to avoid during compression/writing). + */ + l2hdr->b_compress = ZIO_COMPRESS_OFF; + l2hdr->b_asize = ab->b_size; + l2hdr->b_tmp_cdata = ab->b_buf->b_data; + buf_sz = ab->b_size; + ab->b_l2hdr = l2hdr; + + list_insert_head(dev->l2ad_buflist, ab); /* * Compute and store the buffer cksum before @@ -4859,6 +4983,64 @@ l2arc_write_buffers(spa_t *spa, l2arc_de mutex_exit(hash_lock); + write_sz += buf_sz; + } + + mutex_exit(list_lock); + + if (full == B_TRUE) + break; + } + + /* No buffers selected for writing? */ + if (pio == NULL) { + ASSERT0(write_sz); + mutex_exit(&l2arc_buflist_mtx); + kmem_cache_free(hdr_cache, head); + return (0); + } + + /* + * Now start writing the buffers. We're starting at the write head + * and work backwards, retracing the course of the buffer selector + * loop above. + */ + for (ab = list_prev(dev->l2ad_buflist, head); ab; + ab = list_prev(dev->l2ad_buflist, ab)) { + l2arc_buf_hdr_t *l2hdr; + uint64_t buf_sz; + + /* + * We shouldn't need to lock the buffer here, since we flagged + * it as ARC_L2_WRITING in the previous step, but we must take + * care to only access its L2 cache parameters. In particular, + * ab->b_buf may be invalid by now due to ARC eviction. + */ + l2hdr = ab->b_l2hdr; + l2hdr->b_daddr = dev->l2ad_hand; + + if ((ab->b_flags & ARC_L2COMPRESS) && + l2hdr->b_asize >= buf_compress_minsz) { + if (l2arc_compress_buf(l2hdr)) { + /* + * If compression succeeded, enable headroom + * boost on the next scan cycle. + */ + *headroom_boost = B_TRUE; + } + } + + /* + * Pick up the buffer data we had previously stashed away + * (and now potentially also compressed). + */ + buf_data = l2hdr->b_tmp_cdata; + buf_sz = l2hdr->b_asize; + + /* Compression may have squashed the buffer to zero length. */ + if (buf_sz != 0) { + uint64_t buf_p_sz; + wzio = zio_write_phys(pio, dev->l2ad_vdev, dev->l2ad_hand, buf_sz, buf_data, ZIO_CHECKSUM_OFF, NULL, NULL, ZIO_PRIORITY_ASYNC_WRITE, @@ -4868,33 +5050,24 @@ l2arc_write_buffers(spa_t *spa, l2arc_de zio_t *, wzio); (void) zio_nowait(wzio); + write_asize += buf_sz; /* * Keep the clock hand suitably device-aligned. */ - buf_sz = vdev_psize_to_asize(dev->l2ad_vdev, buf_sz); - - write_sz += buf_sz; - dev->l2ad_hand += buf_sz; + buf_p_sz = vdev_psize_to_asize(dev->l2ad_vdev, buf_sz); + write_psize += buf_p_sz; + dev->l2ad_hand += buf_p_sz; } - - mutex_exit(list_lock); - - if (full == B_TRUE) - break; } - mutex_exit(&l2arc_buflist_mtx); - if (pio == NULL) { - ASSERT0(write_sz); - kmem_cache_free(hdr_cache, head); - return (0); - } + mutex_exit(&l2arc_buflist_mtx); - ASSERT3U(write_sz, <=, target_sz); + ASSERT3U(write_asize, <=, target_sz); ARCSTAT_BUMP(arcstat_l2_writes_sent); - ARCSTAT_INCR(arcstat_l2_write_bytes, write_sz); + ARCSTAT_INCR(arcstat_l2_write_bytes, write_asize); ARCSTAT_INCR(arcstat_l2_size, write_sz); - vdev_space_update(dev->l2ad_vdev, write_sz, 0, 0); + ARCSTAT_INCR(arcstat_l2_asize, write_asize); + vdev_space_update(dev->l2ad_vdev, write_psize, 0, 0); /* * Bump device hand to the device start if it is approaching the end. @@ -4912,7 +5085,153 @@ l2arc_write_buffers(spa_t *spa, l2arc_de (void) zio_wait(pio); dev->l2ad_writing = B_FALSE; - return (write_sz); + return (write_asize); +} + +/* + * Compresses an L2ARC buffer. + * The data to be compressed must be prefilled in l2hdr->b_tmp_cdata and its + * size in l2hdr->b_asize. This routine tries to compress the data and + * depending on the compression result there are three possible outcomes: + * *) The buffer was incompressible. The original l2hdr contents were left + * untouched and are ready for writing to an L2 device. + * *) The buffer was all-zeros, so there is no need to write it to an L2 + * device. To indicate this situation b_tmp_cdata is NULL'ed, b_asize is + * set to zero and b_compress is set to ZIO_COMPRESS_EMPTY. + * *) Compression succeeded and b_tmp_cdata was replaced with a temporary + * data buffer which holds the compressed data to be written, and b_asize + * tells us how much data there is. b_compress is set to the appropriate + * compression algorithm. Once writing is done, invoke + * l2arc_release_cdata_buf on this l2hdr to free this temporary buffer. + * + * Returns B_TRUE if compression succeeded, or B_FALSE if it didn't (the + * buffer was incompressible). + */ +static boolean_t +l2arc_compress_buf(l2arc_buf_hdr_t *l2hdr) +{ + void *cdata; + size_t csize, len; + + ASSERT(l2hdr->b_compress == ZIO_COMPRESS_OFF); + ASSERT(l2hdr->b_tmp_cdata != NULL); + + len = l2hdr->b_asize; + cdata = zio_data_buf_alloc(len); + csize = zio_compress_data(ZIO_COMPRESS_LZ4, l2hdr->b_tmp_cdata, + cdata, l2hdr->b_asize); + + if (csize == 0) { + /* zero block, indicate that there's nothing to write */ + zio_data_buf_free(cdata, len); + l2hdr->b_compress = ZIO_COMPRESS_EMPTY; + l2hdr->b_asize = 0; + l2hdr->b_tmp_cdata = NULL; + ARCSTAT_BUMP(arcstat_l2_compress_zeros); + return (B_TRUE); + } else if (csize > 0 && csize < len) { + /* + * Compression succeeded, we'll keep the cdata around for + * writing and release it afterwards. + */ + l2hdr->b_compress = ZIO_COMPRESS_LZ4; + l2hdr->b_asize = csize; + l2hdr->b_tmp_cdata = cdata; + ARCSTAT_BUMP(arcstat_l2_compress_successes); + return (B_TRUE); + } else { + /* + * Compression failed, release the compressed buffer. + * l2hdr will be left unmodified. + */ + zio_data_buf_free(cdata, len); + ARCSTAT_BUMP(arcstat_l2_compress_failures); + return (B_FALSE); + } +} + +/* + * Decompresses a zio read back from an l2arc device. On success, the + * underlying zio's io_data buffer is overwritten by the uncompressed + * version. On decompression error (corrupt compressed stream), the + * zio->io_error value is set to signal an I/O error. + * + * Please note that the compressed data stream is not checksummed, so + * if the underlying device is experiencing data corruption, we may feed + * corrupt data to the decompressor, so the decompressor needs to be + * able to handle this situation (LZ4 does). + */ +static void +l2arc_decompress_zio(zio_t *zio, arc_buf_hdr_t *hdr, enum zio_compress c) +{ + ASSERT(L2ARC_IS_VALID_COMPRESS(c)); + + if (zio->io_error != 0) { + /* + * An io error has occured, just restore the original io + * size in preparation for a main pool read. + */ + zio->io_orig_size = zio->io_size = hdr->b_size; + return; + } + + if (c == ZIO_COMPRESS_EMPTY) { + /* + * An empty buffer results in a null zio, which means we + * need to fill its io_data after we're done restoring the + * buffer's contents. + */ + ASSERT(hdr->b_buf != NULL); + bzero(hdr->b_buf->b_data, hdr->b_size); + zio->io_data = zio->io_orig_data = hdr->b_buf->b_data; + } else { + ASSERT(zio->io_data != NULL); + /* + * We copy the compressed data from the start of the arc buffer + * (the zio_read will have pulled in only what we need, the + * rest is garbage which we will overwrite at decompression) + * and then decompress back to the ARC data buffer. This way we + * can minimize copying by simply decompressing back over the + * original compressed data (rather than decompressing to an + * aux buffer and then copying back the uncompressed buffer, + * which is likely to be much larger). + */ + uint64_t csize; + void *cdata; + + csize = zio->io_size; + cdata = zio_data_buf_alloc(csize); + bcopy(zio->io_data, cdata, csize); + if (zio_decompress_data(c, cdata, zio->io_data, csize, + hdr->b_size) != 0) + zio->io_error = EIO; + zio_data_buf_free(cdata, csize); + } + + /* Restore the expected uncompressed IO size. */ + zio->io_orig_size = zio->io_size = hdr->b_size; +} + +/* + * Releases the temporary b_tmp_cdata buffer in an l2arc header structure. + * This buffer serves as a temporary holder of compressed data while + * the buffer entry is being written to an l2arc device. Once that is + * done, we can dispose of it. + */ +static void +l2arc_release_cdata_buf(arc_buf_hdr_t *ab) +{ + l2arc_buf_hdr_t *l2hdr = ab->b_l2hdr; + + if (l2hdr->b_compress == ZIO_COMPRESS_LZ4) { + /* + * If the data was compressed, then we've allocated a + * temporary buffer for it, so now we need to release it. + */ + ASSERT(l2hdr->b_tmp_cdata != NULL); + zio_data_buf_free(l2hdr->b_tmp_cdata, ab->b_size); + } + l2hdr->b_tmp_cdata = NULL; } /* @@ -4927,6 +5246,7 @@ l2arc_feed_thread(void *dummy __unused) spa_t *spa; uint64_t size, wrote; clock_t begin, next = ddi_get_lbolt(); + boolean_t headroom_boost = B_FALSE; CALLB_CPR_INIT(&cpr, &l2arc_feed_thr_lock, callb_generic_cpr, FTAG); @@ -4987,7 +5307,7 @@ l2arc_feed_thread(void *dummy __unused) ARCSTAT_BUMP(arcstat_l2_feeds); - size = l2arc_write_size(dev); + size = l2arc_write_size(); /* * Evict L2ARC buffers that will be overwritten. @@ -4997,7 +5317,7 @@ l2arc_feed_thread(void *dummy __unused) /* * Write ARC buffers. */ - wrote = l2arc_write_buffers(spa, dev, size); + wrote = l2arc_write_buffers(spa, dev, size, &headroom_boost); /* * Calculate interval between writes. @@ -5045,15 +5365,12 @@ l2arc_add_vdev(spa_t *spa, vdev_t *vd) adddev = kmem_zalloc(sizeof (l2arc_dev_t), KM_SLEEP); adddev->l2ad_spa = spa; adddev->l2ad_vdev = vd; - adddev->l2ad_write = l2arc_write_max; - adddev->l2ad_boost = l2arc_write_boost; adddev->l2ad_start = VDEV_LABEL_START_SIZE; adddev->l2ad_end = VDEV_LABEL_START_SIZE + vdev_get_min_asize(vd); adddev->l2ad_hand = adddev->l2ad_start; adddev->l2ad_evict = adddev->l2ad_start; adddev->l2ad_first = B_TRUE; adddev->l2ad_writing = B_FALSE; - ASSERT3U(adddev->l2ad_write, >, 0); /* * This is a list of all ARC buffers that are still valid on the Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Thu Jan 16 15:26:16 2014 (r260758) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Thu Jan 16 15:29:44 2014 (r260759) @@ -22,6 +22,7 @@ * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright 2011 Nexenta Systems, Inc. All rights reserved. * Copyright (c) 2013 by Delphix. All rights reserved. + * Copyright (c) 2013 by Saso Kiselkov. All rights reserved. */ #include @@ -575,6 +576,8 @@ dbuf_read_impl(dmu_buf_impl_t *db, zio_t if (DBUF_IS_L2CACHEABLE(db)) aflags |= ARC_L2CACHE; + if (DBUF_IS_L2COMPRESSIBLE(db)) + aflags |= ARC_L2COMPRESS; SET_BOOKMARK(&zb, db->db_objset->os_dsl_dataset ? db->db_objset->os_dsl_dataset->ds_object : DMU_META_OBJSET, @@ -2761,8 +2764,9 @@ dbuf_write(dbuf_dirty_record_t *dr, arc_ } else { ASSERT(arc_released(data)); dr->dr_zio = arc_write(zio, os->os_spa, txg, - db->db_blkptr, data, DBUF_IS_L2CACHEABLE(db), &zp, - dbuf_write_ready, dbuf_write_done, db, - ZIO_PRIORITY_ASYNC_WRITE, ZIO_FLAG_MUSTSUCCEED, &zb); + db->db_blkptr, data, DBUF_IS_L2CACHEABLE(db), + DBUF_IS_L2COMPRESSIBLE(db), &zp, dbuf_write_ready, + dbuf_write_done, db, ZIO_PRIORITY_ASYNC_WRITE, + ZIO_FLAG_MUSTSUCCEED, &zb); } } Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c Thu Jan 16 15:26:16 2014 (r260758) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c Thu Jan 16 15:29:44 2014 (r260759) @@ -23,6 +23,8 @@ * Copyright (c) 2013 by Delphix. All rights reserved. */ +/* Copyright (c) 2013 by Saso Kiselkov. All rights reserved. */ + #include #include #include @@ -1513,9 +1515,9 @@ dmu_sync(zio_t *pio, uint64_t txg, dmu_s dsa->dsa_tx = NULL; zio_nowait(arc_write(pio, os->os_spa, txg, - bp, dr->dt.dl.dr_data, DBUF_IS_L2CACHEABLE(db), &zp, - dmu_sync_ready, dmu_sync_done, dsa, - ZIO_PRIORITY_SYNC_WRITE, ZIO_FLAG_CANFAIL, &zb)); + bp, dr->dt.dl.dr_data, DBUF_IS_L2CACHEABLE(db), + DBUF_IS_L2COMPRESSIBLE(db), &zp, dmu_sync_ready, dmu_sync_done, + dsa, ZIO_PRIORITY_SYNC_WRITE, ZIO_FLAG_CANFAIL, &zb)); return (0); } Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c Thu Jan 16 15:26:16 2014 (r260758) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c Thu Jan 16 15:29:44 2014 (r260759) @@ -21,6 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright (c) 2013 by Delphix. All rights reserved. + * Copyright (c) 2013 by Saso Kiselkov. All rights reserved. */ /* Portions Copyright 2010 Robert Milkowski */ @@ -276,6 +277,8 @@ dmu_objset_open_impl(spa_t *spa, dsl_dat if (DMU_OS_IS_L2CACHEABLE(os)) aflags |= ARC_L2CACHE; + if (DMU_OS_IS_L2COMPRESSIBLE(os)) + aflags |= ARC_L2COMPRESS; dprintf_bp(os->os_rootbp, "reading %s", ""); err = arc_read(NULL, spa, os->os_rootbp, @@ -1023,9 +1026,10 @@ dmu_objset_sync(objset_t *os, zio_t *pio dmu_write_policy(os, NULL, 0, 0, &zp); zio = arc_write(pio, os->os_spa, tx->tx_txg, - os->os_rootbp, os->os_phys_buf, DMU_OS_IS_L2CACHEABLE(os), &zp, - dmu_objset_write_ready, dmu_objset_write_done, os, - ZIO_PRIORITY_ASYNC_WRITE, ZIO_FLAG_MUSTSUCCEED, &zb); + os->os_rootbp, os->os_phys_buf, DMU_OS_IS_L2CACHEABLE(os), + DMU_OS_IS_L2COMPRESSIBLE(os), &zp, dmu_objset_write_ready, + dmu_objset_write_done, os, ZIO_PRIORITY_ASYNC_WRITE, + ZIO_FLAG_MUSTSUCCEED, &zb); /* * Sync special dnodes - the parent IO for the sync is the root block Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h Thu Jan 16 15:26:16 2014 (r260758) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h Thu Jan 16 15:29:44 2014 (r260759) @@ -21,6 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright (c) 2012 by Delphix. All rights reserved. + * Copyright (c) 2013 by Saso Kiselkov. All rights reserved. */ #ifndef _SYS_ARC_H @@ -67,6 +68,7 @@ typedef enum arc_buf_contents { #define ARC_PREFETCH (1 << 3) /* I/O is a prefetch */ #define ARC_CACHED (1 << 4) /* I/O was already in cache */ #define ARC_L2CACHE (1 << 5) /* cache in L2ARC */ +#define ARC_L2COMPRESS (1 << 6) /* compress in L2ARC */ /* * The following breakdows of arc_size exist for kstat only. @@ -105,9 +107,9 @@ int arc_read(zio_t *pio, spa_t *spa, con arc_done_func_t *done, void *priv, int priority, int flags, uint32_t *arc_flags, const zbookmark_t *zb); zio_t *arc_write(zio_t *pio, spa_t *spa, uint64_t txg, - blkptr_t *bp, arc_buf_t *buf, boolean_t l2arc, const zio_prop_t *zp, - arc_done_func_t *ready, arc_done_func_t *done, void *priv, - int priority, int zio_flags, const zbookmark_t *zb); + blkptr_t *bp, arc_buf_t *buf, boolean_t l2arc, boolean_t l2arc_compress, + const zio_prop_t *zp, arc_done_func_t *ready, arc_done_func_t *done, + void *priv, int priority, int zio_flags, const zbookmark_t *zb); void arc_set_callback(arc_buf_t *buf, arc_evict_func_t *func, void *priv); int arc_buf_evict(arc_buf_t *buf); Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dbuf.h ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dbuf.h Thu Jan 16 15:26:16 2014 (r260758) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dbuf.h Thu Jan 16 15:29:44 2014 (r260759) @@ -21,6 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright (c) 2012 by Delphix. All rights reserved. + * Copyright (c) 2013 by Saso Kiselkov. All rights reserved. */ #ifndef _SYS_DBUF_H @@ -324,6 +325,9 @@ boolean_t dbuf_is_metadata(dmu_buf_impl_ (dbuf_is_metadata(_db) && \ ((_db)->db_objset->os_secondary_cache == ZFS_CACHE_METADATA))) +#define DBUF_IS_L2COMPRESSIBLE(_db) \ + ((_db)->db_objset->os_compress != ZIO_COMPRESS_OFF) + #ifdef ZFS_DEBUG /* Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_objset.h ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_objset.h Thu Jan 16 15:26:16 2014 (r260758) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_objset.h Thu Jan 16 15:29:44 2014 (r260759) @@ -21,6 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Copyright (c) 2012 by Delphix. All rights reserved. + * Copyright (c) 2013 by Saso Kiselkov. All rights reserved. */ /* Portions Copyright 2010 Robert Milkowski */ @@ -129,6 +130,8 @@ struct objset { ((os)->os_secondary_cache == ZFS_CACHE_ALL || \ (os)->os_secondary_cache == ZFS_CACHE_METADATA) +#define DMU_OS_IS_L2COMPRESSIBLE(os) ((os)->os_compress != ZIO_COMPRESS_OFF) + /* called from zpl */ int dmu_objset_hold(const char *name, void *tag, objset_t **osp); int dmu_objset_own(const char *name, dmu_objset_type_t type, From owner-svn-src-stable-8@FreeBSD.ORG Thu Jan 16 15:45:05 2014 Return-Path: Delivered-To: svn-src-stable-8@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 6092B78E; Thu, 16 Jan 2014 15:45:05 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 40EFD1851; Thu, 16 Jan 2014 15:45:05 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id s0GFj5B2065228; Thu, 16 Jan 2014 15:45:05 GMT (envelope-from avg@svn.freebsd.org) Received: (from avg@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id s0GFj47G065225; Thu, 16 Jan 2014 15:45:04 GMT (envelope-from avg@svn.freebsd.org) Message-Id: <201401161545.s0GFj47G065225@svn.freebsd.org> From: Andriy Gapon Date: Thu, 16 Jan 2014 15:45:04 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r260761 - in stable/8: cddl/contrib/opensolaris/cmd/ztest sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Jan 2014 15:45:05 -0000 Author: avg Date: Thu Jan 16 15:45:04 2014 New Revision: 260761 URL: http://svnweb.freebsd.org/changeset/base/260761 Log: MFC r254074: MFV r254070: Merge vendor bugfix for ZFS test suite that triggers false positives MFC slacker: delphij Modified: stable/8/cddl/contrib/opensolaris/cmd/ztest/ztest.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c Directory Properties: stable/8/cddl/contrib/opensolaris/ (props changed) stable/8/sys/ (props changed) stable/8/sys/cddl/ (props changed) stable/8/sys/cddl/contrib/opensolaris/ (props changed) Modified: stable/8/cddl/contrib/opensolaris/cmd/ztest/ztest.c ============================================================================== --- stable/8/cddl/contrib/opensolaris/cmd/ztest/ztest.c Thu Jan 16 15:43:17 2014 (r260760) +++ stable/8/cddl/contrib/opensolaris/cmd/ztest/ztest.c Thu Jan 16 15:45:04 2014 (r260761) @@ -186,6 +186,7 @@ static const ztest_shared_opts_t ztest_o extern uint64_t metaslab_gang_bang; extern uint64_t metaslab_df_alloc_threshold; +extern uint64_t zfs_deadman_synctime; static ztest_shared_opts_t *ztest_shared_opts; static ztest_shared_opts_t ztest_opts; @@ -365,7 +366,7 @@ ztest_info_t ztest_info[] = { { ztest_fault_inject, 1, &zopt_sometimes }, { ztest_ddt_repair, 1, &zopt_sometimes }, { ztest_dmu_snapshot_hold, 1, &zopt_sometimes }, - { ztest_reguid, 1, &zopt_sometimes }, + { ztest_reguid, 1, &zopt_rarely }, { ztest_spa_rename, 1, &zopt_rarely }, { ztest_scrub, 1, &zopt_rarely }, { ztest_spa_upgrade, 1, &zopt_rarely }, @@ -4756,6 +4757,14 @@ ztest_fault_inject(ztest_ds_t *zd, uint6 ASSERT(leaves >= 1); /* + * Grab the name lock as reader. There are some operations + * which don't like to have their vdevs changed while + * they are in progress (i.e. spa_change_guid). Those + * operations will have grabbed the name lock as writer. + */ + (void) rw_rdlock(&ztest_name_lock); + + /* * We need SCL_STATE here because we're going to look at vd0->vdev_tsd. */ spa_config_enter(spa, SCL_STATE, FTAG, RW_READER); @@ -4784,7 +4793,14 @@ ztest_fault_inject(ztest_ds_t *zd, uint6 if (vd0 != NULL && vd0->vdev_top->vdev_islog) islog = B_TRUE; - if (vd0 != NULL && maxfaults != 1) { + /* + * If the top-level vdev needs to be resilvered + * then we only allow faults on the device that is + * resilvering. + */ + if (vd0 != NULL && maxfaults != 1 && + (!vdev_resilver_needed(vd0->vdev_top, NULL, NULL) || + vd0->vdev_resilvering)) { /* * Make vd0 explicitly claim to be unreadable, * or unwriteable, or reach behind its back @@ -4815,6 +4831,7 @@ ztest_fault_inject(ztest_ds_t *zd, uint6 if (sav->sav_count == 0) { spa_config_exit(spa, SCL_STATE, FTAG); + (void) rw_unlock(&ztest_name_lock); return; } vd0 = sav->sav_vdevs[ztest_random(sav->sav_count)]; @@ -4828,6 +4845,7 @@ ztest_fault_inject(ztest_ds_t *zd, uint6 } spa_config_exit(spa, SCL_STATE, FTAG); + (void) rw_unlock(&ztest_name_lock); /* * If we can tolerate two or more faults, or we're dealing @@ -5293,16 +5311,33 @@ static void * ztest_deadman_thread(void *arg) { ztest_shared_t *zs = arg; - int grace = 300; - hrtime_t delta; - - delta = (zs->zs_thread_stop - zs->zs_thread_start) / NANOSEC + grace; + spa_t *spa = ztest_spa; + hrtime_t delta, total = 0; - (void) poll(NULL, 0, (int)(1000 * delta)); + for (;;) { + delta = (zs->zs_thread_stop - zs->zs_thread_start) / + NANOSEC + zfs_deadman_synctime; - fatal(0, "failed to complete within %d seconds of deadline", grace); + (void) poll(NULL, 0, (int)(1000 * delta)); - return (NULL); + /* + * If the pool is suspended then fail immediately. Otherwise, + * check to see if the pool is making any progress. If + * vdev_deadman() discovers that there hasn't been any recent + * I/Os then it will end up aborting the tests. + */ + if (spa_suspended(spa)) { + fatal(0, "aborting test after %llu seconds because " + "pool has transitioned to a suspended state.", + zfs_deadman_synctime); + return (NULL); + } + vdev_deadman(spa->spa_root_vdev); + + total += zfs_deadman_synctime; + (void) printf("ztest has been running for %lld seconds\n", + total); + } } static void @@ -6031,6 +6066,7 @@ main(int argc, char **argv) (void) setvbuf(stdout, NULL, _IOLBF, 0); dprintf_setup(&argc, argv); + zfs_deadman_synctime = 300; ztest_fd_rand = open("/dev/urandom", O_RDONLY); ASSERT3S(ztest_fd_rand, >=, 0); Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c Thu Jan 16 15:43:17 2014 (r260760) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c Thu Jan 16 15:45:04 2014 (r260761) @@ -774,6 +774,7 @@ spa_change_guid(spa_t *spa) int error; uint64_t guid; + mutex_enter(&spa->spa_vdev_top_lock); mutex_enter(&spa_namespace_lock); guid = spa_generate_guid(NULL); @@ -786,6 +787,7 @@ spa_change_guid(spa_t *spa) } mutex_exit(&spa_namespace_lock); + mutex_exit(&spa->spa_vdev_top_lock); return (error); } @@ -4934,7 +4936,6 @@ spa_vdev_detach(spa_t *spa, uint64_t gui if (pvd->vdev_ops == &vdev_spare_ops) cvd->vdev_unspare = B_FALSE; vdev_remove_parent(cvd); - cvd->vdev_resilvering = B_FALSE; } @@ -5569,6 +5570,13 @@ spa_vdev_resilver_done_hunt(vdev_t *vd) return (oldvd); } + if (vd->vdev_resilvering && vdev_dtl_empty(vd, DTL_MISSING) && + vdev_dtl_empty(vd, DTL_OUTAGE)) { + ASSERT(vd->vdev_ops->vdev_op_leaf); + vd->vdev_resilvering = B_FALSE; + vdev_config_dirty(vd->vdev_top); + } + /* * Check for a completed replacement. We always consider the first * vdev in the list to be the oldest vdev, and the last one to be From owner-svn-src-stable-8@FreeBSD.ORG Thu Jan 16 15:47:09 2014 Return-Path: Delivered-To: svn-src-stable-8@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id AB634A8B; Thu, 16 Jan 2014 15:47:09 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 8C58A1866; Thu, 16 Jan 2014 15:47:09 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id s0GFl9tN065728; Thu, 16 Jan 2014 15:47:09 GMT (envelope-from avg@svn.freebsd.org) Received: (from avg@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id s0GFl9Hc065727; Thu, 16 Jan 2014 15:47:09 GMT (envelope-from avg@svn.freebsd.org) Message-Id: <201401161547.s0GFl9Hc065727@svn.freebsd.org> From: Andriy Gapon Date: Thu, 16 Jan 2014 15:47:09 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r260762 - stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Jan 2014 15:47:09 -0000 Author: avg Date: Thu Jan 16 15:47:09 2014 New Revision: 260762 URL: http://svnweb.freebsd.org/changeset/base/260762 Log: MFC r245511: MFV r245510: improve the comment in txg.c MFC slacker: delphij Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c Directory Properties: stable/8/sys/ (props changed) stable/8/sys/cddl/ (props changed) stable/8/sys/cddl/contrib/opensolaris/ (props changed) Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c Thu Jan 16 15:45:04 2014 (r260761) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c Thu Jan 16 15:47:09 2014 (r260762) @@ -21,7 +21,7 @@ /* * Copyright (c) 2005, 2010, Oracle and/or its affiliates. All rights reserved. * Portions Copyright 2011 Martin Matuska - * Copyright (c) 2012 by Delphix. All rights reserved. + * Copyright (c) 2013 by Delphix. All rights reserved. */ #include @@ -33,7 +33,76 @@ #include /* - * Pool-wide transaction groups. + * ZFS Transaction Groups + * ---------------------- + * + * ZFS transaction groups are, as the name implies, groups of transactions + * that act on persistent state. ZFS asserts consistency at the granularity of + * these transaction groups. Each successive transaction group (txg) is + * assigned a 64-bit consecutive identifier. There are three active + * transaction group states: open, quiescing, or syncing. At any given time, + * there may be an active txg associated with each state; each active txg may + * either be processing, or blocked waiting to enter the next state. There may + * be up to three active txgs, and there is always a txg in the open state + * (though it may be blocked waiting to enter the quiescing state). In broad + * strokes, transactions — operations that change in-memory structures — are + * accepted into the txg in the open state, and are completed while the txg is + * in the open or quiescing states. The accumulated changes are written to + * disk in the syncing state. + * + * Open + * + * When a new txg becomes active, it first enters the open state. New + * transactions — updates to in-memory structures — are assigned to the + * currently open txg. There is always a txg in the open state so that ZFS can + * accept new changes (though the txg may refuse new changes if it has hit + * some limit). ZFS advances the open txg to the next state for a variety of + * reasons such as it hitting a time or size threshold, or the execution of an + * administrative action that must be completed in the syncing state. + * + * Quiescing + * + * After a txg exits the open state, it enters the quiescing state. The + * quiescing state is intended to provide a buffer between accepting new + * transactions in the open state and writing them out to stable storage in + * the syncing state. While quiescing, transactions can continue their + * operation without delaying either of the other states. Typically, a txg is + * in the quiescing state very briefly since the operations are bounded by + * software latencies rather than, say, slower I/O latencies. After all + * transactions complete, the txg is ready to enter the next state. + * + * Syncing + * + * In the syncing state, the in-memory state built up during the open and (to + * a lesser degree) the quiescing states is written to stable storage. The + * process of writing out modified data can, in turn modify more data. For + * example when we write new blocks, we need to allocate space for them; those + * allocations modify metadata (space maps)... which themselves must be + * written to stable storage. During the sync state, ZFS iterates, writing out + * data until it converges and all in-memory changes have been written out. + * The first such pass is the largest as it encompasses all the modified user + * data (as opposed to filesystem metadata). Subsequent passes typically have + * far less data to write as they consist exclusively of filesystem metadata. + * + * To ensure convergence, after a certain number of passes ZFS begins + * overwriting locations on stable storage that had been allocated earlier in + * the syncing state (and subsequently freed). ZFS usually allocates new + * blocks to optimize for large, continuous, writes. For the syncing state to + * converge however it must complete a pass where no new blocks are allocated + * since each allocation requires a modification of persistent metadata. + * Further, to hasten convergence, after a prescribed number of passes, ZFS + * also defers frees, and stops compressing. + * + * In addition to writing out user data, we must also execute synctasks during + * the syncing context. A synctask is the mechanism by which some + * administrative activities work such as creating and destroying snapshots or + * datasets. Note that when a synctask is initiated it enters the open txg, + * and ZFS then pushes that txg as quickly as possible to completion of the + * syncing state in order to reduce the latency of the administrative + * activity. To complete the syncing state, ZFS writes out a new uberblock, + * the root of the tree of blocks that comprise all state stored on the ZFS + * pool. Finally, if there is a quiesced txg waiting, we signal that it can + * now transition to the syncing state. */ static void txg_sync_thread(void *arg); From owner-svn-src-stable-8@FreeBSD.ORG Thu Jan 16 16:00:08 2014 Return-Path: Delivered-To: svn-src-stable-8@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 0BCA037B; Thu, 16 Jan 2014 16:00:08 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id E782B1987; Thu, 16 Jan 2014 16:00:07 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id s0GG071c070365; Thu, 16 Jan 2014 16:00:07 GMT (envelope-from avg@svn.freebsd.org) Received: (from avg@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id s0GG06fs070350; Thu, 16 Jan 2014 16:00:06 GMT (envelope-from avg@svn.freebsd.org) Message-Id: <201401161600.s0GG06fs070350@svn.freebsd.org> From: Andriy Gapon Date: Thu, 16 Jan 2014 16:00:06 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r260765 - in stable/8: cddl/contrib/opensolaris/cmd/ztest cddl/contrib/opensolaris/lib/libzpool/common/sys sys/cddl/compat/opensolaris/sys sys/cddl/contrib/opensolaris/uts/common/fs/zfs... X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Jan 2014 16:00:08 -0000 Author: avg Date: Thu Jan 16 16:00:05 2014 New Revision: 260765 URL: http://svnweb.freebsd.org/changeset/base/260765 Log: MFC r258632,258704: MFV r255255: 4045 zfs write throttle & i/o scheduler performance work Note a change in dmu_tx_delay: pause_sbt is not available in this branch. Sponsored by: HybridCluster [merge] Added: stable/8/sys/cddl/compat/opensolaris/sys/disp.h - copied unchanged from r258632, head/sys/cddl/compat/opensolaris/sys/disp.h Modified: stable/8/cddl/contrib/opensolaris/cmd/ztest/ztest.c stable/8/cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_zfetch.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dir.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_pool.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/arc.h stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dbuf.h stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_tx.h stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_dir.h stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dsl_pool.h stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/sa_impl.h stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/spa_impl.h stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/txg.h stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/txg_impl.h stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/vdev_impl.h stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zfs_context.h stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/zio.h stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/txg.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_cache.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_queue.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_raidz.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zil.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c Directory Properties: stable/8/cddl/contrib/opensolaris/ (props changed) stable/8/cddl/contrib/opensolaris/cmd/zfs/ (props changed) stable/8/cddl/contrib/opensolaris/lib/libzfs/ (props changed) stable/8/sys/ (props changed) stable/8/sys/cddl/ (props changed) stable/8/sys/cddl/contrib/opensolaris/ (props changed) Modified: stable/8/cddl/contrib/opensolaris/cmd/ztest/ztest.c ============================================================================== --- stable/8/cddl/contrib/opensolaris/cmd/ztest/ztest.c Thu Jan 16 15:59:08 2014 (r260764) +++ stable/8/cddl/contrib/opensolaris/cmd/ztest/ztest.c Thu Jan 16 16:00:05 2014 (r260765) @@ -186,7 +186,7 @@ static const ztest_shared_opts_t ztest_o extern uint64_t metaslab_gang_bang; extern uint64_t metaslab_df_alloc_threshold; -extern uint64_t zfs_deadman_synctime; +extern uint64_t zfs_deadman_synctime_ms; static ztest_shared_opts_t *ztest_shared_opts; static ztest_shared_opts_t ztest_opts; @@ -5315,10 +5315,10 @@ ztest_deadman_thread(void *arg) hrtime_t delta, total = 0; for (;;) { - delta = (zs->zs_thread_stop - zs->zs_thread_start) / - NANOSEC + zfs_deadman_synctime; + delta = zs->zs_thread_stop - zs->zs_thread_start + + MSEC2NSEC(zfs_deadman_synctime_ms); - (void) poll(NULL, 0, (int)(1000 * delta)); + (void) poll(NULL, 0, (int)NSEC2MSEC(delta)); /* * If the pool is suspended then fail immediately. Otherwise, @@ -5329,12 +5329,12 @@ ztest_deadman_thread(void *arg) if (spa_suspended(spa)) { fatal(0, "aborting test after %llu seconds because " "pool has transitioned to a suspended state.", - zfs_deadman_synctime); + zfs_deadman_synctime_ms / 1000); return (NULL); } vdev_deadman(spa->spa_root_vdev); - total += zfs_deadman_synctime; + total += zfs_deadman_synctime_ms/1000; (void) printf("ztest has been running for %lld seconds\n", total); } @@ -6066,7 +6066,7 @@ main(int argc, char **argv) (void) setvbuf(stdout, NULL, _IOLBF, 0); dprintf_setup(&argc, argv); - zfs_deadman_synctime = 300; + zfs_deadman_synctime_ms = 300000; ztest_fd_rand = open("/dev/urandom", O_RDONLY); ASSERT3S(ztest_fd_rand, >=, 0); Modified: stable/8/cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h ============================================================================== --- stable/8/cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h Thu Jan 16 15:59:08 2014 (r260764) +++ stable/8/cddl/contrib/opensolaris/lib/libzpool/common/sys/zfs_context.h Thu Jan 16 16:00:05 2014 (r260765) @@ -65,6 +65,7 @@ extern "C" { #include #include #include +#include #include #include #include @@ -204,6 +205,8 @@ extern int aok; */ #define curthread ((void *)(uintptr_t)thr_self()) +#define kpreempt(x) sched_yield() + typedef struct kthread kthread_t; #define thread_create(stk, stksize, func, arg, len, pp, state, pri) \ Copied: stable/8/sys/cddl/compat/opensolaris/sys/disp.h (from r258632, head/sys/cddl/compat/opensolaris/sys/disp.h) ============================================================================== --- /dev/null 00:00:00 1970 (empty, because file is newly added) +++ stable/8/sys/cddl/compat/opensolaris/sys/disp.h Thu Jan 16 16:00:05 2014 (r260765, copy of r258632, head/sys/cddl/compat/opensolaris/sys/disp.h) @@ -0,0 +1,40 @@ +/*- + * Copyright (c) 2013 Andriy Gapon + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in the + * documentation and/or other materials provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * $FreeBSD$ + */ + +#ifndef _OPENSOLARIS_SYS_DISP_H_ +#define _OPENSOLARIS_SYS_DISP_H_ + +#ifdef _KERNEL + +#include + +#define kpreempt(x) kern_yield(PRI_USER) + +#endif /* _KERNEL */ + +#endif /* _OPENSOLARIS_SYS_DISP_H_ */ Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Thu Jan 16 15:59:08 2014 (r260764) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c Thu Jan 16 16:00:05 2014 (r260765) @@ -127,6 +127,7 @@ #include #include #include +#include #ifdef _KERNEL #include #endif @@ -150,10 +151,6 @@ static kmutex_t arc_reclaim_thr_lock; static kcondvar_t arc_reclaim_thr_cv; /* used to signal reclaim thr */ static uint8_t arc_thread_exit; -extern int zfs_write_limit_shift; -extern uint64_t zfs_write_limit_max; -extern kmutex_t zfs_write_limit_lock; - #define ARC_REDUCE_DNLC_PERCENT 3 uint_t arc_reduce_dnlc_percent = ARC_REDUCE_DNLC_PERCENT; @@ -162,6 +159,12 @@ typedef enum arc_reclaim_strategy { ARC_RECLAIM_CONS /* Conservative reclaim strategy */ } arc_reclaim_strategy_t; +/* + * The number of iterations through arc_evict_*() before we + * drop & reacquire the lock. + */ +int arc_evict_iterations = 100; + /* number of seconds before growing cache again */ static int arc_grow_retry = 60; @@ -177,6 +180,11 @@ static int arc_shrink_shift = 5; */ static int arc_min_prefetch_lifespan; +/* + * If this percent of memory is free, don't throttle. + */ +int arc_lotsfree_percent = 10; + static int arc_dead; extern int zfs_prefetch_disable; @@ -526,6 +534,7 @@ typedef struct arc_write_callback arc_wr struct arc_write_callback { void *awcb_private; arc_done_func_t *awcb_ready; + arc_done_func_t *awcb_physdone; arc_done_func_t *awcb_done; arc_buf_t *awcb_buf; }; @@ -1312,7 +1321,7 @@ arc_change_state(arc_state_t *new_state, kmutex_t *lock; ASSERT(MUTEX_HELD(hash_lock)); - ASSERT(new_state != old_state); + ASSERT3P(new_state, !=, old_state); ASSERT(refcnt == 0 || ab->b_datacnt > 0); ASSERT(ab->b_datacnt == 0 || !GHOST_STATE(new_state)); ASSERT(ab->b_datacnt <= 1 || old_state != arc_anon); @@ -1937,8 +1946,10 @@ arc_evict(arc_state_t *state, uint64_t s kmutex_t *hash_lock; boolean_t have_lock; void *stolen = NULL; + arc_buf_hdr_t marker = { 0 }; + int count = 0; static int evict_metadata_offset, evict_data_offset; - int i, idx, offset, list_count, count; + int i, idx, offset, list_count, lists; ASSERT(state == arc_mru || state == arc_mfu); @@ -1958,7 +1969,7 @@ arc_evict(arc_state_t *state, uint64_t s idx = evict_data_offset; } bytes_remaining = evicted_state->arcs_lsize[type]; - count = 0; + lists = 0; evict_start: list = &list_start[idx]; @@ -1985,6 +1996,33 @@ evict_start: if (recycle && ab->b_size != bytes && ab_prev && ab_prev->b_size == bytes) continue; + + /* ignore markers */ + if (ab->b_spa == 0) + continue; + + /* + * It may take a long time to evict all the bufs requested. + * To avoid blocking all arc activity, periodically drop + * the arcs_mtx and give other threads a chance to run + * before reacquiring the lock. + * + * If we are looking for a buffer to recycle, we are in + * the hot code path, so don't sleep. + */ + if (!recycle && count++ > arc_evict_iterations) { + list_insert_after(list, ab, &marker); + mutex_exit(evicted_lock); + mutex_exit(lock); + kpreempt(KPREEMPT_SYNC); + mutex_enter(lock); + mutex_enter(evicted_lock); + ab_prev = list_prev(list, &marker); + list_remove(list, &marker); + count = 0; + continue; + } + hash_lock = HDR_LOCK(ab); have_lock = MUTEX_HELD(hash_lock); if (have_lock || mutex_tryenter(hash_lock)) { @@ -2051,7 +2089,7 @@ evict_start: mutex_exit(evicted_lock); mutex_exit(lock); idx = ((idx + 1) & (list_count - 1)); - count++; + lists++; goto evict_start; } } else { @@ -2063,10 +2101,10 @@ evict_start: mutex_exit(lock); idx = ((idx + 1) & (list_count - 1)); - count++; + lists++; if (bytes_evicted < bytes) { - if (count < list_count) + if (lists < list_count) goto evict_start; else dprintf("only evicted %lld bytes from %x", @@ -2084,28 +2122,14 @@ evict_start: ARCSTAT_INCR(arcstat_mutex_miss, missed); /* - * We have just evicted some data into the ghost state, make - * sure we also adjust the ghost state size if necessary. + * Note: we have just evicted some data into the ghost state, + * potentially putting the ghost size over the desired size. Rather + * that evicting from the ghost list in this hot code path, leave + * this chore to the arc_reclaim_thread(). */ - if (arc_no_grow && - arc_mru_ghost->arcs_size + arc_mfu_ghost->arcs_size > arc_c) { - int64_t mru_over = arc_anon->arcs_size + arc_mru->arcs_size + - arc_mru_ghost->arcs_size - arc_c; - - if (mru_over > 0 && arc_mru_ghost->arcs_lsize[type] > 0) { - int64_t todelete = - MIN(arc_mru_ghost->arcs_lsize[type], mru_over); - arc_evict_ghost(arc_mru_ghost, 0, todelete); - } else if (arc_mfu_ghost->arcs_lsize[type] > 0) { - int64_t todelete = MIN(arc_mfu_ghost->arcs_lsize[type], - arc_mru_ghost->arcs_size + - arc_mfu_ghost->arcs_size - arc_c); - arc_evict_ghost(arc_mfu_ghost, 0, todelete); - } - } + if (stolen) ARCSTAT_BUMP(arcstat_stolen); - return (stolen); } @@ -2122,9 +2146,10 @@ arc_evict_ghost(arc_state_t *state, uint kmutex_t *hash_lock, *lock; uint64_t bytes_deleted = 0; uint64_t bufs_skipped = 0; + int count = 0; static int evict_offset; int list_count, idx = evict_offset; - int offset, count = 0; + int offset, lists = 0; ASSERT(GHOST_STATE(state)); @@ -2142,6 +2167,8 @@ evict_start: mutex_enter(lock); for (ab = list_tail(list); ab; ab = ab_prev) { ab_prev = list_prev(list, ab); + if (ab->b_type > ARC_BUFC_NUMTYPES) + panic("invalid ab=%p", (void *)ab); if (spa && ab->b_spa != spa) continue; @@ -2153,6 +2180,23 @@ evict_start: /* caller may be trying to modify this buffer, skip it */ if (MUTEX_HELD(hash_lock)) continue; + + /* + * It may take a long time to evict all the bufs requested. + * To avoid blocking all arc activity, periodically drop + * the arcs_mtx and give other threads a chance to run + * before reacquiring the lock. + */ + if (count++ > arc_evict_iterations) { + list_insert_after(list, ab, &marker); + mutex_exit(lock); + kpreempt(KPREEMPT_SYNC); + mutex_enter(lock); + ab_prev = list_prev(list, &marker); + list_remove(list, &marker); + count = 0; + continue; + } if (mutex_tryenter(hash_lock)) { ASSERT(!HDR_IO_IN_PROGRESS(ab)); ASSERT(ab->b_buf == NULL); @@ -2188,14 +2232,16 @@ evict_start: mutex_enter(lock); ab_prev = list_prev(list, &marker); list_remove(list, &marker); - } else + } else { bufs_skipped += 1; + } + } mutex_exit(lock); idx = ((idx + 1) & (ARC_BUFC_NUMDATALISTS - 1)); - count++; + lists++; - if (count < list_count) + if (lists < list_count) goto evict_start; evict_offset = idx; @@ -2203,7 +2249,7 @@ evict_start: (bytes < 0 || bytes_deleted < bytes)) { list_start = &state->arcs_lists[0]; list_count = ARC_BUFC_NUMMETADATALISTS; - offset = count = 0; + offset = lists = 0; goto evict_start; } @@ -3083,7 +3129,7 @@ arc_read_done(zio_t *zio) */ int arc_read(zio_t *pio, spa_t *spa, const blkptr_t *bp, arc_done_func_t *done, - void *private, int priority, int zio_flags, uint32_t *arc_flags, + void *private, zio_priority_t priority, int zio_flags, uint32_t *arc_flags, const zbookmark_t *zb) { arc_buf_hdr_t *hdr; @@ -3669,6 +3715,18 @@ arc_write_ready(zio_t *zio) hdr->b_flags |= ARC_IO_IN_PROGRESS; } +/* + * The SPA calls this callback for each physical write that happens on behalf + * of a logical write. See the comment in dbuf_write_physdone() for details. + */ +static void +arc_write_physdone(zio_t *zio) +{ + arc_write_callback_t *cb = zio->io_private; + if (cb->awcb_physdone != NULL) + cb->awcb_physdone(zio, cb->awcb_buf, cb->awcb_private); +} + static void arc_write_done(zio_t *zio) { @@ -3749,8 +3807,9 @@ arc_write_done(zio_t *zio) zio_t * arc_write(zio_t *pio, spa_t *spa, uint64_t txg, blkptr_t *bp, arc_buf_t *buf, boolean_t l2arc, boolean_t l2arc_compress, - const zio_prop_t *zp, arc_done_func_t *ready, arc_done_func_t *done, - void *private, int priority, int zio_flags, const zbookmark_t *zb) + const zio_prop_t *zp, arc_done_func_t *ready, arc_done_func_t *physdone, + arc_done_func_t *done, void *private, zio_priority_t priority, + int zio_flags, const zbookmark_t *zb) { arc_buf_hdr_t *hdr = buf->b_hdr; arc_write_callback_t *callback; @@ -3767,18 +3826,20 @@ arc_write(zio_t *pio, spa_t *spa, uint64 hdr->b_flags |= ARC_L2COMPRESS; callback = kmem_zalloc(sizeof (arc_write_callback_t), KM_SLEEP); callback->awcb_ready = ready; + callback->awcb_physdone = physdone; callback->awcb_done = done; callback->awcb_private = private; callback->awcb_buf = buf; zio = zio_write(pio, spa, txg, bp, buf->b_data, hdr->b_size, zp, - arc_write_ready, arc_write_done, callback, priority, zio_flags, zb); + arc_write_ready, arc_write_physdone, arc_write_done, callback, + priority, zio_flags, zb); return (zio); } static int -arc_memory_throttle(uint64_t reserve, uint64_t inflight_data, uint64_t txg) +arc_memory_throttle(uint64_t reserve, uint64_t txg) { #ifdef _KERNEL uint64_t available_memory = @@ -3792,7 +3853,9 @@ arc_memory_throttle(uint64_t reserve, ui MIN(available_memory, vmem_size(heap_arena, VMEM_FREE)); #endif #endif /* sun */ - if (available_memory >= zfs_write_limit_max) + + if (cnt.v_free_count + cnt.v_cache_count > + (uint64_t)physmem * arc_lotsfree_percent / 100) return (0); if (txg > last_txg) { @@ -3816,20 +3879,6 @@ arc_memory_throttle(uint64_t reserve, ui return (SET_ERROR(EAGAIN)); } page_load = 0; - - if (arc_size > arc_c_min) { - uint64_t evictable_memory = - arc_mru->arcs_lsize[ARC_BUFC_DATA] + - arc_mru->arcs_lsize[ARC_BUFC_METADATA] + - arc_mfu->arcs_lsize[ARC_BUFC_DATA] + - arc_mfu->arcs_lsize[ARC_BUFC_METADATA]; - available_memory += MIN(evictable_memory, arc_size - arc_c_min); - } - - if (inflight_data > available_memory / 4) { - ARCSTAT_INCR(arcstat_memory_throttle_count, 1); - return (SET_ERROR(ERESTART)); - } #endif return (0); } @@ -3847,15 +3896,6 @@ arc_tempreserve_space(uint64_t reserve, int error; uint64_t anon_size; -#ifdef ZFS_DEBUG - /* - * Once in a while, fail for no reason. Everything should cope. - */ - if (spa_get_random(10000) == 0) { - dprintf("forcing random failure\n"); - return (SET_ERROR(ERESTART)); - } -#endif if (reserve > arc_c/4 && !arc_no_grow) arc_c = MIN(arc_c_max, reserve * 4); if (reserve > arc_c) @@ -3873,7 +3913,8 @@ arc_tempreserve_space(uint64_t reserve, * in order to compress/encrypt/etc the data. We therefore need to * make sure that there is sufficient available memory for this. */ - if (error = arc_memory_throttle(reserve, anon_size, txg)) + error = arc_memory_throttle(reserve, txg); + if (error != 0) return (error); /* @@ -4064,11 +4105,20 @@ arc_init(void) arc_dead = FALSE; arc_warm = B_FALSE; - if (zfs_write_limit_max == 0) - zfs_write_limit_max = ptob(physmem) >> zfs_write_limit_shift; - else - zfs_write_limit_shift = 0; - mutex_init(&zfs_write_limit_lock, NULL, MUTEX_DEFAULT, NULL); + /* + * Calculate maximum amount of dirty data per pool. + * + * If it has been set by /etc/system, take that. + * Otherwise, use a percentage of physical memory defined by + * zfs_dirty_data_max_percent (default 10%) with a cap at + * zfs_dirty_data_max_max (default 4GB). + */ + if (zfs_dirty_data_max == 0) { + zfs_dirty_data_max = ptob(physmem) * + zfs_dirty_data_max_percent / 100; + zfs_dirty_data_max = MIN(zfs_dirty_data_max, + zfs_dirty_data_max_max); + } #ifdef _KERNEL if (TUNABLE_INT_FETCH("vfs.zfs.prefetch_disable", &zfs_prefetch_disable)) @@ -4147,8 +4197,6 @@ arc_fini(void) mutex_destroy(&arc_l2c_only->arcs_locks[i].arcs_lock); } - mutex_destroy(&zfs_write_limit_lock); - buf_fini(); ASSERT(arc_loaned_bytes == 0); Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Thu Jan 16 15:59:08 2014 (r260764) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c Thu Jan 16 16:00:05 2014 (r260765) @@ -841,7 +841,7 @@ dbuf_free_range(dnode_t *dn, uint64_t st atomic_inc_64(&zfs_free_range_recv_miss); } - for (db = list_head(&dn->dn_dbufs); db; db = db_next) { + for (db = list_head(&dn->dn_dbufs); db != NULL; db = db_next) { db_next = list_next(&dn->dn_dbufs, db); ASSERT(db->db_blkid != DMU_BONUS_BLKID); @@ -1187,6 +1187,8 @@ dbuf_dirty(dmu_buf_impl_t *db, dmu_tx_t sizeof (dbuf_dirty_record_t), offsetof(dbuf_dirty_record_t, dr_dirty_node)); } + if (db->db_blkid != DMU_BONUS_BLKID && os->os_dsl_dataset != NULL) + dr->dr_accounted = db->db.db_size; dr->dr_dbuf = db; dr->dr_txg = tx->tx_txg; dr->dr_next = *drp; @@ -1270,7 +1272,10 @@ dbuf_dirty(dmu_buf_impl_t *db, dmu_tx_t dbuf_rele(parent, FTAG); mutex_enter(&db->db_mtx); - /* possible race with dbuf_undirty() */ + /* + * Since we've dropped the mutex, it's possible that + * dbuf_undirty() might have changed this out from under us. + */ if (db->db_last_dirty == dr || dn->dn_object == DMU_META_DNODE_OBJECT) { mutex_enter(&di->dt.di.dr_mtx); @@ -1340,7 +1345,11 @@ dbuf_undirty(dmu_buf_impl_t *db, dmu_tx_ ASSERT(db->db.db_size != 0); - /* XXX would be nice to fix up dn_towrite_space[] */ + /* + * Any space we accounted for in dp_dirty_* will be cleaned up by + * dsl_pool_sync(). This is relatively rare so the discrepancy + * is not a big deal. + */ *drp = dr->dr_next; @@ -1520,7 +1529,7 @@ dbuf_assign_arcbuf(dmu_buf_impl_t *db, a /* * "Clear" the contents of this dbuf. This will mark the dbuf - * EVICTING and clear *most* of its references. Unfortunetely, + * EVICTING and clear *most* of its references. Unfortunately, * when we are not holding the dn_dbufs_mtx, we can't clear the * entry in the dn_dbufs list. We have to wait until dbuf_destroy() * in this case. For callers from the DMU we will usually see: @@ -1707,7 +1716,7 @@ dbuf_create(dnode_t *dn, uint8_t level, db->db.db_offset = 0; } else { int blocksize = - db->db_level ? 1<dn_indblkshift : dn->dn_datablksz; + db->db_level ? 1 << dn->dn_indblkshift : dn->dn_datablksz; db->db.db_size = blocksize; db->db.db_offset = db->db_blkid * blocksize; } @@ -1816,7 +1825,7 @@ dbuf_destroy(dmu_buf_impl_t *db) } void -dbuf_prefetch(dnode_t *dn, uint64_t blkid) +dbuf_prefetch(dnode_t *dn, uint64_t blkid, zio_priority_t prio) { dmu_buf_impl_t *db = NULL; blkptr_t *bp = NULL; @@ -1840,8 +1849,6 @@ dbuf_prefetch(dnode_t *dn, uint64_t blki if (dbuf_findbp(dn, 0, blkid, TRUE, &db, &bp) == 0) { if (bp && !BP_IS_HOLE(bp)) { - int priority = dn->dn_type == DMU_OT_DDT_ZAP ? - ZIO_PRIORITY_DDT_PREFETCH : ZIO_PRIORITY_ASYNC_READ; dsl_dataset_t *ds = dn->dn_objset->os_dsl_dataset; uint32_t aflags = ARC_NOWAIT | ARC_PREFETCH; zbookmark_t zb; @@ -1850,7 +1857,7 @@ dbuf_prefetch(dnode_t *dn, uint64_t blki dn->dn_object, 0, blkid); (void) arc_read(NULL, dn->dn_objset->os_spa, - bp, NULL, NULL, priority, + bp, NULL, NULL, prio, ZIO_FLAG_CANFAIL | ZIO_FLAG_SPECULATIVE, &aflags, &zb); } @@ -2535,6 +2542,38 @@ dbuf_write_ready(zio_t *zio, arc_buf_t * mutex_exit(&db->db_mtx); } +/* + * The SPA will call this callback several times for each zio - once + * for every physical child i/o (zio->io_phys_children times). This + * allows the DMU to monitor the progress of each logical i/o. For example, + * there may be 2 copies of an indirect block, or many fragments of a RAID-Z + * block. There may be a long delay before all copies/fragments are completed, + * so this callback allows us to retire dirty space gradually, as the physical + * i/os complete. + */ +/* ARGSUSED */ +static void +dbuf_write_physdone(zio_t *zio, arc_buf_t *buf, void *arg) +{ + dmu_buf_impl_t *db = arg; + objset_t *os = db->db_objset; + dsl_pool_t *dp = dmu_objset_pool(os); + dbuf_dirty_record_t *dr; + int delta = 0; + + dr = db->db_data_pending; + ASSERT3U(dr->dr_txg, ==, zio->io_txg); + + /* + * The callback will be called io_phys_children times. Retire one + * portion of our dirty space each time we are called. Any rounding + * error will be cleaned up by dsl_pool_sync()'s call to + * dsl_pool_undirty_space(). + */ + delta = dr->dr_accounted / zio->io_phys_children; + dsl_pool_undirty_space(dp, delta, zio->io_txg); +} + /* ARGSUSED */ static void dbuf_write_done(zio_t *zio, arc_buf_t *buf, void *vdb) @@ -2629,6 +2668,7 @@ dbuf_write_done(zio_t *zio, arc_buf_t *b ASSERT(db->db_dirtycnt > 0); db->db_dirtycnt -= 1; db->db_data_pending = NULL; + dbuf_rele_and_unlock(db, (void *)(uintptr_t)txg); } @@ -2747,8 +2787,8 @@ dbuf_write(dbuf_dirty_record_t *dr, arc_ ASSERT(db->db_state != DB_NOFILL); dr->dr_zio = zio_write(zio, os->os_spa, txg, db->db_blkptr, data->b_data, arc_buf_size(data), &zp, - dbuf_write_override_ready, dbuf_write_override_done, dr, - ZIO_PRIORITY_ASYNC_WRITE, ZIO_FLAG_MUSTSUCCEED, &zb); + dbuf_write_override_ready, NULL, dbuf_write_override_done, + dr, ZIO_PRIORITY_ASYNC_WRITE, ZIO_FLAG_MUSTSUCCEED, &zb); mutex_enter(&db->db_mtx); dr->dt.dl.dr_override_state = DR_NOT_OVERRIDDEN; zio_write_override(dr->dr_zio, &dr->dt.dl.dr_overridden_by, @@ -2758,7 +2798,7 @@ dbuf_write(dbuf_dirty_record_t *dr, arc_ ASSERT(zp.zp_checksum == ZIO_CHECKSUM_OFF); dr->dr_zio = zio_write(zio, os->os_spa, txg, db->db_blkptr, NULL, db->db.db_size, &zp, - dbuf_write_nofill_ready, dbuf_write_nofill_done, db, + dbuf_write_nofill_ready, NULL, dbuf_write_nofill_done, db, ZIO_PRIORITY_ASYNC_WRITE, ZIO_FLAG_MUSTSUCCEED | ZIO_FLAG_NODATA, &zb); } else { @@ -2766,7 +2806,7 @@ dbuf_write(dbuf_dirty_record_t *dr, arc_ dr->dr_zio = arc_write(zio, os->os_spa, txg, db->db_blkptr, data, DBUF_IS_L2CACHEABLE(db), DBUF_IS_L2COMPRESSIBLE(db), &zp, dbuf_write_ready, - dbuf_write_done, db, ZIO_PRIORITY_ASYNC_WRITE, - ZIO_FLAG_MUSTSUCCEED, &zb); + dbuf_write_physdone, dbuf_write_done, db, + ZIO_PRIORITY_ASYNC_WRITE, ZIO_FLAG_MUSTSUCCEED, &zb); } } Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c Thu Jan 16 15:59:08 2014 (r260764) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c Thu Jan 16 16:00:05 2014 (r260765) @@ -374,13 +374,11 @@ static int dmu_buf_hold_array_by_dnode(dnode_t *dn, uint64_t offset, uint64_t length, int read, void *tag, int *numbufsp, dmu_buf_t ***dbpp, uint32_t flags) { - dsl_pool_t *dp = NULL; dmu_buf_t **dbp; uint64_t blkid, nblks, i; uint32_t dbuf_flags; int err; zio_t *zio; - hrtime_t start; ASSERT(length <= DMU_MAX_ACCESS); @@ -408,9 +406,6 @@ dmu_buf_hold_array_by_dnode(dnode_t *dn, } dbp = kmem_zalloc(sizeof (dmu_buf_t *) * nblks, KM_SLEEP); - if (dn->dn_objset->os_dsl_dataset) - dp = dn->dn_objset->os_dsl_dataset->ds_dir->dd_pool; - start = gethrtime(); zio = zio_root(dn->dn_objset->os_spa, NULL, NULL, ZIO_FLAG_CANFAIL); blkid = dbuf_whichblock(dn, offset); for (i = 0; i < nblks; i++) { @@ -434,9 +429,6 @@ dmu_buf_hold_array_by_dnode(dnode_t *dn, /* wait for async i/o */ err = zio_wait(zio); - /* track read overhead when we are in sync context */ - if (dp && dsl_pool_sync_context(dp)) - dp->dp_read_overhead += gethrtime() - start; if (err) { dmu_buf_rele_array(dbp, nblks, tag); return (err); @@ -518,12 +510,22 @@ dmu_buf_rele_array(dmu_buf_t **dbp_fake, kmem_free(dbp, sizeof (dmu_buf_t *) * numbufs); } +/* + * Issue prefetch i/os for the given blocks. + * + * Note: The assumption is that we *know* these blocks will be needed + * almost immediately. Therefore, the prefetch i/os will be issued at + * ZIO_PRIORITY_SYNC_READ + * + * Note: indirect blocks and other metadata will be read synchronously, + * causing this function to block if they are not already cached. + */ void dmu_prefetch(objset_t *os, uint64_t object, uint64_t offset, uint64_t len) { dnode_t *dn; uint64_t blkid; - int nblks, i, err; + int nblks, err; if (zfs_prefetch_disable) return; @@ -536,7 +538,7 @@ dmu_prefetch(objset_t *os, uint64_t obje rw_enter(&dn->dn_struct_rwlock, RW_READER); blkid = dbuf_whichblock(dn, object * sizeof (dnode_phys_t)); - dbuf_prefetch(dn, blkid); + dbuf_prefetch(dn, blkid, ZIO_PRIORITY_SYNC_READ); rw_exit(&dn->dn_struct_rwlock); return; } @@ -553,16 +555,16 @@ dmu_prefetch(objset_t *os, uint64_t obje rw_enter(&dn->dn_struct_rwlock, RW_READER); if (dn->dn_datablkshift) { int blkshift = dn->dn_datablkshift; - nblks = (P2ROUNDUP(offset+len, 1<> blkshift; + nblks = (P2ROUNDUP(offset + len, 1 << blkshift) - + P2ALIGN(offset, 1 << blkshift)) >> blkshift; } else { nblks = (offset < dn->dn_datablksz); } if (nblks != 0) { blkid = dbuf_whichblock(dn, offset); - for (i = 0; i < nblks; i++) - dbuf_prefetch(dn, blkid+i); + for (int i = 0; i < nblks; i++) + dbuf_prefetch(dn, blkid + i, ZIO_PRIORITY_SYNC_READ); } rw_exit(&dn->dn_struct_rwlock); @@ -1376,7 +1378,7 @@ dmu_sync_late_arrival(zio_t *pio, objset zio_nowait(zio_write(pio, os->os_spa, dmu_tx_get_txg(tx), zgd->zgd_bp, zgd->zgd_db->db_data, zgd->zgd_db->db_size, zp, - dmu_sync_late_arrival_ready, dmu_sync_late_arrival_done, dsa, + dmu_sync_late_arrival_ready, NULL, dmu_sync_late_arrival_done, dsa, ZIO_PRIORITY_SYNC_WRITE, ZIO_FLAG_CANFAIL, zb)); return (0); @@ -1516,8 +1518,9 @@ dmu_sync(zio_t *pio, uint64_t txg, dmu_s zio_nowait(arc_write(pio, os->os_spa, txg, bp, dr->dt.dl.dr_data, DBUF_IS_L2CACHEABLE(db), - DBUF_IS_L2COMPRESSIBLE(db), &zp, dmu_sync_ready, dmu_sync_done, - dsa, ZIO_PRIORITY_SYNC_WRITE, ZIO_FLAG_CANFAIL, &zb)); + DBUF_IS_L2COMPRESSIBLE(db), &zp, dmu_sync_ready, + NULL, dmu_sync_done, dsa, ZIO_PRIORITY_SYNC_WRITE, + ZIO_FLAG_CANFAIL, &zb)); return (0); } Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c Thu Jan 16 15:59:08 2014 (r260764) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c Thu Jan 16 16:00:05 2014 (r260765) @@ -1028,7 +1028,7 @@ dmu_objset_sync(objset_t *os, zio_t *pio zio = arc_write(pio, os->os_spa, tx->tx_txg, os->os_rootbp, os->os_phys_buf, DMU_OS_IS_L2CACHEABLE(os), DMU_OS_IS_L2COMPRESSIBLE(os), &zp, dmu_objset_write_ready, - dmu_objset_write_done, os, ZIO_PRIORITY_ASYNC_WRITE, + NULL, dmu_objset_write_done, os, ZIO_PRIORITY_ASYNC_WRITE, ZIO_FLAG_MUSTSUCCEED, &zb); /* Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c Thu Jan 16 15:59:08 2014 (r260764) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_tx.c Thu Jan 16 16:00:05 2014 (r260765) @@ -54,6 +54,7 @@ dmu_tx_create_dd(dsl_dir_t *dd) offsetof(dmu_tx_hold_t, txh_node)); list_create(&tx->tx_callbacks, sizeof (dmu_tx_callback_t), offsetof(dmu_tx_callback_t, dcb_node)); + tx->tx_start = gethrtime(); #ifdef ZFS_DEBUG refcount_create(&tx->tx_space_written); refcount_create(&tx->tx_space_freed); @@ -597,13 +598,13 @@ dmu_tx_hold_free(dmu_tx_t *tx, uint64_t if (txh == NULL) return; dn = txh->txh_dnode; + dmu_tx_count_dnode(txh); if (off >= (dn->dn_maxblkid+1) * dn->dn_datablksz) return; if (len == DMU_OBJECT_END) len = (dn->dn_maxblkid+1) * dn->dn_datablksz - off; - dmu_tx_count_dnode(txh); /* * For i/o error checking, we read the first and last level-0 @@ -911,6 +912,162 @@ dmu_tx_dirty_buf(dmu_tx_t *tx, dmu_buf_i } #endif +/* + * If we can't do 10 iops, something is wrong. Let us go ahead + * and hit zfs_dirty_data_max. + */ +hrtime_t zfs_delay_max_ns = MSEC2NSEC(100); +int zfs_delay_resolution_ns = 100 * 1000; /* 100 microseconds */ + +/* + * We delay transactions when we've determined that the backend storage + * isn't able to accommodate the rate of incoming writes. + * + * If there is already a transaction waiting, we delay relative to when + * that transaction finishes waiting. This way the calculated min_time + * is independent of the number of threads concurrently executing + * transactions. + * + * If we are the only waiter, wait relative to when the transaction + * started, rather than the current time. This credits the transaction for + * "time already served", e.g. reading indirect blocks. + * + * The minimum time for a transaction to take is calculated as: + * min_time = scale * (dirty - min) / (max - dirty) + * min_time is then capped at zfs_delay_max_ns. + * + * The delay has two degrees of freedom that can be adjusted via tunables. + * The percentage of dirty data at which we start to delay is defined by + * zfs_delay_min_dirty_percent. This should typically be at or above + * zfs_vdev_async_write_active_max_dirty_percent so that we only start to + * delay after writing at full speed has failed to keep up with the incoming + * write rate. The scale of the curve is defined by zfs_delay_scale. Roughly + * speaking, this variable determines the amount of delay at the midpoint of + * the curve. + * + * delay + * 10ms +-------------------------------------------------------------*+ + * | *| + * 9ms + *+ + * | *| + * 8ms + *+ + * | * | + * 7ms + * + + * | * | + * 6ms + * + + * | * | + * 5ms + * + + * | * | + * 4ms + * + + * | * | + * 3ms + * + + * | * | + * 2ms + (midpoint) * + + * | | ** | + * 1ms + v *** + + * | zfs_delay_scale ----------> ******** | + * 0 +-------------------------------------*********----------------+ + * 0% <- zfs_dirty_data_max -> 100% + * + * Note that since the delay is added to the outstanding time remaining on the + * most recent transaction, the delay is effectively the inverse of IOPS. + * Here the midpoint of 500us translates to 2000 IOPS. The shape of the curve + * was chosen such that small changes in the amount of accumulated dirty data + * in the first 3/4 of the curve yield relatively small differences in the + * amount of delay. + * + * The effects can be easier to understand when the amount of delay is + * represented on a log scale: + * + * delay + * 100ms +-------------------------------------------------------------++ + * + + + * | | + * + *+ + * 10ms + *+ + * + ** + + * | (midpoint) ** | + * + | ** + + * 1ms + v **** + + * + zfs_delay_scale ----------> ***** + + * | **** | + * + **** + + * 100us + ** + + * + * + + * | * | + * + * + + * 10us + * + + * + + + * | | + * + + + * +--------------------------------------------------------------+ + * 0% <- zfs_dirty_data_max -> 100% + * + * Note here that only as the amount of dirty data approaches its limit does + * the delay start to increase rapidly. The goal of a properly tuned system + * should be to keep the amount of dirty data out of that range by first + * ensuring that the appropriate limits are set for the I/O scheduler to reach + * optimal throughput on the backend storage, and then by changing the value + * of zfs_delay_scale to increase the steepness of the curve. + */ +static void +dmu_tx_delay(dmu_tx_t *tx, uint64_t dirty) +{ + dsl_pool_t *dp = tx->tx_pool; + uint64_t delay_min_bytes = + zfs_dirty_data_max * zfs_delay_min_dirty_percent / 100; + hrtime_t wakeup, min_tx_time, now; + + if (dirty <= delay_min_bytes) + return; + + /* + * The caller has already waited until we are under the max. + * We make them pass us the amount of dirty data so we don't + * have to handle the case of it being >= the max, which could + * cause a divide-by-zero if it's == the max. + */ + ASSERT3U(dirty, <, zfs_dirty_data_max); + + now = gethrtime(); + min_tx_time = zfs_delay_scale * + (dirty - delay_min_bytes) / (zfs_dirty_data_max - dirty); + if (now > tx->tx_start + min_tx_time) + return; + + min_tx_time = MIN(min_tx_time, zfs_delay_max_ns); + + DTRACE_PROBE3(delay__mintime, dmu_tx_t *, tx, uint64_t, dirty, + uint64_t, min_tx_time); + + mutex_enter(&dp->dp_lock); + wakeup = MAX(tx->tx_start + min_tx_time, + dp->dp_last_wakeup + min_tx_time); + dp->dp_last_wakeup = wakeup; + mutex_exit(&dp->dp_lock); + +#ifdef _KERNEL +#ifdef illumos + mutex_enter(&curthread->t_delay_lock); + while (cv_timedwait_hires(&curthread->t_delay_cv, + &curthread->t_delay_lock, wakeup, zfs_delay_resolution_ns, + CALLOUT_FLAG_ABSOLUTE | CALLOUT_FLAG_ROUNDUP) > 0) + continue; + mutex_exit(&curthread->t_delay_lock); +#else + /* XXX High resolution callouts are not available */ + ASSERT(wakeup >= now); + pause("dmu_tx_delay", NSEC_TO_TICK(wakeup - now)); +#endif +#else + hrtime_t delta = wakeup - gethrtime(); + struct timespec ts; + ts.tv_sec = delta / NANOSEC; + ts.tv_nsec = delta % NANOSEC; + (void) nanosleep(&ts, NULL); +#endif +} + static int dmu_tx_try_assign(dmu_tx_t *tx, txg_how_t txg_how) { @@ -941,6 +1098,12 @@ dmu_tx_try_assign(dmu_tx_t *tx, txg_how_ return (SET_ERROR(ERESTART)); } + if (!tx->tx_waited && + dsl_pool_need_dirty_delay(tx->tx_pool)) { + tx->tx_wait_dirty = B_TRUE; + return (SET_ERROR(ERESTART)); *** DIFF OUTPUT TRUNCATED AT 1000 LINES *** From owner-svn-src-stable-8@FreeBSD.ORG Thu Jan 16 16:04:37 2014 Return-Path: Delivered-To: svn-src-stable-8@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 98A4F905; Thu, 16 Jan 2014 16:04:37 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 8383F1A22; Thu, 16 Jan 2014 16:04:37 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id s0GG4bSu073656; Thu, 16 Jan 2014 16:04:37 GMT (envelope-from avg@svn.freebsd.org) Received: (from avg@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id s0GG4aoZ073651; Thu, 16 Jan 2014 16:04:36 GMT (envelope-from avg@svn.freebsd.org) Message-Id: <201401161604.s0GG4aoZ073651@svn.freebsd.org> From: Andriy Gapon Date: Thu, 16 Jan 2014 16:04:36 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r260767 - in stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs: . sys X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Jan 2014 16:04:37 -0000 Author: avg Date: Thu Jan 16 16:04:36 2014 New Revision: 260767 URL: http://svnweb.freebsd.org/changeset/base/260767 Log: MFC r258633: MFV r255256: 3954 metaslabs continue to load even after hitting zfs_mg_alloc_failure limit Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/metaslab_impl.h stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c Directory Properties: stable/8/sys/ (props changed) stable/8/sys/cddl/ (props changed) stable/8/sys/cddl/contrib/opensolaris/ (props changed) Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c Thu Jan 16 16:04:20 2014 (r260766) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c Thu Jan 16 16:04:36 2014 (r260767) @@ -58,7 +58,8 @@ int zfs_condense_pct = 200; /* * This value defines the number of allowed allocation failures per vdev. * If a device reaches this threshold in a given txg then we consider skipping - * allocations on that device. + * allocations on that device. The value of zfs_mg_alloc_failures is computed + * in zio_init() unless it has been overridden in /etc/system. */ int zfs_mg_alloc_failures = 0; @@ -69,6 +70,21 @@ SYSCTL_INT(_vfs_zfs, OID_AUTO, mg_alloc_ TUNABLE_INT("vfs.zfs.mg_alloc_failures", &zfs_mg_alloc_failures); /* + * The zfs_mg_noalloc_threshold defines which metaslab groups should + * be eligible for allocation. The value is defined as a percentage of + * a free space. Metaslab groups that have more free space than + * zfs_mg_noalloc_threshold are always eligible for allocations. Once + * a metaslab group's free space is less than or equal to the + * zfs_mg_noalloc_threshold the allocator will avoid allocating to that + * group unless all groups in the pool have reached zfs_mg_noalloc_threshold. + * Once all groups in the pool reach zfs_mg_noalloc_threshold then all + * groups are allowed to accept allocations. Gang blocks are always + * eligible to allocate on any metaslab group. The default value of 0 means + * no metaslab group will be excluded based on this criterion. + */ +int zfs_mg_noalloc_threshold = 0; + +/* * Metaslab debugging: when set, keeps all space maps in core to verify frees. */ static int metaslab_debug = 0; @@ -234,6 +250,53 @@ metaslab_compare(const void *x1, const v return (0); } +/* + * Update the allocatable flag and the metaslab group's capacity. + * The allocatable flag is set to true if the capacity is below + * the zfs_mg_noalloc_threshold. If a metaslab group transitions + * from allocatable to non-allocatable or vice versa then the metaslab + * group's class is updated to reflect the transition. + */ +static void +metaslab_group_alloc_update(metaslab_group_t *mg) +{ + vdev_t *vd = mg->mg_vd; + metaslab_class_t *mc = mg->mg_class; + vdev_stat_t *vs = &vd->vdev_stat; + boolean_t was_allocatable; + + ASSERT(vd == vd->vdev_top); + + mutex_enter(&mg->mg_lock); + was_allocatable = mg->mg_allocatable; + + mg->mg_free_capacity = ((vs->vs_space - vs->vs_alloc) * 100) / + (vs->vs_space + 1); + + mg->mg_allocatable = (mg->mg_free_capacity > zfs_mg_noalloc_threshold); + + /* + * The mc_alloc_groups maintains a count of the number of + * groups in this metaslab class that are still above the + * zfs_mg_noalloc_threshold. This is used by the allocating + * threads to determine if they should avoid allocations to + * a given group. The allocator will avoid allocations to a group + * if that group has reached or is below the zfs_mg_noalloc_threshold + * and there are still other groups that are above the threshold. + * When a group transitions from allocatable to non-allocatable or + * vice versa we update the metaslab class to reflect that change. + * When the mc_alloc_groups value drops to 0 that means that all + * groups have reached the zfs_mg_noalloc_threshold making all groups + * eligible for allocations. This effectively means that all devices + * are balanced again. + */ + if (was_allocatable && !mg->mg_allocatable) + mc->mc_alloc_groups--; + else if (!was_allocatable && mg->mg_allocatable) + mc->mc_alloc_groups++; + mutex_exit(&mg->mg_lock); +} + metaslab_group_t * metaslab_group_create(metaslab_class_t *mc, vdev_t *vd) { @@ -284,6 +347,7 @@ metaslab_group_activate(metaslab_group_t return; mg->mg_aliquot = metaslab_aliquot * MAX(1, mg->mg_vd->vdev_children); + metaslab_group_alloc_update(mg); if ((mgprev = mc->mc_rotor) == NULL) { mg->mg_prev = mg; @@ -369,6 +433,29 @@ metaslab_group_sort(metaslab_group_t *mg } /* + * Determine if a given metaslab group should skip allocations. A metaslab + * group should avoid allocations if its used capacity has crossed the + * zfs_mg_noalloc_threshold and there is at least one metaslab group + * that can still handle allocations. + */ +static boolean_t +metaslab_group_allocatable(metaslab_group_t *mg) +{ + vdev_t *vd = mg->mg_vd; + spa_t *spa = vd->vdev_spa; + metaslab_class_t *mc = mg->mg_class; + + /* + * A metaslab group is considered allocatable if its free capacity + * is greater than the set value of zfs_mg_noalloc_threshold, it's + * associated with a slog, or there are no other metaslab groups + * with free capacity greater than zfs_mg_noalloc_threshold. + */ + return (mg->mg_free_capacity > zfs_mg_noalloc_threshold || + mc != spa_normal_class(spa) || mc->mc_alloc_groups == 0); +} + +/* * ========================================================================== * Common allocator routines * ========================================================================== @@ -1317,6 +1404,8 @@ metaslab_sync_reassess(metaslab_group_t vdev_t *vd = mg->mg_vd; int64_t failures = mg->mg_alloc_failures; + metaslab_group_alloc_update(mg); + /* * Re-evaluate all metaslabs which have lower offsets than the * bonus area. @@ -1418,6 +1507,8 @@ metaslab_group_alloc(metaslab_group_t *m if (msp == NULL) return (-1ULL); + mutex_enter(&msp->ms_lock); + /* * If we've already reached the allowable number of failed * allocation attempts on this metaslab group then we @@ -1434,11 +1525,10 @@ metaslab_group_alloc(metaslab_group_t *m "asize %llu, failures %llu", spa_name(spa), mg->mg_vd->vdev_id, txg, mg, psize, asize, mg->mg_alloc_failures); + mutex_exit(&msp->ms_lock); return (-1ULL); } - mutex_enter(&msp->ms_lock); - /* * Ensure that the metaslab we have selected is still * capable of handling our request. It's possible that @@ -1591,6 +1681,21 @@ top: } else { allocatable = vdev_allocatable(vd); } + + /* + * Determine if the selected metaslab group is eligible + * for allocations. If we're ganging or have requested + * an allocation for the smallest gang block size + * then we don't want to avoid allocating to the this + * metaslab group. If we're in this condition we should + * try to allocate from any device possible so that we + * don't inadvertently return ENOSPC and suspend the pool + * even though space is still available. + */ + if (allocatable && CAN_FASTGANG(flags) && + psize > SPA_GANGBLOCKSIZE) + allocatable = metaslab_group_allocatable(mg); + if (!allocatable) goto next; Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/metaslab_impl.h ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/metaslab_impl.h Thu Jan 16 16:04:20 2014 (r260766) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/metaslab_impl.h Thu Jan 16 16:04:36 2014 (r260767) @@ -24,7 +24,7 @@ */ /* - * Copyright (c) 2012 by Delphix. All rights reserved. + * Copyright (c) 2013 by Delphix. All rights reserved. */ #ifndef _SYS_METASLAB_IMPL_H @@ -45,6 +45,7 @@ struct metaslab_class { metaslab_group_t *mc_rotor; space_map_ops_t *mc_ops; uint64_t mc_aliquot; + uint64_t mc_alloc_groups; /* # of allocatable groups */ uint64_t mc_alloc; /* total allocated space */ uint64_t mc_deferred; /* total deferred frees */ uint64_t mc_space; /* total space (alloc + free) */ @@ -57,6 +58,8 @@ struct metaslab_group { uint64_t mg_aliquot; uint64_t mg_bonus_area; uint64_t mg_alloc_failures; + boolean_t mg_allocatable; /* can we allocate? */ + uint64_t mg_free_capacity; /* percentage free */ int64_t mg_bias; int64_t mg_activation_count; metaslab_class_t *mg_class; Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c Thu Jan 16 16:04:20 2014 (r260766) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c Thu Jan 16 16:04:36 2014 (r260767) @@ -2417,7 +2417,7 @@ zio_alloc_zil(spa_t *spa, uint64_t txg, if (error) { error = metaslab_alloc(spa, spa_normal_class(spa), size, new_bp, 1, txg, old_bp, - METASLAB_HINTBP_AVOID | METASLAB_GANG_AVOID); + METASLAB_HINTBP_AVOID); } if (error == 0) { From owner-svn-src-stable-8@FreeBSD.ORG Thu Jan 16 16:08:27 2014 Return-Path: Delivered-To: svn-src-stable-8@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id CB893E00; Thu, 16 Jan 2014 16:08:27 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id ACE7A1A50; Thu, 16 Jan 2014 16:08:27 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id s0GG8Rjs074449; Thu, 16 Jan 2014 16:08:27 GMT (envelope-from avg@svn.freebsd.org) Received: (from avg@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id s0GG8Ri1074445; Thu, 16 Jan 2014 16:08:27 GMT (envelope-from avg@svn.freebsd.org) Message-Id: <201401161608.s0GG8Ri1074445@svn.freebsd.org> From: Andriy Gapon Date: Thu, 16 Jan 2014 16:08:27 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r260771 - stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Jan 2014 16:08:28 -0000 Author: avg Date: Thu Jan 16 16:08:26 2014 New Revision: 260771 URL: http://svnweb.freebsd.org/changeset/base/260771 Log: MFC r258634: MFV r258376: 3964 L2ARC should always compress metadata buffers Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dbuf.h stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_objset.h Directory Properties: stable/8/sys/ (props changed) stable/8/sys/cddl/ (props changed) stable/8/sys/cddl/contrib/opensolaris/ (props changed) Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dbuf.h ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dbuf.h Thu Jan 16 16:08:14 2014 (r260770) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dbuf.h Thu Jan 16 16:08:26 2014 (r260771) @@ -329,7 +329,8 @@ boolean_t dbuf_is_metadata(dmu_buf_impl_ ((_db)->db_objset->os_secondary_cache == ZFS_CACHE_METADATA))) #define DBUF_IS_L2COMPRESSIBLE(_db) \ - ((_db)->db_objset->os_compress != ZIO_COMPRESS_OFF) + ((_db)->db_objset->os_compress != ZIO_COMPRESS_OFF || \ + (dbuf_is_metadata(_db) && zfs_mdcomp_disable == B_FALSE)) #ifdef ZFS_DEBUG Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h Thu Jan 16 16:08:14 2014 (r260770) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu.h Thu Jan 16 16:08:26 2014 (r260771) @@ -24,6 +24,7 @@ * Copyright (c) 2013 by Delphix. All rights reserved. * Copyright 2011 Nexenta Systems, Inc. All rights reserved. * Copyright (c) 2012, Joyent, Inc. All rights reserved. + * Copyright 2013 DEY Storage Systems, Inc. */ /* Portions Copyright 2010 Robert Milkowski */ @@ -807,6 +808,8 @@ int dmu_diff(const char *tosnap_name, co #define ZFS_CRC64_POLY 0xC96C5795D7870F42ULL /* ECMA-182, reflected form */ extern uint64_t zfs_crc64_table[256]; +extern int zfs_mdcomp_disable; + #ifdef __cplusplus } #endif Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_objset.h ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_objset.h Thu Jan 16 16:08:14 2014 (r260770) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/sys/dmu_objset.h Thu Jan 16 16:08:26 2014 (r260771) @@ -130,7 +130,7 @@ struct objset { ((os)->os_secondary_cache == ZFS_CACHE_ALL || \ (os)->os_secondary_cache == ZFS_CACHE_METADATA) -#define DMU_OS_IS_L2COMPRESSIBLE(os) ((os)->os_compress != ZIO_COMPRESS_OFF) +#define DMU_OS_IS_L2COMPRESSIBLE(os) (zfs_mdcomp_disable == B_FALSE) /* called from zpl */ int dmu_objset_hold(const char *name, void *tag, objset_t **osp); From owner-svn-src-stable-8@FreeBSD.ORG Thu Jan 16 16:13:44 2014 Return-Path: Delivered-To: svn-src-stable-8@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id F0AC759C; Thu, 16 Jan 2014 16:13:44 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id DBB4A1AF4; Thu, 16 Jan 2014 16:13:44 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id s0GGDi9J078163; Thu, 16 Jan 2014 16:13:44 GMT (envelope-from avg@svn.freebsd.org) Received: (from avg@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id s0GGDib8078162; Thu, 16 Jan 2014 16:13:44 GMT (envelope-from avg@svn.freebsd.org) Message-Id: <201401161613.s0GGDib8078162@svn.freebsd.org> From: Andriy Gapon Date: Thu, 16 Jan 2014 16:13:44 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r260775 - stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Jan 2014 16:13:45 -0000 Author: avg Date: Thu Jan 16 16:13:44 2014 New Revision: 260775 URL: http://svnweb.freebsd.org/changeset/base/260775 Log: MFC r258739: zfs mappedread_sf: assert that a page is never partially valid Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c Directory Properties: stable/8/sys/ (props changed) stable/8/sys/cddl/ (props changed) stable/8/sys/cddl/contrib/opensolaris/ (props changed) Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c Thu Jan 16 16:13:32 2014 (r260774) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c Thu Jan 16 16:13:44 2014 (r260775) @@ -567,7 +567,8 @@ again: pp->valid = VM_PAGE_BITS_ALL; vm_page_activate(pp); } - vm_page_unlock_queues(); + } else { + ASSERT3U(pp->valid, ==, VM_PAGE_BITS_ALL); } if (error) break; From owner-svn-src-stable-8@FreeBSD.ORG Thu Jan 16 16:15:58 2014 Return-Path: Delivered-To: svn-src-stable-8@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id E56A9B30; Thu, 16 Jan 2014 16:15:57 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id D15441B15; Thu, 16 Jan 2014 16:15:57 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id s0GGFvBW078816; Thu, 16 Jan 2014 16:15:57 GMT (envelope-from avg@svn.freebsd.org) Received: (from avg@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id s0GGFvPE078813; Thu, 16 Jan 2014 16:15:57 GMT (envelope-from avg@svn.freebsd.org) Message-Id: <201401161615.s0GGFvPE078813@svn.freebsd.org> From: Andriy Gapon Date: Thu, 16 Jan 2014 16:15:57 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r260778 - stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Jan 2014 16:15:58 -0000 Author: avg Date: Thu Jan 16 16:15:56 2014 New Revision: 260778 URL: http://svnweb.freebsd.org/changeset/base/260778 Log: MFC r258720: MFV r258665: 4347 ZPL can use dmu_tx_assign(TXG_WAIT) Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_dir.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c Directory Properties: stable/8/sys/ (props changed) stable/8/sys/cddl/ (props changed) stable/8/sys/cddl/contrib/opensolaris/ (props changed) Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_dir.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_dir.c Thu Jan 16 16:15:48 2014 (r260777) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_dir.c Thu Jan 16 16:15:56 2014 (r260778) @@ -951,7 +951,6 @@ zfs_make_xattrdir(znode_t *zp, vattr_t * return (SET_ERROR(EDQUOT)); } -top: tx = dmu_tx_create(zfsvfs->z_os); dmu_tx_hold_sa_create(tx, acl_ids.z_aclp->z_acl_bytes + ZFS_SA_BASE_ATTR_SIZE); @@ -960,13 +959,8 @@ top: fuid_dirtied = zfsvfs->z_fuid_dirty; if (fuid_dirtied) zfs_fuid_txhold(zfsvfs, tx); - error = dmu_tx_assign(tx, TXG_NOWAIT); + error = dmu_tx_assign(tx, TXG_WAIT); if (error) { - if (error == ERESTART) { - dmu_tx_wait(tx); - dmu_tx_abort(tx); - goto top; - } zfs_acl_ids_free(&acl_ids); dmu_tx_abort(tx); return (error); Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c Thu Jan 16 16:15:48 2014 (r260777) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c Thu Jan 16 16:15:56 2014 (r260778) @@ -105,11 +105,18 @@ * (3) All range locks must be grabbed before calling dmu_tx_assign(), * as they can span dmu_tx_assign() calls. * - * (4) Always pass TXG_NOWAIT as the second argument to dmu_tx_assign(). - * This is critical because we don't want to block while holding locks. - * Note, in particular, that if a lock is sometimes acquired before - * the tx assigns, and sometimes after (e.g. z_lock), then failing to - * use a non-blocking assign can deadlock the system. The scenario: + * (4) If ZPL locks are held, pass TXG_NOWAIT as the second argument to + * dmu_tx_assign(). This is critical because we don't want to block + * while holding locks. + * + * If no ZPL locks are held (aside from ZFS_ENTER()), use TXG_WAIT. This + * reduces lock contention and CPU usage when we must wait (note that if + * throughput is constrained by the storage, nearly every transaction + * must wait). + * + * Note, in particular, that if a lock is sometimes acquired before + * the tx assigns, and sometimes after (e.g. z_lock), then failing + * to use a non-blocking assign can deadlock the system. The scenario: * * Thread A has grabbed a lock before calling dmu_tx_assign(). * Thread B is in an already-assigned tx, and blocks for this lock. @@ -949,7 +956,6 @@ zfs_write(vnode_t *vp, uio_t *uio, int i while (n > 0) { abuf = NULL; woff = uio->uio_loffset; -again: if (zfs_owner_overquota(zfsvfs, zp, B_FALSE) || zfs_owner_overquota(zfsvfs, zp, B_TRUE)) { if (abuf != NULL) @@ -1001,13 +1007,8 @@ again: dmu_tx_hold_sa(tx, zp->z_sa_hdl, B_FALSE); dmu_tx_hold_write(tx, zp->z_id, woff, MIN(n, max_blksz)); zfs_sa_upgrade_txholds(tx, zp); - error = dmu_tx_assign(tx, TXG_NOWAIT); + error = dmu_tx_assign(tx, TXG_WAIT); if (error) { - if (error == ERESTART) { - dmu_tx_wait(tx); - dmu_tx_abort(tx); - goto again; - } dmu_tx_abort(tx); if (abuf != NULL) dmu_return_arcbuf(abuf); @@ -3394,12 +3395,9 @@ top: zfs_sa_upgrade_txholds(tx, zp); - err = dmu_tx_assign(tx, TXG_NOWAIT); - if (err) { - if (err == ERESTART) - dmu_tx_wait(tx); + err = dmu_tx_assign(tx, TXG_WAIT); + if (err) goto out; - } count = 0; /* @@ -4495,19 +4493,13 @@ zfs_putapage(vnode_t *vp, page_t *pp, u_ err = SET_ERROR(EDQUOT); goto out; } -top: tx = dmu_tx_create(zfsvfs->z_os); dmu_tx_hold_write(tx, zp->z_id, off, len); dmu_tx_hold_sa(tx, zp->z_sa_hdl, B_FALSE); zfs_sa_upgrade_txholds(tx, zp); - err = dmu_tx_assign(tx, TXG_NOWAIT); + err = dmu_tx_assign(tx, TXG_WAIT); if (err != 0) { - if (err == ERESTART) { - dmu_tx_wait(tx); - dmu_tx_abort(tx); - goto top; - } dmu_tx_abort(tx); goto out; } Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c Thu Jan 16 16:15:48 2014 (r260777) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c Thu Jan 16 16:15:56 2014 (r260778) @@ -1536,7 +1536,6 @@ zfs_extend(znode_t *zp, uint64_t end) zfs_range_unlock(rl); return (0); } -top: tx = dmu_tx_create(zfsvfs->z_os); dmu_tx_hold_sa(tx, zp->z_sa_hdl, B_FALSE); zfs_sa_upgrade_txholds(tx, zp); @@ -1556,13 +1555,8 @@ top: newblksz = 0; } - error = dmu_tx_assign(tx, TXG_NOWAIT); + error = dmu_tx_assign(tx, TXG_WAIT); if (error) { - if (error == ERESTART) { - dmu_tx_wait(tx); - dmu_tx_abort(tx); - goto top; - } dmu_tx_abort(tx); zfs_range_unlock(rl); return (error); @@ -1670,17 +1664,11 @@ zfs_trunc(znode_t *zp, uint64_t end) zfs_range_unlock(rl); return (error); } -top: tx = dmu_tx_create(zfsvfs->z_os); dmu_tx_hold_sa(tx, zp->z_sa_hdl, B_FALSE); zfs_sa_upgrade_txholds(tx, zp); - error = dmu_tx_assign(tx, TXG_NOWAIT); + error = dmu_tx_assign(tx, TXG_WAIT); if (error) { - if (error == ERESTART) { - dmu_tx_wait(tx); - dmu_tx_abort(tx); - goto top; - } dmu_tx_abort(tx); zfs_range_unlock(rl); return (error); @@ -1771,13 +1759,8 @@ log: tx = dmu_tx_create(zfsvfs->z_os); dmu_tx_hold_sa(tx, zp->z_sa_hdl, B_FALSE); zfs_sa_upgrade_txholds(tx, zp); - error = dmu_tx_assign(tx, TXG_NOWAIT); + error = dmu_tx_assign(tx, TXG_WAIT); if (error) { - if (error == ERESTART) { - dmu_tx_wait(tx); - dmu_tx_abort(tx); - goto log; - } dmu_tx_abort(tx); return (error); } From owner-svn-src-stable-8@FreeBSD.ORG Thu Jan 16 16:37:17 2014 Return-Path: Delivered-To: svn-src-stable-8@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id C97BB608; Thu, 16 Jan 2014 16:37:17 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id B45EB1E10; Thu, 16 Jan 2014 16:37:17 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id s0GGbHNM086927; Thu, 16 Jan 2014 16:37:17 GMT (envelope-from avg@svn.freebsd.org) Received: (from avg@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id s0GGbHxT086926; Thu, 16 Jan 2014 16:37:17 GMT (envelope-from avg@svn.freebsd.org) Message-Id: <201401161637.s0GGbHxT086926@svn.freebsd.org> From: Andriy Gapon Date: Thu, 16 Jan 2014 16:37:17 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r260780 - stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Jan 2014 16:37:17 -0000 Author: avg Date: Thu Jan 16 16:37:17 2014 New Revision: 260780 URL: http://svnweb.freebsd.org/changeset/base/260780 Log: MFC r243518: add zfs_bmap to aid vnode_pager_haspage Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c Directory Properties: stable/8/sys/ (props changed) stable/8/sys/cddl/ (props changed) stable/8/sys/cddl/contrib/opensolaris/ (props changed) Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c Thu Jan 16 16:28:19 2014 (r260779) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c Thu Jan 16 16:37:17 2014 (r260780) @@ -5737,6 +5737,30 @@ zfs_freebsd_getpages(ap) } static int +zfs_freebsd_bmap(ap) + struct vop_bmap_args /* { + struct vnode *a_vp; + daddr_t a_bn; + struct bufobj **a_bop; + daddr_t *a_bnp; + int *a_runp; + int *a_runb; + } */ *ap; +{ + + if (ap->a_bop != NULL) + *ap->a_bop = &ap->a_vp->v_bufobj; + if (ap->a_bnp != NULL) + *ap->a_bnp = ap->a_bn; + if (ap->a_runp != NULL) + *ap->a_runp = 0; + if (ap->a_runb != NULL) + *ap->a_runb = 0; + + return (0); +} + +static int zfs_freebsd_open(ap) struct vop_open_args /* { struct vnode *a_vp; @@ -6786,7 +6810,7 @@ struct vop_vector zfs_vnodeops = { .vop_remove = zfs_freebsd_remove, .vop_rename = zfs_freebsd_rename, .vop_pathconf = zfs_freebsd_pathconf, - .vop_bmap = VOP_EOPNOTSUPP, + .vop_bmap = zfs_freebsd_bmap, .vop_fid = zfs_freebsd_fid, .vop_getextattr = zfs_getextattr, .vop_deleteextattr = zfs_deleteextattr, From owner-svn-src-stable-8@FreeBSD.ORG Thu Jan 16 17:58:23 2014 Return-Path: Delivered-To: svn-src-stable-8@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 07C16EE9; Thu, 16 Jan 2014 17:58:23 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id E755C15EE; Thu, 16 Jan 2014 17:58:22 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id s0GHwMXu021926; Thu, 16 Jan 2014 17:58:22 GMT (envelope-from avg@svn.freebsd.org) Received: (from avg@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id s0GHwMfV021924; Thu, 16 Jan 2014 17:58:22 GMT (envelope-from avg@svn.freebsd.org) Message-Id: <201401161758.s0GHwMfV021924@svn.freebsd.org> From: Andriy Gapon Date: Thu, 16 Jan 2014 17:58:22 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r260783 - stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Jan 2014 17:58:23 -0000 Author: avg Date: Thu Jan 16 17:58:22 2014 New Revision: 260783 URL: http://svnweb.freebsd.org/changeset/base/260783 Log: Revert r260780 "add zfs_bmap to aid vnode_pager_haspage" I thought that I had to have that commit in this branch, but now I decided to not bother. This is a direct commit, obviously. Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c Directory Properties: stable/8/sys/ (props changed) stable/8/sys/cddl/ (props changed) stable/8/sys/cddl/contrib/opensolaris/ (props changed) Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c Thu Jan 16 17:06:02 2014 (r260782) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c Thu Jan 16 17:58:22 2014 (r260783) @@ -5737,30 +5737,6 @@ zfs_freebsd_getpages(ap) } static int -zfs_freebsd_bmap(ap) - struct vop_bmap_args /* { - struct vnode *a_vp; - daddr_t a_bn; - struct bufobj **a_bop; - daddr_t *a_bnp; - int *a_runp; - int *a_runb; - } */ *ap; -{ - - if (ap->a_bop != NULL) - *ap->a_bop = &ap->a_vp->v_bufobj; - if (ap->a_bnp != NULL) - *ap->a_bnp = ap->a_bn; - if (ap->a_runp != NULL) - *ap->a_runp = 0; - if (ap->a_runb != NULL) - *ap->a_runb = 0; - - return (0); -} - -static int zfs_freebsd_open(ap) struct vop_open_args /* { struct vnode *a_vp; @@ -6810,7 +6786,7 @@ struct vop_vector zfs_vnodeops = { .vop_remove = zfs_freebsd_remove, .vop_rename = zfs_freebsd_rename, .vop_pathconf = zfs_freebsd_pathconf, - .vop_bmap = zfs_freebsd_bmap, + .vop_bmap = VOP_EOPNOTSUPP, .vop_fid = zfs_freebsd_fid, .vop_getextattr = zfs_getextattr, .vop_deleteextattr = zfs_deleteextattr, From owner-svn-src-stable-8@FreeBSD.ORG Thu Jan 16 18:01:58 2014 Return-Path: Delivered-To: svn-src-stable-8@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 60790302; Thu, 16 Jan 2014 18:01:58 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id 4BC431696; Thu, 16 Jan 2014 18:01:58 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id s0GI1wpV025254; Thu, 16 Jan 2014 18:01:58 GMT (envelope-from avg@svn.freebsd.org) Received: (from avg@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id s0GI1waY025253; Thu, 16 Jan 2014 18:01:58 GMT (envelope-from avg@svn.freebsd.org) Message-Id: <201401161801.s0GI1waY025253@svn.freebsd.org> From: Andriy Gapon Date: Thu, 16 Jan 2014 18:01:58 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r260784 - stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Jan 2014 18:01:58 -0000 Author: avg Date: Thu Jan 16 18:01:57 2014 New Revision: 260784 URL: http://svnweb.freebsd.org/changeset/base/260784 Log: fix a botched merge in r260775, MFC of r258739 This is a direct commit. Pointyhat to: avg Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c Modified: stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c ============================================================================== --- stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c Thu Jan 16 17:58:22 2014 (r260783) +++ stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c Thu Jan 16 18:01:57 2014 (r260784) @@ -574,6 +574,7 @@ again: pp->valid = VM_PAGE_BITS_ALL; vm_page_activate(pp); } + vm_page_unlock_queues(); } else { ASSERT3U(pp->valid, ==, VM_PAGE_BITS_ALL); } From owner-svn-src-stable-8@FreeBSD.ORG Thu Jan 16 18:05:45 2014 Return-Path: Delivered-To: svn-src-stable-8@freebsd.org Received: by hub.freebsd.org (Postfix, from userid 1033) id 5B90A6C3; Thu, 16 Jan 2014 18:05:45 +0000 (UTC) Date: Thu, 16 Jan 2014 18:05:45 +0000 From: Alexey Dokuchaev To: Andriy Gapon Subject: Re: svn commit: r260783 - stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs Message-ID: <20140116180545.GA10827@FreeBSD.org> References: <201401161758.s0GHwMfV021924@svn.freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201401161758.s0GHwMfV021924@svn.freebsd.org> User-Agent: Mutt/1.5.22 (2013-10-16) Cc: svn-src-stable@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, svn-src-stable-8@freebsd.org X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Jan 2014 18:05:45 -0000 On Thu, Jan 16, 2014 at 05:58:22PM +0000, Andriy Gapon wrote: > New Revision: 260783 > URL: http://svnweb.freebsd.org/changeset/base/260783 > > Log: > Revert r260780 "add zfs_bmap to aid vnode_pager_haspage" > > I thought that I had to have that commit in this branch, but now I > decided to not bother. Andriy, may I ask how soon you plan to finish merging to stable/8? I have a few patches (mostly GCC-related) to test against this branch, but rather wait for the dust to settle first to avoid any interference. ./danfe From owner-svn-src-stable-8@FreeBSD.ORG Thu Jan 16 18:15:51 2014 Return-Path: Delivered-To: svn-src-stable-8@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 32179DD5; Thu, 16 Jan 2014 18:15:51 +0000 (UTC) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 81F3B182D; Thu, 16 Jan 2014 18:15:42 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id UAA08887; Thu, 16 Jan 2014 20:15:41 +0200 (EET) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1W3rTh-000KE1-8k; Thu, 16 Jan 2014 20:15:41 +0200 Message-ID: <52D82195.80703@FreeBSD.org> Date: Thu, 16 Jan 2014 20:14:45 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: Alexey Dokuchaev Subject: Re: svn commit: r260783 - stable/8/sys/cddl/contrib/opensolaris/uts/common/fs/zfs References: <201401161758.s0GHwMfV021924@svn.freebsd.org> <20140116180545.GA10827@FreeBSD.org> In-Reply-To: <20140116180545.GA10827@FreeBSD.org> X-Enigmail-Version: 1.6 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: svn-src-stable@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org, svn-src-stable-8@FreeBSD.org X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 16 Jan 2014 18:15:51 -0000 on 16/01/2014 20:05 Alexey Dokuchaev said the following: > On Thu, Jan 16, 2014 at 05:58:22PM +0000, Andriy Gapon wrote: >> New Revision: 260783 >> URL: http://svnweb.freebsd.org/changeset/base/260783 >> >> Log: >> Revert r260780 "add zfs_bmap to aid vnode_pager_haspage" >> >> I thought that I had to have that commit in this branch, but now I >> decided to not bother. > > Andriy, may I ask how soon you plan to finish merging to stable/8? I have > a few patches (mostly GCC-related) to test against this branch, but rather > wait for the dust to settle first to avoid any interference. I am done for today. -- Andriy Gapon From owner-svn-src-stable-8@FreeBSD.ORG Sat Jan 18 03:45:08 2014 Return-Path: Delivered-To: svn-src-stable-8@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 144F8D3F; Sat, 18 Jan 2014 03:45:08 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id F4163176C; Sat, 18 Jan 2014 03:45:07 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.7/8.14.7) with ESMTP id s0I3j751045878; Sat, 18 Jan 2014 03:45:07 GMT (envelope-from bryanv@svn.freebsd.org) Received: (from bryanv@localhost) by svn.freebsd.org (8.14.7/8.14.7/Submit) id s0I3j7dS045877; Sat, 18 Jan 2014 03:45:07 GMT (envelope-from bryanv@svn.freebsd.org) Message-Id: <201401180345.s0I3j7dS045877@svn.freebsd.org> From: Bryan Venteicher Date: Sat, 18 Jan 2014 03:45:07 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-8@freebsd.org Subject: svn commit: r260840 - stable/8/sys/dev/virtio/scsi X-SVN-Group: stable-8 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable-8@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: SVN commit messages for only the 8-stable src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 18 Jan 2014 03:45:08 -0000 Author: bryanv Date: Sat Jan 18 03:45:07 2014 New Revision: 260840 URL: http://svnweb.freebsd.org/changeset/base/260840 Log: MFC r260566: Remove incorrect bit shift when assigning the LUN request field Modified: stable/8/sys/dev/virtio/scsi/virtio_scsi.c Directory Properties: stable/8/sys/ (props changed) stable/8/sys/dev/ (props changed) stable/8/sys/dev/virtio/ (props changed) Modified: stable/8/sys/dev/virtio/scsi/virtio_scsi.c ============================================================================== --- stable/8/sys/dev/virtio/scsi/virtio_scsi.c Sat Jan 18 03:44:43 2014 (r260839) +++ stable/8/sys/dev/virtio/scsi/virtio_scsi.c Sat Jan 18 03:45:07 2014 (r260840) @@ -1561,7 +1561,7 @@ vtscsi_set_request_lun(struct ccb_hdr *c lun[0] = 1; lun[1] = ccbh->target_id; lun[2] = 0x40 | ((ccbh->target_lun >> 8) & 0x3F); - lun[3] = (ccbh->target_lun >> 8) & 0xFF; + lun[3] = ccbh->target_lun & 0xFF; } static void