From owner-freebsd-rc@FreeBSD.ORG Fri Nov 5 17:41:04 2010 Return-Path: Delivered-To: freebsd-rc@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0620C1065693 for ; Fri, 5 Nov 2010 17:41:04 +0000 (UTC) (envelope-from dteske@vicor.com) Received: from postoffice.vicor.com (postoffice.vicor.com [69.26.56.53]) by mx1.freebsd.org (Postfix) with ESMTP id DC83A8FC12 for ; Fri, 5 Nov 2010 17:41:02 +0000 (UTC) Received: from [208.206.78.30] (port=52808 helo=dt.vicor.com) by postoffice.vicor.com with esmtpsa (SSLv3:RC4-MD5:128) (Exim 4.71) (envelope-from ) id 1PEQHa-0000QY-RV; Fri, 05 Nov 2010 10:41:02 -0700 From: Devin Teske To: Cyrille Lefevre In-Reply-To: <4CD3731C.6020501@laposte.net> References: <1286925182.32724.18.camel@localhost.localdomain> <1286996709.32724.60.camel@localhost.localdomain> <1287448781.5713.3.camel@localhost.localdomain> <1287510629.25599.2.camel@localhost.localdomain> <1288746388.7362.4.camel@localhost.localdomain> <17B64023-A64A-40DA-9CBC-A601710AB5BB@vicor.com> <1288919368.7362.35.camel@localhost.localdomain> <4CD3731C.6020501@laposte.net> Content-Type: text/plain; charset=UTF-8 Organization: Vicor, Inc Date: Fri, 05 Nov 2010 10:40:58 -0700 Message-Id: <1288978858.7362.154.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.0.2 (2.0.2-41.el4) Content-Transfer-Encoding: quoted-printable X-Scan-Signature: 2bdb55ca3ea6de4a29b1210eedbadebc X-Scan-Host: postoffice.vicor.com Cc: freebsd-rc@freebsd.org Subject: Re: sysrc(8) -- a sysctl(8)-like utility for managing rc.conf(5) X-BeenThere: freebsd-rc@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussion related to /etc/rc.d design and implementation." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 05 Nov 2010 17:41:04 -0000 On Fri, 2010-11-05 at 03:59 +0100, Cyrille Lefevre wrote: > Le 05/11/2010 02:09, Devin Teske a =C3=A9crit : >=20 > hi, how about something like (untested, but should work) : >=20 > # no $(cat << EOF needed here, so, no extra \ and $quot and so needed The `$(cat << EOF' is very much needed. The expansion rules for text appearing between "< awkscript=3D' > # %s/\\$0/$0/;s/\\\\/\\/g > BEGIN { ...; regex=3D"^[[:space:]]*" varname "=3D" } Won't work. You've replaced my usage of an embedded here-document (which performs parameter expansion) with single-quote assignment (which does not perform parameter expansion). The point of failure here is that you've taken $varname (which is a positional parameter passed to the shell script function) and turned it into an awk variable (which is never assigned). So right off the bat, your awk script will fail because regex will be assigned a value of: ^[[:space:]]*=3D Now, again, you can get around this problem by using a compound string, like so: awkscript=3D' BEGIN { ...; regex=3D"^[[:space:]]*'"$varname"'=3D } But that's not as readable because it requires you to mentally switch from reading awk code to reading shell, back to awk in the same line. > ... > if ( ! match($0, regex) ) { print; next } Fail due to bad regex value (see prior comments) > ... > if ( t1 ~ /[\'\$\\]/ ) Since we're picking nits: - the backslash before the apostrophe is not needed. > ... > else if ( t1 =3D=3D apos ) { apos needs to be a shell-expanded parameter/variable, otherwise you'll be forced to declare apos in the awkscript, which itself (without using compound strings) will force vim syntax highlighting to break (something that's not really important for the awkscript itself, but it's rather disconcerting to see entire oceans of miscolored text AFTER the embedded here-document). > sub("^" apos "[^" apos "]*", "", value) > if ( length(value) =3D=3D 0 ) t2 =3D "" > sub("^" apos, "", value) Same as above. > ... > else if ( t1 =3D=3D "\"" ) { > sub(/^"[^"]*/, "", value) > if ( length(value) =3D=3D 0 ) t2 =3D "" > sub(/^"/, "", value) $quot was used to protect the vim syntax highlighting from stray marks. This block breaks that whereas my block preserved highlighting (using vim version 7.2.411 here). > ... > t1 =3D t2 =3D "\"" Again, stray mark. To understand why, it's because back-slash expansion is not performed within a here-document, so syntax-highlighting sees three quotes, _not_ an escaped-quote surrounded by two double-quotes. > ... > else if ( t1 ~ /[[:space:]];#]/ ) > # parentheses aren't needed here, or wrap them as before > t1 =3D t2 =3D "\"" Huh? My code: else if ( t1 ~ /[^[:space:];#]/ ) { t1 =3D t2 =3D "$bquot" sub(/^[^[:space:]]*/, "", value) } Your suggested replacement (???): else if ( t1 ~ /[[:space:]];#]/ ) t1 =3D t2 =3D "\"" *cough* You've removed the "NOT" operator (^) from my character class expansion, removed the substitution line, and killed the braces. This code does not do the same thing. Either that, or you were attempting to combine the detection of the non- quoted value-assignment and the null-assignment. This is wrong. These cannot be combined or else the results differ. > ... > printf "%s%c%s%c%s\n", substr(\$0, 0, matchlen), \ > t1, awk_new_value, t2, value > ' Just as before with $regex (which had $varname which is $1 which is the positional parameter passed to the shell function), this will fail because awk_new_value is not defined by awk. Further, you've forgotten to conver \$0 to $0 (considering that you've switched the context from an embedded here-document which performs parameter expansion -- hence the escape of the dollar-sign -- versus a single-quoted assignment which does _NOT_ perform parameter expansion -- and hence no need for the preceding back-slash). >=20 > # ... | ... doesn't need a final \ when wrapped after the | > local awk_new_value=3D"$( echo "$new_value" | > awk '{ gsub(/\\/, "\\\\"); gsub(/"/, "\\\""); print }' )" Wrong. Fail. And here's why... You are correct that a $( ... ) block can traverse newlines. However, what $( ... ) functional performs is a sub-shell. Each line within the $( ... ) syntax is taken as a single-line of shell to be executed. Therefore, by deleting the back-slash at the end of the line, you've turned one statement into two. One statement: echo "$new_value" | \ awk '{ gsub(/\\/, "\\\\"); gsub(/"/, "\\\""); print }' (also written as -- if you have the guts to stomach line-wrapping given ts=3D8 indentation within the script) echo "$new_value" | awk '{ gsub(/\\/, "\\\\"); gsub(/"/, "\\\""); print }' However, you've turned the above into two statements: First statement: echo "$new_value" | Second statement: awk '{ gsub(/\\/, "\\\\"); gsub(/"/, "\\\""); print }' The first statement will cause a shell syntax error within the sub- shell. > ... > # you missed the " here > new_contents=3D$( tail -r "$file" 2> /dev/null ) No, I did not. And here's why... Many people are confused on this issue. Let me clarify. When you make an assignment to a variable in the bare name-space using the $( ... ) or ` ... ` syntaxes, quotes are optional. When you make an assignment to a variable in the positional argument name-space using the $( ... ) or ` ... ` syntaxes, quotes are _required_ (if you want to preserve spaces). Valid Examples: # bare name-space foo=3D$( echo " 123 456 ") echo "'$foo'" # produces: ' 123 456 ' # bare name-space foo=3D` echo " 123 456 " ` echo "'$foo'" # produces: ' 123 456 ' # positional argument name-space env foo=3D"$( echo " 123 456 " )" echo "'$foo'" # produces: ' 123 456 ' # positional argument name-space (within a function) local foo=3D"$( echo " 123 456 " )" echo "'$foo'" # produces: ' 123 456 ' # positional argument name-space [ "$( echo " 123 456 " )" =3D "xxx" ] # works fine Invalid Examples: # positional argument name-space env foo=3D$( echo " 123 456 " ) # produces: env: 123: No such file or directory # positional argument name-space (within a function) local foo=3D$( echo " 123 456 " ) # produces: local: 123: bad variable name # positional argument name-space [ $( echo " 123 456 " ) =3D "xxx" ] # produces: [: 123: unexpected operator > # you may want to use printf "%s" "$new_contents" instead of echo > # to avoid \ sequences interpretation if any > new_contents=3D$( echo "$new_contents" | > awk -v varname=3D"$varname" -v apos=3D"'" \ > -v new_value=3D"$new_value" "$awkscript") No. You are confused about the order in which the shell performs each different word expansion. Expansion of escape sequences is not performed on expanded parameters. Here's proof: #!/bin/sh foo=3D'123\456' echo "'$foo'" # produces: '123\456' Here's the one-liner proof (first with compound strings): /bin/sh -c 'foo=3D'"'"'123\456'"'"'; echo "'"'"'$foo'"'"'"' again (with alternate-form compound strings): /bin/sh -c "foo=3D'"'123\456'"'; echo "'"'"'"'$foo'"'"'"' again (without compound strings; with escapes): /bin/sh -c "foo=3D'123\\456'; echo \"'\$foo'\"" again (with compound strings and escapes): /bin/sh -c "foo=3D'"'123\456'"'; echo \"'\$foo'\"" again (simpler -- as quotes are optional on second part) /bin/sh -c "foo=3D'123\\456'; echo \$foo" again: /bin/sh -c 'foo=3D'"'"'123\456'"'"'; echo $foo' All of the above produce the same output (well, almost, the last two lack surrounding apostrophes -- which were themselves illustrative of the fact that parameter expansion within double-quotes, even with surrounding text, also lacks escape-sequence expansion) and serve to illustrate that it is a common misconception that escape-sequences within variables need imply that one needs to use printf instead of echo. Rather, the truth is that one never needs to escape the escape- sequences within a variable unless the command-line that the parameter was expanded in is itself then re-evaluated (e.g. foo=3D'\\' && eval echo $foo). It is perfectly safe to pass a variable containing escape-sequences, newlines, spaces, variables, etc. to echo. Just make sure that you enclose the variable is quoted (and even then, the only thing you lose by not having it quoted is that multiple spaces will become one -- because rather than appearing as one argument, it will appear as multiple arguments and echo cannot conceivably preserve spaces between arguments because it is the shell that obliterates the spaces between positional parameters, not echo) >=20 > of course, same remarks about the later awk script :-) Which would be equally as faulty. Above remarks to the your same remarks about the later awk script. :-) >=20 > also, %s|/bin/sh|$_PATH_BSHELL| && _PATH_BSHELL=3D/bin/sh What could this possibly buy you? Other than increased CPU cycles caused by foisting further expansions upon the runtime execution? This script is coded for a single target: FreeBSD-CURRENT Abstracting the shell in the above manner gives you nothing, though I understand why you'd want it. Alternatively, if you really desired the ability to switch the shell, you'd do something like: _SH=3D$SHELL ... $_SH ... This would allow you two nice things... 1. If someone modifies your script to change the invocation line: from: #!/bin/sh to: #!/usr/bin/env sh or: #!/usr/bin/env bash or: #!/usr/local/bin/bash or: #!/bin/bash or: #!/usr/bin/bash etc... _SH will inherit the correct value from $SHELL and use the same script interpreter that is interpreting the script itself. 2. If someone modifies your script to change the _SH line: from: _SH=3D$SHELL to: _SH=3D/bin/sh or: _SH=3D/usr/bin/bash etc... The script can use a different sub-interpreter than the invocation interpreter. However, I will use neither for two very good reasons: a. when I do call /bin/sh in certain places, I need to pass specific command-line options were may be shell specific, and thus it should not be considered allowable for some user to come along and change the following: /bin/sh -n /etc/defaults/rc.conf to use some other shell because some other shell (beit inheritance of the invocation shell or otherwise) may not support the `-n' flag (which btw, checks syntax of a script without executing it). b. If /bin/sh doesn't exist, you've got real problems. Least to mention, it's likely that script itself will not execute considering that the invocation line is itself: #!/bin/sh If you truly want to write shell agnostic code, you can, and I have. This involves using env(1) to find your shell by-name rather than by- path (and env(1) is always at /usr/bin/env on every POSIX-compliant OS including BSD*, SunOS, *BSD, Solaris, Linux, Cygwin, and Mac OS X). It also involves using a dependency-calculating structure much like the first iterations of this script (see the beginnings of this thread over on freebsd-hackers@ mailing list archives under the "sysconf" thread). However, as I'll point out again, this script is coded for a single target (FreeBSD-Current) and thus will not employ such abstractions so commonly associated with platform aganosticism and/or embeddedOS programming. In other words, we will rely on things that exist in the base structure of FreeBSD and make no qualms about directly referencing pathnames that should exist. And in mentioning this, here's the current dependency list: # Dependencies (sorted alphabetically): # # awk(1) cat(1) chmod(1) chown(8) chroot(8) env(1) # grep(1) jexec(8) jls(8)* mktemp(1) mv(1) rm(1) # sh(1) stat(1) tail(1) # # *optional Last, but not least... Some pathnames have been hard-coded for security purposes. For example, in the script, you will find three instances where PATH expansion is not trusted and full-pathnames are used: /bin/sh (6 occurrences) /usr/sbin/jexec (2 occurrences) /usr/sbin/chroot (1 occurrence) The only reason I can possibly see to assign these full paths to variables would be if we were going to target another OS which puts these in different places. However, as it currently stands, this script targets FreeBSD, and all versions of FreeBSD agree on the location of these utilities. If it is discovered that this script is even usable on another BSD-style OS that uses the rc.conf(5) files in a similar manner (that is, /etc/defaults/rc.conf must exist and must declare the source_rc_confs() function and must also define the rc_conf_files variable), then I will consider localizing the paths to these executables to variables IF (and only IF) said target platform uses different pathnames. However, as it currently, stands, even MidnightBSD (which has taken this script into its base -- Thanks Lucas Holt!) has these executables right where I expect them to be. >=20 > PS : for non french user : > %s/quot/dquot/;%s/apos/squot/;s/bquotquot/bquot/ Might you have instead meant: %s/[^b]?quot/dquot/g %s/apos/squot/g Not sure where you got the bquotquot (not in my code). NOTE: The leading '%' indicates to me that perhaps you meant this as a vi/vim find/replace command (no '%' would instead indicate sed(1) or perl(1)). You cannot perform multiple '%s///' operators separated by a semi-colon. You would get the error "E488: Trailing characters" in vim and the error "Usage: [line [,line]] s [[/;]RE[/;]repl[/;] [cgr] [count] [#lp]]." in vi (but I digress from picking nits) >=20 > Regards, >=20 > Cyrille Lefevre --=20 Cheers, Devin Teske -> CONTACT INFORMATION <- Business Solutions Consultant II FIS - fisglobal.com 510-735-5650 Mobile 510-621-2038 Office 510-621-2020 Office Fax 909-477-4578 Home/Fax devin.teske@fisglobal.com -> LEGAL DISCLAIMER <- This message contains confidential and proprietary information of the sender, and is intended only for the person(s) to whom it is addressed. Any use, distribution, copying or disclosure by any other person is strictly prohibited. If you have received this message in error, please notify the e-mail sender immediately, and delete the original message without making a copy. -> END TRANSMISSION <-