Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 5 Oct 2015 23:58:12 +0200
From:      Polytropon <freebsd@edvax.de>
To:        Quartz <quartz@sneakertech.com>
Cc:        freebsd-questions@freebsd.org
Subject:   Re: awk question
Message-ID:  <20151005235812.eee38247.freebsd@edvax.de>
In-Reply-To: <5612EF57.10207@sneakertech.com>
References:  <5611C922.4050007@hiwaay.net> <20151005042129.1f153ec6.freebsd@edvax.de> <5611F776.9090701@hiwaay.net> <56124479.9020505@sneakertech.com> <20151005165902.ad01c288.freebsd@edvax.de> <5612EF57.10207@sneakertech.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, 05 Oct 2015 17:44:55 -0400, Quartz wrote:
> 
> > The form "input | step1 | step2 | step3 | step4>  result" usually
> > is more readable
> 
> That's what I meant my being easier to understand conceptually. I agree 
> about being more readable- even though this format sometimes needs the 
> 'useless cat' it's often my preferred coding style, especially in 
> scripts where the input might change around.

And the "useless cat" method also makes it easy to test the
script with varying input (for example, pre-generated test
input) before it "goes live". It also makes it easier to
"extend" the pre- or post-processing commands with new ones.



> > Additionally, awk isn't that hard to learn. Reading "man awk" will
> > provide you with a good background. And if you're already a C
> > programmer, you'll see that many things you can do in C will also
> > work similarly in awk, which _might_ not even be a good thing. :-)
> 
> The problem with awk is the whole BEGIN/END/braces thing and how commas 
> interact with the operands.

It's not that hard:

BEGIN { ... } will be executed _before_ any input is processed,
END { ... } will be executed _after_ all input has been processed.
/pattern/ { ... } will be executed for each matching input line,
(condition) { ... } will be executed when the condition is true,
and { ... } will be executed for _every_ input line.

Regarding commas: You can use the "print a b c" form as well as
the more sophisticated C-like printf("format string", a, b, c)
form. For all other functions, commas are argument separators
just like in many other programming languages.

	% echo "a b c" | awk '{ print $3 $1 $2 }'
	cab
	% echo "a b c" | awk '{ print $3, $1, $2 }'
	c a b
	% echo "a b c" | awk '{ printf("%s-%s-%s\n", $3, $1, $2); }'
	c-a-b

Those are the three "main methods" of printing: concatenated,
separated by a space, and custom formatted string. And the
semicolon is optional, it's just my C-contamination. :-)


> It's not very much like sh or C syntax (or 
> any other syntax) and new users tend to get really confused.

Hmmm... I don't know, could you provide an example where you
would say, like, "this is not intuitive" or even "this does
something totally strange"?



> Also, different versions of awk handle math (esp floating point) with 
> different rounding/precision/overflow, making calculations vary between 
> installations, only further adding to the confusion.

Yes, this is true, but keep in mind what awk is: a "pattern-directed
scanning and processing language". If you want higher precision
math, use system("<math stuff> | dc") and incorporate the result;
awk isn't really for math, but integer math is usually fine. :-)



-- 
Polytropon
Magdeburg, Germany
Happy FreeBSD user since 4.0
Andra moi ennepe, Mousa, ...



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20151005235812.eee38247.freebsd>