Date: Sat, 21 Aug 2010 00:21:02 +0100 From: krad <kraduk@googlemail.com> To: Paul Schmehl <pschmehl_lists@tx.rr.com> Cc: FreeBSD Questions <freebsd-questions@freebsd.org> Subject: Re: Any awk gurus on the list? Message-ID: <AANLkTinzMYBdC=0Gm2qr3XYN-uNx_Dg3au5GipzDG6Lq@mail.gmail.com> In-Reply-To: <23BA961B74BA2B5CA8B523F9@utd65257.utdallas.edu> References: <23BA961B74BA2B5CA8B523F9@utd65257.utdallas.edu>
next in thread | previous in thread | raw e-mail | index | archive | help
On 20 August 2010 18:12, Paul Schmehl <pschmehl_lists@tx.rr.com> wrote: > I'm trying to figure out how to use awk to parse values from a string of > unknown length and unknown fields using awk, from within a shell script, and > write those values to a file in a certain order. > > Here's a typical string that I want to parse: > > alert ip [ > 50.0.0.0/8,100.0.0.0/6,104.0.0.0/5,112.0.0.0/6,173.0.0.0/8,174.0.0.0/7,176.0.0.0/5,184.0.0.0/6] > any -> $HOME_NET any (msg:"ET POLICY Reserved IP Space Traffic - Bogon Nets > 2"; classtype:bad-unknown; reference:url, > www.cymru.com/Documents/bogon-list.html; threshold: type limit, track > by_src, count 1, seconds 360; sid:2002750; rev:10;) > > What I want to do is extract the value after "sid:", the value after > "reference:" and the value after "msg:" and insert them into a file that > would look like this: > > 2002750 || "ET POLICY Reserved IP Space Traffic - Bogon Nets 2" || url, > www.cymru.com/Documents/bogon-list.html > > Yes, I know I could do this easily in Perl. I'm doing this to try and > improve my understanding of awk. I *think* I've figured out that the right > approach is to use an associative array, and this command: > > # awk '!/#/ { for (i=1; i<=NF; i++) { if ( $i ~ /sid/) {mtcmsg[sid]=$i; > print mtcmsg[sid]}}}' < /usr/local/etc/snort/rules/mtc.rules.test > > prodcues this data: > sid:299913; > sid:52123; > sid:3001441; > sid:1444; > sid:2008120; > sid:5001684; > sid:2001683; > sid:22466; > sid:2002750; > sid:3000003; > sid:292000032; > sid:22000032; > sid:3000000; > sid:2003070; > sid:2003484; > sid:2003603; > sid:31000004; > sid:299998; > > So it appears (at least to me) that I'm on the right path, but I thought > I'd query the awk gurus on the list. Is there a better way to approach > this? > > The standard FS breaks the msg into multiple fields, which is unacceptable. > So my thinking is that I would need to do somthing like this (pseudocode) > > !/#/; FS=";" {if ( $i ~ /sid/) then use tr to stip the "sid:" and ";" and > insert the result into an element named sid > if ($i ~ /reference/) then ditto into an element named ref > if $i ~ /msg/) then ditto into an element named msg) > then print array[sid]" || "array[msg]" || " array[ref] > resulting file.} > > But when I add an FS to the script, I get odd results: > > # awk '!/#/ { FS=";"; for (i=1; i<=NF; i++) { if ( $i ~ /sid/) > {mtcmsg[sid]=$i; print mtcmsg[sid]}}}' < > /usr/local/etc/snort/rules/mtc.rules.test > sid:299913; > sid:52123 > sid:3001441 > sid:1444 > sid:2008120 > sid:5001684 > sid:2001683 > sid:22466 > sid:2002750 > sid:3000003 > sid:292000032 > sid:22000032 > sid:3000000 > sid:2003070 > sid:2003484 > sid:2003603 > sid:31000004 > sid:299998 > > Why is the first value indented and not stripped of the semi-colon? > > -- > Paul Schmehl, Senior Infosec Analyst > As if it wasn't already obvious, my opinions > are my own and not those of my employer. > ******************************************* > "It is as useless to argue with those who have > renounced the use of reason as to administer > medication to the dead." Thomas Jefferson > > _______________________________________________ > freebsd-questions@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-questions > To unsubscribe, send any mail to " > freebsd-questions-unsubscribe@freebsd.org" > No need to use perl, a simple sed should work in front of the awk. You might want to tighten the regexp a little depending on your data input sed -e "s/.*sid//" <file> | awk '{print $1}'
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?AANLkTinzMYBdC=0Gm2qr3XYN-uNx_Dg3au5GipzDG6Lq>