Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 23 Jan 2014 13:57:20 -0800
From:      <dteske@FreeBSD.org>
To:        "'RW'" <rwmaillists@googlemail.com>, <freebsd-questions@freebsd.org>
Cc:        'Devin Teske' <dteske@FreeBSD.org>
Subject:   RE: awk programming question
Message-ID:  <04bd01cf1886$1844f390$48cedab0$@FreeBSD.org>
In-Reply-To: <20140123213352.5f289890@gumby.homeunix.com>
References:  <F01EB9CE742DEB17DB6B51C7@localhost> <alpine.BSF.2.00.1401230900270.76961@wonkity.com> <20140123185604.4cbd7611@gumby.homeunix.com> <04a201cf1878$8ebce540$ac36afc0$@FreeBSD.org> <alpine.BSF.2.00.1401231346520.80613@wonkity.com> <20140123213352.5f289890@gumby.homeunix.com>

next in thread | previous in thread | raw e-mail | index | archive | help


> -----Original Message-----
> From: RW [mailto:rwmaillists@googlemail.com]
> Sent: Thursday, January 23, 2014 1:34 PM
> To: freebsd-questions@freebsd.org
> Subject: Re: awk programming question
> 
> On Thu, 23 Jan 2014 13:57:03 -0700 (MST) Warren Block wrote:
> 
> > On Thu, 23 Jan 2014, dteske@FreeBSD.org wrote:
> >
> > >> From: RW [mailto:rwmaillists@googlemail.com]
> > >> Note that awk supports +, but not newfangled things like *.
> > >
> > > With respect to regex, what awk really needs is the quantifier
> > > syntax...
> > >
> > > * = {0,} = zero or more
> > > + = {1,} = one or more
> > > {x,y} = any quantity from x inclusively up to y {x,} = any quantity
> > > from x or more
> >
> > I think RW meant to type that awk did not have the newfangled "?" for
> > non-greedy matches.
> 
> No I meant it doesn't support *, which had been used in all the previous
awk
> examples in this thread, and would have been interpreted as a literal "*".
> 
> $ echo "sid:2008120; re" | awk ' {match($0,/[0-9]+/) ; \
>         s=substr($0,RSTART,RLENGTH) ; print "_",s,"_"} '
> _ 2008120 _
> 21:12 (bob) ~
> $ echo "sid:2008120; re" | awk ' {match($0,/[0-9]*/) ; \
>         s=substr($0,RSTART,RLENGTH) ; print "_",s,"_"} '
> _  _
> 

Awk does support "*" but you have to give match() something
to "anchor" to. For example...

$ echo "sid:2008120; re" | awk '{match($0,/[0-9][0-9]*/); \
	s=substr($0,RSTART,RLENGTH); print "_",s,"_"}'
_ 2008120 _


> 
> On Thu, 23 Jan 2014 12:20:26 -0800
> dteske@FreeBSD.org wrote:
> 
> > 1. sig-msg.map file according to OP shouldn't have the quotes that are
> > present from the snort rule input 2. Doesn't ignore lines of
> > disinterest
> 
> I know nothing about snort - I was just going on the previous posts, but
> FWIW removing the quotes is just a matter of changing:
> 
>     msg = substr($0,RSTART+4, RLENGTH-5)
> 
> to
> 
>     msg = substr($0,RSTART+5, RLENGTH-6)

The match() that preceded that (going back in the thread) was:

	match($0, /msg:[^;]+;/

That was bad for a couple reasons.

1. The msg could be the last property in the ( ... ) set, meaning the msg
would
either be NULL or contain too much (go beyond the terminating parenthetical
and on to the next semi-colon

2. If the msg is not double-quoted, you'll end up shaving the first and last
byte
unexpectedly.

Hence why I test the first byte of the msg and then split on a conditional
field
separator, later extracting the appropriate field based on traditional
parsing
logic (which a little googling helped when it came to finding out how simple
vs.
complex the snort rules file was).
-- 
Devin

_____________
The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to archiving and review by persons other than the intended recipient. Thank you.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?04bd01cf1886$1844f390$48cedab0$>