Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 13 Sep 2007 12:08:51 -0700 (MST)
From:      "Craig Whipp" <crwhipp@gmail.com>
To:        "Kurt Buff" <kurt.buff@gmail.com>
Cc:        Jerry McAllister <jerrymc@msu.edu>, questions@freebsd.org
Subject:   Re: Scripting question
Message-ID:  <62309.65.121.28.16.1189710531.squirrel@whippsthroughlife.servebeer.com>
In-Reply-To: <a9f4a3860709131119h2d7589aej59587749bb1fa2ef@mail.gmail.com>
References:  <a9f4a3860709131016w54c12b6fy94fc2b0f286aea3d@mail.gmail.com> <20070913172001.GA78799@gizmo.acns.msu.edu> <a9f4a3860709131032q21bfefc2hf8d78cae53637576@mail.gmail.com> <20070913175510.GA78984@gizmo.acns.msu.edu> <a9f4a3860709131119h2d7589aej59587749bb1fa2ef@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
> On 9/13/07, Jerry McAllister <jerrymc@msu.edu> wrote:
>> > The only space is the one separating the SMTP address from the OK or
>> NO.
>>
>> Then you should be able to tell it to sort on the first token in
>> the string with white space as a separator and to eliminate
>> duplicates.   It has been a long time since I had need of sort. I
>> don't remember the arguments/flags but am sure that type of thing can be
>> done.
>>
>> ////jerry
>
> Ya know, it's really easy to get wrapped around the axle on this stuff.
>
> I think I may have a better solution. The file I'm trying to massage
> has a predecessor - the non-unique lines are the result of a
> concatenation of two files.
>
> Silly me, it's better to 'grep -v' with the one file vs. the second
> rather than trying to merge, sort and further massage the result. The
> fix will be to use sed against the first file to remove the ' NO',
> thus providing a clean argument for grepping the other file.
>
> Sigh.
>
> Kurt


It sounds like you've found your solution, but how about the below shell
script?  Probably woefully inefficient, but should work.

- Craig

########### begin script ##############
#!/bin/sh
# Read in an input list of 2 column data pairs and output the pairs where
the first columns are unique.

INPUT_FILE="list.txt"
OUTPUT_FILE="new_list.txt"
NON_UNIQ_LIST=""

for NON_UNIQ in `cat $INPUT_FILE | awk '{print $1}' | sort | uniq -c |
grep -vE '^ *1' | awk '{print $2}'`
do
	NON_UNIQ_LIST=$NON_UNIQ_LIST"|"$NON_UNIQ
done

NON_UNIQ_LIST=`echo $NON_UNIQ_LIST | sed 's/^.//'`

cat $INPUT_FILE | grep -vE $NON_UNIQ_LIST > $OUTPUT_FILE
########### end script ##############




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?62309.65.121.28.16.1189710531.squirrel>