From owner-freebsd-questions@FreeBSD.ORG Sat May 7 00:54:25 2011 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 276F7106566C for ; Sat, 7 May 2011 00:54:25 +0000 (UTC) (envelope-from bonomi@mail.r-bonomi.com) Received: from mail.r-bonomi.com (mx-out.r-bonomi.com [204.87.227.120]) by mx1.freebsd.org (Postfix) with ESMTP id F3A158FC08 for ; Sat, 7 May 2011 00:54:24 +0000 (UTC) Received: (from bonomi@localhost) by mail.r-bonomi.com (8.14.4/rdb1) id p470sgYR092690; Fri, 6 May 2011 19:54:42 -0500 (CDT) Date: Fri, 6 May 2011 19:54:42 -0500 (CDT) From: Robert Bonomi Message-Id: <201105070054.p470sgYR092690@mail.r-bonomi.com> To: freebsd-questions@freebsd.org, listreader@lazlarlyricon.com In-Reply-To: <4DC48DB6.8030907@lazlarlyricon.com> Cc: Subject: Re: Comparing two lists X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 07 May 2011 00:54:25 -0000 > From owner-freebsd-questions@freebsd.org Fri May 6 19:27:54 2011 > Date: Sat, 07 May 2011 02:09:26 +0200 > From: Rolf Nielsen > To: FreeBSD > Subject: Comparing two lists > > Hello all, > > I have two text files, quite extensive ones. They have some lines in > common and some lines are unique to one of the files. The lines that do > exist in both files are not necessarily in the same location. Now I need > to compare the files and output a list of lines that exist in both > files. Is there a simple way to do this? diff? awk? sed? cmp? Or a > combination of two or more of them? If the files have only 'minor' differences -- i.e. no long runs of lines that are in only one fie -- *and* the common lines are in the same order in each file, you can use diff(1), without any other shennigans. If the above is -not- true, and If you need _only_ the common lines, AND order is not important, then sort(1) both files, and use diff(1) on the two sorted versions. Beyond that it depends on what you mean by 'extensive' ones. megabytes? Gigabytes? or what??