From owner-freebsd-questions@FreeBSD.ORG Fri Dec 18 09:13:16 2009 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1AA1B1065693 for ; Fri, 18 Dec 2009 09:13:16 +0000 (UTC) (envelope-from keramida@ceid.upatras.gr) Received: from poseidon.ceid.upatras.gr (poseidon.ceid.upatras.gr [150.140.141.169]) by mx1.freebsd.org (Postfix) with ESMTP id 83CBE8FC13 for ; Fri, 18 Dec 2009 09:13:15 +0000 (UTC) Received: from mail.ceid.upatras.gr (unknown [10.1.0.143]) by poseidon.ceid.upatras.gr (Postfix) with ESMTP id A7B35EB47AC; Fri, 18 Dec 2009 11:13:14 +0200 (EET) Received: from localhost (europa.ceid.upatras.gr [127.0.0.1]) by mail.ceid.upatras.gr (Postfix) with ESMTP id 9ACB54533F; Fri, 18 Dec 2009 11:13:14 +0200 (EET) X-Virus-Scanned: amavisd-new at ceid.upatras.gr Received: from mail.ceid.upatras.gr ([127.0.0.1]) by localhost (europa.ceid.upatras.gr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ruTVODbCc4BZ; Fri, 18 Dec 2009 11:13:14 +0200 (EET) Received: from kobe.laptop (ppp-94-64-214-218.home.otenet.gr [94.64.214.218]) by mail.ceid.upatras.gr (Postfix) with ESMTP id 4173D45308; Fri, 18 Dec 2009 11:13:14 +0200 (EET) Received: from kobe.laptop (kobe.laptop [127.0.0.1]) by kobe.laptop (8.14.3/8.14.3) with ESMTP id nBI9DDrh036615 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 18 Dec 2009 11:13:13 +0200 (EET) (envelope-from keramida@ceid.upatras.gr) Received: (from keramida@localhost) by kobe.laptop (8.14.3/8.14.3/Submit) id nBI9DD1M036591; Fri, 18 Dec 2009 11:13:13 +0200 (EET) (envelope-from keramida@ceid.upatras.gr) From: Giorgos Keramidas To: Anton Shterenlikht References: <20091218005102.GA51064@mech-cluster241.men.bris.ac.uk> <4B2AD666.9090404@lazlarlyricon.com> <20091218012918.GA71118@mech-cluster241.men.bris.ac.uk> Date: Fri, 18 Dec 2009 11:13:12 +0200 In-Reply-To: <20091218012918.GA71118@mech-cluster241.men.bris.ac.uk> (Anton Shterenlikht's message of "Fri, 18 Dec 2009 01:29:18 +0000") Message-ID: <87d42cve53.fsf@kobe.laptop> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1.90 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: freebsd-questions@freebsd.org Subject: Re: editing a binary file X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 18 Dec 2009 09:13:16 -0000 On Fri, 18 Dec 2009 01:29:18 +0000, Anton Shterenlikht wrote: >> My bet would be /usr/ports/editors/hexedit. Been a while since I've >> used it, but AFAIR, it has a curses or a curses like interface, and >> it's fairly simple to use, yet sufficiently powerful for most normal >> binary editing. If you want a GUI, I believe gnome (and probably KDE >> as well) has its own hex editor. > > thank you. hexedit does the job on small files, but is quite > clunky. If I've a xGB file and I need to delete the first and the last > record, this becomes quite hard, if at all possible. > > I didn't appreciate it's not that simple. > > Perhaps I can read a file with C and write back? I can't remember if C > supports binary files, and whether it also writes some record > delimiters. Yes, C supports binary files and does not insert spurious 'record delimiters' unless you instruct it to do so. It may even be possible to use one of the scripting languages (Perl or Python) to do the same work. It's often easier to hack together a solution if you don't have to worry about some of the details C will require. I don't know how your record delimiters look like, but here's a small sample of how Python can read a binary file of 32 bytes and strip the last 2 bytes of each 16-byte record: A binary file of two 16-byte records may look like this: keramida@kobe:/tmp$ hd binfile 00000000 b6 b0 fc 58 96 48 56 d5 e9 10 f0 55 55 67 87 5d |...X.HV....UUg.]| 00000010 b0 c9 8b 49 db 53 26 28 57 d6 62 0d d5 1b c4 dc |...I.S&(W.b.....| 00000020 Reading the file in chunks of 16 bytes and stripping the last 2 bytes of each record from Python is only a few lines of code: keramida@kobe:/tmp$ python Python 2.6.4 (r264:75706, Dec 3 2009, 23:31:07) [GCC 4.2.1 20070719 [FreeBSD]] on freebsd9 Type "help", "copyright", "credits" or "license" for more information. >>> ifp = file('binfile') # open input file for reading >>> ofp = file('outfile', 'w') # open output file for writing >>> for rec in range(2): # we'll transfer 2 records ... bytes = ifp.read(16) # of 16 bytes each ... obytes = bytes[0:14] # strip the last two bytes of each record ... ofp.write(obytes) # push to the output file ... >>> ifp.close() # close input >>> ofp.close() # close output >>> The output file now looks like this: keramida@kobe:/tmp$ hd outfile 00000000 b6 b0 fc 58 96 48 56 d5 e9 10 f0 55 55 67 b0 c9 |...X.HV....UUg..| 00000010 8b 49 db 53 26 28 57 d6 62 0d d5 1b |.I.S&(W.b...| 0000001c This is 4 bytes smaller than the original file, and the last two bytes of each 16-byte record are gone. Bingo! Now this example is really a very small and contrived sample of what you can do. This script lacks serious error-checking too, and it may be slightly more involved if you have variable record sizes. But the general idea is that it *is* possible to hack together something that loads and processes binary data. As long as you know the on-disk format of the records you are reading, anything goes.