From owner-freebsd-questions@FreeBSD.ORG Tue May 27 06:27:29 2008 Return-Path: Delivered-To: freebsd-questions@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 790571065672 for ; Tue, 27 May 2008 06:27:29 +0000 (UTC) (envelope-from karel@inetis.com) Received: from dsl.inetis.com (cpe-212-18-40-64.static.amis.net [212.18.40.64]) by mx1.freebsd.org (Postfix) with ESMTP id 779A98FC1E for ; Tue, 27 May 2008 06:27:27 +0000 (UTC) (envelope-from karel@inetis.com) Received: from [192.168.110.14] ([192.168.110.14]) by inetis.com with MailEnable ESMTP; Tue, 27 May 2008 08:27:10 +0200 Message-ID: <483BAA2F.30009@inetis.com> Date: Tue, 27 May 2008 08:29:03 +0200 From: Karel Miklav User-Agent: Thunderbird 2.0.0.14 (Windows/20080421) MIME-Version: 1.0 To: Oliver Fromme References: <200805231523.m4NFNOwO024115@lurza.secnetix.de> In-Reply-To: <200805231523.m4NFNOwO024115@lurza.secnetix.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: delphij@freebsd.org, chinsan , freebsd-questions@FreeBSD.ORG Subject: Re: Sed, shell and hexadecimal character codes X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 27 May 2008 06:27:29 -0000 Oliver Fromme wrote: > Karel Miklav wrote: > > There's a tip in the FreeBSD fortunes database that says: > > > > > Want to strip UTF-8 BOM(Bye Order Mark) from given files? > > > > > > sed -e '1s/^\xef\xbb\xbf//' < bomfile > newfile > > FreeBSD's sed(1) doesn't support hexadecimal or octal > sequences. I think even gnu sed doesn't support it, but > you might try it yourself (/usr/ports/textprog/gsed). > > I don't know why that fortunes entry exist. It's wrong. That's what I thought. Maybe we should replace the recipe with the awk version Oliver proposed below? > > I can't make it work, and I can't find any other method to > > work with hexa codes in scripts or on the command line so > > I'm kind-a depressed :) I help myself with xxd now, but if > > it is possible to avoid it, I'd like to hear about it. > > There is no standard for handling octal and hexadecimal > sequences, unfortunately, so you have to consult the > manual page to find out. For example, tr(1) supports > octal sequences only (no hexadecimal), while awk(1) > supports both. So the above line could be rewritten > with awk: > > awk '{if(NR==1)sub(/^\xef\xbb\xbf/, "");print}' < bomfile > newfile