Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 29 Oct 2015 11:31:33 +0100
From:      Matthias Apitz <guru@unixarea.de>
To:        freebsd-questions@freebsd.org
Subject:   tr(1) and LANG=de_DE.UTF-8
Message-ID:  <20151029103133.GA16882@sh4-5.1blu.de>

Next in thread | Raw E-Mail | Index | Archive | Help

Hello,

I was wondering why I could not patch a byte \357 in a file with tr(1):

[guru@kant-r269739 ~]$ od -c /tmp/x
0000000    n   o   n       U   T   F   -   8  \n   n   o   n       U   T
0000020    F   -   8  \n   v   a   l   i   d       U   T   F   -   8  \n
0000040    H   e   l   l   o           W   o   r   l   d   !  \n   v   a
0000060    l   i   d       U   T   F   -   8  \n   H   e   l   l   o    
0000100  357 277 277       W   o   r   l   d   !  \n                    
0000113
[guru@kant-r269739 ~]$ LANG=de_DE.UTF-8
tr '\357' '\000' < /tmp/x  | od -c
0000000    n   o   n       U   T   F   -   8  \n   n   o   n       U   T
0000020    F   -   8  \n   v   a   l   i   d       U   T   F   -   8  \n
0000040    H   e   l   l   o           W   o   r   l   d   !  \n   v   a
0000060    l   i   d       U   T   F   -   8  \n   H   e   l   l   o    
0000100  357 277 277       W   o   r   l   d   !  \n                    
0000113

until I changed the LANG to C:

[guru@kant-r269739 ~]$ LANG=C tr '\357'
'\000' < /tmp/x  | od -c
0000000    n   o   n       U   T   F   -   8  \n   n   o   n       U   T
0000020    F   -   8  \n   v   a   l   i   d       U   T   F   -   8  \n
0000040    H   e   l   l   o           W   o   r   l   d   !  \n   v   a
0000060    l   i   d       U   T   F   -   8  \n   H   e   l   l   o    
0000100   \0 277 277       W   o   r   l   d   !  \n                    
0000113

I know that the man page of tr(1) contains a hint about the LANG and
environment(7), but would not expect that this means that I can't change
a single byte, octal given value, only for the reason that \357 is not a valid
Unicode code point.

Any ideas/comments on this?
Thanks

	matthias

-- 
Matthias Apitz               |  /"\   ASCII Ribbon Campaign:
E-mail: guru@unixarea.de     |  \ /   - No HTML/RTF in E-mail
WWW: http://www.unixarea.de/ |   X    - No proprietary attachments
phone: +49-176-38902045      |  / \   - Respect for open standards
                             | en.wikipedia.org/wiki/ASCII_Ribbon_Campaign



Want to link to this message? Use this URL: <http://docs.FreeBSD.org/cgi/mid.cgi?20151029103133.GA16882>