From owner-freebsd-questions@freebsd.org Sat Oct 17 19:18:55 2020 Return-Path: Delivered-To: freebsd-questions@mailman.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.nyi.freebsd.org (Postfix) with ESMTP id BFA814403A8 for ; Sat, 17 Oct 2020 19:18:55 +0000 (UTC) (envelope-from guru@unixarea.de) Received: from ms-10.1blu.de (ms-10.1blu.de [178.254.4.101]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4CDCVk3d5Bz4X3v for ; Sat, 17 Oct 2020 19:18:54 +0000 (UTC) (envelope-from guru@unixarea.de) Received: from [80.187.82.78] (helo=localhost.unixarea.de) by ms-10.1blu.de with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kTrjA-0007rc-38 for freebsd-questions@freebsd.org; Sat, 17 Oct 2020 21:18:52 +0200 Received: from localhost.my.domain (localhost [127.0.0.1]) by localhost.unixarea.de (8.15.2/8.14.9) with ESMTPS id 09HJIo94012005 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO) for ; Sat, 17 Oct 2020 21:18:50 +0200 (CEST) (envelope-from guru@unixarea.de) Received: (from guru@localhost) by localhost.my.domain (8.15.2/8.14.9/Submit) id 09HJIoXn012004 for freebsd-questions@freebsd.org; Sat, 17 Oct 2020 21:18:50 +0200 (CEST) (envelope-from guru@unixarea.de) X-Authentication-Warning: localhost.my.domain: guru set sender to guru@unixarea.de using -f Date: Sat, 17 Oct 2020 21:18:50 +0200 From: Matthias Apitz To: freebsd-questions@freebsd.org Subject: printf(1) and UTF-8 multi-byte chars Message-ID: <20201017191850.GA11909@c720-r342378> Reply-To: Matthias Apitz Mail-Followup-To: freebsd-questions@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit X-Operating-System: FreeBSD 13.0-CURRENT r342378 (amd64) X-message-flag: Mails containing HTML will not be read! Please send only plain text. User-Agent: Mutt/1.11.1 (2018-12-01) X-Con-Id: 51246 X-Con-U: 0-guru X-Originating-IP: 80.187.82.78 X-Rspamd-Queue-Id: 4CDCVk3d5Bz4X3v X-Spamd-Bar: ++ Authentication-Results: mx1.freebsd.org; dkim=none; dmarc=none; spf=none (mx1.freebsd.org: domain of guru@unixarea.de has no SPF policy when checking 178.254.4.101) smtp.mailfrom=guru@unixarea.de X-Spamd-Result: default: False [2.84 / 15.00]; HAS_REPLYTO(0.00)[guru@unixarea.de]; RCVD_VIA_SMTP_AUTH(0.00)[]; HAS_XOIP(0.00)[]; RWL_MAILSPIKE_GOOD(0.00)[178.254.4.101:from]; HAS_XAW(0.00)[]; TO_DN_NONE(0.00)[]; RCVD_COUNT_THREE(0.00)[3]; RECEIVED_SPAMHAUS_PBL(0.00)[80.187.82.78:received]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.00)[]; ASN(0.00)[asn:42730, ipnet:178.254.0.0/19, country:DE]; MIME_TRACE(0.00)[0:+]; RCVD_IN_DNSWL_LOW(-0.10)[178.254.4.101:from]; ARC_NA(0.00)[]; REPLYTO_EQ_FROM(0.00)[]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_SPAM_SHORT(0.01)[0.008]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[freebsd-questions@freebsd.org]; DMARC_NA(0.00)[unixarea.de]; AUTH_NA(1.00)[]; RCPT_COUNT_ONE(0.00)[1]; NEURAL_SPAM_MEDIUM(0.67)[0.673]; NEURAL_SPAM_LONG(0.85)[0.855]; R_SPF_NA(0.00)[no SPF record]; MID_RHS_NOT_FQDN(0.50)[]; RCVD_TLS_ALL(0.00)[]; MAILMAN_DEST(0.00)[freebsd-questions] X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.33 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 17 Oct 2020 19:18:55 -0000 If you look at the two examples: $ printf "[%-10s]\n" "xxxx?xxx" [xxxx?xxx ] $ printf "[%-10s]\n" "xxxx¿xxx" [xxxx¿xxx ] you see that in the first two blanks are used to fill the '%-10s' pattern, while in the second only one blank is used. For sure, the problem/bug has todo with being '¿' a multi-byte UTF-8 char: $ echo '¿' | od -tx1 0000000 c2 bf 0a i.e. with "xxxx¿xxx" 8 chars plus one blank are printed to give %-10s, with "xxxx?xxx" 8 chars plus two blanks are printed. This means the output of printf(1) is byte oriented and not character oriented. Is there a way to print it like this: [xxxx¿xxx ] [xxxx?xxx ] Thanks matthias -- Matthias Apitz, ✉ guru@unixarea.de, http://www.unixarea.de/ +49-176-38902045 Public GnuPG key: http://www.unixarea.de/key.pub Без книги нет знания, без знания нет коммунизма (Влaдимир Ильич Ленин) Without books no knowledge - without knowledge no communism (Vladimir Ilyich Lenin) Sin libros no hay saber - sin saber no hay comunismo. (Vladimir Ilich Lenin)