From owner-svn-src-all@freebsd.org Wed Jul 4 01:44:47 2018 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 7609A1041444; Wed, 4 Jul 2018 01:44:47 +0000 (UTC) (envelope-from hrs@FreeBSD.org) Received: from mail.allbsd.org (mx-int.allbsd.org [IPv6:2001:2f0:104:e002::7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "gatekeeper.allbsd.org", Issuer "Let's Encrypt Authority X3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id D434C7D1E3; Wed, 4 Jul 2018 01:44:46 +0000 (UTC) (envelope-from hrs@FreeBSD.org) Received: from mail-d.allbsd.org ([IPv6:2409:11:a740:c00:58:65ff:fe00:b0b]) (authenticated bits=56) by mail.allbsd.org (8.15.2/8.15.2) with ESMTPSA id w641hfbJ090629 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK) (Client CN "/OU=GT07882699/OU=See+20www.rapidssl.com/resources/cps+20+28c+2915/OU=Domain+20Control+20Validated+20-+20RapidSSL+28R+29/CN=*.allbsd.org", Issuer "/C=US/O=GeoTrust+20Inc./CN=RapidSSL+20SHA256+20CA+20-+20G3"); Wed, 4 Jul 2018 10:43:54 +0900 (JST) (envelope-from hrs@FreeBSD.org) Received: from alph.d.allbsd.org ([IPv6:2409:11:a740:c00:16:ceff:fe34:2700]) by mail-d.allbsd.org (8.15.2/8.15.2) with ESMTPS id w641ha9f057784 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Wed, 4 Jul 2018 10:43:36 +0900 (JST) (envelope-from hrs@FreeBSD.org) Received: from localhost (localhost [IPv6:0:0:0:0:0:0:0:1]) (authenticated bits=0) by alph.d.allbsd.org (8.15.2/8.15.2) with ESMTPA id w641hWcB057781; Wed, 4 Jul 2018 10:43:34 +0900 (JST) (envelope-from hrs@FreeBSD.org) Date: Wed, 04 Jul 2018 10:42:52 +0900 (JST) Message-Id: <20180704.104252.1616889858955681927.hrs@allbsd.org> To: jilles@stack.nl Cc: daichigoto@icloud.com, lists@eitanadler.com, daichi@freebsd.org, gnn@FreeBSD.org, cem@freebsd.org, src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org Subject: Re: svn commit: r335836 - head/usr.bin/top From: Hiroki Sato In-Reply-To: <20180703211002.GA11832@stack.nl> References: <459BD898-8072-426E-A968-96C1382AC616@icloud.com> <20180703.020956.859981414196673670.hrs@allbsd.org> <20180703211002.GA11832@stack.nl> X-PGPkey-fingerprint: BDB3 443F A5DD B3D0 A530 FFD7 4F2C D3D8 2793 CF2D X-Mailer: Mew version 6.7 on Emacs 25.3 / Mule 6.0 (HANACHIRUSATO) Mime-Version: 1.0 Content-Type: Multipart/Signed; protocol="application/pgp-signature"; micalg=pgp-sha1; boundary="--Security_Multipart(Wed_Jul__4_10_42_52_2018_164)--" Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.6.2 (mail.allbsd.org [IPv6:2001:2f0:104:e001:0:0:0:41]); Wed, 04 Jul 2018 01:43:55 +0000 (UTC) X-Spam-Status: No, score=-94.9 required=13.0 tests=CONTENT_TYPE_PRESENT, QENCPTR1,RCVD_IN_AHBL,RCVD_IN_AHBL_PROXY,RCVD_IN_AHBL_SPAM,RDNS_NONE, SPF_SOFTFAIL,URIBL_SC2_SURBL,URIBL_XS_SURBL,USER_IN_WHITELIST autolearn=no autolearn_force=no version=3.4.1 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on mx.allbsd.org X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.27 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Jul 2018 01:44:47 -0000 ----Security_Multipart(Wed_Jul__4_10_42_52_2018_164)-- Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Jilles Tjoelker wrote in <20180703211002.GA11832@stack.nl>: ji> > 3. Print the multibyte characters by using strvisx(3) family, which ji> > supports multibyte character, or swprintf(3) family if you want to ji> > format wide characters directly. Note that buffer length for ji> > strvisx(3) must be calculated by using MB_LEN_MAX. ji> ji> In this case, calling setlocale() and then using strvisx() seems the ji> right solution. If locales differ across processes this may result in ji> mojibake but that cannot really be helped. Even analyzing other ji> processes' locale variables is not fully reliable, since strings may be ji> incorrectly encoded even in the process's real locale, environment ji> variables cannot be read across users and the environment block may be ji> overwritten by a program. ji> ji> In general, although using conversion to wide characters allows users a ji> lot of flexibility, I don't think it is the best in all situations: ji> ji> * The result of mbstowcs() is a UTF-32 string which consumes a lot of ji> memory. A loop with mbrtowc() may also be slow. Many operations can be ji> done directly on UTF-8 strings with no or little additional complexity ji> compared to byte strings. ji> ji> * If there is an invalid multibyte character, there is little ji> flexibility to handle this usefully and securely, since so little is ji> known about the encoding. The best handling may depend on the context. ji> ji> Therefore, in /bin/sh, I have only implemented multibyte support for ji> UTF-8. All other encodings have bytes treated as characters. ji> ji> However, I do agree that getenv("LANG") is bad. Instead, setlocale() ji> should be used. After that, nl_langinfo(CODESET) can be called and the ji> result compared to "UTF-8". Yes, I agree that using mb->wc conversion is not always the best and using strvisx() for cmdbuf, not only for argv, is enough in this case. I thought it was difficult to avoid iswprint() because I was not sure of the goal of r335836 and it looked to me that it aimed to keep the original printable() function. And as you mentioned it may not be worth to try to correctly detect/support locales in different processes, either. Probably one of the simplest ways would be that relying on LC_CTYPE+strvisx() and documenting how top(1) handles multibyte characters in the manual page. -- Hiroki ----Security_Multipart(Wed_Jul__4_10_42_52_2018_164)-- Content-Type: application/pgp-signature Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- iEYEABECAAYFAls8JhwACgkQTyzT2CeTzy1IeQCaAodTCzM9gOB5rqO81+Gy24Q1 O60AnRmFR2/cYK0ov6a3d5Tma6vk/zff =MhXt -----END PGP SIGNATURE----- ----Security_Multipart(Wed_Jul__4_10_42_52_2018_164)----