Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 22 Jan 2019 04:30:00 -0800
From:      <soralx@cydem.org>
To:        <tijl@FreeBSD.org>, <rigoletto@FreeBSD.org>
Cc:        <svn-ports-all@FreeBSD.org>
Subject:   Re: svn commit: r490800 - in head/net-p2p: transmission-cli[...]
Message-ID:  <20190122043000.3fcc2340@mscad14>

next in thread | raw e-mail | index | archive | help

Tijl,

>> New Revision: 490800
>> URL: https://svnweb.freebsd.org/changeset/ports/490800
>> 
>> Log:
>>   net-p2p/transmission-cli: change transmission's data size unit
>>   conversion factors from 1000 to 1024, to match FreeBSD's blocksize.

> What blocksize?  The disk block sizes are determined by the hardware so

# env | grep BLOCKSIZE
BLOCKSIZE=K
# uname -r
11.2-STABLE

The 'K' stands for "kilo[bytes]", where 1K = 1024 [bytes] on FreeBSD.
This is the default on 9, 10, 11, 12, and 13-CURRENT (I forget when
BLOCKSIZE was changed from 512 to 1024... it was lo-o-ong time ago).
Most basic tools like `ls` and `df` care about BLOCKSIZE.

> FreeBSD isn't any different from Linux.  Why should the application
> behave differently on FreeBSD than it does on Linux?

Um... Because FreeBSD is not a Linux distribution? It is an operating
system with its own style & philosophy -- and I think it would do much
good for user experience to put some effort into keeping programs from
ports & the base OS consistent? FreeBSD is all about consistency, no?

Pardon the bitterness, but I honestly fail to understand the argument
"that's how it works on Linux, and they're fine -- so we should do it
like that too!".

>> [...]
>> + #define MEM_G_STR "GiB"
>> + #define MEM_T_STR "TiB"

> Since they use GiB and TiB here...

>> +-#define DISK_K 1000
>> ++#define DISK_K 1024
>> + #define DISK_B_STR   "B"
>> + #define DISK_K_STR "kB"
>> + #define DISK_M_STR "MB"
>> + #define DISK_G_STR "GB"
>> + #define DISK_T_STR "TB"

> ...you should use KiB, MiB, GiB and TiB here...

Good point: I agree that measurement units should be consistent.
I wanted to keep changes to the minimum (plus I do not care so much
about displayed units personally, as long as the numbers are correct),
so that's why I didn't bother changing these. But the patch certainly
can be improved, yes.

There are no prefixes "ki", "Mi", "Gi", etc. in FreeBSD, so I think
we should simply change the MEM_* units to be similar to DISK_* and
SPEED_* ("kB", "MB", "GB"...). Honestly, how may times you've heard
anyone say: "There are 16 gibibytes of RAM in that machine!"?

>> +-#define SPEED_K 1000
>> ++#define SPEED_K 1024
>> + #define SPEED_B_STR  "B/s"
>> + #define SPEED_K_STR "kB/s"
>> + #define SPEED_M_STR "MB/s"

> ...and here as well (although using 1024 for bandwidth is weird).

Notice that bandwidth is displayed in bytes/s, not bits/s. Applying
SI prefixes to bps is desirable, but measuring throughput in 1000's
of BYTES/s would be weird and unexpected.

> But really I think you should revert this change.

I am a long-time user of transmission, and I've been patching it
locally ever since they changed the scaling factor; I've got so
many machines that patching locally wastes too much time, so I
did my duty to improve the port, and rolled a patch & submitted
it. Now you are telling me I've wasted my time, because I alone
would want such a change?

How about this: we patch the port to adapt T. better to FreeBSD,
enable the fix by default? Then, if there are unhappy users who
care enough to complain and/or send a patch, the option can be
turned off by default.

> If the submitter wants
> this he can discuss that with the transmission developers.

Point is to adapt transmission to work on FreeBSD, not to convince
the program's developers to change their ways. Isn't the advantage
of ports being able to customize?

Below I include a message previously sent to Alexandre that goes
into more detail about the issue.

===================================8<===================================
> I need to wait my mentor to approve it (or not) but if the patch get 
> approved I will just merge the patch itself making it default without
> the UNITS OPTION.  

I thought this was obvious, but perhaps I should explain.

In FreeBSD, we don't use SI prefixes [which apply to physical measures]
for scaling digital data [which is not physical, thus has nothing to do
with SI]; rather, we use traditional binary units. So, we have a base
unit of bytes, and derived units of kilobytes, megabytes, etc. -- and
to convert between them, a factor of 1024 is used (or 512, in case of
sectors).

Transmission, on the other hand, incorrectly applies prefixes that carry
a scale factor of 1000 (perhaps to match the behavior of the OSes it was
written for?) to the base unit "byte", which produces an error of +2.4%;
notably, this error multiplies when applying the conversion multiple
times -- so when you're dealing with gigabytes of data, for instance,
the error becomes 7.4%.

As a practical example, if I set a bandwidth limit of 512 "KB"/s in
unpatched transmission, the actual limit will be 500 KB/s -- not a big
difference, but technically not what I've asked for. If have a torrent
that tr.-cli tells me is 40GB, but `ls -alfh` will show that its actual
size is 37GB -- already a 3GB difference! And, let us say, I've got 222
torrents, summing to 3.03TB total according to t-cli, then their actual
size is 2.75TB, for a 0.25TB difference! huge! 3TB of data will not fit
on a 3 "TB" disk, while 2.7TB might; it's a qualitative difference.

IMO, using SI for digital data is misguided, as, again, data is not
physical nor analog (i.e., you would not re-use the unit "byte" when
scaling below 1; for ex., there is no such thing as a microbyte), so
I think that we should not let transmission be trendy and fashionable
on FreeBSD, but instead fix it to mach the way the OS calculates data
sizes.

[Note that a byte is itself a derived unit, being made of 8 bits most
commonly, but as far as the OS is concerned, it _is_ a base unit of
data storage. So while it makes a lot of sense to apply SI prefixes
to bits, the same cannot be said about bytes; the difference is that
bytes measure quantity of data always, while bits can measure amount
of information.]

Thus, I believe the UNITS option should not only be included, but
also made default, to make transmission more compatible with *BSD.
===================================8<===================================

P.S.:
 I live in North America, but I use SI units for physical measures.
 Why? is SI inherently better? Who cares; SI is _international_, and
 using SI reduces *ambiguity*. That's what matters. So, are you still
 in support of introducing multiple units for measuring data in bytes?

 Well, which one to settle on? Is *BSD currently wrong? Consider this.
 What's easier for a binary computer (and a human dealing with binary
 numbers) to calculate: divide by 1024 or divide by 1000? How often
 you view a hexdump of a binary file with decimal instead of hex
 formatting?

-- 
[SorAlx]  ridin' VN2000 Classic LT



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20190122043000.3fcc2340>