From owner-freebsd-ports@FreeBSD.ORG Sun Jun 20 21:04:19 2010 Return-Path: Delivered-To: ports@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 27B0C106566C; Sun, 20 Jun 2010 21:04:19 +0000 (UTC) (envelope-from lasse.collin@tukaani.org) Received: from mailfw02.zoner.fi (mailfw02.zoner.fi [84.34.147.249]) by mx1.freebsd.org (Postfix) with ESMTP id DCED98FC1A; Sun, 20 Jun 2010 21:04:16 +0000 (UTC) Received: from www25.zoner.fi ([84.34.147.45]) by wwwsmtp02.zoner.fi with ESMTP; 21 Jun 2010 00:04:15 +0300 Received: from 86-60-146-209-dyn-dsl.ssp.fi ([86.60.146.209] helo=kaneli.localnet) by www25.zoner.fi with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.69) (envelope-from ) id 1OQRgd-0000JH-Dl; Mon, 21 Jun 2010 00:04:15 +0300 From: Lasse Collin To: "Ion-Mihai Tetcu" Date: Mon, 21 Jun 2010 00:04:14 +0300 User-Agent: KMail/1.13.3 (Linux/2.6.33-ARCH; KDE/4.4.4; x86_64; ; ) References: <4C1BA4D4.9000205@FreeBSD.org> <201006201823.03817.lasse.collin@tukaani.org> <20100620200207.3b67796d@it.buh.tecnik93.com> In-Reply-To: <20100620200207.3b67796d@it.buh.tecnik93.com> MIME-Version: 1.0 Content-Type: Text/Plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Message-Id: <201006210004.14698.lasse.collin@tukaani.org> X-Antivirus-Scanner: Clean mail though you should still use an Antivirus Cc: ports@freebsd.org, Christian Weisgerber , Matthias Andree , portmgr@freebsd.org Subject: Re: FreeBSD ports USE_XZ critical issue on low-RAM computers X-BeenThere: freebsd-ports@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Porting software to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 20 Jun 2010 21:04:19 -0000 On 2010-06-20 Ion-Mihai Tetcu wrote: > Personally I'd suggest keeping the option to limit the memory, but as > an option, not as default. OK. > One thing I would really love to see going away is the default to > delete the archive on decompression. Being somewhat compatible with gzip and bzip2 command line syntax is useful, so even though I don't disagree with you, the default is and will be to delete the input file. > Generally, I think programs should support both, the later overriding > the first: .conf -> env -> command line It means that I will need to create a config file on all my computers that have 512 MiB RAM or less to get the behavior I want. Probably other users with older computers have to do that too to avoid insanely slow compression and unresponsive system when some script runs "xz -9". While I would prefer no need for a config file, people like me seem to be in a minority, and creating a config file isn't that big deal. Using a second environment variable would be quite similar. Only the place where the setting is put would differ. A config file could allow more flexibility though, e.g. it could be possible to even override the preset levels with user-defined custom values (at his or her own risk, of course). > At the moment, what are the plans and the advantages of multithreding > (both on compression and decompression)? The "only" advantage is that threading makes things faster when there are multiple CPU cores to use. Disadvantages of threading: - Compression ratio might be worse. It depends on how the threading is done. Different ways have their own pros and cons. - Memory usage may be a lot be higher for both compression and decompression. The plan is to get some type of threaded compression support into liblzma after the 5.0.0 release. Considering my free time etc. I don't promise any kind of development schedule. The API will done so that applications won't need to think about the details of threading too much, and can use the zlib-style loop like they do in single-threaded mode. > > Next question could be how to determine how many threads could be > > OK for multithreaded decompression. It doesn't "fully" parallelize > > either, and would be possible only in certain situations. There > > too the memory usage grows quickly when threads are added. To me, > > a memory usage limit together with a limit on number of threads > > looks good; with no limits, the decompressor could end up reading > > the whole file into RAM (and swap). Threaded decompression isn't > > so important though, so I'm not even sure if I will ever implement > > it. > > I'd say offer an option if you want. Sorry, I explained this poorly. Simple number of threads = something is not good for threaded decompression. In a generic situation you don't know beforehand how much RAM each decompressor thread would use. If threaded decompression is implemented, maybe the default should be one thread just to keep things simple. But there should be an option to use optimal number of threads so that the user doesn't need to worry about details too much. My idea for that would be to have a user- specified maximum number of threads and a memory usage limit. Then xz would use up to the allowed number of threads as long as the memory usage limit is not exceeded. Without a memory usage limit, memory usage could grow to insane amounts if there are very many cores. It's somewhat similar for threaded compression, except that the amount of memory needed per thread at the given compression level is known before the compression is started. An option to easily tell xz to use optimal number of threads would be useful e.g. in scripts, which may be used on different computers, and thus don't want to be bothered to figure out how many CPU cores there are. I think a thread limit combined with memory usage limit is reasonable here too. For the above use, there should be default values for the thread and memory limits, so that a config file or many command line options wouldn't be strictly required to get some threading with the "use optimal number of threads" setting. Number of CPU cores and some percentage of RAM could work. Users could set better values themselves, but defaults are still a nice starting point and may be enough for many. Note that if I remove the current default memory usage limit from xz, the default memory usage limit used to calculate optimal number of threads wouldn't be used for anything else; if the limit is too low, xz would just drop to single-threaded mode to use minimal amount of RAM. > We've pondered a bit about switching our packages from .tbz to .xz or > tar.xz. Given that a package is made once, and downloaded and > decompressed by a lot of users a lot of times, it would probably make > sense to go for the smallest possible size; I had the same reasoning when I got interested in LZMA in 2004. LZMA was also much faster to decompress than bzip2. Slackware uses .txz suffix for .tar.xz packages, so if you prefer a single three-letter suffix instead of .tar.xz, .txz is the way to go. > however, if this would mean that some users won't be able to > decompress the packages, then probably xz isn't the tools for us. Decoder memory usage is all about the dictionary size. With 2 MiB dictionary you can make most packages smaller with xz than with "bzip2 -9" while keeping the decoder memory usage (3 MiB) _lower_ than that of bzip2 (man page says 3700k without using the slower --small mode). I would recommend using 8 MiB dictionary for packages. That way 9 MiB of memory is needed to decompress. That's what I used for packages years ago, and it's also the default in xz ("xz -6"). A dictionary bigger than 8 MiB is not useful unless the uncompressed file is over 8 MiB. Using "xz -6e" might reduce the size a little more with some files, but it's not necessarily worth the extra CPU time. Compressing with "xz -6" needs about 100 MiB memory. It is much more than with "bzip2 -9" (man page says 7600k), but should be fine on the systems that create the packages. Using "xz -9" for binary packages would be a bad choice. It doesn't save that much space over "xz -6" and can seriously annoy users of older computers. In contrast, decompressing files created with "xz -6" works nicely on 100 MHz Pentium with 32 MiB RAM (16 MiB should be quite OK too). I will need to emphasize much more in the xz docs and possibly also in "xz --help" that using -9 really isn't usually what people want. There are also additional filters that might help. Enabling them requires using advanced options. You can try e.g. "xz --x86 --lzma2" when compressing data that includes significant amount of x86-32 or x86-64 code. That filter has a known problem that makes it perform poorly on static libraries (and Linux kernel modules), so applying it to all packages isn't necessarily a good idea. In the future (I don't know when), there will be a better and easier-to-use filter, that will use heuristics to detect when and what extra filtering should be useful. > Speaking of sizes, do you have any statistical data regarding: source > size, compression options, compression speed and decompression speed > (and memory usage, since we're talking about it)? No. It's good to note here that I haven't so far worked much on the actual compression algorithms. The critical parts are directly derived from Igor Pavlov's LZMA SDK (the code may look very different at first sight, but don't let that mislead you). As I mentioned in an earlier email, I will tweak the compression settings mapped to the compression levels before the 5.0.0 release. To do that I will need to collect some data from many different compression settings. It probably won't be high quality data, since I have limited time for experiments and I just need some rough guidelines to tweak the options. Here are a few known things: - Decompression speed is roughly constant x bytes per second of _compressed_ data on the same machine. The better the compression has been, the faster the decompression tends to be. However, if the data doesn't fit to RAM and the system needs to swap out parts of the xz process, old floppy disks start to become competitive, because the memory is accessed quite randomly. - Dictionary keeps the most recently processed uncompressed data in a ring buffer. Using a dictionary bigger than the uncompressed file is useless. - Compressor memory usage is roughly 5-12 times the dictionary size. It depends on the match finder (see mf under --lzma2 on the man page). "xz -vv" shows the encoder memory usage. I might make single -v show that info in the future along with the decoder memory usage. - Decompressor memory usage is a little more than the dictionary size. The currently supported extra filters don't use significant amount of memory. -- Lasse Collin | IRC: Larhzu @ IRCnet & Freenode