Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 12 Mar 2001 02:45:25 -0500 (EST)
From:      Trevor Johnson <trevor@jpj.net>
To:        Kris Kennaway <kris@obsecurity.org>
Cc:        <ports@FreeBSD.ORG>, Alistair Crooks <agc@pkgsrc.org>
Subject:   Re: new message digest support in pkgsrc (fwd)
Message-ID:  <20010310215713.Q23492-100000@blues.jpj.net>
In-Reply-To: <20010310180103.A28745@mollari.cthul.hu>

next in thread | previous in thread | raw e-mail | index | archive | help
> We have two utilities in the base system which calculate
> MD5/SHA1/RIPEMD160 hashes (md5 and openssl). Actually, looks like md5
> only does md5, I thought it did the others too -- what is true is that
> we have two libraries which handle it -- libmd and libcrypto (and
> adding code to md5(1) would be trivial).

You're right.  If no one has any objection, I'll delete the digest port.
As for overloading the md5 utility, it seems counter-intuitive to run a
command called "md5" and get some other message digest from it.  The
OpenBSD folks have taken a similar approach.  They have one binary which
is hard-linked under different names:

	$ ls -li `which  md5` `which rmd160` `which sha1`
	11545 -r-xr-xr-x  3 root  wheel  69632 Nov  6 09:10 /bin/md5
	11545 -r-xr-xr-x  3 root  wheel  69632 Nov  6 09:10 /bin/rmd160
	11545 -r-xr-xr-x  3 root  wheel  69632 Nov  6 09:10 /bin/sha1

(someone told me that using argv(0) that way is a bad practice, perhaps
because the command won't work properly if it is renamed). They also have
openssl, under /usr/ and dynamically linked (like ours):

        $ ldd `which md5` `which openssl`
        ldd: /bin/md5: not a dynamic executable
        /usr/sbin/openssl:
                -lssl.2 => /usr/lib/libssl.so.2.4 (0x4004f000)
                -lcrypto.2 => /usr/lib/libcrypto.so.2.4 (0x4007e000)
                -lc.25 => /usr/lib/libc.so.25.2 (0x40127000)

Maybe they're paranoid about stuff that could be on a shared /usr/.

> I question the motivation for the NetBSD change.  There are some
> theoretical weaknesses in MD5, but they aren't known to impact
> real-world uses.

At
http://www.acm.org/pubs/citations/proceedings/commsec/191177/p210-van_oorschot/
I found the abstract of a 1994 paper, which says:

   [...] a $10 million custom machine for applying parallel collision
   search to the MD5 hash function could complete an attack with
   an expected run time of 24 days.

I haven't read the whole paper, but I conjecture that this parallel method
could work, less efficiently, on an array of compromised, general-purpose
microcomputers connected through the Internet.  A black-hat cracking
effort similar to this is described at
http://distributed.net/trojans.html.en .

An article in CryptoBytes, "the technical newsletter of RSA Laboratories,"
published in 1996, says:

        The presented attack does not yet threaten practical applications
        of MD5, but it comes rather close.  In view of the flexibility of
        the new analytic techniques it would be unwise to assume that the
        attack could not be improved.  Ron Rivest [16] commented on the
        status of MD4, after two-round attacks had been found, that it is
        "at the edge" in terms of risking successful cryptanalytic attack.
        Today this assessment characterizes the status of MD5.

        Therefore we suggest that in the future MD5 should no longer be
        implemented in applications like signature schemes, where a
        collision-resistant hash function is required.  According to our
        present knowledge, the best recommendations for alternatives to
        MD5 are SHA-1 and RIPEMD-160.

The newsletter is at
ftp://ftp.rsasecurity.com/pub/cryptobytes/crypto2n1.pdf .

RFC 1828, written in 1995, says (brackets are in original):

   At the time of writing of this document, it is known to be possible
   to produce collisions in the compression function of MD5 [dBB93].
   There is not yet a known method to exploit these collisions to attack
   MD5 in practice, but this fact is disturbing to some authors
   [Schneier94].

   It has also recently been determined [vOW94] that it is possible to
   build a machine for $10 Million that could find two chosen text
   variants with a common MD5 hash value.  However, it is unclear
   whether this attack is applicable to a keyed MD5 transform.

   This attack requires approximately 24 days.  The same form of attack
   is useful on any iterated n-bit hash function, and the time is
   entirely due to the 128-bit length of the MD5 hash.

   Although there is no substantial weakness for most IP security
   applications, it should be recognized that current technology is
   catching up to the 128-bit hash length used by MD5.  Applications
   requiring extremely high levels of security may wish to move in the
   near future to algorithms with longer hash lengths.

I've heard that in a situation in which a hostile party can generate a
message with innocent contents, present it to a trusted party for signing,
then replace the message with one having hostile contents, the hostile
party can more easily arrange a hash collision than in a situation where
the innocent message is generated for innocent purposes.  Either scenario
can credibly happen in the ports collection.

The RIPEMD-160 home page at
http://www.esat.kuleuven.ac.be/~bosselae/ripemd160.html cites the same
articles.  It says that RIPEMD-160 was designed to replace RIPEMD-128
because of the ACM paper:

   RIPEMD-128 is a plug-in substitute for RIPEMD (or MD4 and MD5, for
   that matter) with a 128-bit result. In view of the result of Paul van
   Oorschot and Mike Wiener mentioned earlier, 128-bit hash results do
   not offer sufficient protection for the next ten years, and
   applications using 128-bit hash functions should consider upgrading to
   a 160-bit hash function.

It also says that some aspects of SHA-1 are kept secret by the U.S.
government (cue X-Files theme).

> I think switching to SHA1 for buzzword-compliance would be gratuitous.

Likewise, avoiding it purely because it has become a buzzword would be a
poor decision.

The SHA-1 algorithm is described at
http://www.rsasecurity.com/rsalabs/faq/3-6-5.html as "more secure" than
MD5 (MD5 is a trademark of RSA Security, for whom the algorithm was
developed).

> Even more ludicrous would be something like what OpenBSD does:
>
> MD5 (scanssh-1.4.tar.gz) = 843796cdb9361ed7e3d862a0e3a6ce16
> RMD160 (scanssh-1.4.tar.gz) = 8825be05348f1d5e8f53657a0de65f9b81320413
> SHA1 (scanssh-1.4.tar.gz) = 266d9de9a7965177b5d10ec0eed3de3e199ac237

At first glance, this looks crazy, but I see some advantages to it:  it is
"upward compatible" (so is the NetBSD way); users who find "make checksum"
too slow can still use the MD5 hash (or one or two of the others);  if one
of the other hashes is shown to be weak, there's no need to panic because
the other hash can be used, and has already been generated right after the
porter looked at the contents of the distfile; there was no need for a
flag day or to suddenly generate hundreds of new pkg/md5 files when the
change was made (just over two years ago).

Two disadvantages are apparent.  One is that "make makesum" must run more
slowly.  A porter who feels inconvenienced by this could choose to only
provide the MD5 checksum, as before. The other is that md5 (for us,
distinfo) files in CTM diffs or tarballs are bigger (on disk, most will
still take up one block, on the usual filesystems).  CTM diffs and ports
tarballs on CD-ROMs are normally compressed, and with a compression
utility worth its salt, 160-byte hashes should only take up about 160
bytes (20 bytes).  IMO these disadvantages are trivial.

The change for OpenBSD can be viewed at
http://www.openbsd.org/cgi-bin/cvsweb/ports/infrastructure/mk/bsd.port.mk.diff?r1=1.74&amp;r2=1.75
and for NetBSD, at
http://cvsweb.netbsd.org/bsdweb.cgi/pkgsrc/mk/bsd.pkg.mk.diff?r1=1.675&amp;r2=1.676
.

Until Moore's Law is repealed, MD5 will only become less difficult to
crack.  Cryptographic experts have been recommending its replacement for
some purposes since at least 1995.  Better (longer) hash functions can be
calculated by openssl, which is in our base system.  The NetBSD and
OpenBSD projects have adopted these functions for their ports (pkgsrc)
collections.  The desirability of keeping more information about distfiles
was anticipated by us during last year's reorganization
(http://www.geocrawler.com/mail/msg.php3?msg_id=4418223&amp;list=167), so
the "md5" files have already been renamed.

I'd like to see:
- the 160-byte hashes permitted (not required) in the distinfo file.
- a "makesum" target which generates all three hashes, using openssl.
- a "checksum" target which uses whichever hashes exist in distinfo.
-- 
Trevor Johnson
http://jpj.net/~trevor/gpgkey.txt


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-ports" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20010310215713.Q23492-100000>