Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 4 Apr 2019 23:32:27 +0000 (UTC)
From:      Conrad Meyer <cem@FreeBSD.org>
To:        src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject:   svn commit: r345896 - head/usr.bin/sort
Message-ID:  <201904042332.x34NWR0J029049@repo.freebsd.org>

next in thread | raw e-mail | index | archive | help
Author: cem
Date: Thu Apr  4 23:32:27 2019
New Revision: 345896
URL: https://svnweb.freebsd.org/changeset/base/345896

Log:
  sort(1): randomcoll: Skip the memory allocation entirely
  
  There's no reason to order based on strcmp of ASCII digests instead of
  memcmp of the raw digests.
  
  While here, remove collision fallback.  If you collide two MD5s, they're
  probably the same string anyway.  If robustness against MD5 collisions is
  desired, maybe we shouldn't use MD5.
  
  None of the behavior of sort -R is specified by POSIX, so we're free to
  implement this however we like.  E.g., using a 128-bit counter and block cipher
  to generate unique indices for each line of input.
  
  PR:		230792 (2/many)
  Relnotes:	This will change the sort order for a given dataset with a
  		given seed.  Other similarly breaking changes are planned.
  Sponsored by:	Dell EMC Isilon

Modified:
  head/usr.bin/sort/coll.c

Modified: head/usr.bin/sort/coll.c
==============================================================================
--- head/usr.bin/sort/coll.c	Thu Apr  4 23:30:27 2019	(r345895)
+++ head/usr.bin/sort/coll.c	Thu Apr  4 23:32:27 2019	(r345896)
@@ -990,8 +990,7 @@ randomcoll(struct key_value *kv1, struct key_value *kv
 {
 	struct bwstring *s1, *s2;
 	MD5_CTX ctx1, ctx2;
-	char *b1, *b2;
-	int cmp_res;
+	unsigned char hash1[MD5_DIGEST_LENGTH], hash2[MD5_DIGEST_LENGTH];
 
 	s1 = kv1->k;
 	s2 = kv2->k;
@@ -1004,24 +1003,16 @@ randomcoll(struct key_value *kv1, struct key_value *kv
 	if (s1 == s2)
 		return (0);
 
-	memcpy(&ctx1,&md5_ctx,sizeof(MD5_CTX));
-	memcpy(&ctx2,&md5_ctx,sizeof(MD5_CTX));
+	memcpy(&ctx1, &md5_ctx, sizeof(MD5_CTX));
+	memcpy(&ctx2, &md5_ctx, sizeof(MD5_CTX));
 
 	MD5Update(&ctx1, bwsrawdata(s1), bwsrawlen(s1));
 	MD5Update(&ctx2, bwsrawdata(s2), bwsrawlen(s2));
-	b1 = MD5End(&ctx1, NULL);
-	b2 = MD5End(&ctx2, NULL);
-	if (b1 == NULL || b2 == NULL)
-		err(2, "MD5End");
 
-	cmp_res = strcmp(b1,b2);
-	sort_free(b1);
-	sort_free(b2);
+	MD5Final(hash1, &ctx1);
+	MD5Final(hash2, &ctx2);
 
-	if (!cmp_res)
-		cmp_res = bwscoll(s1, s2, 0);
-
-	return (cmp_res);
+	return (memcmp(hash1, hash2, sizeof(hash1)));
 }
 
 /*



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201904042332.x34NWR0J029049>