Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 19 Jul 2011 12:08:42 +0200
From:      "C. P. Ghost" <>
To:        Lars Eighner <>
Cc:        Frank Bonnet <>, "" <>
Subject:   Re: Tools to find "unlegal" files ( videos , music etc )
Message-ID:  <>
In-Reply-To: <alpine.BSF.2.00.1107190420560.40638@abbf.onfvpvfc.arg>
References:  <> <> <> <> <alpine.BSF.2.00.1107190420560.40638@abbf.onfvpvfc.arg>

Next in thread | Previous in thread | Raw E-Mail | Index | Archive | Help
On Tue, Jul 19, 2011 at 11:51 AM, Lars Eighner
<> wrote:
> On Tue, 19 Jul 2011, C. P. Ghost wrote:
>> Speaking with my university sysadmin hat on: you're NOT allowed to
>> peek inside personal files of your users, UNLESS the user has waived
>> his/her rights to privacy by explicitly agreeing to the TOS and
>> there's legal language in the TOS that allows staff to inspect files
>> (and then staff needs to abide by those rules in a very strict and
>> cautious manner). So unless the TOS are very explicit, a sysadmin or
>> an IT head can get in deep trouble w.r.t. privacy laws.
> Yes, but I am not an expert on privacy laws in France, and I suspect
> you are not either. =A0Whether examining the magic number (first four byt=
> of a file constitutes a breach of privacy is a matter for legal advice
> applicable to the particular jurisdiction. =A0You certainly can look at t=
> external package: file size and name.

Fair enough. Automatically scanning files, hashing them etc... may or
may not run afoul privacy laws... which vary widely from jurisdiction
to jurisdiction. And yes, I'm no expert on french privacy laws.

>> What can technically be done is that the copyright owner provides a
>> list of hashes for his files, and requests that you traverse your
>> filesystems, looking for files that match those hashes. AND, even
>> then, all you can do is flag the files, and you'll have to check with
>> the user that he/she doesn't own a license permitting him/her to own
>> that file!
> You cannot generate a hash without at a certain automated level opening t=
> file. =A0If you can do that, couldn't you generate a hash of the first fo=
> bytes to match with hashes of known magic numbers? If you can "look" at t=
> whole file, surely you can "look" at just the first four bytes.

To check the magic numbers, you don't need a hash. Just check the
magic numbers (where legally allowable). However, a magic number would
merely say: this is an MP3, this is a MPEG file etc...: it is just a
hint (and a very weak one at that) as to the types of files. You as
staff will STILL have to manually look at the file: the MP3 could
contain random noise, the MPEG could contain a private video or video
letter etc.

So practically, you'll get a list of users owning multimedia
files. Unless your organization forbids files by content type, you
still face the problem of identifying the "infringingness" of said
files, and this can only be done reliably by manual (human)
inspection. And here, we're right again deep in privacy protection
land where things get incredibly hairy.

>> However, even that isn't foolproof: nothing prevents a user from
>> flipping a bit or two, rescaling, resampling, splitting the files into
>> multiple files in a non-obvious manner, adding random bytes at the end
>> etc...: the result would still be infringing, but can't be detected
>> automatically (at least not in a reasonable amount of time).
> This is a bit like security. =A0There is no absolute that can be achieved=
. You
> don't have to be smarter than God, you just have to be smarter than the
> users. =A0Now the whole point of infringing schemes is that most dumb use=
> have to be able to use the files they download. =A0They can reasonablely =
> things like rename the files or pass them through a commonly available
> decoder. =A0No point in trying to "file share" if users have to be the NS=
A to
> play the music.
> You can scan (where legal) for the common stuff. =A0You can't find stuff
> encoded by Dr. Evil Genius Hacker -- but neither can the party claiming t=
> be infringed and neither can Suzie Shebop who just wants free music.


But Dr. Evil Genius Hacker could write a user friendly program that
does all this, and John Stupiduser Doe would still be able to use
it. Just think of the encrypted RAR files: how many users know how
encryption works?  Yet, it's the most widely used form for sharing
files nowadays by countless technically ignorant users.

> Lars Eighner
> 8800 N IH35 APT 1191 AUSTIN TX 78753-5266


Cordula's Web.

Want to link to this message? Use this URL: <>