Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 02 Jul 2010 14:14:43 -0500
From:      Tim Daneliuk <tundra@tundraware.com>
Cc:        freebsd-questions@freebsd.org
Subject:   Re: 'file' Command Giving False Positives
Message-ID:  <4C2E3AA3.7080200@tundraware.com>
In-Reply-To: <20100702204249.1a7423ac.freebsd@edvax.de>
References:  <4C2DF07F.1020509@tundraware.com> <44630xq527.fsf@be-well.ilk.org>	<20100702173504.c53738b2.freebsd@edvax.de>	<44r5jln3oj.fsf@be-well.ilk.org> <20100702204249.1a7423ac.freebsd@edvax.de>

next in thread | previous in thread | raw e-mail | index | archive | help
On 7/2/2010 1:42 PM, Polytropon wrote:
> On Fri, 02 Jul 2010 14:23:24 -0400, Lowell Gilbert<freebsd-questions-local@be-well.ilk.org>  wrote:
>> Apparently, your memory is better than mine, because that was indeed
>> what I was thinking of.  Which leads to the question of why magic(5)
>> lists LZ as representing "MS-DOS executable (built-in)".  I'd be
>> hesitant to change that unless we knew for sure it was wrong.
>
> As it has been mentioned before, .EXE is *one* of the formats
> executable in DOS. .COM executables do not have specific headers
> (as they are loaded directly). Also, .BAT are executable, allthough
> they are text files, and finally .BTM are also text file executables,
> specific to NDOS. As far as I also remember, there's .EXE on OS/2,
> too. One could argue if "Windows" .PIF are also executables. Of
> course, VMS also has .COM... but I see I'm making a digression... :-)
>
>
>
>> Even if it _is_ wrong, the "problem" still remains for "MZ" at least:
>> Any file starting with those letters is going to be identified as an
>> MS-DOS executable, and there's no clear way to distinguish it from a
>> text file that happens to start with those letters.
>
> Well, there's a solution that is not *that* complicated: If the
> file contains characters that don't match isprint(), i. e. those
> outside the ASCII set used in real text files, it's likely to be
> an executable.
>
> A scriptable solution might be to diff<filename>  vs. `strings
> <filename>`. If they differ, it's not a text, so it might be an
> executable.
>
> I'm not sure if the magic identification string starting with MZ
> could be enlarged with other specific characters immediately
> following MZ that are *only* present in executables...
>
> The problem is that "MZ itself is completely sufficient:
>
> 	% echo "MZ">  foo
> 	% file foo
> 	foo: MS-DOS executable
>
> Of course, that's not correct.
>
>

All noted (and appreciated).  In this case, the client has
a situation where none of the above will work:  They can
take in encrypted files that happen to have an MZ/LZ at the
beginning but have binary data thereafter but are NOT
executables.  They want to properly flag executables but
not get false positives.

At this point, I'm inclined to believe that 'file' alone is
insufficient to do this and, at best - even with more tools -
it's going to be a probabilities game - i.e. "What percentage
of false positives is acceptable?"


-- 
------------------------------------------------------------------------
Tim Daneliuk
tundra@tundraware.com



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4C2E3AA3.7080200>