Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 25 May 2008 13:34:16 +0200
From:      "Martin Laabs" <martin.laabs@mailbox.tu-dresden.de>
To:        "Bruce Evans" <brde@optusnet.com.au>
Cc:        "freebsd-gnats-submit@freebsd.org" <freebsd-bugs@freebsd.org>
Subject:   Re: misc/123939: msdosfs corruptes new files
Message-ID:  <op.ubpjreaw724k7f@martin>
In-Reply-To: <20080525100023.D17089@besplex.bde.org>
References:  <200805231916.m4NJGVXP001708@www.freebsd.org> <20080524134012.L69478@delplex.bde.org> <op.ubnw4xiy724k7f@martin> <20080525100023.D17089@besplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi,

> This thread switched to private mail.  Did you mean that?  I don't min=
d,
> but sometimes useful PR info gets lost because it is not public.

Oh no - this was not my intention. Now I added the two CC's.
For that I fullquoted the last mail.

Using "cmp -lx" directly with the device does not work - I think be-
cause of the wrong block size.
Under the assumtion that writing and reading with bs=3D2k does work
propperly I tried to discover whether read, write or both are affec-
ted of the bug.

su:~$ dd if=3D/boot/kernel/kernel of=3D/dev/da4 bs=3D2k count=3D200
200+0 records in
200+0 records out
409600 bytes transferred in 1.399798 secs (292614 bytes/sec)
su:~$ dd if=3D/dev/da4 bs=3D2k|cmp -lx /boot/kernel/kernel - |head -n 15=

00064000 f6 55
00064001 46 89
00064002 30 e5
[...]

This is OK since I wrote exactly 0x64000 bytes.

Now I tried 4k and 8k which worked also fine.

With a blocksize of 10k I get a missmatch at adress 0:

su:~$ dd if=3D/dev/da4 bs=3D10k|cmp -lx /boot/kernel/kernel - |head -n 1=
5
00000000 7f 1d
00000001 45 04
00000002 4c 00
00000003 46 00
00000004 01 c4
00000005 01 16
00000006 01 00
00000007 09 00
[...]

I tried to discover the offset of the data that is read with
bs>8k and bs<=3D16k. It is exactly 0x2000 (8k).
With bs>16k, bs<=3D24k the offset is 0x4000, with bs>24k, bs<=3D32
it is 0x6000.
Until now I only checked the data around address 0.

Now the writing experiment:

As seen above bs=3D2k is working OK. Now I try 4k:

su:~$ dd if=3D/boot/kernel/kernel of=3D/dev/da4 bs=3D4k count=3D100
100+0 records in
100+0 records out
409600 bytes transferred in 1.002872 secs (408427 bytes/sec)
su:~$ dd if=3D/dev/da4 bs=3D2k|cmp -lx /boot/kernel/kernel - |head -n 15=

00064000 f6 00
00064001 46 00
00064002 30 00
[...]

And now 8k:

su:~$ dd if=3D/boot/kernel/kernel of=3D/dev/da4 bs=3D8k count=3D50
50+0 records in
50+0 records out
409600 bytes transferred in 0.748899 secs (546936 bytes/sec)
su:~$ dd if=3D/dev/da4 bs=3D2k|cmp -lx /boot/kernel/kernel - |head -n 15=

00064000 f6 00
00064001 46 00
00064002 30 00
[...]

Both are OK.

With bs of 10k I get the first byte mismatch at 0x2000 (8k)

su:~$ dd if=3D/boot/kernel/kernel of=3D/dev/da4 bs=3D10k count=3D40
40+0 records in
40+0 records out
409600 bytes transferred in 0.699931 secs (585201 bytes/sec)
su:~$ dd if=3D/dev/da4 bs=3D2k|cmp -lx /boot/kernel/kernel - |head -n 15=

00002000 1d 7f
00002001 04 45
00002002 00 4c
00002003 00 46
00002004 c4 01
00002005 16 01
[...]

The offset of the readback data is -0x2000. This means the data at 0x200=
0
on the stick should be orginally at 0x0. Since it *is* already there (cm=
p
did not report any difference between the file and the first 0x2000 byte=
s)
it is the second time there. This means that the data that would be  =

origina-
lly at 0x2000 is lost.

The length of this "discontinuity" is 0x800 with not really regular
spacings. (writing bs was 10k)

su:~$ dd if=3D/dev/da4 bs=3D2k|cmp -lx /boot/kernel/kernel - |less

00002000 1d 7f
00002001 04 45
00002002 00 4c
[...]
000027f9 13 00
000027fc 00 49
000027fd 00 1e
00004800 de 00
00004801 0f 00
00004804 00 4f
[...]
00004ff9 00 02
00004ffc 00 3d
00004ffd 00 1e
00007000 00 bc
00007001 00 15
00007004 a0 be
[...]
000077f8 8a 83
000077f9 0f 1a
000077fc 12 dd
00009800 00 90
00009801 00 1d
00009804 00 83


So far,
  Martin


--------------------:<----------------------------

>>> This is probably a bug in the umass or da driver. da claims to suppo=
rt  =

>>> i/o's
>>> of DFLTPHYS =3D 64K, so lower level drivers must support this even i=
f the
>>> hardware doesn't, but apparently some usb drives have a lower limit.=

>>
>> Hey - you are right. First I tried direct copy with bs=3D2k (which
>> is the sector size of that device.) This was OK:
>>
>> u:~$ dd if=3D/boot/kernel/kernel of=3D/dev/da7 bs=3D2k
>> dd: /dev/da7: Invalid argument
>> 4501+1 records in
>> 4501+0 records out
>> 9218048 bytes transferred in 31.502305 secs (292615 bytes/sec)

> It's another bug that gives the EINVAL error for writing at EOF.
> This complicates debugging a little.  I think the disk size is not
> a multiple of the block size (2K here), so the last block would
> strictly cross the boundary at the end of the disk, and none of
> it is written, but the error handling would be different/better
> if the block were at the boundary, and maybe different/worse if
> the block were strictly beyond the boundary.  For larger blocks,
> the last one would be more likely to strictly cross the boundary.
> So just note the error above so as to ignore similar errors for
> larger blocks.
>
>> su:~$ dd if=3D/boot/kernel/kernel bs=3D2k count=3D4501 of=3Dtest.fs
>> 4501+0 records in
>> 4501+0 records out
>> 9218048 bytes transferred in 0.134802 secs (68382200 bytes/sec)
>
> No errors since it didn't go near EOF for either the input or output.
>
>> su:~$ diff test.umass test.fs
>>
>> Now I tried this with a block size of 128k and it did not work
>> anymore:
>>
>> su:~$ dd if=3D/boot/kernel/kernel of=3D/dev/da7 bs=3D128k
>> dd: /dev/da7: Invalid argument
>> 70+1 records in
>> 70+0 records out
>> 9175040 bytes transferred in 12.484369 secs (734922 bytes/sec)
>
> Better write only 70 blocks to avoid secondary errors.  I think you
> eliminated the secondary error above by checking only 70 blocks later.=

>
>> su:~$ dd if=3D/dev/da7 of=3Dtest.umass bs=3D128k count=3D70
>> 70+0 records in
>> 70+0 records out
>> 9175040 bytes transferred in 9.297371 secs (986842 bytes/sec)
>>
>> su:~$ dd if=3D/boot/kernel/kernel of=3Dtest.fs bs=3D128k count=3D70
>> 70+0 records in
>> 70+0 records out
>> 9175040 bytes transferred in 0.127474 secs (71975736 bytes/sec)
>>
>> su:~$ diff test.umass test.fs
>> Files test.umass and test.fs differ
>
> Use cmp -lx to locate the error(s), especially the first one (expect
> a lot).  Copying from the disk using dd is good for eliminating
> secondary errors, but cmp -lx directly on the disk should work for
> a quick check, with a better chance of working than for diff.  (It
> depends on whether cmp's block size working (the block size only needs=

> to be a multiple of the sector size, with no partial block at EOF) and=

> not being large enough to cause the suspected error on input.  The
> suspected bug may affect input, output, or both.  I suspect both.)
>
> Bruce
>





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?op.ubpjreaw724k7f>