Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 17 Jan 1999 21:34:02 -0600 (CST)
From:      fgil@gc2.kloepfer.org
To:        FreeBSD-gnats-submit@FreeBSD.ORG
Cc:        fgil@limbic.gc2.kloepfer.org
Subject:   kern/9548: UNION fs corrupts data and has undefined getpages VOP
Message-ID:  <199901180334.VAA09847@limbic.gc2.kloepfer.org>

next in thread | raw e-mail | index | archive | help

>Number:         9548
>Category:       kern
>Synopsis:       UNION fs corrupts data and has undefined getpages VOP
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sun Jan 17 19:40:00 PST 1999
>Closed-Date:
>Last-Modified:
>Originator:     Gil Kloepfer Jr.
>Release:        FreeBSD 3.0-RELEASE i386
>Organization:
None
>Environment:

	Probably the easiest and the most complete description comes
	from the kernel itself...

	avail memory = 46227456 (45144K bytes)
	Bad BIOS32 Service Directory!
	Probing for devices on PCI bus 0:
	chip0: <Intel 82437FX PCI cache memory controller> rev 0x02 on pci0.0.0
	chip1: <Intel 82371FB PCI to ISA bridge> rev 0x02 on pci0.7.0
	ide_pci0: <Intel PIIX Bus-master IDE controller> rev 0x02 on pci0.7.1
	vga0: <S3 968 graphics accelerator> rev 0x00 int a irq 5 on pci0.18.0
	Probing for devices on the ISA bus:
	sc0 at 0x60-0x6f irq 1 on motherboard
	sc0: VGA color <16 virtual consoles, flags=0x0> 
	ed0 at 0x300-0x31f irq 9 maddr 0xc8000 msize 163 84 on isa
	ed0: address 00:00:c0:49:93:da, type SMC8216/SMC8216C (16 bit)
	sio0 at 0x3f8-0x3ff irq 4 flags 0x10 on isa
	sio0: type 16550A
	sio1 at 0x2f8-0x2ff irq 3 on isa
	sio1: type 16550A
	lpt0 at 0x378-0x37f irq 7 on isa
	lpt0: Interrupt-driven port
	lp0: TCP/IP capable interface
	psm0 at 0x60-0x64 irq 12 on motherboard
	psm0: model Generic PS/2 mouse, device ID 0
	fdc0 at 0x3f0-0x3f7 irq 6 drq 2 on isa
	fdc0: FIFO enabled, 8 bytes threshold
	fd0: 1.44MB 3.5in
	wdc0 at 0x1f0-0x1f7 irq 14 on isa
	wdc0: unit 0 (wd0): <WDC AC31600H>
	wd0: 1549MB (3173184 sectors), 3148 cyls, 16 heads, 63 S/T, 512 B/S
	wdc0: unit 1 (wd1): <Conner Peripherals 420MB - CFS420A>
	wd1: 406MB (832608 sectors), 826 cyls, 16 heads, 63 S/T, 512 B/S
	wdc1 at 0x170-0x177 irq 15 on isa
	wdc1: unit 0 (atapi): <SANYO CRD-256P/1.01>, removable, dma, iordis
	wcd0: 1033Kb/sec, 128Kb cache, audio play, 256 volume levels, ejectable tray
	wcd0: no disc inside, unlocked
	npx0 on motherboard
	npx0: INT 16 interface
	Intel Pentium F00F detected, installing workaround

>Description:

	1.  UNION filesystem corrupts data and 
	2.  Reports that there is a stale getpages routine (this
	    is a diagnostic from /sys/vm/vnode_pager.c to show that
	    a EOPNOTSUPP was returned when getpages was called, meaning
	    that in union_vnops.c there was no getpages VOP implemented).
	    Exact message from kernel is:
	            vnode_pager: *** WARNING *** stale FS getpages

	It appears that the data corruption in (1) is caused during
	a mmap operation on a file copied to the upper layer of the
	union filesystem.  For example, in the steps outlined
	in How-To-Repeat, a file can be copied (/bin/cp) around (which
	uses stdio), but if a "/usr/bin/cmp -l" on the file is
	performed (which uses mmap) or if an execute is attempted on
	an executable file, the file on the union filesystem becomes
	corrupt (basically filled with 0x00).

	I originally discovered all this because I wanted to keep the
	kernel sources on a CD, but mount some writable disk space on top
	in order to do a kernel build, thus avoiding the need to keep
	the kernel (and other) sources on disk.

>How-To-Repeat:

	two filesystems, /fs1 and /fs2

	mkdir /fs1/lower
	mkdir /fs2/upper
	mount -t union /fs2/upper /fs1/lower
	cd /fs1/lower   # really the union filesystem at this point
	cp /etc/termcap .
	cmp -l termcap /etc/termcap
	# data in /fs2/upper/termcap is now corrupt

	-- another example --
	mkdir /fs1/lower
	mkdir /fs2/upper
	mount -t union /fs2/upper /fs1/lower
	cd /fs1/lower   # really the union filesystem at this point
	cp /bin/cat .
	./cat /etc/termcap >/dev/null
	# will report "wrong architecture" because the file will become
	# filled with zeros

	-- example where it works correctly --
	mkdir /fs1/lower
	mkdir /fs2/upper
	mount -t union /fs2/upper /fs1/lower
	cd /fs1/lower   # really the union filesystem at this point
	cp /etc/termcap .
	cat termcap >/tmp/termcap
	cd /tmp/termcap
	cmp -l termcap /etc/termcap
	# file compare is good, because /tmp/termcap has not been corrupted
	# NOTE:  now umount the union filesystem, and od the following:
	cd /fs2/upper
	cmp -l termcap /etc/termcap
	# What will happen now is that cmp will core-dump (Segmentation fault)
	# and the kernel will report:
	# vm_fault: pager read error, pid 298 (cmp)

>Fix:

	I have tried without any luck to find out exactly what is
	happening in #2, and why this behavior occurs.  I don't know
	enough about the workings of the kernel to understand what may
	be wrong.  (I did learn some about what a vnode is, however...:)

	For #1, I applied the following changes to
	/sys/miscfs/union/union_vnops.c as per the recommendations
	in vnode_pager.c.  However, I am not sure if this is the correct
	fix.  (remove leading tab from context diff to use it)

	*** union_vnops.c.ORIG  Sat Jan  9 18:04:08 1999
	--- union_vnops.c       Sun Jan 17 21:27:28 1999
	***************
	*** 67,72 ****
	--- 67,73 ----
	  static void   union_fixup __P((struct union_node *un, struct proc *p));
	  static int    union_fsync __P((struct vop_fsync_args *ap));
	  static int    union_getattr __P((struct vop_getattr_args *ap));
	+ static int    union_getpages __P((struct vop_getpages_args *ap));
	  static int    union_inactive __P((struct vop_inactive_args *ap));
	  static int    union_ioctl __P((struct vop_ioctl_args *ap));
	  static int    union_islocked __P((struct vop_islocked_args *ap));
	***************
	*** 99,104 ****
	--- 100,107 ----
	  static int    union_whiteout __P((struct vop_whiteout_args *ap));
	  static int    union_write __P((struct vop_read_args *ap));
	  
	+ extern int    vnode_pager_generic_getpages __P((struct vnode *, vm_page_t *, int, int));
	+ 
	  static void
	  union_fixup(un, p)
	        struct union_node *un;
	***************
	*** 1750,1755 ****
	--- 1753,1773 ----
	        return (error);
	  }
	  
	+ 
	+ /*
	+  * XXX - This getpages function is copied from the one used in mfs.
	+  * There really needs to be a fs-device-specific default getpages
	+  * vop function written...
	+  */
	+ 
	+ static int
	+ union_getpages(ap)
	+       struct vop_getpages_args *ap;
	+ {
	+       return(vnode_pager_generic_getpages(ap->a_vp, ap->a_m, ap->a_count, ap->a_reqpage));
	+ }
	+ 
	+ 
	  /*
	   * Global vfs data structures
	   */
	***************
	*** 1764,1769 ****
	--- 1782,1788 ----
	        { &vop_create_desc,             (vop_t *) union_create },
	        { &vop_fsync_desc,              (vop_t *) union_fsync },
	        { &vop_getattr_desc,            (vop_t *) union_getattr },
	+       { &vop_getpages_desc,           (vop_t *) union_getpages },
	        { &vop_inactive_desc,           (vop_t *) union_inactive },
	        { &vop_ioctl_desc,              (vop_t *) union_ioctl },
	        { &vop_islocked_desc,           (vop_t *) union_islocked },
>Release-Note:
>Audit-Trail:
>Unformatted:

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-bugs" in the body of the message



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199901180334.VAA09847>