Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 4 Jan 2013 13:32:20 GMT
From:      Martin Birgmeier <Martin.Birgmeier@aon.at>
To:        freebsd-gnats-submit@FreeBSD.org
Subject:   ports/174968: virtualbox: CAM lockup when using more than one disk
Message-ID:  <201301041332.r04DWKst084342@red.freebsd.org>
Resent-Message-ID: <201301041340.r04De1c0029756@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

>Number:         174968
>Category:       ports
>Synopsis:       virtualbox: CAM lockup when using more than one disk
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-ports-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Fri Jan 04 13:40:00 UTC 2013
>Closed-Date:
>Last-Modified:
>Originator:     Martin Birgmeier
>Release:        FreeBSD 8.2.0 release
>Organization:
MBi at home
>Environment:
FreeBSD hal.xyzzy 8.2-RELEASE FreeBSD 8.2-RELEASE #0: Sat Nov  3 09:16:25 CET 2012     root@hal.xyzzy:/z/OBJ/FreeBSD/amd64/release/8.2.0-nfse/sys/XYZZY_SMP  amd64
>Description:
I am using virtualbox (4.1.22 previously, but now 4.2.6) to pre-test various installations. In this case, I am testing a zfs raidz2 setup.

The host is running FreeBSD 8.2.0 release.

The client is running FreeBSD 9.1.0 release with the latst NFSE patches.

The host is exporting 7 physical disk partitions to the client:
- 1 on IDE, used for UFS / and /usr
- 6 via either a single SCSI or a single SATA controller (the problems are the same using either)

In the client, the 6 SATA-attached (or SCSI-attached) partitions are used to form a raidz2 zpool.

The problem is that even with just a little disk activity, the CAM path seems to hang after just a few operations (regardless of using SATA or SCSI to attach the 6 zpool disks). This behavior can be triggered by something as simple as a 'zfs create'.

Typically, on the console of the client the following messages appear (ultimately, for all disks):

Jan  4 14:02:31 v904 kernel: Trying to mount root from ufs:/dev/ada0a [rw]...
Jan  4 14:02:31 v904 kernel: ZFS NOTICE: Prefetch is disabled by default if less than 4GB of RAM is present;
Jan  4 14:02:31 v904 kernel: to enable, add "vfs.zfs.prefetch_disable=0" to /boot/loader.conf.
Jan  4 14:02:31 v904 kernel: ZFS filesystem version 5
Jan  4 14:02:31 v904 kernel: ZFS storage pool version 28
Jan  4 14:02:31 v904 root: /etc/rc: WARNING: failed precmd routine for vmware_guestd
Jan  4 14:02:32 v904 kernel: .
Jan  4 14:07:06 v904 kernel: (ada5:ata6:0:0:0): FLUSHCACHE48. ACB: ea 00 00 00 00 40 00 00 00 00 00 00
Jan  4 14:07:06 v904 kernel: (ada5:ata6:0:0:0): CAM status: Command timeout
Jan  4 14:07:06 v904 kernel: (ada5:ata6:0:0:0): Retrying command
Jan  4 14:07:36 v904 kernel: (ada5:ata6:0:0:0): FLUSHCACHE48. ACB: ea 00 00 00 00 40 00 00 00 00 00 00
Jan  4 14:07:36 v904 kernel: (ada5:ata6:0:0:0): CAM status: Command timeout
Jan  4 14:07:36 v904 kernel: (ada5:ata6:0:0:0): Error 5, Retries exhausted
Jan  4 14:07:44 v904 kernel: .
Jan  4 14:07:45 v904 kernel: , 750.

(I have no idea where the lines with the single dots and the ", 750" come from.)

Some activity in the client is still possible if it does not access the zpool.

Interestingly, as soon as the problem surfaces, the VirtualBox emulation process itself also becomes stuck immediately when trying to execute an action (e.g., hard reset) from its pull-down menu. The process can then be killed (just kill, i.e., -15, I assume (using zsh)). The host does not seem to be adversely affected.

Because the emulation process itself is affected, I do not believe that the client OS itself the culprit (and neither NFSE); rather, I'd guess that it is a VirtualBox problem.

One more note: Similar problems seem to occur when running the client under a Windows 7 host. However, in that case the same real partitions on the FreeBSD 8.2 server are accessed using iSCSI from VirtualBox running on the Windows 7 host, and that might introduce additional problems (for example, I see a high rate of iSCSI disconnects/reconnects in this scenario). From this, I would guess that it is the vendor source which has problems with multiple disks (because the problem occurs in a similar manner under both FreeBSD 8.2 and Windows 7 as hosts).

>How-To-Repeat:
See description above.
>Fix:


>Release-Note:
>Audit-Trail:
>Unformatted:



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201301041332.r04DWKst084342>