From owner-freebsd-virtualization@FreeBSD.ORG Wed Jan 7 09:22:25 2015 Return-Path: Delivered-To: freebsd-virtualization@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id D0009BB0 for ; Wed, 7 Jan 2015 09:22:25 +0000 (UTC) Received: from potassio.roma.schema31.it (85-18-162-231.ip.fastwebnet.it [85.18.162.231]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4EDAB1D01 for ; Wed, 7 Jan 2015 09:22:23 +0000 (UTC) Received: from Mac-mini-di-Andrea-Brancatelli.local ([10.33.100.192]) by potassio.roma.schema31.it (8.14.4/8.14.4) with ESMTP id t079FxG6037732 for ; Wed, 7 Jan 2015 10:15:59 +0100 (CET) (envelope-from abrancatelli@schema31.it) Date: Wed, 7 Jan 2015 10:15:58 +0100 From: Andrea Brancatelli To: freebsd-virtualization@freebsd.org Message-ID: Subject: bhyve stuck X-Mailer: Airmail (284) MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 07 Jan 2015 09:22:26 -0000 =C2=A0 Hello everybody. I had () a VM doing some intense work over an volume that, on the host, i= s mapped on a iSCSI volume. =20 After some hours of correct work the machine hang displaying=E2=80=A6 =20 ahcich0: Timeout on slot 29 port 0 =20 ahcich0: is 00000000 cs 00000000 ss f0000007 rs f0000007 tfd 50 serr 0000= 0000 cmd 1000c217 (ada0:ahcich0:0:0:0): WRITE=5F=46PDMA=5FQUEUED. ACB: 61 40 e2 b9 7e 40 38= 00 00 00 00 00 (ada0:ahcich0:0:0:0): CAM status: Command timeout (ada0:ahcich0:0:0:0): Retrying command ahcich0: Timeout on slot 2 port 0 ahcich0: is 00000000 cs 00000000 ss 000001fc rs 000001fc tfd 50 serr 0000= 0000 cmd 1000c817 (ada0:ahcich0:0:0:0): READ=5F=46PDMA=5FQUEUED. ACB: 60 00 e2 4e 3c 40 10 = 00 00 01 00 00 (ada0:ahcich0:0:0:0): CAM status: Command timeout (ada0:ahcich0:0:0:0): Retrying command ahcich0: Timeout on slot 8 port 0 ahcich0: is 00000000 cs 00000000 ss 00007f00 rs 00007f00 tfd 50 serr 0000= 0000 cmd 1000ce17 (ada0:ahcich0:0:0:0): READ=5F=46PDMA=5FQUEUED. ACB: 60 00 e2 4e 3c 40 10 = 00 00 01 00 00 (ada0:ahcich0:0:0:0): CAM status: Command timeout (ada0:ahcich0:0:0:0): Retrying command ahcich0: Timeout on slot 14 port 0 ahcich0: is 00000000 cs 00000000 ss 001fc000 rs 001fc000 tfd 50 serr 0000= 0000 cmd 1000d417 (ada0:ahcich0:0:0:0): READ=5F=46PDMA=5FQUEUED. ACB: 60 00 e2 4e 3c 40 10 = 00 00 01 00 00 (ada0:ahcich0:0:0:0): CAM status: Command timeout (ada0:ahcich0:0:0:0): Retrying command Assertion failed: (aior =21=3D NULL), function ahci=5Fhandle=5Fdma, file = /usr/src/usr.sbin/bhyve/pci=5Fahci.c, line 494. Now the VM is totally hang. Trying to kill bhyve doesn=E2=80=99t work, no= t even kill -9. I tries do to do a bhyvectl =E2=80=94destroy and the VM d= isappeared from /dev/vmm but I am strongly uncomfortable with what to do = now. The process is still there. Can I restart the VM=3F =20 Obviously I cannot restart the physical machine. =20 This is the state of the process: =20 root=40environment-rm-01:/san=5Fstorage/VMfs/cloud31Slave =23 ps -ax =7C = grep 91715 =20 91715 5 T+ 1465:19.24 bhyve: cloud31Slave (bhyve) 18037 14 S+ 0:00.00 grep 91715 root=40environment-rm-01:/san=5Fstorage/VMfs/cloud31Slave =23 procstat -t= 91715 PID TID COMM TDNAME CPU PRI STATE WCHAN 91715 100129 bhyve mevent 2 120 stop - 91715 100246 bhyve blk-2:0 7 121 stop getblk 91715 100247 bhyve vtnet-4:0 tx 6 120 stop - 91715 100248 bhyve vcpu 0 8 120 stop - 91715 100249 bhyve vcpu 1 4 120 stop - root=40environment-rm-01:/san=5Fstorage/VMfs/cloud31Slave =23 procstat -k= k 91715 PID TID COMM TDNAME KSTACK 91715 100129 bhyve mevent mi=5Fswitch+0xe1 thread=5Fsuspend=5Fcheck+0x317= ast+0x4f5 doreti=5Fast+0x1f 91715 100246 bhyve blk-2:0 mi=5Fswitch+0xe1 sleepq=5Fwait+0x3a sleeplk+0x= 15d =5F=5Flockmgr=5Fargs+0xc9e getblk+0x131 cluster=5Fread+0xd0 ffs=5Frea= d+0x1a9 VOP=5FREAD=5FAPV+0xa1 vn=5Fread+0x165 vn=5Fio=5Ffault=5Fdoio+0x22= vn=5Fio=5Ffault1+0x7c vn=5Fio=5Ffault+0x18b dofileread+0x95 kern=5Fpread= v+0x92 sys=5Fpreadv+0x3a amd64=5Fsyscall+0x351 Xfast=5Fsyscall+0xfb 91715 100247 bhyve vtnet-4:0 tx mi=5Fswitch+0xe1 thread=5Fsuspend=5Fcheck= +0x317 ast+0x4f5 doreti=5Fast+0x1f 91715 100248 bhyve vcpu 0 mi=5Fswitch+0xe1 thread=5Fsuspend=5Fcheck+0x317= ast+0x4f5 doreti=5Fast+0x1f 91715 100249 bhyve vcpu 1 mi=5Fswitch+0xe1 thread=5Fsuspend=5Fswitch+0x17= 0 thread=5Fsingle+0x357 sigexit+0x4e postsig+0x361 ast+0x427 Xfast=5Fsysc= all+0x160 root=40environment-rm-01:/san=5Fstorage/VMfs/cloud31Slave =23 kill -CONT = 91715 root=40environment-rm-01:/san=5Fstorage/VMfs/cloud31Slave =23 kill -9 917= 15 root=40environment-rm-01:/san=5Fstorage/VMfs/cloud31Slave =23 ps -ax =7C = grep 91715 91715 5 T+ 1465:19.24 bhyve: cloud31Slave (bhyve) 18041 14 S+ 0:00.00 grep 91715 ------- =20 Andrea Brancatelli =20