Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 29 Jan 2018 19:44:35 -0800
From:      Dexuan-BSD Cui <dexuan.bsd@gmail.com>
To:        freebsd-scsi@freebsd.org, freebsd-virtualization@freebsd.org,  scott4long@yahoo.com, scottl@freebsd.org, imp@freebsd.org,  jhibbits@freebsd.org, avg@freebsd.org, asomers@freebsd.org, cem@freebsd.org,  mav@freebsd.org
Cc:        decui@microsoft.com
Subject:   A strange SCSI-disk-detached issue for FreeBSD VM on Hyper-V/Azure
Message-ID:  <CABkgQk-kUSgZ_BvgOOZNkxHr=7gPF5G3qn_be9tzrQ2j_fHSZg@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
With the the recent Azure host upgrade, some FreeBSD VMs can't
properly detect HDD-backed disks.
The issue happens to the latest 12-CURRENT too.

da0 (0:0:0:0=EF=BC=89is also backed by HDD, but somehow is not affected.
da1 (1:0:1:0) it not affected either, only the extra disks, e,g. da2
(3:0:0:0=EF=BC=89, are affected.

These are the logs of my VM, which has da0, da1 and da2. These 3 disks
are on 3 different disk controllers.


[root@decui-d4-repro ~]# dmesg -c > /dev/null
[root@decui-d4-repro ~]# camcontrol rescan all
Re-scan of bus 0 was successful
Re-scan of bus 1 was successful
Re-scan of bus 2 was successful
Re-scan of bus 3 was successful
Re-scan of bus 4 was successful
Re-scan of bus 5 was successful

[root@decui-d4-repro ~]# dmesg
(aprobe0:ata0:0:0:0): Disable ATA disks on simulated ATA controller (0x7111=
8086)
(aprobe0:ata0:0:1:0): Disable ATA disks on simulated ATA controller (0x7111=
8086)
(probe1:(probe0:blkvsc0:0:1:0): Down reving Protocol Version from 4 to 2?
blkvsc0:0:0:0): storvsc inquiry (36) [0 0 4 2 1f ... ]
(probe1:blkvsc0:0:1:0): storvsc inquiry (36) [7f 0 5 2 1f ... ]
(probe0:blkvsc0:0:0:0): storvsc scsi_status =3D 2, srb_status =3D 6
(probe0:(probe1:blkvsc0:0:0:0): storvsc inquiry (5) [0 0 0 5 0 ... ]
(probe0:blkvsc0:0:1:1): Down reving Protocol Version from 4 to 2?
blkvsc0:0:0:0): storvsc inquiry (6) [0 83 0 1c 1 ... ]
(probe1:blkvsc0:0:1:1): invalid LUN 1 for op: INQUIRY
(probe0:blkvsc0:0:0:1): invalid LUN 1 for op: INQUIRY
(probe0:blkvsc1:0:0:0): Down reving Protocol Version from 4 to 2?
(probe0:blkvsc1:0:0:0): storvsc inquiry (36) [7f 0 5 2 1f ... ]
(probe1:(probe0:blkvsc1:0:1:0): storvsc inquiry (36) [0 0 5 2 1f ... ]
(probe1:blkvsc1:0:0:1): Down reving Protocol Version from 4 to 2?
blkvsc1:0:1:0): storvsc scsi_status =3D 2, srb_status =3D 6
(probe1:blkvsc1:0:1:0): storvsc inquiry (5) [0 0 0 8 0 ... ]
(probe0:blkvsc1:0:0:1): invalid LUN 1 for op: INQUIRY
(probe1:blkvsc1:0:1:0): storvsc inquiry (6) [0 83 0 30 1 ... ]
(probe0:storvsc2:0:0:0): Down reving Protocol Version from 4 to 2?
(probe1:(probe0:storvsc2:0:1:0): Down reving Protocol Version from 4 to 2?
storvsc2:0:0:0): storvsc inquiry (36) [7f 0 5 2 1f ... ]
(probe1:(probe0:storvsc2:0:1:0): storvsc inquiry (36) [7f 0 5 2 1f ... ]
(probe1:storvsc2:0:0:1): Down reving Protocol Version from 4 to 2?
(probe0:storvsc2:0:1:1): Down reving Protocol Version from 4 to 2?
(probe1:storvsc2:0:0:1): invalid LUN 1 for op: INQUIRY
storvsc2:0:1:1): invalid LUN 1 for op: INQUIRY
(probe0:storvsc3:0:0:0): Down reving Protocol Version from 4 to 2?
(probe1:(probe0:storvsc3:0:1:0): Down reving Protocol Version from 4 to 2?
storvsc3:0:0:0): storvsc inquiry (36) [0 0 4 2 1f ... ]
(probe1:(probe0:storvsc3:0:1:0): storvsc inquiry (36) [7f 0 5 2 1f ... ]
storvsc3:0:0:0): storvsc scsi_status =3D 2, srb_status =3D 6
(probe0:(probe1:storvsc3:0:0:0): storvsc inquiry (5) [0 0 0 5 0 ... ]
(probe0:storvsc3:0:1:1): Down reving Protocol Version from 4 to 2?
storvsc3:0:0:0): storvsc inquiry (6) [0 83 0 1c 1 ... ]
(probe1:storvsc3:0:1:1): invalid LUN 1 for op: INQUIRY
GEOM: new disk da2
(probe0:pass3 at storvsc3 bus 0 scbus5 target 0 lun 0
storvsc3:0:0:1): invalid LUN 1 for op: INQUIRY
pass3: <Msft Virtual Disk 1.0> Fixed Direct Access SPC-2 SCSI device
pass3: 300.000MB/s transfers
pass3: Command Queueing enabled
pass3 at storvsc3 bus 0 scbus5 target 0 lun 0
pass3: <Msft Virtual Disk 1.0> detached
(da2:storvsc3:0:0:0): fatal error, could not acquire reference count
(pass3:(da2:storvsc3:0:storvsc3:0:0:0): storvsc inquiry (5) [0 b1 0 3c 0 ..=
. ]
0:0): Periph destroyed
da2: Delete methods: <NONE(*),ZERO>
Opened disk da2 -> 6
g_access(918): provider da2 has error
g_access(918): provider da2 has error
g_access(918): provider da2 has error
g_access(918): provider da2 has error


According to the log, first it looks da2 is indeed detected, but then
it's detached immediately:

pass3: <Msft Virtual Disk 1.0> detached

The above line comes from xpt_denounce_periph_sbuf(), which is called by:

cam_periph_invalidate() at cam_periph_invalidate+0xe7/frame 0xfffffe00004eb=
7f0
xpt_async_process_dev() at xpt_async_process_dev+0x162/frame 0xfffffe00004e=
b840
xptdevicetraverse() at xptdevicetraverse+0xa3/frame 0xfffffe00004eb890
xpttargettraverse() at xpttargettraverse+0x6a/frame 0xfffffe00004eb8d0
xpt_async_process() at xpt_async_process+0x114/frame 0xfffffe00004eb9e0
xpt_done_process() at xpt_done_process+0x38a/frame 0xfffffe00004eba20
xpt_done_td() at xpt_done_td+0x136/frame 0xfffffe00004eba70
fork_exit() at fork_exit+0x84/frame 0xfffffe00004ebab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00004ebab


I'm trying to figure out why CAM detaches da2 -- it looks something is wron=
g?

Currently, with the below hack, da2 can be recognized and
reading/writing da2 works fine:

--- a/sys/cam/cam_periph.c
+++ b/sys/cam/cam_periph.c
@@ -1942,8 +1942,10 @@ cam_periph_error(union ccb *ccb, cam_flags camflags,
                         * Let peripheral drivers know that this
                         * device has gone away.
                         */
+#if 0
                        xpt_async(AC_LOST_DEVICE, newpath, NULL);
                        xpt_free_path(newpath);
+#endif
                }
        }

And I'm further debugging. I appreciate any suggestion.

Thanks!

-- Dexuan



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CABkgQk-kUSgZ_BvgOOZNkxHr=7gPF5G3qn_be9tzrQ2j_fHSZg>