Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 17 Dec 2019 20:29:47 +0000 (UTC)
From:      "Kenneth D. Merry" <ken@FreeBSD.org>
To:        src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-12@freebsd.org
Subject:   svn commit: r355862 - stable/12/sys/cam/scsi
Message-ID:  <201912172029.xBHKTlpE007701@repo.freebsd.org>

next in thread | raw e-mail | index | archive | help
Author: ken
Date: Tue Dec 17 20:29:47 2019
New Revision: 355862
URL: https://svnweb.freebsd.org/changeset/base/355862

Log:
  MFC r355299:
  
    ------------------------------------------------------------------------
    r355299 | ken | 2019-12-02 14:57:39 -0500 (Mon, 02 Dec 2019) | 52 lines
  
    Fix a hang introduced in r351599.
  
    My changes in 351599 (kindly committed by avg) made the cd(4) media check
    asynchronous to avoid a sleep while holding a mutex.
  
    There was a difficult to reproduce bug with those changes that caused a
    hang on boot on some single processor machines/VMs.  Leandro Lupori
    managed to reproduce the bug, diagnose it, and supplied a patch!  Here is
    his analysis, from the PR:
  
    ======
    I was able to reproduce the problem described in comment#14.
  
    Actually, I wasn't trying to reproduce it, I just started seeing it a few
    weeks ago, in CURRENT.
  
    I can reproduce it consistently, by using QEMU to run a PowerPC64 VM with a
    single core/thread (-smp 1).
  
    It happens only when there is no media in the emulated CD-ROM, a device
    that QEMU adds by default, unless -nodefaults is specified in command line.
  
    I've debugged it and this is what I've found:
  
    1- After the CD probe is successful, GEOM will try to open the device,
    which will end up calling cdcheckmedia(), that sets CD state to
    CD_STATE_MEDIA_PREVENT.
    2- Next, scsi_prevent() is executed and succeeds, the CD_FLAG_DISC_LOCKED
    flag is set and CD state moves to CD_STATE_MEDIA_SIZE.
    3- Next, scsi_read_capacity() is executed and fails, state is set to
    CD_STATE_MEDIA_ALLOW, cdmediaprobedone() is called and wakes up
    cdcheckmedia().
    4- Then, when cdstart() is invoked to process CD_STATE_MEDIA_ALLOW, it
    first checks if CD_FLAG_DISC_LOCKED is set, and if so skips directly to
    CD_STATE_MEDIA_SIZE state. This will repeat the steps of bullet 3, entering
    an infinite MEDIA_SIZE command loop.
  
    When there is a least another core/thread, the GEOM thread that performed
    the initial cdopen() will get scheduled again, closing the CD device, that
    will call cdprevent(PR_ALLOW) that clears the CD_FLAG_DISC_LOCKED flag and
    breaks the loop.
  
    So, apparently, the problem is CD_STATE_MEDIA_ALLOW being skipped when
    CD_FLAG_DISC_LOCKED is set. If I understand correctly, in this case, the
    state should be advanced to CD_STATE_MEDIA size only when the current state
    is CD_STATE_MEDIA_PREVENT.
    =====
    ------------------------------------------------------------------------
  
  PR:		kern/219857
  Submitted by:	Leandro Lupori <leandro.lupori@gmail.com>

Modified:
  stable/12/sys/cam/scsi/scsi_cd.c
Directory Properties:
  stable/12/   (props changed)

Modified: stable/12/sys/cam/scsi/scsi_cd.c
==============================================================================
--- stable/12/sys/cam/scsi/scsi_cd.c	Tue Dec 17 19:01:09 2019	(r355861)
+++ stable/12/sys/cam/scsi/scsi_cd.c	Tue Dec 17 20:29:47 2019	(r355862)
@@ -1030,7 +1030,8 @@ cdstart(struct cam_periph *periph, union ccb *start_cc
 		 * If the CD is already locked, we don't need to do this.
 		 * Move on to the capacity check.
 		 */
-		if ((softc->flags & CD_FLAG_DISC_LOCKED) != 0) {
+		if (softc->state == CD_STATE_MEDIA_PREVENT
+		 && (softc->flags & CD_FLAG_DISC_LOCKED) != 0) {
 			softc->state = CD_STATE_MEDIA_SIZE;
 			xpt_release_ccb(start_ccb);
 			xpt_schedule(periph, CAM_PRIORITY_NORMAL);



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?201912172029.xBHKTlpE007701>