From owner-freebsd-stable@FreeBSD.ORG Thu Apr 29 13:55:40 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 08B2E106564A; Thu, 29 Apr 2010 13:55:40 +0000 (UTC) (envelope-from mavbsd@gmail.com) Received: from mail-ew0-f224.google.com (mail-ew0-f224.google.com [209.85.219.224]) by mx1.freebsd.org (Postfix) with ESMTP id 351E08FC20; Thu, 29 Apr 2010 13:55:38 +0000 (UTC) Received: by ewy24 with SMTP id 24so4895666ewy.33 for ; Thu, 29 Apr 2010 06:55:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:sender:message-id:date:from :user-agent:mime-version:to:cc:subject:references:in-reply-to :x-enigmail-version:content-type; bh=/EtQ/LwYy+Hdow8E8CDWQ0GxNK9RwmhaJMkzdmE3SWk=; b=KQXyRYX7JMcXvucbwwbeASM9JqVCxXhLWDCGtItCyT1CUyXmhJsHVsCm7Y3QVjlp5h kkPlmnl2KIv1QBr+VcgyhMSt1yLVHCiuGrS2zfevWwyhyZdls80dJebqDOnEcnE32oov 6bbSsAffYgHfsPP/Yxt0z+/5RUAPZjjFNkiw8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:x-enigmail-version:content-type; b=Eb68vAgb8vkOeFGL7gcbI+DVx9Cf0RZyF+3Vh5J4l5iUbduE1mjzihkSO5Czl6sqJk XCDvUr+LDaBe0008l/TpiTrv4wTWlgzyUkS3jzF2TdxzK7VP6vK5I6+qdmPZ/UaHpPhY YBacGVOXrmH26Xv5tZmiyEwTgvz+JSmiBlJIc= Received: by 10.102.237.35 with SMTP id k35mr5180278muh.72.1272549328839; Thu, 29 Apr 2010 06:55:28 -0700 (PDT) Received: from mavbook.mavhome.dp.ua (pc.mavhome.dp.ua [212.86.226.226]) by mx.google.com with ESMTPS id s10sm4277427muh.35.2010.04.29.06.55.27 (version=SSLv3 cipher=RC4-MD5); Thu, 29 Apr 2010 06:55:28 -0700 (PDT) Sender: Alexander Motin Message-ID: <4BD98FC0.2030206@FreeBSD.org> Date: Thu, 29 Apr 2010 16:55:12 +0300 From: Alexander Motin User-Agent: Thunderbird 2.0.0.24 (X11/20100402) MIME-Version: 1.0 To: Pete French References: <4BD896AC.4080509@FreeBSD.org> In-Reply-To: <4BD896AC.4080509@FreeBSD.org> X-Enigmail-Version: 0.96.0 Content-Type: multipart/mixed; boundary="------------080401080207070003020809" Cc: freebsd-scsi@freebsd.org, chuzzwassa@gmail.com, scottl@FreeBSD.org, freebsd-stable@freebsd.org Subject: Re: MFC of "Large set of CAM improvements" breaks I/O to Adaptec 29160 SCSI controller X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 29 Apr 2010 13:55:40 -0000 This is a multi-part message in MIME format. --------------080401080207070003020809 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Alexander Motin wrote: > Pete French wrote: >>> I have some 29160N locally and I'll try to reproduce this. >> I would suggest you try gmirror across two drives - that is how >> both myself and the original poster first noticed the issue. > > Thanks. First step successful - I can steadily reproduce problem on > CURRENT. raidtest with 200 I/O streams over gmirror of two disks on same > channel triggers issue in seconds. Any I/O on channel dying after both > disks report "Queue full" error same time. The rest of system works > fine. If I preliminarily manually adjust queue depth of one disk - > everything works fine. I'll investigate it tomorrow. Seems like I've found the reason. Attached patch fixes problem for me. This call was removed by mistake in specified commit. It is not needed during normal operation, only when device queue shrinking. And even in that case problem often wasn't not triggered if there were more requests and controller request allocation queue wasn't not exhausted at the moment. That's why problem wasn't detected and why gmirror increased it's chances. -- Alexander Motin --------------080401080207070003020809 Content-Type: text/plain; name="cam.sched_fix.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="cam.sched_fix.patch" --- cam_xpt.c.prev 2010-04-28 08:15:40.000000000 +0300 +++ cam_xpt.c 2010-04-29 16:01:23.000000000 +0300 @@ -4903,6 +4903,10 @@ camisr_runqueue(void *V_queue) if ((dev->flags & CAM_DEV_TAG_AFTER_COUNT) != 0 && (--dev->tag_delay_count == 0)) xpt_start_tags(ccb_h->path); + if (!device_is_send_queued(dev)) { + runq = xpt_schedule_dev_sendq(ccb_h->path->bus, + dev); + } } if (ccb_h->status & CAM_RELEASE_SIMQ) { --------------080401080207070003020809--