Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 29 Apr 2010 16:55:12 +0300
From:      Alexander Motin <mav@FreeBSD.org>
To:        Pete French <petefrench@ticketswitch.com>
Cc:        freebsd-scsi@freebsd.org, scottl@FreeBSD.org, freebsd-stable@freebsd.org
Subject:   Re: MFC of "Large set of CAM improvements" breaks I/O to Adaptec 29160 SCSI controller
Message-ID:  <4BD98FC0.2030206@FreeBSD.org>
In-Reply-To: <4BD896AC.4080509@FreeBSD.org>
References:  <E1O7AGM-0005ux-8E@dilbert.ticketswitch.com> <4BD896AC.4080509@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format.
--------------080401080207070003020809
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

Alexander Motin wrote:
> Pete French wrote:
>>> I have some 29160N locally and I'll try to reproduce this.
>> I would suggest you try gmirror across two drives - that is how
>> both myself and the original poster first noticed the issue.
> 
> Thanks. First step successful - I can steadily reproduce problem on
> CURRENT. raidtest with 200 I/O streams over gmirror of two disks on same
> channel triggers issue in seconds. Any I/O on channel dying after both
> disks report "Queue full" error same time. The rest of system works
> fine. If I preliminarily manually adjust queue depth of one disk -
> everything works fine. I'll investigate it tomorrow.

Seems like I've found the reason. Attached patch fixes problem for me.

This call was removed by mistake in specified commit. It is not needed
during normal operation, only when device queue shrinking. And even in
that case problem often wasn't not triggered if there were more requests
and controller request allocation queue wasn't not exhausted at the
moment. That's why problem wasn't detected and why gmirror increased
it's chances.

-- 
Alexander Motin

--------------080401080207070003020809
Content-Type: text/plain;
 name="cam.sched_fix.patch"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="cam.sched_fix.patch"

--- cam_xpt.c.prev	2010-04-28 08:15:40.000000000 +0300
+++ cam_xpt.c	2010-04-29 16:01:23.000000000 +0300
@@ -4903,6 +4903,10 @@ camisr_runqueue(void *V_queue)
 			if ((dev->flags & CAM_DEV_TAG_AFTER_COUNT) != 0
 			 && (--dev->tag_delay_count == 0))
 				xpt_start_tags(ccb_h->path);
+			if (!device_is_send_queued(dev)) {
+				runq = xpt_schedule_dev_sendq(ccb_h->path->bus,
+				    dev);
+			}
 		}
 
 		if (ccb_h->status & CAM_RELEASE_SIMQ) {

--------------080401080207070003020809--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4BD98FC0.2030206>