From owner-freebsd-arch@FreeBSD.ORG Fri Jan 9 13:04:10 2009 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 697801065670 for ; Fri, 9 Jan 2009 13:04:10 +0000 (UTC) (envelope-from luigi@onelab2.iet.unipi.it) Received: from onelab2.iet.unipi.it (onelab2.iet.unipi.it [131.114.9.129]) by mx1.freebsd.org (Postfix) with ESMTP id 301828FC13 for ; Fri, 9 Jan 2009 13:04:09 +0000 (UTC) (envelope-from luigi@onelab2.iet.unipi.it) Received: by onelab2.iet.unipi.it (Postfix, from userid 275) id 0141D7309E; Fri, 9 Jan 2009 14:09:12 +0100 (CET) Date: Fri, 9 Jan 2009 14:09:11 +0100 From: Luigi Rizzo To: arch@freebsd.org Message-ID: <20090109130911.GA17017@onelab2.iet.unipi.it> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.3i Cc: fabio@gandalf.sssup.it Subject: tagging disk requests (geom-related) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 09 Jan 2009 13:04:10 -0000 Hi, together with Fabio Checconi we are working on some disk schedulers using a geom class. One of the things we have is to do is tag requests (struct bio) with the identity of the thread issuing the request (or some other classification information, but which one is irrelevant here). We cannot do the tagging when we reach our geom class because at this point we have lost the thread information (curthread is "g_down" at this point). So there are two issues here: 1) where to store the tag, 2) who does the tagging. (Background for non geom-aware people: each request is identified by a 'struct bio' (BIO); when moving from a GEOM class to the next one downstream, a BIO is cloned and possibly split (e.g., to do slicing, RAID, or simply breaking up large requests) and each child BIO has a pointer to the parent BIO, so overall they are connected in a tree even though more frequently there is just a linear chain.) For #1, to avoid adding a field to 'struct bio', we store the tag in the bio_caller1 field (if available) of the root element of the BIO tree, bio_caller1 is normally unused (except by ZFS), and we say it is available if it contains NULL. For the reasons stated above, we cannot store the mark in the BIO associated to our geom class because it does not exist yet when we need to store the mark. Re #2, we can put the code that does the marking either in the place where the root BIO is created (but there may be many such places, and especially they can be in external modules that we are not even aware of), or hook into a central place, g_io_request(), and walk up the BIO tree until we find the root: { struct bio *top = bp; while (top->bio_parent) top = top->bio_parent; if (top->bio_caller1 == NULL) top->bio_caller1 = (void *)curthread->td_tid; } We opted for the latter. The drawbacks of this approach are that we are writing in a BIO that is not ours, also that bio_caller1 might be unavailable (e.g. in the ZFS case). The alternative approach is adding one field to the struct bio -- in this case the marking could just be done on the current BIO in g_io_request, and propagated down in g_clone_bio() (or just in the lookup, walk up the tree to the topmost marked bio). So, do you have any better ideas, e.g. other fields in the topmost bio that we can use ? cheers luigi