From owner-svn-src-all@freebsd.org Tue Nov 14 18:26:33 2017 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 484DADDA469; Tue, 14 Nov 2017 18:26:33 +0000 (UTC) (envelope-from asomers@gmail.com) Received: from mail-lf0-x22a.google.com (mail-lf0-x22a.google.com [IPv6:2a00:1450:4010:c07::22a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id BF04F69F6A; Tue, 14 Nov 2017 18:26:32 +0000 (UTC) (envelope-from asomers@gmail.com) Received: by mail-lf0-x22a.google.com with SMTP id 73so9065662lfu.10; Tue, 14 Nov 2017 10:26:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc:content-transfer-encoding; bh=P3hXWwaYd5EvFlcePIGF5COOWXO5f71hGMLHgTNH8W0=; b=m/eY6DyYS+q3KCdgDwABMwq4aUeNxw5d02GTLnHr6QmgjTeIKI7/im1aphCD0H0mh6 2NohYKMhKJa6CP8+6RQfxNopo8wlLKnBl1Di1zlBy3o/DGuh+5cFP8bqB+7RezvYUU9e SRPjyAmgNsA1WKABUR/ypaqqLbHoPx69yx+kv/Z226Y93Yq4dSfMgnLUFyhHzmclVeiT AGwVPv3made8Ji8C1/BBVnfjIDrs2c/GbMqdx8Y/y4NCGiE48a3WAzcKno0B2EMDbWgw d2K7DKJBaMXO3mazTGbwzX1Bag1CW3My3qcIzLnkndzOAhJBhB69uZOeg9FiuN3FpYYQ bNNg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc:content-transfer-encoding; bh=P3hXWwaYd5EvFlcePIGF5COOWXO5f71hGMLHgTNH8W0=; b=d3XlijhrPsfflCB4ENmAqolbwUZYiBz9Gq5E+K4/ht8QllB7DeOXdcy163N5t/a9ig esp3oibZqh3UnFHH9nUmFaKfEyTI1CTySi2RKONURvSUsPMklVmIlhADn3d/N5XLSeD1 VJtTdGvluVaZjQg4/h7geAvJS+02LsrvkzIr8o9vKW+otPq5pxYqgSAVhfzl+WUjjSig 2Ak4D3bYCLUeQIUrAvUezlz8Tr5fHvUhD9CF1woXy9xv3bTr0yU93pwkYoOBZRe63zeY FiH/ssq1U5vDBBSOhTPJncFwcq7cqO2pZHCzL4qjPvT39neHQkpwucMDJ/f6Sb1L6Fqd 8L2g== X-Gm-Message-State: AJaThX7s3PuDee0n6+m6BiDK8G3j4b0iq4ymV+wUp6pKW9oxRvEvMYmA GG4S1v22Ry0wG3MEcviIHLs7THBjLrLDJM38K2AaAw== X-Google-Smtp-Source: AGs4zMYZczRfMsIMrWSKmvOXj/7FCuwPV5vZaO+1L6hfP1Zxe7MhVlYBAe8/4Ue0a6fU4bcmAjT0RE9N1wBTU/Fa164= X-Received: by 10.25.29.78 with SMTP id d75mr3761621lfd.39.1510683990541; Tue, 14 Nov 2017 10:26:30 -0800 (PST) MIME-Version: 1.0 Sender: asomers@gmail.com Received: by 10.179.93.24 with HTTP; Tue, 14 Nov 2017 10:26:29 -0800 (PST) In-Reply-To: <582427C7.5020007@omnilan.de> References: <201611021513.uA2FDPk6062463@repo.freebsd.org> <581C5249.2060104@omnilan.de> <161EBBC5-F642-4A05-9361-179B74CDA50A@samsco.org> <582427C7.5020007@omnilan.de> From: Alan Somers Date: Tue, 14 Nov 2017 11:26:29 -0700 X-Google-Sender-Auth: _towBn4zAFmZChrMT2WHgcjwFBg Message-ID: Subject: Re: svn commit: r308217 - in head/sys/dev: mpr mps To: Harry Schmalzbauer Cc: Scott Long , "src-committers@freebsd.org" , "svn-src-all@freebsd.org" , "svn-src-head@freebsd.org" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.25 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Nov 2017 18:26:33 -0000 On Thu, Nov 10, 2016 at 12:54 AM, Harry Schmalzbauer w= rote: > Bez=C3=BCglich Scott Long's Nachricht vom 09.11.2016 17:06 (localtime): >> >>> On Nov 4, 2016, at 3:18 AM, Harry Schmalzbauer wro= te: > =E2=80=A6 >>> If it's really mps(4) who decides to store driveserial-targetID >>> numbering in the /"persitent non-manufacturing config pages/" of the >>> controller, mpsutil(8) should be able to reset. Otherwise replacing >>> failed drives, or - even mor confusing - rearranging drive/zpool layout= s >>> is very unsatisfying. >>> >>> Maybe "-1" should be mentioned with sysctl decription, otherwise this i= s >>> another very hard to find/influence behaviour. >>> >>> >> >> Thanks for the feedback. For the record, this problem happens on a >> Supermicro X10SDV-7TP4F motherboard. It appears that the support >> logic around the LSI controller is mis-configured to show the SAS ports >> being part of an enclosure with 0 slots, instead of 8. This confuses >> the device mapper logic in the driver that activates if the controller N= VRAM >> doesn=E2=80=99t specify a pre-existing mapping. Typically this is not t= he default, >> the NVRAM persistent mappings are the default and are used by the driver= , >> so I considered this problem to be unique to our deployment. Maybe it= =E2=80=99s >> more of a problem than I estimated? Anyways, sounds like this new > > I haven't had too much diversity regarding Fusion-MPT and this mapping > problem has never hit me yet, so I can't help estimating the severity of > that specific problem. > > But I think I haven't described clearly that I'm having da(4)-numbering > problems which are not directly enclosure-mapping related (at least not > related to the nvram mapping page), but which hopefully could be worked > arround the same way: > > I frequently had problems replacing drives due to the eternal targetID > assigning (every drive with a new/unknown serial gets a consecutive > targetID regardless of the enclosure-slot or the number of attached drive= s). > I guess this is stored in a completely different NVRAM page than the > enclosure-mapping page. > Your patch is intended to solve problems with invalid/absent > enclosure-mapping page, but I guess I'll sooner or later need to try the > "hw.mpr.use_phy_num=3D-1" sysctl > to hopefully overwrite the targetID++ assigning, which causes "wholes" > every time a drive gets replaced. > In that case it's just a cosmetic problem, but when rearranging old and > new drives on the same controller, it can cause severe confusion for the > admins =E2=80=93 leading to fatal mistakes. And migrating disks to new > controller/chassis is even more problematic if the host had hard wired > da(4) assignins via device.hints. > > I couldn't search the driver to find out if the "save eternal targetID > in nvram page N" is really present and not firmware-only induced, but > since I saw a different behaviour on windows, I guess it is. > I could circumvent the problem by simply using IR firmware since it is > only active when mps(4) runs IT firmware. > > But having a way to disable "save eternal targetID in nvram page N" for > mps(4)-IT (via sysctl, and possibly for mpr/mpt also) would be very welco= me. > > Top on my wishlist was extending mpsutil(8) to be able to list and > selectively delete single serial-targetID mappings, but I haven't even > found a way to do that with any vendor provided tool, not even with > LSIUtil =E2=80=93 where I can at least erase _all_ mappings. > > >> functionality should be properly documented in the driver. > > Thanks for your continuous supprt/help/improvements! > > -Harry We've run into a situation where we had to use this fallback logic too. In our case, it happened when upgrading mpr HBA firmware from 14.0.0.0 to 15.0.0.0, which corrupted the internal mapping tables. I fixed the resulting panic in r325363, but that exposed a problem with the fallback logic. Since the fallback logic uses the phy number as the target ID, it doesn't work on SAS busses with more than one expander. In that case, multiple drives can get the same target ID, and only the first is usable. In our codebase, I fixed this by setting the id to ((config_page.EnclosureHandle - 1) << 7) | config_page.Slot; . That formula will work for enclosures of up to 128 slots and 8 enclosures. However, it will obviously fail if the enclosure assigns the same Slot to multiple drives. That sounds like the case for Scott. Are there any alternatives I'm missing that would satisfy everyone? -Alan