Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 6 Feb 2010 18:20:29 +0100
From:      Marius Strobl <marius@alchemy.franken.de>
To:        Rick Macklem <rmacklem@uoguelph.ca>
Cc:        rmacklem@freebsd.org, dfr@freebsd.org, freebsd-stable@freebsd.org, Peter Jeremy <peterjeremy@acm.org>, John Baldwin <jhb@freebsd.org>
Subject:   Re: uma_zalloc_arg complaining about non-sleepable locks
Message-ID:  <20100206172029.GK77522@alchemy.franken.de>
In-Reply-To: <Pine.GSO.4.63.1002011131080.6641@muncher.cs.uoguelph.ca>
References:  <20100126073336.GA1955@server.vk2pj.dyndns.org> <20100131010618.GA1864@server.vk2pj.dyndns.org> <20100131162854.GC77522@alchemy.franken.de> <201002010946.57253.jhb@freebsd.org> <Pine.GSO.4.63.1002011131080.6641@muncher.cs.uoguelph.ca>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Feb 01, 2010 at 11:54:54AM -0500, Rick Macklem wrote:
> 
> 
> On Mon, 1 Feb 2010, John Baldwin wrote:
> 
> >>>I'd say that your patch works.
> >>
> >>John, are you okay with that patch?
> >>http://people.freebsd.org/~marius/fha_extract_info_realign2.diff
> >>
> >>It's intention is to:
> >>- Move nfs_realign() from the NFS client to the shared NFS code and
> >>  remove the NFS server version in order to reduce code duplication.
> >>  The shared version now uses a second parameter how, which is passed
> >>  on to m_get(9) and m_getcl(9) as the server used M_WAIT while the
> >>  client requires M_DONTWAIT, and replaces the the previously unused
> >>  parameter hsiz.
> 
> I don't think the client side can use M_WAIT/M_WAITOK if it wants to.
> (I believe that it was once called from the socket upcall and that was
>  why M_DONTWAIT was used, but that isn't how it is called in the client
>  now.)
> 
> I mentioned this to dfr@ because I use a version shared between client
> and server for the experimental one that does M_WAIT, but he didn't
> think the regular client needed to be changed. Basically, I think the 
> current code in the regular client can return ENOMEM for I/O system calls, 
> which probably isn't what the app. expected. In the NFS client, you end up
> with this "do I wait forever?" or "do I return an error to an I/O system
> call which an app. doesn't expect" issue cropping up here and there. I
> don't know the correct answer to this, but tend to lean in the "wait 
> forever" direction.

Well, as demonstrated by the problem with fha_extract_info() "waiting
forever" must also match the locking so I'd rather leave chanGing the
regular NFS client to someone who understands it better than I do :)

Btw., newnfs_realign() should use #ifdef __NO_STRICT_ALIGNMENT instead
of #if defined(__i386__) in order to bypass realigning on platforms
which can deal with unaligned accesses.

> 
> >>- Change nfs_realign() to use nfsm_aligned() so as with other NFS code
> >>  the alignment check isn't actually performed on platforms without
> >>  strict alignment requirements for performance reasons because as the
> >>  comment suggests only occasionally occurs with TCP.
> >>- Change fha_extract_info() to use nfs_realign() with M_NOWAIT rather
> >>  than M_DONTWAIT because it's called with the RPC sp_lock held.
> >>
> 
> Btw, from an historical perspective, this was originally added for the
> DEC PMAX port (MIPS R2000), which would trap whenever an unaligned ptr
> was used. Then, there was this involved chunk of MIPS assembly code that
> worked around the unaligned ptr access and the trap returned. Obviously 
> this was a major performance hit and happened fairly frequently for NFS 
> over ISO TP4. As you've seen, it happens infrequently enough over TCP 
> (back then, I think it was only when the TCP window size ended up really 
> small, although I'm way out of date on the TCP stack, so I have no idea 
> now:-) that I'd only do the re-alignment on achitectures that will crash
> if it isn't done?
> 
> I missed the beginning of this. Was there crashes occurring because
> the alignment wasn't being done or problems because the alignment
> wasn't working correctly? (I guess I'm trying to say that, if the
> arch works without nfs_realign(), then you might consider just not
> doing it for that arch. I suppose that isn't a good enough reason
> to not fix the function, though?;-)

The original problem with fha_extract_info() was that it didn't
realign the mbuf data at all but called nfsm_srvmtofh_xx() which
assumes the 4-byte alignment required by XDR.
As mentioned above, changing nfs_realign() to use nfsm_aligned()
has the effect you propose of not performing the realignment on
platforms which can deal with unaligned acesses. Actually one
could take this one step further and #ifdef the whole nfs_realign()
like you do with newnfs_realign() but nfsm_aligned() already
existed and I'm reluctant to rototill foreign code too much. Also
the compiler should be smart enough to optimize this down to just
incrementing nfs_realign_test when __NO_STRICT_ALIGNMENT is defined.

Marius




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20100206172029.GK77522>