Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 05 Sep 2013 15:40:45 +0300
From:      Alexander Motin <mav@FreeBSD.org>
To:        hackers@freebsd.org
Subject:   Again about pbuf_mtx
Message-ID:  <52287BCD.4090507@FreeBSD.org>

next in thread | raw e-mail | index | archive | help
Hi.

Some may remember that not so long ago I complained about high lock 
congestion on pbuf_mtx. At that time switching the mutex to padalign 
reduced the problem. But now after improving scalability in CAM and GEOM 
and doing more then half million IOPS on 32-core system I again heavily 
hit that problem -- hwpmc shows about 30% of CPU time spent on that 
mutex spinning and another 30% of time spent on attempt of threads to go 
to sleep on that mutex and getting more collisions there.

Trying to mitigate that I've made a patch 
(http://people.freebsd.org/~mav/pcpu_pbuf.patch) to split single queue 
of pbufs into several. That definitely cost some amount of KVA and 
memory, but on my tests it fixes problem redically, removing any 
measurable congestion there. The patch is not complete and don't even 
boot on i386 now, but I would like to hear opinions about the approach, 
or may be some better propositions.

Another patch I've made 
(http://people.freebsd.org/~mav/si_threadcount.patch) removes lock 
acquisition from dev_relthread() by using atomics for reference 
counting. That fixes another congestion I see. This patch looks fine to 
me and the only congestion I see after that is on HBA driver locks, but 
may be I am missing something?

Thank you.

-- 
Alexander Motin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?52287BCD.4090507>