From owner-freebsd-stable@FreeBSD.ORG Wed Jan 28 16:05:30 2009 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 962C5106566B for ; Wed, 28 Jan 2009 16:05:30 +0000 (UTC) (envelope-from pluknet@gmail.com) Received: from nf-out-0910.google.com (nf-out-0910.google.com [64.233.182.185]) by mx1.freebsd.org (Postfix) with ESMTP id 2269E8FC26 for ; Wed, 28 Jan 2009 16:05:29 +0000 (UTC) (envelope-from pluknet@gmail.com) Received: by nf-out-0910.google.com with SMTP id h3so1217267nfh.33 for ; Wed, 28 Jan 2009 08:05:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:date:message-id:subject :from:to:content-type:content-transfer-encoding; bh=qk81qNWjtix5do8svxg8qnLlDcA/5KYHTDaGYl23nxc=; b=e0ZmzyAWns1In3yq7zj6RaXwB/LmsvytskpYRGPeRpqJllJhnnd6UYjl5X9OvU4pvS tEAWJIhUXGWnA3UmNQ0ejRvOQGhtUHzs4Fjv5lThpwsE6ABwgNlmmOJK5rsI8SXr4L6a JOO5WBe/TVgapRM7ZMoCY2yBuHDYovoKOYnfw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type :content-transfer-encoding; b=fL2OXx9zyOIsWvJSJzK+aI9ihuleC6U+RXCre53i2ziaU2c8iOiqbFhRY/R/ej05vl bQZTzowTRYhqVdOBEAiLReonTl0sQXoVH6oxJVp4OxffPOXMb9m9K6Ib3YQwBSlXvhPD X0KfGckw1Gua2eo+NkQ/aD30l4AYq7rFFbX0g= MIME-Version: 1.0 Received: by 10.86.84.5 with SMTP id h5mr700011fgb.32.1233158729152; Wed, 28 Jan 2009 08:05:29 -0800 (PST) Date: Wed, 28 Jan 2009 19:05:29 +0300 Message-ID: From: pluknet To: FreeBSD Stable Mailing List Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: smp_tlb_shootdown bottleneck? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Jan 2009 16:05:30 -0000 Hi. Sometimes I see much contention in smp_tlb_shootdown while running sysbench: sysbench --test=fileio --num-threads=8 --file-test-mode=rndrd --file-total-size=3G run kern.smp.cpus: 8 FreeBSD 7.1-R CPU: 0.8% user, 0.0% nice, 93.8% system, 0.0% interrupt, 5.4% idle Mem: 11M Active, 2873M Inact, 282M Wired, 8K Cache, 214M Buf, 765M Free Swap: 4096M Total, 4096M Free PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND 810 root 115 0 11568K 2408K RUN 3 0:04 89.60% sysbench 810 root 114 0 11568K 2408K CPU0 0 0:18 87.06% sysbench 810 root 114 0 11568K 2408K CPU2 2 0:18 86.67% sysbench 810 root 114 0 11568K 2408K CPU6 6 0:18 84.47% sysbench 810 root 114 0 11568K 2408K CPU3 3 0:04 84.08% sysbench 810 root 114 0 11568K 2408K CPU7 7 0:17 81.69% sysbench 810 root 113 0 11568K 2408K CPU4 4 0:17 78.08% sysbench 810 root 113 0 11568K 2408K CPU1 1 0:17 77.88% sysbench This high system load appears from time to time. Usually this sysbench test passed in a few seconds, and at this time it runs in more than 10 seconds. breaking to ddb while seen too much system load shows many sysbench threads waiting for a mutex in smp_tlb_shootdown: db> bt 100133 Tracing pid 810 tid 100133 td 0xffffff00032d56e0 cpustop_handler() at cpustop_handler+0x40 ipi_nmi_handler() at ipi_nmi_handler+0x30 trap() at trap+0x378 nmi_calltrap() at nmi_calltrap+0x8 --- trap 0x13, rip = 0xffffffff8074c912, rsp = 0xffffffffab79fff0, rbp = 0xffffffffb4acf6f0 --- smp_tlb_shootdown() at smp_tlb_shootdown+0x82 pmap_invalidate_range() at pmap_invalidate_range+0xae vfs_vmio_release() at vfs_vmio_release+0x120 getnewbuf() at getnewbuf+0x7be getblk() at getblk+0x308 cluster_read() at cluster_read+0xc5 ffs_read() at ffs_read+0x37d vn_read() at vn_read+0x17b dofileread() at dofileread+0xa1 kern_preadv() at kern_preadv+0x66 pread() at pread+0x58 syscall() at syscall+0x1ce Xfast_syscall() at Xfast_syscall+0xab --- syscall (475, FreeBSD ELF64, pread), rip = 0x800cf7b5c, rsp = 0x7ffffeff8e78, rbp = 0x7ffffeff8f60 --- db> bt 100132 Tracing pid 810 tid 100132 td 0xffffff00036426e0 cpustop_handler() at cpustop_handler+0x40 ipi_nmi_handler() at ipi_nmi_handler+0x30 trap() at trap+0x378 nmi_calltrap() at nmi_calltrap+0x8 --- trap 0x13, rip = 0xffffffff8048b49c, rsp = 0xffffffff80b564b0, rbp = 0xffffffffb4aca680 --- _mtx_lock_spin() at _mtx_lock_spin+0x6c _mtx_lock_spin_flags() at _mtx_lock_spin_flags+0x124 smp_tlb_shootdown() at smp_tlb_shootdown+0x58 pmap_invalidate_range() at pmap_invalidate_range+0xae vfs_vmio_release() at vfs_vmio_release+0x120 getnewbuf() at getnewbuf+0x7be getblk() at getblk+0x308 cluster_read() at cluster_read+0xc5 ffs_read() at ffs_read+0x37d vn_read() at vn_read+0x17b dofileread() at dofileread+0xa1 kern_preadv() at kern_preadv+0x66 pread() at pread+0x58 syscall() at syscall+0x1ce Xfast_syscall() at Xfast_syscall+0xab --- syscall (475, FreeBSD ELF64, pread), rip = 0x800cf7b5c, rsp = 0x7fffff1f9e78, rbp = 0x7fffff1f9f60 --- db> bt 100131 Tracing pid 810 tid 100131 td 0xffffff0003a75000 cpustop_handler() at cpustop_handler+0x40 ipi_nmi_handler() at ipi_nmi_handler+0x30 trap() at trap+0x378 nmi_calltrap() at nmi_calltrap+0x8 --- trap 0x13, rip = 0xffffffff8048b4a1, rsp = 0xffffffffab7a4ff0, rbp = 0xffffffffb4ac5680 --- _mtx_lock_spin() at _mtx_lock_spin+0x71 _mtx_lock_spin_flags() at _mtx_lock_spin_flags+0x124 smp_tlb_shootdown() at smp_tlb_shootdown+0x58 pmap_invalidate_range() at pmap_invalidate_range+0xae vfs_vmio_release() at vfs_vmio_release+0x120 getnewbuf() at getnewbuf+0x7be getblk() at getblk+0x308 cluster_read() at cluster_read+0xc5 ffs_read() at ffs_read+0x37d vn_read() at vn_read+0x17b dofileread() at dofileread+0xa1 kern_preadv() at kern_preadv+0x66 pread() at pread+0x58 syscall() at syscall+0x1ce Xfast_syscall() at Xfast_syscall+0xab --- syscall (475, FreeBSD ELF64, pread), rip = 0x800cf7b5c, rsp = 0x7fffff3fae78, rbp = 0x7fffff3faf60 --- Is that a normal behavior and if yes then how can I help with that? -- wbr, pluknet