From owner-freebsd-arch@FreeBSD.ORG Fri Oct 12 16:25:41 2012 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 3B2C1195 for ; Fri, 12 Oct 2012 16:25:41 +0000 (UTC) (envelope-from marcel@xcllnt.net) Received: from mail.xcllnt.net (mail.xcllnt.net [70.36.220.4]) by mx1.freebsd.org (Postfix) with ESMTP id EE08D8FC1A for ; Fri, 12 Oct 2012 16:25:40 +0000 (UTC) Received: from marcelm-sslvpn-nc.jnpr.net (natint3.juniper.net [66.129.224.36]) (authenticated bits=0) by mail.xcllnt.net (8.14.5/8.14.5) with ESMTP id q9CGPcLd064464 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Fri, 12 Oct 2012 09:25:40 -0700 (PDT) (envelope-from marcel@xcllnt.net) From: Marcel Moolenaar Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Subject: Behavior of madvise(MADV_FREE) Date: Fri, 12 Oct 2012 09:25:34 -0700 Message-Id: <9FEBC10C-C453-41BE-8829-34E830585E90@xcllnt.net> To: "freebsd-arch@freebsd.org Arch" Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) X-Mailer: Apple Mail (2.1499) Cc: Tim LaBerge , Alan Cox X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Oct 2012 16:25:41 -0000 All, Juniper has been intrigued for a while about the beahviour of madvise(MADV_FREE). Let me give a bit of context before asking questions: 1. We have an important daemon that needs lots of memory. It uses sbrk()/brk() to extend its address space and uses madvise(MADV_FREE) to inform the kernel when a chunk of memory is effectively unused (the chunk is first zeroed). 2. Most of the time memory usage of the daemon is pretty stable, but under certain conditions it can spike, after which it drops back to a new stability point (either higher or lower). 3. Obviously the daemon is not the only component in the system, so whatever it doesn't need we very likely need somewhere else -- badly in some cases, so we do like immediate recycling then. Now on to the questions: 1. madvise(MADV_FREE) marks the pages as clean and moves them to the inactive queue. Why isn't the reference state cleared on either the page or the TLB? 2. Why aren't the pages moved to the cache queue in the first place? 3. What would be the impact or consequence of changing the behaviour of madvise(MADV_FREE) to mark the page as clean and unreferenced and have the page moved to the cache queue (or free queue even)? Ad 1: When the system is under memory pressure, the pageout daemon scans the inactive queue in order to try to move pages to the cache or free queue. With the MADV_FREE'd pages still having PG_REFERENCE or the underlying TLBs still having the access flag set, these pages actually get bumped to the active queue. Ad 2: MADV_DONTNEED is there to signal that the pages contain valid data, but that the page is not needed right now. Using this, pages get moved to the inactive queue. That makes sense. But MADV_FREE signals that there's no valid data anymore and that the page may be demand zeroed on next reference. The page is not inactive. It's free. If the paged was zeroed before calling MADV_FREE, the page really caches contents that that can be recreated later (the demand zero). Thanks, -- Marcel Moolenaar marcel@xcllnt.net