From owner-freebsd-current@freebsd.org Tue May 29 16:22:25 2018 Return-Path: Delivered-To: freebsd-current@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id BEE19EF6368 for ; Tue, 29 May 2018 16:22:25 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: from mail-pf0-x244.google.com (mail-pf0-x244.google.com [IPv6:2607:f8b0:400e:c00::244]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 438C4732FF; Tue, 29 May 2018 16:22:25 +0000 (UTC) (envelope-from markjdb@gmail.com) Received: by mail-pf0-x244.google.com with SMTP id f189-v6so7530492pfa.7; Tue, 29 May 2018 09:22:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=a+i7pIp5GL8+F4QArh2NOHBV3KJdyFI05sVHWVDoxjw=; b=dExinrwvSEdJ1ah/ZTjDDOZNTmSUMlwapMegUmGXRNLueL3eIawArpF5orTv86TCIg XIjMw4BZ5/JpR49icLf57bmFRZnwtgdoQM48RR4W1ew4b6G0Ca80ISQYQRvCdLk7/KPO OXvZavJ6DL2TbmF5b40NsItmxaPxvKACThuexjwn1dN7sJpnDH9a7Rm8xSsuRpcwFji2 WjIAZXgxEPmKMQhJiKGrRyxzIwjPHI3YEU68D8FECvvr2ujNQUGupXSRF5wOKOgV2Buz rYmmCeV64KqV3cCRmUZAFP8UlbjdxHHLb4wv95QNm7PyCUpIkA08P2+ISK9mnYjhOeOT Ea+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition :content-transfer-encoding:in-reply-to:user-agent; bh=a+i7pIp5GL8+F4QArh2NOHBV3KJdyFI05sVHWVDoxjw=; b=Ny492IhzcTYqRQW0Mn7GStItRXZh+6xyMjM40SoB9abfRI+oxn1/vgIgtwvVn7w7Kg niEsvW+8HJ72QEFF21cdcmNge4zR4pH9HBx8Ib6o9uDlszkEykjwcuzD7IJunPKQrLeT LU5MlS1tYK/5GZ5nFVxIYjkoEv0pwYngm+YoscMAeoOO2Nq77ahnq5ZNY93KJsoQDCw9 qox5gDmL/UxzPl9gk0ZJj0IK+9FrYkqYa047fU0Hh5Z3cEKNzRe39tWjcdmCBlvU8JCM q2/qICF0weXdDF7KpDSjwFW1I1EQN9HzZwtnwYJDsJJpzIFLCN0ptfineXmYe9D1yC3S Xljw== X-Gm-Message-State: ALKqPwejLllGNAm4cQeE0UhKGrrU0HLcJDfHGgrBeVOudvsnRG7OFKhf mQQoJ0JL5s+jIukLjT8IdPfsJg== X-Google-Smtp-Source: AB8JxZpI0fLa+C7FtSwPAgoiz7JEkZYhAmtkHte1Lnm1DbGu963DPzhOuwQptXmu7ELej4wRXEoC/A== X-Received: by 2002:a62:1549:: with SMTP id 70-v6mr18080799pfv.91.1527610943961; Tue, 29 May 2018 09:22:23 -0700 (PDT) Received: from raichu (toroon0560w-lp140-02-70-49-169-156.dsl.bell.ca. [70.49.169.156]) by smtp.gmail.com with ESMTPSA id z7-v6sm44151323pgp.74.2018.05.29.09.22.22 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 29 May 2018 09:22:22 -0700 (PDT) Sender: Mark Johnston Date: Tue, 29 May 2018 12:22:17 -0400 From: Mark Johnston To: Andriy Gapon Cc: freebsd-current , Julian Elischer , Bryan Drewery Subject: Re: Bad link elm in vm_object_terminate [Was: crash on process exit.. current at about r332467] Message-ID: <20180529162217.GA99109@raichu> References: <9479e941-39be-e6e2-869e-aac475c5e33a@freebsd.org> <9bf4b2b0-65a2-90ef-c8c0-3022e80bc149@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <9bf4b2b0-65a2-90ef-c8c0-3022e80bc149@FreeBSD.org> User-Agent: Mutt/1.9.5 (2018-04-13) X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.26 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 29 May 2018 16:22:26 -0000 On Tue, May 29, 2018 at 04:50:14PM +0300, Andriy Gapon wrote: > On 23/04/2018 17:50, Julian Elischer wrote: > > back trace at:  http://www.freebsd.org/~julian/bob-crash.png > > > > If anyone wants to take a look.. > > > > In the exit syscall, while deallocating a vm object. > > > > I haven't see references to a similar crash in the last 10 days or so.. But if > > it rings any bells... > > We have just got another one: > panic: Bad link elm 0xfffff80cc3938360 prev->next != elm > > Matching disassembled code to C code, it seems that the crash is somewhere in > vm_object_terminate_pages (inlined into vm_object_terminate), probably in one of > TAILQ_REMOVE-s there: > if (p->queue != PQ_NONE) { > KASSERT(p->queue < PQ_COUNT, ("vm_object_terminate: " > "page %p is not queued", p)); > pq1 = vm_page_pagequeue(p); > if (pq != pq1) { > if (pq != NULL) { > vm_pagequeue_cnt_add(pq, dequeued); > vm_pagequeue_unlock(pq); > } > pq = pq1; > vm_pagequeue_lock(pq); > dequeued = 0; > } > p->queue = PQ_NONE; > TAILQ_REMOVE(&pq->pq_pl, p, plinks.q); > dequeued--; > } > if (vm_page_free_prep(p, true)) > continue; > unlist: > TAILQ_REMOVE(&object->memq, p, listq); > } > > > Please note that this is the code before r332974 Improve VM page queue scalability. > I am not sure if r332974 + r333256 would fix the problem or if it just would get > moved to a different place. > > Does this ring a bell to anyone who tinkered with that part of the VM code recently? This doesn't look familiar to me and I doubt that r332974 fixed the underlying problem, whatever it is. > Looking a little bit further, I think that object->memq somehow got corrupted. > memq contains just two elements and the reported element is not there. Based on the debugging session, it would be interesting to know if there were any other threads somehow manipulating the (dead) object at the time of the panic. Among the panics that you observed, is it the same application that is causing the crash in each case?