From owner-freebsd-arch@FreeBSD.ORG Thu Jun 23 06:23:29 2005 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C7F1616A421; Thu, 23 Jun 2005 06:23:29 +0000 (GMT) (envelope-from peter@wemm.org) Received: from canning.wemm.org (canning.wemm.org [192.203.228.65]) by mx1.FreeBSD.org (Postfix) with ESMTP id A32CD43D1F; Thu, 23 Jun 2005 06:23:29 +0000 (GMT) (envelope-from peter@wemm.org) Received: from fw.wemm.org (canning.wemm.org [192.203.228.65]) by canning.wemm.org (Postfix) with ESMTP id 4C7C32A8FA; Wed, 22 Jun 2005 23:23:29 -0700 (PDT) (envelope-from peter@wemm.org) Received: from overcee.wemm.org (overcee.wemm.org [10.0.0.3]) by fw.wemm.org (Postfix) with ESMTP id C554FE2B3; Wed, 22 Jun 2005 23:23:28 -0700 (PDT) (envelope-from peter@wemm.org) Received: from overcee.wemm.org (localhost [127.0.0.1]) by overcee.wemm.org (8.13.3/8.13.1) with ESMTP id j5N6NRuT029395; Wed, 22 Jun 2005 23:23:27 -0700 (PDT) (envelope-from peter@wemm.org) Received: from localhost (localhost [[UNIX: localhost]]) by overcee.wemm.org (8.13.3/8.13.1/Submit) id j5N6NQ1N029394; Wed, 22 Jun 2005 23:23:26 -0700 (PDT) (envelope-from peter@wemm.org) X-Authentication-Warning: overcee.wemm.org: peter set sender to peter@wemm.org using -f From: Peter Wemm To: freebsd-arch@freebsd.org, Sue Howard Date: Wed, 22 Jun 2005 23:23:25 -0700 User-Agent: KMail/1.8 References: <1e89cd51050616062241e9e201@mail.gmail.com> In-Reply-To: <1e89cd51050616062241e9e201@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200506222323.26666.peter@wemm.org> Cc: arch@freebsd.org Subject: Re: Kernel Dump X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Jun 2005 06:23:30 -0000 On Thursday 16 June 2005 06:22 am, Sue Howard wrote: > Hi, > > I find there are three states of kernel dump support currently. > 1. ARM and PowerPC. > Not supported yet. > 2. I386, AMD64, Alpha. > The dump file contains dump header plus the raw physical memory > image. The code in dump_machdep.c is almost same. > 3. IA64, SPARC > The dump file has private header besides dump header and raw physical > memory image. The code is ready for importing a MI dump interface. > > I want to understand what IA64 and SPARC are different than i386. > What is the private header is for? Is it tech problem or historic > problem? Is it possible to remove the private header in order to make > IA64 and SPARC share the dump code of i386? The private header is to standardize the way savecore(8) finds the dump images and recovers them. IA64 and Sparc systems usually had sparse memory configurations and the old raw format didn't have any facility to avoid storing holes. Suppose you had a machine where 1GB of ram appeared at physical adress 0, and the other 1GB of ram appeared at physical address 7GB. The i386 and alpha dump method would require 8GB of dump file and swap device usage to record those 2 x 1GB chunks of data. IA64 uses an ELF "coredump" format to record the memory segments that are scattered around its address space. It is flexible and descriptive enough to handle this. sparc64 uses its own custom format to achieve the same thing. I've just rewritten the AMD64 crashdump support to use ELF like ia64. In fact, I reused most of the ia64 code. We ran into serious problems at work, first on the amd64 platform and now also the i386 platform. The problem is that x86 machines are increasinly having memory holes. The simplistic dump code has no way to skip the memory hole and tries to dump things like the AGP frame buffer, pci card MMIO space, PCI-Express configuration space (this means accessing hardware!!) and so on. We have already switched to ELF crashdump support at work (yahoo) on amd64 and will be backporting to i386 and even RELENG_4 (PAE systems expose the same problem on this hardware). doghouse# file /var/crash/vmcore.1 /var/crash/vmcore.1: ELF 64-bit LSB core file AMD x86-64, invalid version (embedded) doghouse# objdump --headers /var/crash/vmcore.1 Sections: Idx Name Size VMA LMA File off Algn 0 load0 000a0000 0000000000 0000000000 00001000 2**12 CONTENTS, ALLOC, LOAD, READONLY 1 load1 7c1c0000 0000100000 0000100000 000a1000 2**12 CONTENTS, ALLOC, LOAD, READONLY 2 load2 00100000 007ff00000 007ff00000 7c261000 2**12 CONTENTS, ALLOC, LOAD, READONLY It doesn't mean much, but at least the standard tools let you see inside the crash dump. libkvm and gdb support is trivial. Anyway, this isn't a new problem for i386 either. I have one older machine in a server room that has 6GB of ram. 2GB at address 0-2GB, a 2GB pci hotplug hole, and 4GB at 4-8GB. This 5 year old machine does nasty things if you crash it with dumps enabled. BTW: elf core dumps are really simple. They're two headers (elf and program headers) and then the memory contents. There isn't much to it. > In my understanding, it should be possible. Since /dev/mem should be > a physic memory image. Anyway, if anything, the i386 code is going away. Machines with sparse memory are becoming more common. > Howard -- Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com "All of this is for nothing if we don't go to the stars" - JMS/B5