From owner-freebsd-current@FreeBSD.ORG Mon Nov 14 23:43:23 2011 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3494F1065672 for ; Mon, 14 Nov 2011 23:43:23 +0000 (UTC) (envelope-from mavbsd@gmail.com) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id AD1A08FC15 for ; Mon, 14 Nov 2011 23:43:22 +0000 (UTC) Received: by bkbzs8 with SMTP id zs8so8967522bkb.13 for ; Mon, 14 Nov 2011 15:43:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=sender:message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=2a4cGHB/Ui4PCoiJ68rS1F0KAswzO4lkZ92eWkK0J6c=; b=UBo2O3HyigM8/NA+PRmw30EcZohxoDmLVDdharO3VaoQNg7dx/zw4BncWwlmdsJSSA npK6DAso3/0xE30FYgT8kMTlReBN1G/uFAmd1w9XfTenZ7sXTbaByOdoq2AdpDnc6UKu /8nj0Sg6CDYC/y5uP/j8XLpdKKf+h24/BDuWU= Received: by 10.204.151.84 with SMTP id b20mr21480591bkw.22.1321314201335; Mon, 14 Nov 2011 15:43:21 -0800 (PST) Received: from mavbook.mavhome.dp.ua (pc.mavhome.dp.ua. [212.86.226.226]) by mx.google.com with ESMTPS id k26sm15391731fab.8.2011.11.14.15.43.19 (version=SSLv3 cipher=OTHER); Mon, 14 Nov 2011 15:43:20 -0800 (PST) Sender: Alexander Motin Message-ID: <4EC1A799.7050102@FreeBSD.org> Date: Tue, 15 Nov 2011 01:43:21 +0200 From: Alexander Motin User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:8.0) Gecko/20111112 Thunderbird/8.0 MIME-Version: 1.0 To: Sebastian Chmielewski References: <4EC198B8.1010901@FreeBSD.org> <20111115000055.58b7a103@o2.pl> In-Reply-To: <20111115000055.58b7a103@o2.pl> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: current Subject: Re: Second SATA device lost after ZFS root is mount X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Nov 2011 23:43:23 -0000 On 15.11.2011 01:00, Sebastian Chmielewski wrote: > On Tue, 15 Nov 2011 00:39:52 +0200 > Alexander Motin wrote: > >> SATA device can be dropped because of error during reset/ probe/ >> initialization sequence or because controller reported disconnection. >> Verbose boot messages (boot -v from loader prompt) should give more >> information about what happened there. Show please full verbose dmesg. > Using rc_debug="YES" in rc.conf I've found that my device is dropped during > sysctl_start. With empty sysctl.conf my device is not lost. The contents of > file seems quite innocent: > > # Uncomment this to prevent users from seeing information about processes that > # are being run under another UID. > security.bsd.see_other_uids=1 > > # Enable/disable coredump > kern.coredump=1 > > # Up the maxfiles to 4x default > kern.maxfiles=49312 > > kern.ipc.shmmax=67108864 > kern.ipc.shmall=32768 > > # Allow users to mount CD's > vfs.usermount=1 > vfs.hirunningspace=8388608 > vfs.lorunningspace=1048576 > > kern.corefile="/var/coredumps/%U/%N.core" > > # Do not truncate command line arguments in ps(1) listing > kern.ps_arg_cache_limit=10000 > > # Tune for desktop usage > kern.sched.preempt_thresh=224 > > # Increase default setting - recommended for 2 GB of RAM > kern.maxvnodes=400000 > > dev.acpi_ibm.0.lcd_brightness=6 > dev.acpi_ibm.0.lcd_brightness=3 > net.link.tap.user_open=1 > net.link.tap.up_on_open=1 > > The device is lost even when sysctl is started with new file when booting finishes (I did service sysctl restart from X session). > # sysctl debug.bootverbose=1 > # service sysctl restart > # dmesg > > ahcich1: DISCONNECT requested > ahcich1: AHCI reset... > ahcich1: SATA connect timeout time=10000us status=00000000 > ahcich1: AHCI reset: device not found > (ada1:ahcich1:0:0:0): lost device > (pass1:ahcich1:0:0:0): lost device > (pass1:ahcich1:0:0:0): removing device entry > > Crazy, isn't it? It is. I've never heard about such things. Reset status looks like if device was indeed disconnected or powered down. I don't even know how to do it this way, at least on Intel chipsets. My laptop's BIOS has bug that disables SATA port after suspend/resume, but there it can be seen in reset status that port was explicitly disabled. I have only one crazy idea: while setting screen brightness you are calling ACPI code that is black box by definition and can do whatever it wants with hardware, including using any possible custom power control interfaces. Was the second disk initially planned in this laptop? Laptop vendors more then desktop ones tend to hardcode things. I would try two things: - bisecting list of sysctls found one that cause this; - tried to enable SATA interface power management for the device. If power management was somehow enabled on the device around the OS, it may cause false DISCONNECT messages, while it still it should not cause such reset status. Setting hint.ahcich.1.pm_level=1 in loader.conf will make ahci(4) driver do ignore link loss events. If device indeed lost, you should see command timeouts and only then device loss. -- Alexander Motin