From owner-freebsd-questions@freebsd.org Thu Mar 10 01:32:56 2016 Return-Path: Delivered-To: freebsd-questions@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id E89BEAC9E50 for ; Thu, 10 Mar 2016 01:32:55 +0000 (UTC) (envelope-from travis.parker@gmail.com) Received: from mail-io0-x234.google.com (mail-io0-x234.google.com [IPv6:2607:f8b0:4001:c06::234]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client CN "smtp.gmail.com", Issuer "Google Internet Authority G2" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id C0C61270 for ; Thu, 10 Mar 2016 01:32:55 +0000 (UTC) (envelope-from travis.parker@gmail.com) Received: by mail-io0-x234.google.com with SMTP id m184so88778981iof.1 for ; Wed, 09 Mar 2016 17:32:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to; bh=66nwDjJU/ViJCCJD0xgxQ1H1iwsqxUgVMY0wGXmk2+w=; b=cLdu9qQ8ppus3PUAjvZ3sOqeOxs+UcTOU+N9595yb6KDmRGQMfZdHe/N86U4at2xYV HGHXvxt5zGq6/kN5tzZj4Y8Q49wrRCw64a83xJFQ3FVgM2GY/sbYK28vK6Q76jY4XNSH AR1yHqe0HOIQp5JEtp6tiR77eyBf5PzDbge1RjTb+7iv0J+6uS0g/uy/j5RdG4TcYi/I q+Ei7rWIMUJLJ66wvABQiPD5PPNrCh7ZQe25Yt9ur8kiDJQlvNEgBkSYopD8POPcWM3M RoJe3SWx4Q5QAtTxXcAVn7i+VhMlJ44jXAqIzl7X6/UCsp4kEeXomfMJibcTpf6Y+Dwq f8dg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=66nwDjJU/ViJCCJD0xgxQ1H1iwsqxUgVMY0wGXmk2+w=; b=As+GnGjpcmFiW6wO45eD5BqS6l0Cg71H9DV3Y3dsa8TXWE3eWeOQhwW9yDqoitqFU0 i9dXjavi8/fOGmzuBfWIH9zBoRUJnghUlMvDlzeM+iV3aT0fevPe+3WiHD/D1NbXNnnQ ++zLZ2kBTPAtKeCUR1nS5fdsVkSpPWVkQSAMPyMubpfz0IstAsPy1kzcGLuw6nQDJ1dW nacnZoUmholAPDjvHCcwcNC8732EMhUXude+jJG8bUgdJwnE/Brsg1D5Gtm0WVxF+7Wa Q3D/P2q4oNcmsEQX3JF1MzX/Z6Cy7VdlqSrszrxPQXfhlm0xYIwJbjiLy15ngmkHqKBa FQ+w== X-Gm-Message-State: AD7BkJLZ86YL/ZJkAv+Y1S6ELqZkzb2CTyXkZoFfmKdg+L5Peke9ZQF+hW1epSQ4T1mo8C5e8NxWh62PkhnkSw== X-Received: by 10.107.165.17 with SMTP id o17mr990387ioe.42.1457573574871; Wed, 09 Mar 2016 17:32:54 -0800 (PST) MIME-Version: 1.0 Received: by 10.64.142.73 with HTTP; Wed, 9 Mar 2016 17:32:35 -0800 (PST) From: Travis Parker Date: Wed, 9 Mar 2016 17:32:35 -0800 Message-ID: Subject: unresponsive process issue To: freebsd-questions@freebsd.org Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.21 X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Mar 2016 01:32:56 -0000 I've twice now had a process get into a stuck state that I don't believe should be possible: stopped (ps reports 'T', top reports 'STOP'), but unresponsive to any signal, including even CONT (KILL followed by CONT isn't clearing it). It's a redis process, in a jail, that is listening on a streaming unix domain socket. This came to my attention both times because it stops accepting connections on the socket. Also being unresponsive to signals I'm left with no way to interact with it (or get rid of it). truss(1) doesn't see any activity and I couldn't get a backtrace from gdb(1) although there's probably more information to be gleaned with better gdb-foo than mine. It has some 200 connection fds sitting around, but it's configured to accept up to 10000. For the moment I have switched it over to TCP on localhost and I'll have to wait and see if that works around whatever got it into this state (it takes a few days to occur). I wanted to reach out to this list in case there's something obvious I'm missing before mailing freebsd-bugs. I've found a few descriptions of a similar issue (STOP state unresponsive to CONT) from googling, but only back around 2004 and always resolved. Thanks in advance for any help, Travis Parker