From owner-freebsd-geom@FreeBSD.ORG Sat Sep 27 14:57:41 2008 Return-Path: Delivered-To: freebsd-geom@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DF3AC1065686 for ; Sat, 27 Sep 2008 14:57:41 +0000 (UTC) (envelope-from auryn@zirakzigil.org) Received: from aurynhome1sv1.zirakzigil.org (mail.zirakzigil.org [82.63.178.63]) by mx1.freebsd.org (Postfix) with SMTP id DD5AC8FC14 for ; Sat, 27 Sep 2008 14:57:39 +0000 (UTC) (envelope-from auryn@zirakzigil.org) Received: (qmail 98748 invoked by uid 98); 27 Sep 2008 14:30:57 -0000 Received: from 192.168.229.11 by aurynhome1sv1.zirakzigil.org (envelope-from , uid 89) with qmail-scanner-1.25 ( Clear:RC:1(192.168.229.11):. Processed in 0.043597 secs); 27 Sep 2008 14:30:57 -0000 X-Qmail-Scanner-Mail-From: auryn@zirakzigil.org via aurynhome1sv1.zirakzigil.org X-Qmail-Scanner: 1.25 (Clear:RC:1(192.168.229.11):. Processed in 0.043597 secs) Received: from unknown (HELO aurynhome1ws2.zirakzigil.org) (postmaster@zirakzigil.org@192.168.229.11) by 0 with SMTP; 27 Sep 2008 14:30:57 -0000 Message-ID: <48DE439C.4050505@zirakzigil.org> Date: Sat, 27 Sep 2008 16:30:52 +0200 From: Giulio Ferro User-Agent: Thunderbird 2.0.0.0 (X11/20070513) MIME-Version: 1.0 To: freebsd-geom@freebsd.org, pjd@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Subject: Problems with journal? X-BeenThere: freebsd-geom@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: GEOM-specific discussions and implementations List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 27 Sep 2008 14:57:42 -0000 I'm experiencing very serious delay issues in 2 production servers. The server are dual intel quad core o.s. : freebsd 7 stable (yesterday) amd64, mfi driver for 2 physical controllers: 1) two sas disks in mirror (mfi0) 2) one dell md1000 disk enclosure with 15 sata disks in raid 60 (total ~7 TB) (mfi1) I created a single partition for the second disk, I put the geom_journal on it, I formatted it with ufs and mounted it on /data directory. from /etc/fstab /dev/mfid1.journal /data ufs rw,async 2 0 Everything seemed to work nice until I put the servers in production under a heavy load. In fact, I found that something was wrong when the standard maintenance script /etc/periodic/daily/450.status-security would very nearly freeze the server at night. When this happened the server would get incredibly slow, and it took minutes for me to ssh login and kill the process hanging the system. The process which hung the system was a simple find on the /data partition. During this the machine became unresponsive, and the db which stays on the same partition didn't serve requests any more. During this hang a "top" command would show a 20% system and 100% for the find process. In order to check whether the problem was with the journal, I enabled debugging: sysctl kern.geom.journal.debug=1 I started the "killer" script and waited. Nothing happened for several minutes, then the problem would show its ugly face again and here is what I got in the logs: (normal behavior) Sep 27 15:20:38 dcsdata1 kernel: GEOM_JOURNAL[1]: Sync time of /data: 0.043653s Sep 27 15:20:38 dcsdata1 kernel: GEOM_JOURNAL[1]: BIO_FLUSH time of mfid1: 0.000001s Sep 27 15:20:38 dcsdata1 kernel: GEOM_JOURNAL[1]: Suspend time of /data: 0.087174s Sep 27 15:20:38 dcsdata1 kernel: GEOM_JOURNAL[1]: Starting copy of journal. Sep 27 15:20:38 dcsdata1 kernel: GEOM_JOURNAL[1]: Switch time of mfid1: 0.003278s Sep 27 15:20:38 dcsdata1 kernel: GEOM_JOURNAL[1]: Entire switch time: 0.172709s Sep 27 15:20:38 dcsdata1 kernel: GEOM_JOURNAL[1]: Data has been copied. Sep 27 15:20:48 dcsdata1 kernel: GEOM_JOURNAL[1]: Entire switch time: 0.000005s Sep 27 15:20:58 dcsdata1 kernel: GEOM_JOURNAL[1]: Entire switch time: 0.000005s Sep 27 15:21:08 dcsdata1 kernel: GEOM_JOURNAL[1]: Msync time of /data: 0.029507s Sep 27 15:21:08 dcsdata1 kernel: GEOM_JOURNAL[1]: Sync time of /data: 0.053562s Sep 27 15:21:08 dcsdata1 kernel: GEOM_JOURNAL[1]: BIO_FLUSH time of mfid1: 0.000002s Sep 27 15:21:08 dcsdata1 kernel: GEOM_JOURNAL[1]: Suspend time of /data: 0.086662s Sep 27 15:21:08 dcsdata1 kernel: GEOM_JOURNAL[1]: Starting copy of journal. Sep 27 15:21:08 dcsdata1 kernel: GEOM_JOURNAL[1]: Switch time of mfid1: 0.003178s Sep 27 15:21:08 dcsdata1 kernel: GEOM_JOURNAL[1]: Entire switch time: 0.183835s Sep 27 15:21:08 dcsdata1 kernel: GEOM_JOURNAL[1]: Data has been copied. (starts abnormal behavior) Sep 27 15:24:40 dcsdata1 kernel: GEOM_JOURNAL[1]: Suspend time of /data: 181.380868s Sep 27 15:24:40 dcsdata1 kernel: GEOM_JOURNAL[1]: Starting copy of journal. Sep 27 15:24:40 dcsdata1 kernel: GEOM_JOURNAL[1]: Switch time of mfid1: 0.003105s Sep 27 15:24:40 dcsdata1 kernel: GEOM_JOURNAL[1]: Entire switch time: 181.473032s Sep 27 15:24:40 dcsdata1 kernel: GEOM_JOURNAL[1]: Data has been copied. Sep 27 15:24:50 dcsdata1 kernel: GEOM_JOURNAL[1]: Entire switch time: 0.000007s Sep 27 15:25:00 dcsdata1 kernel: GEOM_JOURNAL[1]: Msync time of /data: 0.029305s Sep 27 15:25:00 dcsdata1 kernel: GEOM_JOURNAL[1]: Sync time of /data: 0.045913s Sep 27 15:25:00 dcsdata1 kernel: GEOM_JOURNAL[1]: BIO_FLUSH time of mfid1: 0.000001s After this no more logs would show, and after a long time I finally killed the "find" process. In the logs I got this: Sep 27 15:42:30 dcsdata1 kernel: GEOM_JOURNAL[1]: Suspend time of /data: 1050.197022s Sep 27 15:42:30 dcsdata1 kernel: GEOM_JOURNAL[1]: Starting copy of journal. Sep 27 15:42:30 dcsdata1 kernel: GEOM_JOURNAL[1]: Switch time of mfid1: 0.003276s Sep 27 15:42:30 dcsdata1 kernel: GEOM_JOURNAL[1]: Entire switch time: 1050.282609s Sep 27 15:42:30 dcsdata1 kernel: GEOM_JOURNAL[1]: Data has been copied. and then everything got normal again: Sep 27 15:42:40 dcsdata1 kernel: GEOM_JOURNAL[1]: Msync time of /data: 0.026546s Sep 27 15:42:40 dcsdata1 kernel: GEOM_JOURNAL[1]: Sync time of /data: 0.043528s Sep 27 15:42:40 dcsdata1 kernel: GEOM_JOURNAL[1]: BIO_FLUSH time of mfid1: 0.000002s Sep 27 15:42:40 dcsdata1 kernel: GEOM_JOURNAL[1]: Suspend time of /data: 0.041300s Sep 27 15:42:40 dcsdata1 kernel: GEOM_JOURNAL[1]: Starting copy of journal. Sep 27 15:42:40 dcsdata1 kernel: GEOM_JOURNAL[1]: Switch time of mfid1: 0.005283s Sep 27 15:42:40 dcsdata1 kernel: GEOM_JOURNAL[1]: Entire switch time: 0.133709s Sep 27 15:42:40 dcsdata1 kernel: GEOM_JOURNAL[1]: Data has been copied. Sep 27 15:42:50 dcsdata1 kernel: GEOM_JOURNAL[1]: Msync time of /data: 0.025201s Sep 27 15:42:50 dcsdata1 kernel: GEOM_JOURNAL[1]: Sync time of /data: 0.036892s Sep 27 15:42:50 dcsdata1 kernel: GEOM_JOURNAL[1]: BIO_FLUSH time of mfid1: 0.000001s Sep 27 15:42:50 dcsdata1 kernel: GEOM_JOURNAL[1]: Suspend time of /data: 0.036874s Sep 27 15:42:50 dcsdata1 kernel: GEOM_JOURNAL[1]: Starting copy of journal. Sep 27 15:42:50 dcsdata1 kernel: GEOM_JOURNAL[1]: Switch time of mfid1: 0.002075s Sep 27 15:42:50 dcsdata1 kernel: GEOM_JOURNAL[1]: Entire switch time: 0.108044s Sep 27 15:42:50 dcsdata1 kernel: GEOM_JOURNAL[1]: Data has been copied. Sep 27 15:43:01 dcsdata1 kernel: GEOM_JOURNAL[1]: Msync time of /data: 0.026738s Sep 27 15:43:01 dcsdata1 kernel: GEOM_JOURNAL[1]: Sync time of /data: 0.036973s Sep 27 15:43:01 dcsdata1 kernel: GEOM_JOURNAL[1]: BIO_FLUSH time of mfid1: 0.000001s Sep 27 15:43:01 dcsdata1 kernel: GEOM_JOURNAL[1]: Suspend time of /data: 0.035504s Sep 27 15:43:01 dcsdata1 kernel: GEOM_JOURNAL[1]: Starting copy of journal. Sep 27 15:43:01 dcsdata1 kernel: GEOM_JOURNAL[1]: Switch time of mfid1: 0.003822s Sep 27 15:43:01 dcsdata1 kernel: GEOM_JOURNAL[1]: Entire switch time: 0.110049s Sep 27 15:43:01 dcsdata1 kernel: GEOM_JOURNAL[1]: Data has been copied. From what I can see the mfi driver and the hardware below it are not to blame for the abysmal performance, but the journal probably is. I hope any of you can help me look in the right direction, and point me to any further tests to try or tunable to set... Thanks in advance.