From owner-freebsd-questions@FreeBSD.ORG Mon Mar 28 15:35:22 2011 Return-Path: Delivered-To: freebsd-questions@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4FAE6106566B for ; Mon, 28 Mar 2011 15:35:22 +0000 (UTC) (envelope-from tom@claimlynx.com) Received: from na3sys009aog111.obsmtp.com (na3sys009aog111.obsmtp.com [74.125.149.205]) by mx1.freebsd.org (Postfix) with ESMTP id 9871E8FC0C for ; Mon, 28 Mar 2011 15:35:21 +0000 (UTC) Received: from source ([209.85.220.176]) (using TLSv1) by na3sys009aob111.postini.com ([74.125.148.12]) with SMTP ID DSNKTZCquHkV05ZauXaYYGQJn7cwDXTXIlAm@postini.com; Mon, 28 Mar 2011 08:35:21 PDT Received: by vxa37 with SMTP id 37so2385877vxa.7 for ; Mon, 28 Mar 2011 08:35:15 -0700 (PDT) MIME-Version: 1.0 Received: by 10.52.95.15 with SMTP id dg15mr5560148vdb.228.1301324642206; Mon, 28 Mar 2011 08:04:02 -0700 (PDT) Received: by 10.52.66.129 with HTTP; Mon, 28 Mar 2011 08:04:02 -0700 (PDT) Date: Mon, 28 Mar 2011 10:04:02 -0500 Message-ID: From: Thomas Johnson To: freebsd-questions@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Eric Crist Subject: Slow filesystem sync after upgrading from 8.0 to 8.2? X-BeenThere: freebsd-questions@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: User questions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 28 Mar 2011 15:35:22 -0000 Since upgrading one of our servers to FreeBSD 8.2 (from 8.0p2), we have noticed that one of our shell scripts has begun failing. This particular script is run by cron every morning and backs up the configuration files of our routers. The script calls an expect script that connects to the routers and has them push their config to the server using tftp. The script then does some housekeeping on the resulting file. Since upgrading to 8.2, the problem we see is that the file resulting from the tftp operation does not seem to have any contents for ~ 5 seconds after the copy. The script is pasted below. Trouble starts around line 31, when the script greps to see that the hostname of the router exists in the tftp'd file (as a basic verification of the copy). Grep reports back that no match is found. I've found that by inserting a 'sleep 5' instruction before line 31, the grep command works, and the script completes successfully. I have also verified that the tftp transfer is completing before the script continues on to the grep check. For the sake of completeness, I have also included the lines I've added to the script to make it behave, the syslog output showing that the tftp transfer completes when expected, and [relevant] trace output of the script before and after adding the sleep statement. Does anyone have any idea what may have changed between 8.0 and 8.2 to cause this behavior? Nothing directly related to this script changed, aside from the OS upgrade. I'm not too proud to admit that this script isn't the greatest and could be improved; but I would like to know why a script that worked fine before, now fails. Also, if there is something that needs to be tweaked in 8.2, I want to make sure I'm not just fixing a symptom. ##### ##### Script ##### 1 #!/bin/sh 2 3 4 # Program Header 5 echo " NETwork-INFrastructure backup script." 6 echo 7 echo "Started: `date`." 8 echo "Executed from: `hostname`" 9 echo 10 11 12 13 ### Define static variables here 14 15 expect=/usr/local/bin/expect 16 working_dir=/d/net-infrastructure 17 script_dir=$working_dir/scripts 18 starttime=`date '+%s'` 19 backup_error=0 20 too_old=90 21 curr_ts=`date '+%Y%m%d.%H%M%S'` 22 log_file="$working_dir"/`date '+%A'`.log 23 24 25 # Define functions here. 26 27 28 run_backup () { 29 printf "%-38s" "Backing up $1" 30 expect "$script_dir"/"$1".exp >> $log_file 31 grep -qi $1 $working_dir/$1.test 32 if [ $? = 0 ]; then 33 echo "COMPLETE" 34 printf "%-38s" "====> Rotating $1" 35 mv -f "$working_dir"/$1.test "$working_dir"/$1/"$curr_ts".$1.bak >> $log_file 36 if [ $? = 0 ]; then 37 echo "COMPLETE" 38 else 39 echo "...ERROR" 40 backup_error=1 41 fi 42 43 else 44 echo " ERROR" 45 backup_error=1 46 fi 47 } 48 49 50 ######## Program Execution HERE ######### 51 52 # Overwrite existing log file. 53 echo `date` > $log_file 54 55 # If expect doesn't exist on this system, exit 56 57 if [ ! -e "/usr/local/bin/expect" ]; then 58 echo "EXPECT doesn't exist on this system. Please install and try again." > /dev/stderr 59 exit 1 60 fi 61 62 # Call the function run_backup for each host in our system. 63 echo "==============================================" 64 run_backup "router1" 65 run_backup "router2" 66 echo 67 echo "==============================================" 68 69 # Check to see if there were any errors during the execution of our script. 70 # If there are, send message to /dev/stderr and to syslog in /var/log/messages. 71 72 if [ $backup_error = 0 ]; then 73 echo " Cicso Backups complete." 74 else 75 echo " Cisco Backups completed with errors." 76 echo "Cisco backups have failed. Please see log file on `hostname`: $log_file" > /dev/stderr 77 logger "### Cisco Errors ###" 78 logger "Cisco backups have failed. Please see log file $log_file." 79 logger "### Cisco Errors ###" 80 fi 81 echo "==============================================" 82 echo "Removing stale files older than $too_old days." 83 find $working_dir -type f -maxdepth 2 -mtime +${too_old}d -name "*.bak" -print0 | xargs -0 rm -f 84 run_time=$(( `date '+%s'` - "$starttime" )) 85 echo "Run Time: $run_time seconds." ##### ##### Code inserted between lines 30 & 31 ##### 31 logger "$1 backup complete" 32 sleep 5 33 logger "Verifying $1 backup" ##### ##### Syslog output (from working case) ##### Mar 28 09:29:06 server tftpd[66536]: Filename: 'router1.test' Mar 28 09:29:06 server tftpd[66536]: Mode: 'octet' Mar 28 09:29:06 server tftpd[66536]: 10.10.83.247: write request for router1.test: success Mar 28 09:29:06 server tom: router1 backup complete Mar 28 09:29:11 server tom: Verifying router1 backup Mar 28 09:29:14 server tftpd[66548]: Filename: 'router2.test' Mar 28 09:29:14 server tftpd[66548]: Mode: 'octet' Mar 28 09:29:14 server tftpd[66548]: 10.10.83.246: write request for router2.test: success Mar 28 09:29:14 server tom: router2 backup complete Mar 28 09:29:19 server tom: Verifying router2 backup ##### ##### "Broken" trace output ##### + run_backup router1 + printf %-38s 'Backing up router1' Backing up router1 + expect /d/net-infrastructure/scripts/router1.exp + grep -qi router1 /d/net-infrastructure/router1.test + [ 1 = 0 ] + echo ' ERROR' ERROR + backup_error=1 + run_backup router2 + printf %-38s 'Backing up router2' Backing up router2 + expect /d/net-infrastructure/scripts/router2.exp + grep -qi router2 /d/net-infrastructure/router2.test + [ 1 = 0 ] + echo ' ERROR' ERROR + backup_error=1 + echo ##### ##### "Fixed" trace output ##### + run_backup router1 + printf %-38s 'Backing up router1' Backing up router1 + expect /d/net-infrastructure/scripts/router1.exp + logger 'router1 backup complete' + sleep 5 + logger 'Verifying router1 backup' + grep -qi router1 /d/net-infrastructure/router1.test + [ 0 = 0 ] + echo COMPLETE COMPLETE -- Thomas Johnson ClaimLynx, Inc.