Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 7 Aug 1999 17:14:25 -0400 (EDT)
From:      "Alok K. Dhir" <adhir@forumone.com>
To:        tcobb@staff.circle.net
Cc:        lightningweb@hotmail.com, freebsd-stable@FreeBSD.ORG, greg@lightningweb.com, jeremy@lightningweb.com, keith@lightningweb.com, criter@lightningweb.com
Subject:   RE: continued crashes with 3.1-Stable
Message-ID:  <Pine.BSF.4.10.9908071714070.26833-100000@orion.forumone.com>
In-Reply-To: <307D63ED6749CF11AAE9005004461A5B3FB8@FREYA>

next in thread | previous in thread | raw e-mail | index | archive | help

You may want to try Solaris x86 on that same box...

On Sat, 7 Aug 1999 tcobb@staff.circle.net wrote:

> I think the problem is the SMP.  I've been having frequent
> freezes with SMP under heavy webserver load with 3.2-R,
> and 3.2-S.  I'm unfortunately led to believe that FreeBSD
> SMP is just not ready for primetime.  Too bad the $$ we blew
> on a dual PIII-550 box.
> 
> -Troy Cobb
>  Circle Net, Inc.
>  http://www.circle.net
> 
> >   -----Original Message-----
> >   From: lweb Lightningweb [mailto:lightningweb@hotmail.com]
> >   Sent: Friday, August 06, 1999 11:33 PM
> >   To: freebsd-stable@FreeBSD.ORG
> >   Cc: greg@lightningweb.com; jeremy@lightningweb.com;
> >   keith@lightningweb.com; criter@lightningweb.com
> >   Subject: continued crashes with 3.1-Stable
> >   
> >   
> >   We have not resolved our problem with frequent freezes with 
> >   our web server.  
> >   We had two responses to our first mail to this list, but 
> >   neither one was the 
> >   solution.  The problem is that the server will stop 
> >   responding to ANYTHING 
> >   except pings.  No telnet, no ssh, no web, no ftp, nothing.  
> >   Open telnet 
> >   sessions don't drop, there's just no response to keyboard activity.
> >   
> >   One suggestion was to fix the "pthreads library," whic we 
> >   did.  The other 
> >   was: "You may have hardware problems."
> >   
> >   This server is going down more frequently.  Three times so 
> >   far today.  There 
> >   is no apparant pattern to the crashes.  They seem to happen 
> >   most often 
> >   during an Mysql database query, but it's happened many 
> >   times without any 
> >   queries (a few times just by running "pine" with a large 
> >   mailbox file).
> >   
> >   We cannot recreate the crash when we want, it just crashes 
> >   at random times.  
> >   We've tried hammering it with web, database queries, and 
> >   benchamrking 
> >   programs that slam the RAID array and memory and 
> >   processors, but it chuggs 
> >   right along.
> >   
> >   We have replaced drives in the RAID array, we are now 
> >   replacing drive 
> >   caddies.  Next step I think will be the RAID controller.  I 
> >   have a strong 
> >   gut feeling that it is software however.  There's nothing 
> >   to substantiate 
> >   this, except that that more often than not, the crash 
> >   happens during an 
> >   MySQL query.
> >   
> >   Some (NOT ALL) of the suspect errors that we've recorded 
> >   from the console 
> >   during a crash are:
> >   
> >   (da0:dpt0:0:0:0): Invalidating pack
> >   biodone: buffer already done
> >   spec_getpages: I/O read failure: (error code=6)
> >                  size: 32768, resid: 32768, a_count: 32768, valid: 0x0
> >                  nread: 0, reqpage: 0, pindex: 0, pcount: 8
> >   
> >   
> >   Everyone please take a second look at this and help us 
> >   brainstorm the 
> >   problem?  I am including a list of the hardware, the 
> >   original message we 
> >   sent to the list, and a recent dmesg:
> >   
> >   FreeBSD 3.1-STABLE #1
> >   Dual-Proc PII 450
> >   512MB RAM
> >   DPT PM334UW RAID controller
> >   - 16MB RAM
> >   - dual bus Ultra Wide
> >   - Six 9.1GB Quantum VikingII SCSI3 U2W drives
> >   - Three drives per bus, RAID5, one drive is hot-spare
> >   Intel EtherExpress Pro 10/100B Ethernet
> >   TOSHIBA CD-ROM XM-6201TA
> >   
> >   
> >   --------------
> >   I've recently had the job of system administration dumped 
> >   in my lap.  I'm
> >   looking forward to getting on top of it, but I'm a little 
> >   behind the 8-ball
> >   right now.  If my subject matter varies too far from the 
> >   allowed context of
> >   this list, please don't flame me too badly.
> >   
> >   Background:  We are running a dual PII 450 system with a 45 
> >   gig raid array,
> >   controlled by a DPT PM334.
> >   
> >   The O/S: FreeBSD 3.1-STABLE #1
> >   
> >   For several months this has been rock solid.  However, in 
> >   the past three
> >   weeks, we've had a number of crashes, most of which seem to 
> >   be related to
> >   mysql queries.  The system would be totally unresponsive to 
> >   ssh/telnet and
> >   web, but would still return pings.
> >   
> >   The server is colocated at our ISP, so it's been tricky to 
> >   track down the
> >   exact 'on screen' console errors.  Today, shortly after we 
> >   upgraded our
> >   mysql version, I did see the error.
> >   
> >   
> >   (da0:dpt0:0:0:0): Invalidating Pack
> >   (da0:dpt0:0:0:0): Invalidating Pack
> >   devstat_end_transaction: HELP!! busy_count for da0 is < 0 (-1)!
> >   biodone: buffer already done
> >   (da0:dpt0:0:0:0): Read  (10). CDB: 28 0 3 87 33 1f 0 0 80 0
> >   (da0:dpt0:0:0:0): ILLEGAL REQUEST asc:20,0
> >   (da0:dpt0:0:0:0): Invalid command operation code
> >   devstat_end_transaction: HELP!! busy_count for da0 is < 0 (-1)!
> >   biodone: buffer already done
> >   
> >   
> >   Followed by a complete system freeze, including the console.
> >   
> >   Some hunting and searching has led us to believe that we 
> >   are encountering a
> >   driver failure and that we should bring the OS back to -stable.
> >   
> >   As I said, I haven't done this before, so I'm a little 
> >   anxious.  Before I
> >   take that step, I would be very greatful to hear some input 
> >   from those who
> >   surely know more about this than I do.
> >   
> >   Is bringing the system back to -stable likely to correct 
> >   our problem?  Am I
> >   missing some indicator in the error above?   Has someone 
> >   else encountered
> >   similar trouble (and found a fix?)
> >   
> >   I'll be happy to take replies in private e-mail if this is 
> >   off topic.
> >   
> >   Any help would be great.
> >   
> >   Thanks,
> >   Jeremy
> >   
> >   
> >   _______________________________________________________________
> >   Get Free Email and Do More On The Web. Visit http://www.msn.com
> >   
> >   
> >   To Unsubscribe: send mail to majordomo@FreeBSD.org
> >   with "unsubscribe freebsd-stable" in the body of the message
> >   
> 
> 
> To Unsubscribe: send mail to majordomo@FreeBSD.org
> with "unsubscribe freebsd-stable" in the body of the message
> 



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.BSF.4.10.9908071714070.26833-100000>