Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 7 Aug 1999 00:15:38 -0400 
From:      tcobb@staff.circle.net
To:        lightningweb@hotmail.com, freebsd-stable@FreeBSD.ORG
Cc:        greg@lightningweb.com, jeremy@lightningweb.com, keith@lightningweb.com, criter@lightningweb.com
Subject:   RE: continued crashes with 3.1-Stable
Message-ID:  <307D63ED6749CF11AAE9005004461A5B3FB8@FREYA>

next in thread | raw e-mail | index | archive | help
I think the problem is the SMP.  I've been having frequent
freezes with SMP under heavy webserver load with 3.2-R,
and 3.2-S.  I'm unfortunately led to believe that FreeBSD
SMP is just not ready for primetime.  Too bad the $$ we blew
on a dual PIII-550 box.

-Troy Cobb
 Circle Net, Inc.
 http://www.circle.net

>   -----Original Message-----
>   From: lweb Lightningweb [mailto:lightningweb@hotmail.com]
>   Sent: Friday, August 06, 1999 11:33 PM
>   To: freebsd-stable@FreeBSD.ORG
>   Cc: greg@lightningweb.com; jeremy@lightningweb.com;
>   keith@lightningweb.com; criter@lightningweb.com
>   Subject: continued crashes with 3.1-Stable
>   
>   
>   We have not resolved our problem with frequent freezes with 
>   our web server.  
>   We had two responses to our first mail to this list, but 
>   neither one was the 
>   solution.  The problem is that the server will stop 
>   responding to ANYTHING 
>   except pings.  No telnet, no ssh, no web, no ftp, nothing.  
>   Open telnet 
>   sessions don't drop, there's just no response to keyboard activity.
>   
>   One suggestion was to fix the "pthreads library," whic we 
>   did.  The other 
>   was: "You may have hardware problems."
>   
>   This server is going down more frequently.  Three times so 
>   far today.  There 
>   is no apparant pattern to the crashes.  They seem to happen 
>   most often 
>   during an Mysql database query, but it's happened many 
>   times without any 
>   queries (a few times just by running "pine" with a large 
>   mailbox file).
>   
>   We cannot recreate the crash when we want, it just crashes 
>   at random times.  
>   We've tried hammering it with web, database queries, and 
>   benchamrking 
>   programs that slam the RAID array and memory and 
>   processors, but it chuggs 
>   right along.
>   
>   We have replaced drives in the RAID array, we are now 
>   replacing drive 
>   caddies.  Next step I think will be the RAID controller.  I 
>   have a strong 
>   gut feeling that it is software however.  There's nothing 
>   to substantiate 
>   this, except that that more often than not, the crash 
>   happens during an 
>   MySQL query.
>   
>   Some (NOT ALL) of the suspect errors that we've recorded 
>   from the console 
>   during a crash are:
>   
>   (da0:dpt0:0:0:0): Invalidating pack
>   biodone: buffer already done
>   spec_getpages: I/O read failure: (error code=6)
>                  size: 32768, resid: 32768, a_count: 32768, valid: 0x0
>                  nread: 0, reqpage: 0, pindex: 0, pcount: 8
>   
>   
>   Everyone please take a second look at this and help us 
>   brainstorm the 
>   problem?  I am including a list of the hardware, the 
>   original message we 
>   sent to the list, and a recent dmesg:
>   
>   FreeBSD 3.1-STABLE #1
>   Dual-Proc PII 450
>   512MB RAM
>   DPT PM334UW RAID controller
>   - 16MB RAM
>   - dual bus Ultra Wide
>   - Six 9.1GB Quantum VikingII SCSI3 U2W drives
>   - Three drives per bus, RAID5, one drive is hot-spare
>   Intel EtherExpress Pro 10/100B Ethernet
>   TOSHIBA CD-ROM XM-6201TA
>   
>   
>   --------------
>   I've recently had the job of system administration dumped 
>   in my lap.  I'm
>   looking forward to getting on top of it, but I'm a little 
>   behind the 8-ball
>   right now.  If my subject matter varies too far from the 
>   allowed context of
>   this list, please don't flame me too badly.
>   
>   Background:  We are running a dual PII 450 system with a 45 
>   gig raid array,
>   controlled by a DPT PM334.
>   
>   The O/S: FreeBSD 3.1-STABLE #1
>   
>   For several months this has been rock solid.  However, in 
>   the past three
>   weeks, we've had a number of crashes, most of which seem to 
>   be related to
>   mysql queries.  The system would be totally unresponsive to 
>   ssh/telnet and
>   web, but would still return pings.
>   
>   The server is colocated at our ISP, so it's been tricky to 
>   track down the
>   exact 'on screen' console errors.  Today, shortly after we 
>   upgraded our
>   mysql version, I did see the error.
>   
>   
>   (da0:dpt0:0:0:0): Invalidating Pack
>   (da0:dpt0:0:0:0): Invalidating Pack
>   devstat_end_transaction: HELP!! busy_count for da0 is < 0 (-1)!
>   biodone: buffer already done
>   (da0:dpt0:0:0:0): Read  (10). CDB: 28 0 3 87 33 1f 0 0 80 0
>   (da0:dpt0:0:0:0): ILLEGAL REQUEST asc:20,0
>   (da0:dpt0:0:0:0): Invalid command operation code
>   devstat_end_transaction: HELP!! busy_count for da0 is < 0 (-1)!
>   biodone: buffer already done
>   
>   
>   Followed by a complete system freeze, including the console.
>   
>   Some hunting and searching has led us to believe that we 
>   are encountering a
>   driver failure and that we should bring the OS back to -stable.
>   
>   As I said, I haven't done this before, so I'm a little 
>   anxious.  Before I
>   take that step, I would be very greatful to hear some input 
>   from those who
>   surely know more about this than I do.
>   
>   Is bringing the system back to -stable likely to correct 
>   our problem?  Am I
>   missing some indicator in the error above?   Has someone 
>   else encountered
>   similar trouble (and found a fix?)
>   
>   I'll be happy to take replies in private e-mail if this is 
>   off topic.
>   
>   Any help would be great.
>   
>   Thanks,
>   Jeremy
>   
>   
>   _______________________________________________________________
>   Get Free Email and Do More On The Web. Visit http://www.msn.com
>   
>   
>   To Unsubscribe: send mail to majordomo@FreeBSD.org
>   with "unsubscribe freebsd-stable" in the body of the message
>   


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-stable" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?307D63ED6749CF11AAE9005004461A5B3FB8>