Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 5 Oct 1996 00:20:18 -0700 (PDT)
From:      Dror Matalon <dror@dnai.com>
To:        freebsd-isp@freebsd.org
Subject:   How to solve the news server problem
Message-ID:  <Pine.NEB.3.93.961004234743.15684K-100000@mars.dnai.com>
In-Reply-To: <199610031825.NAA07158@brasil.moneng.mei.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi folks,

All this discussion of RAID, ccd etc got me again thinking about the
news server problem. 

Our news has way more problems than any other server we run. I
believe that this is typical for most ISPs. Our population
actually reads news less than many other ISPs. With around
3000 users I've never seen more than 30 concurrent readers
on our news server. Our server runs on :

128 Meg memory
4 Quantum XP34300W (Fast wide 4Gig) Yes, I know 8 2 Gigs would be better.
Pentium 133 

Response time is fine, but not spectacular. I suspect that the next
step for speedup would be for us to have separate reader and feed
machines. Right now this machine connect to 4 other ISPs to send
and receive news.

I'm annoyed with how indeficient the news system is. I know the
history, (pun intended) of Usenet and it all makes sense in the
context of uucp and store and forward on 56K lines to have a news
system where everyone keeps all the articles and everyone has a
full feed.  Today with our fast lines and 150 - 200 Megs of news
I believe that my news server is spending most of its time receiving,
writing to disk, organizing, and then removing files that NONE OF
MY USERS WILL EVER LOOK AT. To put it another way, the reason that
we all have these really full feeds (other than to be able to tell
someone who calls and wants to know, "oh yes we have 500,000
newsgroups"), is so that when one of our users wants to subscribe
to a new newsgroup we want the to have the articles there.

We could quite easily figure out which newsgroups our users subscribe
to, accept only articles for these newsgroups and reduce the traffic,
the disk space, the memory etc to ... 5%? 10%? 30%? The problem is
that we want to have newsgroups available when our users want to
subscribe to something new.

So, it looks like we could have some kind of algorithm that keeps
everything in subscribed newsgroups for 14 days. Keeps subscribed
binary newsgroups for 7 days, keeps everything else for 1 day.  This
way when someone subscribes to a new newsgroup they have something
to start with, and they'll see all the new stuff from the point of
subscription. The only time they lose is when the subscribe to a
new newsgroup they only get to see 1 day instead of 13 days or
articles.  

On the other hand, I just checked and our users only looked at 554
newsgroups out of the 17,000 or so we have (I lied we don't have
500,000).  So even if the binary newsgroups will still contain most
of the same material and even if we do a keep a day's work of other
unsubscribed newsgroups we should be able to handle only 20% or so
of all the articles and our disks will not be working as hard since
they'll have a lot less material to look through, which should make
them more reliable, need to worry less about disks failing, RAID,
ccd etc.

Now, I know I'm not the only smart person in the world so I looked
around and sure enough I found 
    ftp://ftp.math.psu.edu/pub/INN/contrib/actgroups.pl

  #!/usr/local/bin/perl
  # Active Groups -- Detecting actively accessed news groups and setting
  #               expire.ctl accordingly
  $progname       =       "actgroups";
  $version        =       "Ver 0.03c,  30 August 1994";
  $author         =       "Yufan Hu <Y.Hu\@ulst.ac.uk>";
  #
  # Slightly modified by Alan Brown (alan@manawatu.planet.co.nz) 15 Dec 1994


But I couldn't find anyone using this. So folks, is this a good solution?






Dror Matalon                                            Voice: 510 649-6110
Direct Network Access                                   Fax:   510 649-7130
2039 Shattuck Avenue                                    Modem: 510 649-6116
Berkeley, CA 94704                                      Email: dror@dnai.com







Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.NEB.3.93.961004234743.15684K-100000>