From owner-freebsd-hackers Mon Jul 16 18:14:39 2001 Delivered-To: freebsd-hackers@freebsd.org Received: from earth.backplane.com (earth-nat-cw.backplane.com [208.161.114.67]) by hub.freebsd.org (Postfix) with ESMTP id 9A1B437B401 for ; Mon, 16 Jul 2001 18:14:09 -0700 (PDT) (envelope-from dillon@earth.backplane.com) Received: (from dillon@localhost) by earth.backplane.com (8.11.4/8.11.2) id f6H1E5P33636; Mon, 16 Jul 2001 18:14:05 -0700 (PDT) (envelope-from dillon) Date: Mon, 16 Jul 2001 18:14:05 -0700 (PDT) From: Matt Dillon Message-Id: <200107170114.f6H1E5P33636@earth.backplane.com> To: Mike Silbersack Cc: Len Conrad , Subject: Re: Weird named problem - IN A for nameservers being lost! References: <20010716195409.U74787-100000@achilles.silby.com> Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG : : :On Mon, 16 Jul 2001, Matt Dillon wrote: : :> I've been trying to track down a weird problem with our mail system :> suddenly believing that a host does not exist, or timing out in DNS. :> :> I tracked it down to the DNS server, but I am not entirely sure what is :> going on. What appears to be happening is that the glue IN A record :> for the NS server for a domain is getting lost, and the NS record is :> remaining. When named gets into this state, it doesn't seem to be able :> to recover... it sees the NS record but it can't resolve it because :> the glue record is gone, and it doesn't try to get it after that. : :This looks like a problem brought up on the djbdns mailing list a long :while ago. When the NS records listed with the roots and the NS records :returned by the NSes don't match (or share any NSes whatsoever, for that :matter), BIND breaks as you've described. : :The resolution, as I recall, was "don't do that!" Bind 9 might handle the :case correctly, as might djbdns. In any case, the admins of :jamcracker.com should be synchronizing their NS listings. : :Here's how it is now: I don't think that's it... if you look at the dumps, there were no timeouts in the 2-day range. The original glue NS records (from exodus) had already been completely replaced by the NS record from their zone. Everything in their zones is already synchronized. -Matt :> dig jamcracker.com NS : :; <<>> DiG 8.3 <<>> jamcracker.com NS :;; res options: init recurs defnam dnsrch :;; got answer: :;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 4 :;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 2 :;; QUERY SECTION: :;; jamcracker.com, type = NS, class = IN : :;; ANSWER SECTION: :jamcracker.com. 2D IN NS SCA03.SEC.DNS.EXODUS.NET. :jamcracker.com. 2D IN NS SCA02.SEC.DNS.EXODUS.NET. : :> dig jamcracker.com NS @sca03.sec.dns.exodus.net : :; <<>> DiG 8.3 <<>> jamcracker.com NS @sca03.sec.dns.exodus.net :; (1 server found) :;; res options: init recurs defnam dnsrch :;; got answer: :;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 6 :;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 :;; QUERY SECTION: :;; jamcracker.com, type = NS, class = IN : :;; ANSWER SECTION: :jamcracker.com. 1H IN NS fuji.jamcracker.com. : :Mike "Silby" Silbersack To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message