From owner-freebsd-stable Thu Jan 11 9:26:29 2001 Delivered-To: freebsd-stable@freebsd.org Received: from mindcrime.bit0.com (mindcrime.bit0.com [216.7.69.69]) by hub.freebsd.org (Postfix) with ESMTP id BCF4837B401 for ; Thu, 11 Jan 2001 09:26:10 -0800 (PST) Received: from localhost (mandrews@localhost) by mindcrime.bit0.com (8.11.1/8.11.1) with ESMTP id f0BHLpO23718 for ; Thu, 11 Jan 2001 12:21:51 -0500 (EST) (envelope-from mandrews@bit0.com) Date: Thu, 11 Jan 2001 12:21:51 -0500 (EST) From: Mike Andrews To: stable@freebsd.org Subject: Weird sporadic DNS resolution problems Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-stable@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG I'm having a bizarre DNS resolution problem that I'm having a hell of a time tracking down. Someone tell me I'm just being stupid. :) For a few months now, I'm *sporadically* unable to resolve *some* external domains. This started happening approximately between 4.1.1-RELEASE and 4.2-RELEASE, when I believe Bind 8 was upgraded in the source tree. (I don't remember the exact date, sorry) Here's what appears to be going on: When one (but not both) of the nameservers for a domain replies non-authoritatively, named will cache a negative response, rather than asking the other nameserver. Subsequent lookups return an immediate failure. Restarting the nameserver, and then immediately querying the same problematic domain DOES work, but only the first query. After a few minutes/hours the domain stops working again. This is especially chronic because Sendmail tries to resolve domains on incoming email (for spam protection purposes)... it will give "Domain of sender address foo@bar does not resolve" and return a 451 code. This causes the other end to retry periodically, unless the other end is something like Outlook Express, in which case the customer calls me and complains. :) One example domain is "farmersfrankfort.com". This was moved from us to another site yesterday, but we still do MX for them. Looking at whois and at the root servers, you can see that their two new nameservers are now "cerberus.sbscorp.com" and "ns1.qwest.net". Querying the sbscorp server works great, querying qwest doesn't -- it appears Quest never added them to their nameserver config at all. (It has been only about 24 hours, so it's not *too* surprising I guess...) Anyway, when someone on one of our dialups tries to send mail with @farmersfrankfort.com on the end, our Sendmail is unable to resolve it and rejects the message. If I restart (not reload) named, it will start working for a while, then die on its own again. My theory is that if it happens to query sbscorp it's happy, if it happens to query qwest it isn't, and caches the fact that it isn't. Another example is "setel.com" and "se-tel.com". We sometimes have problems exchanging mail with them because one of their DNS servers appears to be answering non-authoritatively. Again, I can flush my backlog by restarting named and immediately running the sendmail queue manually (and I could probably flush their backlog by telnetting to their SMTP port and issuing an ETRN)... but obviously that's not exactly elegant :) I've tried adding "max-ncache-ttl 1" to my named config, hoping it would help. It didn't. In one sense this is "not my problem" because their name server shouldn't be answering non-authoritatively in the first place. But the fact that this started happening after a make world a few months ago, and that I feel it should be a slight bit more tolerant of other people's sloppy configurations, makes it my problem. Anyone have any ideas as to what's going on, or can tell me what debugging output to enable that I could send here that would help figure it out? Configuration options to named that would revert to older behavior? A whack on the head? (I could just compile an older named I guess, but I fear opening up security holes/DoS attacks.) Mike Andrews * mandrews@dcr.net * mandrews@bit0.com * http://www.bit0.com VP, sysadmin, & network guy, Digital Crescent Inc, Frankfort KY Internet access for Frankfort, Lexington, Louisville and surrounding counties www.fark.com: If it's not news, it's Fark. (Or something like that.) To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message