From owner-freebsd-stable@FreeBSD.ORG Tue Apr 24 07:50:46 2007 Return-Path: X-Original-To: freebsd-stable@freebsd.org Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id A1B8C16A408 for ; Tue, 24 Apr 2007 07:50:46 +0000 (UTC) (envelope-from smithi@nimnet.asn.au) Received: from gaia.nimnet.asn.au (nimbin.lnk.telstra.net [139.130.45.143]) by mx1.freebsd.org (Postfix) with ESMTP id 3C33013C48C for ; Tue, 24 Apr 2007 07:50:43 +0000 (UTC) (envelope-from smithi@nimnet.asn.au) Received: from localhost (smithi@localhost) by gaia.nimnet.asn.au (8.8.8/8.8.8R1.5) with SMTP id RAA07114; Tue, 24 Apr 2007 17:50:24 +1000 (EST) (envelope-from smithi@nimnet.asn.au) Date: Tue, 24 Apr 2007 17:50:24 +1000 (EST) From: Ian Smith To: Howard Leadmon In-Reply-To: <005101c785f8$62b04010$081872cf@Leadmon.local> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Cc: freebsd-stable@freebsd.org Subject: Re: FreeBSD DNS Resolver Issues? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 Apr 2007 07:50:46 -0000 On Mon, 23 Apr 2007, Howard Leadmon wrote: > OK, now I am a bit stumped, so wanted to post here in hopes someone might > have an idea. First off the FBSD machine in question is an x86 server running > 6.2-STABLE from a supped from a few weeks ago, so is fairly current. > > I use said machine to handle all of my eMail and things in general seem to > work great, though I have this one mystery. > > I we try and send mail to anyuser@wtplaw.com the mail will just set in the > queue forever, until it's returned as a failure. Talking with the admins at > wtplaw they are swearing their configs are correct, and it's something on our > side. Looking at the mailq, I see: > > l3NEqolY011124 28697 Mon Apr 23 10:52 > (Deferred: Name server: mail.wtplaw.com.: host name lookup > fa) > > > So as it's quick an easy I used dig and did a lookup: > > $ host wtplaw.com > wtplaw.com has address 69.20.43.246 > wtplaw.com mail is handled by 10 mail.wtplaw.com. > > > Then on mail.wtplaw.com: > > $ host mail.wtplaw.com > mail.wtplaw.com has address 65.111.69.228 > mail.wtplaw.com has address 66.166.181.163 > Host mail.wtplaw.com not found: 2(SERVFAIL) > ;; connection timed out; no servers could be reached I'm getting the same results here, using dig rather than host (FWIW). I'm also seing inconsistent results re the listed NS for that domain: ======= smithi on paqi% dig wtplaw.com any ; <<>> DiG 9.3.4 <<>> wtplaw.com any ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 15237 ;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 2, ADDITIONAL: 2 ;; QUESTION SECTION: ;wtplaw.com. IN ANY ;; ANSWER SECTION: wtplaw.com. 86363 IN NS ns1.airband.net. wtplaw.com. 86363 IN NS ns2.airband.net. wtplaw.com. 86363 IN A 69.20.43.246 ;; AUTHORITY SECTION: wtplaw.com. 86363 IN NS ns2.airband.net. wtplaw.com. 86363 IN NS ns1.airband.net. ;; ADDITIONAL SECTION: ns1.airband.net. 5687 IN A 216.138.97.246 ns2.airband.net. 5687 IN A 216.138.119.6 ;; Query time: 236 msec ;; SERVER: 192.168.1.1#53(192.168.1.1) ;; WHEN: Tue Apr 24 15:54:01 2007 ;; MSG SIZE rcvd: 123 ======= but: ======= smithi on paqi% dig mail.wtplaw.com ; <<>> DiG 9.3.4 <<>> mail.wtplaw.com ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 33923 ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 2, ADDITIONAL: 0 ;; QUESTION SECTION: ;mail.wtplaw.com. IN A ;; ANSWER SECTION: mail.wtplaw.com. 3 IN A 65.111.69.228 mail.wtplaw.com. 3 IN A 66.166.181.163 ;; AUTHORITY SECTION: mail.wtplaw.com. 86399 IN NS lp1.wtplaw.com. mail.wtplaw.com. 86399 IN NS lp2.wtplaw.com. ;; Query time: 466 msec ======= Note different NS, with the As for mail.wtplaw.com returned. Further: ======= smithi on paqi% dig wtplaw.com mx ; <<>> DiG 9.3.4 <<>> wtplaw.com mx ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 21494 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 2 ;; QUESTION SECTION: ;wtplaw.com. IN MX ;; ANSWER SECTION: wtplaw.com. 86400 IN MX 10 mail.wtplaw.com. ;; AUTHORITY SECTION: wtplaw.com. 62547 IN NS ns1.airband.net. wtplaw.com. 62547 IN NS ns2.airband.net. ;; ADDITIONAL SECTION: ns1.airband.net. 5671 IN A 216.138.97.246 ns2.airband.net. 5671 IN A 216.138.119.6 ;; Query time: 1021 msec ======= Here no A is retuened for mail.wtplaw.com, and note the airband.net NS. Pretty sure sendmail does an MX request, so that's what it'll get then, which explains your mailq response. At (one set of) the listed NServers: ======= ; <<>> DiG 9.3.4 <<>> @lp1.wtplaw.com. mail.wtplaw.com. ; (1 server found) ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 24202 ;; flags: qr aa; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;mail.wtplaw.com. IN A ;; ANSWER SECTION: mail.wtplaw.com. 3 IN A 66.166.181.163 mail.wtplaw.com. 3 IN A 65.111.69.228 ;; Query time: 268 msec ;; SERVER: 65.111.69.226#53(65.111.69.226) ;; WHEN: Tue Apr 24 15:57:00 2007 ;; MSG SIZE rcvd: 65 ======= Note no A record provided for mail.wtplaw.com; same digging @lp2.wtplaw.com. So trying the 'other' listed NServers above: ======= smithi on paqi% dig @ns1.airband.net. wtplaw.com. any ; <<>> DiG 9.3.4 <<>> @ns1.airband.net. wtplaw.com. any ; (1 server found) ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 57836 ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 6, AUTHORITY: 0, ADDITIONAL: 2 ;; QUESTION SECTION: ;wtplaw.com. IN ANY ;; ANSWER SECTION: wtplaw.com. 86400 IN A 69.20.43.246 wtplaw.com. 86400 IN MX 10 mail.wtplaw.com. wtplaw.com. 86400 IN TXT "v=spf1 \ ip4:65.111.69.230 ip4:66.166.181.165 ip4:65.111.69.228 \ ip4:66.166.181.163 mx ~all" wtplaw.com. 86400 IN SOA ns1.airband.net. \ hostmaster.airband.net. 2007040708 10800 3600 604800 86400 wtplaw.com. 86400 IN NS ns2.airband.net. wtplaw.com. 86400 IN NS ns1.airband.net. ;; ADDITIONAL SECTION: ns1.airband.net. 3600 IN A 216.138.97.246 ns2.airband.net. 3600 IN A 216.138.119.6 ;; Query time: 249 msec ;; SERVER: 216.138.97.246#53(216.138.97.246) ;; WHEN: Tue Apr 24 17:11:06 2007 ;; MSG SIZE rcvd: 292 ======= Still no A record for mail.wtplaw.com. How's sendmail gonna mail it? So their DNS is inconsistent, ad pretty well b0rked (from here anyway). As for the Solaris box working, dunno, maybe caching? or maybe the results are flapping around, what with two sets of supposedly auth NS? HTH, Ian > As you can see I am getting a failure, which I know will make sendmail blow a > gasket over the issue. Oh and use I have the WorkAroundBrokenAAAA set in my > configs. > > Here is where it gets interesting, and confuses me. I also have a Sun SPARC > server running Solaris-10, so figured I would try the same on it. Note that > both servers use the same DNS servers for resolution, plus I also tried the > above specifying the actual listed nameservers for wtplaw.com and got the same > results. > > OK, so let's try the above on my Solaris-10 server: > > $ host wtplaw.com > wtplaw.com has address 69.20.43.246 > wtplaw.com mail is handled by 10 mail.wtplaw.com. > > and: > > $ host mail.wtplaw.com > mail.wtplaw.com has address 65.111.69.228 > mail.wtplaw.com has address 66.166.181.163 > > > Note I am getting no failure messages from my Solaris machine. So I even > turned on -v verbose option. > > Here is from the FreeBSD machine: > > $ host -v mail.wtplaw.com > Trying "mail.wtplaw.com" > ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 27765 > ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 2, ADDITIONAL: 1 > > ;; QUESTION SECTION: > ;mail.wtplaw.com. IN A > > ;; ANSWER SECTION: > mail.wtplaw.com. 3 IN A 65.111.69.228 > mail.wtplaw.com. 3 IN A 66.166.181.163 > > ;; AUTHORITY SECTION: > mail.wtplaw.com. 85342 IN NS lp2.wtplaw.com. > mail.wtplaw.com. 85342 IN NS lp1.wtplaw.com. > > ;; ADDITIONAL SECTION: > lp2.wtplaw.com. 85864 IN A 66.166.181.172 > > Received 117 bytes from 207.114.24.13#53 in 22 ms > Trying "mail.wtplaw.com" > Host mail.wtplaw.com not found: 2(SERVFAIL) > Received 33 bytes from 207.114.24.13#53 in 80 ms > Trying "mail.wtplaw.com" > ;; connection timed out; no servers could be reached > > > Note the failures. I am have to honestly say I am not totally sure what it's > trying to do at the end there, maybe someone can explain that one to me. > > Here is the Solaris-10 machine making the same query: > > $ host -v mail.wtplaw.com > Trying "mail.wtplaw.com" > ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 549 > ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 2, ADDITIONAL: 1 > > ;; QUESTION SECTION: > ;mail.wtplaw.com. IN A > > ;; ANSWER SECTION: > mail.wtplaw.com. 3 IN A 65.111.69.228 > mail.wtplaw.com. 3 IN A 66.166.181.163 > > ;; AUTHORITY SECTION: > mail.wtplaw.com. 85225 IN NS lp1.wtplaw.com. > mail.wtplaw.com. 85225 IN NS lp2.wtplaw.com. > > ;; ADDITIONAL SECTION: > lp2.wtplaw.com. 85747 IN A 66.166.181.172 > > Received 117 bytes from 207.114.24.13#53 in 40 ms > > > Again, the query seemed fine, no troubles. > > As stated earlier, talking to the sysadmin of the wtplaw.com site, they are > swearing there is nothing wrong, they are responding to queries as they should > be, and that we have a configuration problem on our end. If this is true, I'd > sure love to know what it is, so I can fix it, and if not I'd love to know > what to tell them is wrong with their DNS so I can get it corrected. As right > now I am bouncing mail from a few clients to this user, and I can't seem to > find any resolution to this issue. > > When I realized that Solaris seems happy with their DNS, but FBSD is not, it > just made this even more of a mystery. If anyone can help shed any light on > this it would sure be appreciated.. > > > --- > -Howard