Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 20 Mar 2018 16:11:42 -0700
From:      Aleksandr Miroslav <alexmiroslav@gmail.com>
To:        freebsd-questions@freebsd.org
Subject:   weird network/DNS issues (nsd not returning answer)
Message-ID:  <CACcSE1yXgcp8vJS2CTmXWTwSs_cxrAU+NjirNn2n8Ls7fSsjgQ@mail.gmail.com>

Next in thread | Raw E-Mail | Index | Archive | Help
I have a number of FreeBSD servers online. The other day, one of them
that I setup a month back started exhibiting really weird behavior. It
doesn't get answers back to queries made to my two DNS servers, both of
which are running nsd.

Initially I suspected pf or sshguard to be the issue, but this happens
with pf and sshguard turned off on all servers in question.

The other weird thing is that all other network traffic between these
servers are passing back and forth normally, only nsd replies are not
being sent.

Here is the issue, roughly:

    - given multiple servers, labeled, a-z
    - servers k and z run nsd
    - with the exception of server b, all other servers can communicate
      normally with servers k and z
    - with the exception of DNS queries, server b can communicate
      normally with server k and z
    - b can ping, ssh to, rsync, scp, to and from server k and z

The only issue is when b makes a DNS query to k or z. I see those two
servers get the query, and return the answer, but that answer never
reaches b. I have sniffed the network to confirm this.

Observe:

    # in these examples:
    # b.example.org = 66.66.66.66, the server that is misbehaving
    # k.example.org = 1.1.1.1, one of my DNS servers
    # c.example.org = 3.3.3.3, another server of mine, which I am
looking up the DNS for

    # b make initially query to k
        14:11:46.912995 IP 66.66.66.66.18394 > 1.1.1.1.53: 22479+ A?
c.example.org. (31)

    # k receives query and immediately returns the answer
        14:11:46.931605 IP 66.66.66.66.18394 > 1.1.1.1.53: 22479+ A?
c.example.org. (31)
        14:11:46.931854 IP 1.1.1.1.53 > 66.66.66.66.18394: 22479*-
1/2/1 A 3.3.3.3 (103)

    # this second line, the answer, never makes it to b
    # after a second or two, it makes another query:
        14:11:51.969083 IP 66.66.66.66.12645 > 1.1.1.1.53: 22479+ A?
c.example.org. (31)

    # k receives the second query and immediately returns the answer again
        14:11:51.991267 IP 66.66.66.66.12645 > 1.1.1.1.53: 22479+ A?
c.example.org. (31)
        14:11:51.991508 IP 1.1.1.1.53 > 66.66.66.66.12645: 22479*-
1/2/1 A 3.3.3.3 (103)

    # there still nothing from tcpdump on b's interface that it
received the answer

    # [DNS names and IPs have been changed above.]

Here's what it looks like from b's command line

    $ host c.example.org k.example.org
    # a few seconds delay
    ;; connection timed out; no servers could be reached
    $


b has the same problem with my my other server z, which also runs nsd.

All my other servers can query k and z just fine. Only b is exhibiting
this problem.

All the servers run pf/sshguard. But these rules/configs have not been
updated in months.

I did do one other thing to debug. I shutdown nsd on k, and setup a
listener on b like this

    nc -l 10000

And on k, I did this:

    ls /etc | sudo nc -s 1.1.1.1 -p 53 b.example.org 10000

This produced the contents of /etc on b. So that means that without nsd
in the picture, k is able to talk to b via port 53 just fine.


All the above servers in question are running FreeBSD 11.1-RELEASE-p6.


I'm not exactly sure how I can debug this problem further, I'm not sure
where the block is happening.

Any help appreciated.



Want to link to this message? Use this URL: <http://docs.FreeBSD.org/cgi/mid.cgi?CACcSE1yXgcp8vJS2CTmXWTwSs_cxrAU+NjirNn2n8Ls7fSsjgQ>