• Welcome to Hurricane Electric's IPv6 Tunnel Broker Forums.

Odd RDNS issue with HE's DNS Server

Started by comptech, August 29, 2009, 01:10:28 PM

Previous topic - Next topic

comptech

If I do a dig for my RDNS entry on my router's internal interface I get no response.  From either the IPv4 or IPv6 address of HE's Recursive DNS server.

; <<>> DiG 9.5.0-P2 <<>> @2001:470:20::2 -x 2001:470:1f11:1bc::
; (1 server found)
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 41203
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.c.b.1.0.1.1.f.1.0.7.4.0.1.0.0.2.ip6.arpa. IN PTR

;; Query time: 87 msec
;; SERVER: 2001:470:20::2#53(2001:470:20::2)
;; WHEN: Sat Aug 29 15:00:35 2009
;; MSG SIZE  rcvd: 90


If I do a trace, it works fine.  Until the TTL expires and I get no response again.

; <<>> DiG 9.5.0-P2 <<>> @2001:470:20::2 +trace -x 2001:470:1f11:1bc::
; (1 server found)
;; global options:  printcmd
.                       3599632 IN      NS      j.root-servers.net.
.                       3599632 IN      NS      h.root-servers.net.
.                       3599632 IN      NS      a.root-servers.net.
.                       3599632 IN      NS      b.root-servers.net.
.                       3599632 IN      NS      g.root-servers.net.
.                       3599632 IN      NS      e.root-servers.net.
.                       3599632 IN      NS      m.root-servers.net.
.                       3599632 IN      NS      f.root-servers.net.
.                       3599632 IN      NS      d.root-servers.net.
.                       3599632 IN      NS      i.root-servers.net.
.                       3599632 IN      NS      l.root-servers.net.
.                       3599632 IN      NS      k.root-servers.net.
.                       3599632 IN      NS      c.root-servers.net.
;; Received 496 bytes from 2001:470:20::2#53(2001:470:20::2) in 64 ms

ip6.arpa.               172800  IN      NS      SEC1.APNIC.NET.
ip6.arpa.               172800  IN      NS      TINNIE.ARIN.NET.
ip6.arpa.               172800  IN      NS      NS2.LACNIC.NET.
ip6.arpa.               172800  IN      NS      NS.ICANN.ORG.
ip6.arpa.               172800  IN      NS      NS-SEC.RIPE.NET.
;; Received 221 bytes from 192.33.4.12#53(c.root-servers.net) in 34 ms

0.7.4.0.1.0.0.2.ip6.arpa. 10800 IN      NS      ns3.he.net.
0.7.4.0.1.0.0.2.ip6.arpa. 10800 IN      NS      ns1.he.net.
0.7.4.0.1.0.0.2.ip6.arpa. 10800 IN      NS      ns5.he.net.
0.7.4.0.1.0.0.2.ip6.arpa. 10800 IN      NS      ns2.he.net.
0.7.4.0.1.0.0.2.ip6.arpa. 10800 IN      NS      ns4.he.net.
;; Received 186 bytes from 2001:610:240:0:53::4#53(NS-SEC.RIPE.NET) in 160 ms

c.b.1.0.1.1.f.1.0.7.4.0.1.0.0.2.ip6.arpa. 4900 IN NS xenon.it.cx.
;; Received 115 bytes from 2001:470:400::2#53(ns4.he.net) in 83 ms

0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.c.b.1.0.1.1.f.1.0.7.4.0.1.0.0.2.ip6.arpa. 60 IN PTR router-sis0.comptech.it.cx.
c.b.1.0.1.1.f.1.0.7.4.0.1.0.0.2.ip6.arpa. 60 IN NS xenon.it.cx.
;; Received 150 bytes from 2001:470:1f11:1bc:20f:1fff:fe04:8733#53(xenon.it.cx) in 9 ms


Response after the trace.

; <<>> DiG 9.5.0-P2 <<>> @2001:470:20::2 -x 2001:470:1f11:1bc::
; (1 server found)
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 11463
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.c.b.1.0.1.1.f.1.0.7.4.0.1.0.0.2.ip6.arpa. IN PTR

;; ANSWER SECTION:
0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.c.b.1.0.1.1.f.1.0.7.4.0.1.0.0.2.ip6.arpa. 60 IN PTR router-sis0.comptech.it.cx.

;; Query time: 195 msec
;; SERVER: 2001:470:20::2#53(2001:470:20::2)
;; WHEN: Sat Aug 29 15:05:24 2009
;; MSG SIZE  rcvd: 130


Any ideas?  It used to work correctly until a couple days ago.

maestroevolution

I'm seeing the same thing for my domain for the "test your mailserver has remote DNS".

I noticed that ns2 doesn't allow recursive queries, so it's hard to troubleshoot what it thinks the problem is.


jimb

#2
Quote from: maestroevolution on August 30, 2009, 09:16:58 PM
I'm seeing the same thing for my domain for the "test your mailserver has remote DNS".

I noticed that ns2 doesn't allow recursive queries, so it's hard to troubleshoot what it thinks the problem is.


It's likely a slave for the HE's 2001:470::/32 RDNS, since ns1 is the master.  So it shouldn't need to recurse for the delegations.

Also, are you aware that NS2 isn't the same as the anycasted caching only server?  Similar IPs, but not the same:
ns2.he.net  2001:470:200::2
anycast     2001:470:20::2


Just FYI in case you're mixing them up.  :P

I suspect some odd problem on the anycasted server(s).  It seems to be able to recurse the NS records for the delegated /64s and such, but gets a SERVFAIL when u query for PTR of /64, /128.  The ns1-5.he.net servers return NS records in those cases.  Almost like it refuses to recurse for subdomains of the /32.  I wonder if it's caching only for everything, including the HE ip6.arpas?  Mebbe it's an "unlisted slave" w/ bad data?  Or maybe there's a problem with it talking to the internal HE ns1-5.he.net servers 'cause of an issue related to anycast routing, so it gets SERVFAIL (like reply traffic gets routed back to the wrong server 'cause of routing paths at diff locations for the anycast)?  Guessing wildly here.  :P

maestroevolution

Quote from: jimb on August 31, 2009, 12:13:58 AM

Also, are you aware that NS2 isn't the same as the anycasted caching only server?  Similar IPs, but not the same:
ns2.he.net  2001:470:200::2
anycast     2001:470:20::2



Yes, I was aware.  When I run tests against the anycast, I'm hitting ns1.  The anycast DNS (ns1) seems able to resolve, it's only when I query the ns2 that I see failures.

Joel

gshaver


The NS that we have delegated to returns an A and AAAA.  Pdns wants an A or a AAAA but not both.  For maximum compatibility
it will always try to get an A record and will fall through to a AAAA if no A is available.

Here's the lookup order from one of the recursors

Where is this delegated? (we'll skip all the root server stuff and query ns1 & ns2)

dig -x 2001:470:1f11:1bc:: @ns1.he.net +nocomments

; <<>> DiG 9.4.2-P2 <<>> -x 2001:470:1f11:1bc:: @ns1.he.net +nocomments
;; global options:  printcmd
;0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.c.b.1.0.1.1.f.1.0.7.4.0.1.0.0.2.ip6.arpa. IN PTR
c.b.1.0.1.1.f.1.0.7.4.0.1.0.0.2.ip6.arpa. 4900 IN NS xenon.it.cx.
;; Query time: 1 msec
;; SERVER: 216.218.130.2#53(216.218.130.2)
;; WHEN: Mon Aug 31 17:57:30 2009
;; MSG SIZE  rcvd: 115

** Query from ns2 as well **

dig -x 2001:470:1f11:1bc:: @ns2.he.net +nocomments

; <<>> DiG 9.4.2-P2 <<>> -x 2001:470:1f11:1bc:: @ns2.he.net +nocomments
;; global options:  printcmd
;0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.c.b.1.0.1.1.f.1.0.7.4.0.1.0.0.2.ip6.arpa. IN PTR
c.b.1.0.1.1.f.1.0.7.4.0.1.0.0.2.ip6.arpa. 4900 IN NS xenon.it.cx.
;; Query time: 22 msec
;; SERVER: 2001:470:200::2#53(2001:470:200::2)
;; WHEN: Mon Aug 31 18:20:13 2009
;; MSG SIZE  rcvd: 115



From the above, we find that it's delegated to xenon.it.cx

host xenon.it.cx
xenon.it.cx has address 98.156.93.10
xenon.it.cx has IPv6 address 2001:470:1f11:1bc:20f:1fff:fe04:8733


ok we have an A and a AAAA

We'll check the AAAA first

dig -x 2001:470:1f11:1bc:: @2001:470:1f11:1bc:20f:1fff:fe04:8733  +nocomments

; <<>> DiG 9.4.2-P2 <<>> -x 2001:470:1f11:1bc:: @2001:470:1f11:1bc:20f:1fff:fe04:8733 +nocomments
;; global options:  printcmd
;0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.c.b.1.0.1.1.f.1.0.7.4.0.1.0.0.2.ip6.arpa. IN PTR
0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.c.b.1.0.1.1.f.1.0.7.4.0.1.0.0.2.ip6.arpa. 60 IN PTR router-sis0.comptech.it.cx.
c.b.1.0.1.1.f.1.0.7.4.0.1.0.0.2.ip6.arpa. 60 IN NS xenon.it.cx.
;; Query time: 120 msec
;; SERVER: 2001:470:1f11:1bc:20f:1fff:fe04:8733#53(2001:470:1f11:1bc:20f:1fff:fe04:8733)
;; WHEN: Mon Aug 31 18:03:33 2009
;; MSG SIZE  rcvd: 150



Perfect!

Next the A

dig -x 2001:470:1f11:1bc:: @98.156.93.10  +nocomments

; <<>> DiG 9.4.2-P2 <<>> -x 2001:470:1f11:1bc:: @98.156.93.10 +nocomments
;; global options:  printcmd
;; connection timed out; no servers could be reached


Not so good.

The behavior you see from a dig +trace is the result of dig pulling the query apart and performing
a piece at a time.  The v6 address gets cached in the recursor that you're using and remans until the
cache expires. (this is why it would work for a little while after the dig +trace was done.

Here's the trace from pdns showing the failure.

Aug 31 15:32:31 admin pdns_recursor[11530]: [23] 0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.c.b.1.0.1.1.f.1.0.7.4.0.1.0.0.2.ip6.arpa.: We have NS in cache for 'c.b.1.0.1.1.f.1.0.7.4.0.1.0.0.2.ip6.arpa.' (flawedNSSet=0)
Aug 31 15:32:31 admin pdns_recursor[11530]: [23] 0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.c.b.1.0.1.1.f.1.0.7.4.0.1.0.0.2.ip6.arpa.: Cache consultations done, have 1 NS to contact
Aug 31 15:32:31 admin pdns_recursor[11530]: [23] 0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.c.b.1.0.1.1.f.1.0.7.4.0.1.0.0.2.ip6.arpa.: Nameservers: xenon.it.cx.(239ms)
Aug 31 15:32:31 admin pdns_recursor[11530]: [23] 0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.c.b.1.0.1.1.f.1.0.7.4.0.1.0.0.2.ip6.arpa.: Trying to resolve NS 'xenon.it.cx.' (1/1)
Aug 31 15:32:31 admin pdns_recursor[11530]: [23]   xenon.it.cx.: Looking for CNAME cache hit of 'xenon.it.cx.|CNAME'
Aug 31 15:32:31 admin pdns_recursor[11530]: [23]   xenon.it.cx.: No CNAME cache hit of 'xenon.it.cx.|CNAME' found
Aug 31 15:32:31 admin pdns_recursor[11530]: [23]   xenon.it.cx.: Found cache hit for A: 98.156.93.10[ttl=3556]
Aug 31 15:32:31 admin pdns_recursor[11530]: [23] 0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.c.b.1.0.1.1.f.1.0.7.4.0.1.0.0.2.ip6.arpa.: Resolved 'c.b.1.0.1.1.f.1.0.7.4.0.1.0.0.2.ip6.arpa.' NS xenon.it.cx. to: 98.156.93.10
Aug 31 15:32:31 admin pdns_recursor[11530]: [23] 0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.c.b.1.0.1.1.f.1.0.7.4.0.1.0.0.2.ip6.arpa.: Trying IP 98.156.93.10, asking '0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.c.b.1.0.1.1.f.1.0.7.4.0.1.0.0.2.ip6.arpa.|PTR'
Aug 31 15:32:31 admin pdns_recursor[11530]: [23] 0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.c.b.1.0.1.1.f.1.0.7.4.0.1.0.0.2.ip6.arpa.: query throttled
Aug 31 15:32:31 admin pdns_recursor[11530]: [23] 0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.c.b.1.0.1.1.f.1.0.7.4.0.1.0.0.2.ip6.arpa.: Failed to resolve via any of the 1 offered NS at level 'c.b.1.0.1.1.f.1.0.7.4.0.1.0.0.2.ip6.arpa.'
Aug 31 15:32:31 admin pdns_recursor[11530]: [23] 0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.c.b.1.0.1.1.f.1.0.7.4.0.1.0.0.2.ip6.arpa.: failed (res=-1)
Aug 31 15:32:31 admin pdns_recursor[11530]: [23] answer to question '0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.c.b.1.0.1.1.f.1.0.7.4.0.1.0.0.2.ip6.arpa.|PTR': 0 answers, 0 additional, took 0 packets, 1 throttled, 0 timeouts, 0 tcp connections, rcode=2


To work around this you may want to specify two nameservers one with the v4 address and a
second with the v6 address. (this could be the same machine if you wanted).  That should trick it into failing
over to v6 if v4 seems to be down.

Regards,
Gary



jimb

#5
Interesting.  So since the IPv4 for his NS was unreachable, it won't just fail over and try the IPv6?  But if there are two different NS, one w/ a v4 and one w/ a v6, it would?   ???

gshaver

From what the pdns trace shows, it's going to search until it finds an a or aaaa for the nameserver. 
If it fails for lookup with the ip that it got, it's done.  It doesn't seem to want a second opinion. 

If it has two ns' it has one to try in case the first one fails even if it's in a completely different address family.

A second address for the nameserver != second nameserver in this case.

There's probably some obscure RFC supporting/condemning the behavior... or both

Regards,
Gary

comptech

I fixed the IPv4 problem with my DNS server and it seems to be working fine now.   ;D

I'll have to create an IPv4 only and IPv6 only host name for my DNS server so this doesn't happen anymore.

jimb

Quote from: gshaver on August 31, 2009, 07:46:28 PM
From what the pdns trace shows, it's going to search until it finds an a or aaaa for the nameserver. 
If it fails for lookup with the ip that it got, it's done.  It doesn't seem to want a second opinion. 

If it has two ns' it has one to try in case the first one fails even if it's in a completely different address family.

A second address for the nameserver != second nameserver in this case.

There's probably some obscure RFC supporting/condemning the behavior... or both

Regards,
Gary
Interesting.  Do you find that behavior consistent among all name servers (BIND, MS, etc)?

Also, I sort of figured that the IPv6 oriented RDNS tests would just be done by some PHP or Perl script calling "dig aaaa" or somesuch against the IPv6 of the delegated NS to make sure the DNS server has IPv6 connectivity, etc, but I guess it's just easier to do a "dig" without pointing it to a specific NS, and just use the existing infrastructure.  Of course, this revealed this little issue with a unreachable A record, so it's also a more "thorough" test, if you look at it a certain way.  :P

maestroevolution

Ahh.. I expect this is why my issue is, too.

I did create a pseudo-dummy A record for the mail server (as that's what the test  does: checks for IPv6 reverse DNS of the mail server), but I did not for the DNS server.

I'll create an A record as soon as I finish migrating the vms to their new ESX home.

Thanks,

Joel