I'm trying to setup a cluster using a private network on subnet 10. One machine has two interfaces, one to connect to the regular network and the other to connect to all the nodes on subnet 10. This CentOS 6 machine (let's call it "zaza.domain.com") runs DHCP, DNS and currently both of these are managed by Cobbler, which may or may not be part of the problem (although disabling it and doing everything manually still gives me problems).
If I SSH into zaza, and then try to SSH from zaza into node1, I get a warning message like follows:
[root@zaza ~]# ssh node1
reverse mapping checking getaddrinfo for node1.cluster.local [10.69.0.1] failed - POSSIBLE BREAK-IN ATTEMPT!
I still get a password prompt and can still login OK.
I know from sshd warning, "POSSIBLE BREAK-IN ATTEMPT!" for failed reverse DNS and "POSSIBLE BREAK-IN ATTEMPT!" in /var/log/secure — what does this mean? and a bunch of other searching that the cause of this error typically is a PTR record not being set. However, it is set - consider the following:
[root@zaza ~]# nslookup node1.cluster.local
Server: 10.69.0.69
Address: 10.69.0.69#53
Name: node1.cluster.local
Address: 10.69.0.1
[root@zaza ~]# nslookup 10.69.0.1
Server: 10.69.0.69
Address: 10.69.0.69#53
1.0.69.10.in-addr.arpa name = node1.cluster.local.
The 10.69.0.69 IP address is second interface of zaza.
If I try a different tool like dig, to actually view the PTR record I get the following output:
[root@zaza ~]# dig ptr 1.0.69.10.in-addr.arpa
; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.47.rc1.el6_8.4 <<>> ptr 69.0.69.10.in-addr.arpa
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 29499
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 1
;; QUESTION SECTION:
;1.0.69.10.in-addr.arpa. IN PTR
;; ANSWER SECTION:
1.0.69.10.in-addr.arpa. 300 IN PTR node1.cluster.local.
;; AUTHORITY SECTION:
10.in-addr.arpa. 300 IN NS zaza.cluster.local.
;; ADDITIONAL SECTION: zaza.cluster.local. 300 IN A 10.69.0.69
;; Query time: 0 msec
;; SERVER: 10.69.0.69#53(10.69.0.69)
;; WHEN: Wed Mar 1 17:05:44 2017
;; MSG SIZE rcvd: 110
It looks to me like the PTR record is set, so I don't know why SSH would throw a hissy fit when I try to connect to one of the node machines. To give all the information, here's the relevant config files, spoilered to make things look just a tad more readable...
/etc/named.conf
[root@zaza ~]# cat /etc/named.conf
options {
listen-on port 53 { any; };
directory "/var/named";
dump-file "/var/named/data/cache_dump.db";
statistics-file "/var/named/data/named_stats.txt";
memstatistics-file "/var/named/data/named_mem_stats.txt";
allow-query { any; }; # was localhost
recursion yes;
# setup DNS forwarding
forwarders {1.2.3.4;}; # Real IP goes in here
};
logging {
channel default_debug {
file "data/named.run";
severity dynamic;
};
};
zone "cluster.local." {
type master;
file "cluster.local";
# these two lines allow DNS querying
allow-update { any; };
notify no;
};
zone "10.in-addr.arpa." {
type master;
file "10";
# these two lines allow DNS querying
allow-update { any; };
notify no;
};
/var/named/cluster.local
[root@zaza ~]# cat /var/named/cluster.local
$TTL 300
@ IN SOA zaza.cluster.local. nobody.example.com. (
2017030100 ; Serial
600 ; Refresh
1800 ; Retry
604800 ; Expire
300 ; TTL
)
IN NS zaza.cluster.local.
zaza IN A 10.69.0.69
node1 IN A 10.69.0.1;
node2 IN A 10.69.0.2;
/var/named/10
[root@zaza ~]# cat /var/named/10
$TTL 300
@ IN SOA zaza.cluster.local. root.zaza.cluster.local. (
2017030100 ; Serial
600 ; Refresh
1800 ; Retry
604800 ; Expire
300 ; TTL
)
IN NS zaza.cluster.local.
69.0.69 IN PTR zaza.cluster.local.
1.0.69 IN PTR node1.cluster.local.
2.0.69 IN PTR node2.cluster.local.
If you have any ideas, it'd be much appreciated!