Bind 9.8.2 on CentOS 6.4 x64 on IBM X server.
Bind 9.8.2 from official Centos updates repo used to work fine, but then I needed to tweak it by compiling from source (nothing crazy here -- just compiled with different options than offered in the CentOS RPM. Note that I used the source from bind-9.9.4, not 9.8.x, so maybe this is potentially the problem? I doubt it, but it's possible).
Recently, I decided to go back to installing from RPM, but now, I can't get Bind to start.
The only messages I get tells me nothing:
# named -g -c /etc/named.conf
01-Dec-2013 15:46:57.899 starting BIND 9.8.2rc1-RedHat-9.8.2-0.17.rc1.el6_4.6 -g -c /etc/named.conf
01-Dec-2013 15:46:57.899 built with '--build=x86_64-redhat-linux-gnu' '--host=x86_64-redhat-linux-gnu' '--target=x86_64-redhat-linux-gnu' '--program-prefix=' '--prefix=/usr' '--exec-prefix=/usr' '--bindir=/usr/bin' '--sbindir=/usr/sbin' '--sysconfdir=/etc' '--datadir=/usr/share' '--includedir=/usr/include' '--libdir=/usr/lib64' '--libexecdir=/usr/libexec' '--sharedstatedir=/var/lib' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--with-libtool' '--localstatedir=/var' '--enable-threads' '--enable-ipv6' '--with-pic' '--disable-static' '--disable-openssl-version-check' '--with-dlz-ldap=yes' '--with-dlz-postgres=yes' '--with-dlz-mysql=yes' '--with-dlz-filesystem=yes' '--with-gssapi=yes' '--disable-isc-spnego' '--with-docbook-xsl=/usr/share/sgml/docbook/xsl-stylesheets' '--enable-fixed-rrset' 'build_alias=x86_64-redhat-linux-gnu' 'host_alias=x86_64-redhat-linux-gnu' 'target_alias=x86_64-redhat-linux-gnu' 'CFLAGS= -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic' 'CPPFLAGS= -DDIG_SIGCHASE'
01-Dec-2013 15:46:57.899 ----------------------------------------------------
01-Dec-2013 15:46:57.899 BIND 9 is maintained by Internet Systems Consortium,
01-Dec-2013 15:46:57.899 Inc. (ISC), a non-profit 501(c)(3) public-benefit
01-Dec-2013 15:46:57.899 corporation. Support and training for BIND 9 are
01-Dec-2013 15:46:57.899 available at https://www.isc.org/support
01-Dec-2013 15:46:57.899 ----------------------------------------------------
01-Dec-2013 15:46:57.899 adjusted limit on open files from 4096 to 1048576
01-Dec-2013 15:46:57.899 found 24 CPUs, using 24 worker threads
01-Dec-2013 15:46:57.900 using up to 4096 sockets
01-Dec-2013 15:46:57.907 loading configuration: failure
01-Dec-2013 15:46:57.907 exiting (due to fatal error)
Syntax errors in named.conf, and file permission errors, are typically listed just before the loading configuration: failure
log line, but in this case, there is no error, so I don't know what is going on.
The funny thing is, if I reinstall bind from my source compilation (make install), bind works just fine. Right now I'm just using the standard default named.conf. I won't bother posting it here because I know it is valid -- that's not the issue here. I feel like I may have accidentally deleted a shared lib or something, during my fiddling... sticky fingers? working when tired? who knows.
Here's the options I used compiling bind, if it helps:
./configure --build=x86_64-redhat-linux-gnu --host=x86_64-redhat-linux-gnu --target=x86_64-redhat-linux-gnu --prefix=/usr --exec-prefix=/usr --bindir=/usr/bin --sbindir=/usr/sbin --sysconfdir=/etc --datadir=/usr/share --includedir=/usr/include --libdir=/usr/lib64 --libexecdir=/usr/libexec --sharedstatedir=/var/lib --mandir=/usr/share/man --infodir=/usr/share/info --with-libtool --localstatedir=/var --enable-threads --with-dlz-filesystem=yes --with-gssapi=/usr/include/gssapi --enable-fixed-rrset --with-dlopen=yes
Ok, so I'll cut to the chase. I don't need my hand held through this process, but I do need a direction to go in. I have no idea how to further debug this issue. For example, how would I monitor whether or not a process (e.g. named) is querying a shared lib or a resource that isn't there? Bind doesn't have a flag for extra verbosity, aside from what "-g" does. The "-d" flag doesn't give any more verbosity for this error. How would one troubleshoot this further? Ktrace, or some other debugging tool? I'm at a loss, and would love some suggestions.
Things I've tried:
yum reinstall
on every package listed fromrepoquery --requires --recursive --resolve bind
,repoquery --requires --recursive --resolve bind-utils
,repoquery --requires --recursive --resolve bind-libs
yum remove bind bind-utils bind-libs
, then manually remove all remaining tidbits, then reinstall those three packages- Running
ldconfig
after reinstalling bind from RPM (totally redundant, but what the heck) - SElinux is disabled, and App Armor is not installed
I really wish the ISC devs had made an uninstall
make target, because there is really no easy way (that I'm aware of) to uninstall bind after compiling from source. (Feel free to enlighten me there).
Thanks for any pointers, and if you need more info, let me know.
*** EDIT
Output from su - named -c "/usr/sbin/named -d 9 -g -c /etc/named.conf" -s /bin/bash
:
01-Dec-2013 17:16:30.994 Registering DLZ_dlopen driver
01-Dec-2013 17:16:30.994 Registering SDLZ driver 'dlopen'
01-Dec-2013 17:16:30.994 Registering DLZ driver 'dlopen'
01-Dec-2013 17:16:30.996 decrement_reference: delete from rbt: 0x7ff208214068 .
01-Dec-2013 17:16:31.001 load_configuration: failure
01-Dec-2013 17:16:31.001 loading configuration: failure
01-Dec-2013 17:16:31.001 exiting (due to fatal error)
Finally, a little more of a clue! I had been using "-d 10" with bind, which apparently doesn't exist, so no wonder it didn't give me any additional debug info. The above message isn't turning up anything concrete on google, but I'll keep looking. If this sheds any light, please let me know.