I'm trying to solve a dns problem I've been having with my web app.
It makes multiple requests to various but fixed external domains. I can't put the domains in a host for for obvious reasons, cloudfront / load balancing and other changes of ip.
I've found despite running timeouts and handling stale outbound connections I've found that simulating dns failures reproduces the failures I'm seeing within my web app.
Therefore I think I should be implementing a local dns cache. I've chosen powerdns recoursor to handle my outbound requests. It will deal with 500-1000 requests per second all to the same 8 or so domains.
What I'm hoping to achieve is reduced dns failures, either communication errors, slow dns responses or failed dns responses. Believe it or not we were using googles dns before and occasionally it would fail to respond and it would make our app crash and at peak times really make our threads hang and consume resources.
So have I got the right idea, running a local recursive dns?
I'm thinking of running the local alongside google in my resolv.cond with rotate turned on along with other configuration.
What I'm not sure about is how powerdns actually resolves queries, I've set no forward zones but it will still return a dig within 30ms and all subsequent results from cache.
Can you pick holes with my logic and if this is a good solution to my dns reliability?
Thanks