0

I've read that a domain may appear in a daily zone file on multiple days through some change to the dns record. Unfortunately, the source didn't explain the circumstances of when it appears in the daily file. Could anybody enlighten me on this?

Also, (correct me if I'm wrong) once you have had an entire zone file, you then use the daily files to keep your local copy up to date. What mechanism can be used to determine when an entry should be deleted?

As an example... what I have is a large list of keywords. To begin with, I need to search for domains that include or are similar to those keywords. Going forward, I need to be able to perform a smaller search of the keywords over only new domains. The list of keywords can be added to and the new keywords will need to be searched historically and going forward. So, I will need a local database of domains that would only contain domains that actually exist without having to query any nameserver to check for it's existence.

I believe that registrars provide daily deltas but I don't know how expired domains are represented.

Hopefully, the example makes it a bit clearer what I'm trying to do.

I might have just found my own answer... http://bestwhois.org/domain_name_data/docs/README_01_document.html#sec12 They have 2 feeds - 1 for newly registered domains and another for dropped domains.

If anybody can see anything I've overlooked, please let me know.

Thanks!

cj.steele
  • 1
  • 1
  • I am afraid this kind of list doesn't exists. It is also based on question / answer as DNS system at all... The list you may find could be marketing based and it is not over all lists... :-( The list is more or less based on some registrar information and not a whole domain... At the end there are nowadays tens of TLD. You can check (e.g.) https://novekoncovky.cz/domeny . I don't see the switch to english so at least short legend : gray - private, green - ready to register, (let say) cyan - in registration process. – Kamil J Feb 20 '20 at 15:59
  • @KamilJ Zonefiles can be downloaded, see my answer. As for the list of TLDs (which is basically the root zonefile), the authoritative source is IANA, see https://www.iana.org/domains/root/db ; there are more than a thousand TLDs! If you want to track their launch dates the authoritative souce is ICANN: https://gtldresult.icann.org/applicationstatus/viewstatus and https://newgtlds.icann.org/en/program-status/sunrise-claims-periods – Patrick Mevzek Mar 09 '20 at 04:47

3 Answers3

1

A domain name may be absent from the zone file for a number of reasons:

  • it has expired (not renewed)
  • on-hold (eg suspended for abuse)
  • or simply because it has no name servers

Thus if you have a domain name (say from a commercial provider), and you remove all the name servers, it's no longer provisioned in the zone file and not resolving.

Kate
  • 453
  • 3
  • 7
  • Nitpick: it is not exactly the expiration that removes from the zonefile. Typically nothing appears right after expiration. Some days after the registrar may put it on hold (hence disappearing from zonefile) or wait some more days before deleting it and then it is still in zonefile for some days. – Patrick Mevzek Mar 09 '20 at 04:26
0

Hmmm I am not sure about your wording - daily files...

Anyway let check it in general. There is several TTL (Time To Live) parameter related to zone file. Few of them is located in SOA (Start Of zone Authority file):

<domain> IN SOA <primary master FQDN> <administrator contact> (
        <serial> ;
        <refresh> ;
        <retry> ;
        <expire> ;
        <TTL / minimum>
)

refresh - time period in seconds when should DNS server in slave role check the master for the update ("backend" stuff between authoritative servers)

retry - in case slave server fail with update check when to retry the check ("backend" stuff between authoritative servers)

expire - in case slave server cannot check the master for the zone status for how long it can still provide authoritative answers. After this time period the zone file on slave will expire and the serve will stop answering query related to this zone ("backend" stuff between authoritative servers)

TTL / minimum - In case somebody request the RR (DNS record) which does not exists the negative answer will be attached with SOA (so including this information) so the client (DNS resolver) will cache the negative answer for this period. So the same answer (even without rechecking during this period) will be provided to all query ends up on the same DNS resolving server.

Example of RR record:

<name> <TTL> IN A <IP>

Once you will request valid DNS record you are getting individual TTL for the record which will be used for the local caching purpose. You are getting ONLY the DNS records you have requested and not all the zone file. There is exception for this as "hint" or additional records which may be attached for infrastructure purpose but it is far from "whole zone".

"Whole zone file" as you have use in the question can be requested only using special request ( zone transfer request ) which is permitted only to limited server. It is used for backend purpose to distribute the zone file content between valid authoritative servers in roles master - slave. Normal user / client will be refused on zone transfer request.

So as a "normal" user you can get only specific answer / RR records (optionally followed with "hints") or negative answer. In both cases followed with TTL for this answer which can be lower or even higher than just a day (24 hours - 86400 seconds). During this period the cached value (both negative or positive) is used instead of re-quering the authoritative DNS server.

Once there is change in DNS zone the clients who didn't recently (understand in relation to TTL) query it on authoritative server will see directly the new value but the clients who recently query it will see the cached values. That may be called delay in propagation of the new value. The "problem" in this case is not on the side of authoritative servers but on cached values on the client sides... The process of clearing the cache (selectively for domain or all / without condition) on the server is called FLUSHing of the cache (it can be done by administrator). The normal way how the "out-of-date" value is removed from client side is wait for this time to expire - once the TTL expire the caching server is invalidating the value and with the next request it is not there (removed) or at least simply ignored so proper query to authoritative server is done...

Kamil J
  • 1,587
  • 1
  • 4
  • 10
  • Whoa, thanks for the detailed answer! Unfortunately, I think through lack of knowledge on my part, I've worded the question badly. I'll update my question with an example of what I'm trying to do. – cj.steele Feb 20 '20 at 08:21
  • OK. As I have commented directly the question I am afraid this kind of list doesn't exists in global view. Some partial list could exists. It is also question / answer system... – Kamil J Feb 20 '20 at 16:01
  • ""Whole zone file" as you have use in the question can be requested only using special request ( zone transfer request )" No; All gTLDs make their zonefile available, per ICANN regulations. See https://czds.icann.org/home – Patrick Mevzek Mar 09 '20 at 04:25
0

First some generic explanations to cover some confusions you may have.

Domains exist in TLDs. gTLDs are under contract with ICANN. A domain appear in a zonefile. Registries (managers of TLDs) decide if they publish zonefiles or not. Most, especially ccTLDs, will not, considering that it is both private data and that they are responsible for it. However gTLDs are forced to publish them, due to ICANN regulations. You can learn all about that at https://czds.icann.org/

In short you create an account once, and then will be able to download zonefiles.

gTLDs publish daily zonefiles. Hence a domain will appear one day if it started to be registered (and with nameservers) or if it didn't have nameservers and now has, or, as @Anonymous listed in its reply, when it is put on hold, or deleted (before or after expiration), or changed to remove all nameservers.

Some other registries may allow DNS AXFR queries which means you will be able to get back dynamically the full zonefile when requested, but only a dozen or so TLDs do that.

Also some other registries provide "open data" services, through which you can also get zonefiles or equivalent. Some also publish daily on their websites the new names that have been registered, which is not the zonefile but if you get that data day after day at some point you will be close to have a full zonefile. AFNIC, the registry of .FR is in these 2 cases for example.

Now back to your questions:

I've read that a domain may appear in a daily zone file on multiple days through some change to the dns record. Unfortunately, the source didn't explain the circumstances of when it appears in the daily file.

This should be clear now from the above. A domain is published (in a zonefile) once it exists (is registered), has nameservers and is not on hold. If it ceases to exist, does not have nameservers anymore or is put on hold, then it will disappear from the zonefile.

Also, (correct me if I'm wrong) once you have had an entire zone file, you then use the daily files to keep your local copy up to date. What mechanism can be used to determine when an entry should be deleted?

gTLDs publish the full zonefile, each day. You are free to download it and then process it the way you want, based on your contract signed on CZDS. Other registries may impose also other conditions.

If domain A is in yesterday zonefile but not in today's one then you know that the domain has been deleted, or its nameserver removed, or it was put on hold. If you do a whois (or RDAP) query you will then see if the domain exists or not, and if it is on hole or not.

As an example... what I have is a large list of keywords. To begin with, I need to search for domains that include or are similar to those keywords. Going forward, I need to be able to perform a smaller search of the keywords over only new domains. The list of keywords can be added to and the new keywords will need to be searched historically and going forward. So, I will need a local database of domains that would only contain domains that actually exist without having to query any nameserver to check for it's existence.

Many services online do this. But basically you download the zonefiles and process them on your end in a way that conforms to the contract signed and technically so that you can use them the way you need. The keyword search and everything else is to be handled by yourself.

I believe that registrars provide daily deltas but I don't know how expired domains are represented.

First registrars can only provide data they have, hence on their domains not on all of them. So I guess you refer there more about registries, and so see above.

Second, domain name expiration is a complicated process and depends on the TLD. Here are the generic rules:

  • when expiration arrives the registry auto-renews the domain; hence it stays published (in zonefiles) if it was before that event
  • sponsoring registrar has then some time to decide to renew it or not (in gTLDs, this is 45 days)
  • during that time the registrar can decide to put the domain on hold; in which case the domain will cease to be published and hence ceased to be in zonefiles
  • if finally the domain is deleted it will start its redemption period; it may not be published anymore there; and after some more time if nothing happens the domain will really be deleted (and hence not published... until eventually someone registers it again).

I might have just found my own answer... http://bestwhois.org/domain_name_data/docs/README_01_document.html#sec12 They have 2 feeds - 1 for newly registered domains and another for dropped domains.

Anyone downloading zonefiles is then easily able to provide differences:

  • new between yesterday and today = domains registered or updated with new nameservers or put out of hold
  • missing today from yesterday = domains deleted or updated without nameservers or put on hold

By doing a whois query you can see if the domain is still registered and you will see if it is on hold or not. Hence you will be able to discriminate between all the above cases.

There is nothing very complicated to do there, except:

  • volume of data: millions of domains
  • you rely on whois for part of it, which is typically query limited
  • you signed a contract to be able to download zonefiles and this contract will limit what you can do or not with the data.
Patrick Mevzek
  • 9,273
  • 7
  • 29
  • 42