There are two reasons: (a) security was not a strong priority, and (b) differences in 32-bit vs 64-bit architectures.
The NX bit is only supported on 64-bit architectures. The NX bit provides page-level execute permissions, so the operating system can mark some pages as non-executable. The NX bit is the standard way make certain pages non-executable and thus is the standard and cleanest way to implement DEP. However, 64-bit architectures took a while to gain popularity, so implementing support in the OS for NX-bit based DEP only became of significant benefit to security once many people started using 64-bit architectures.
Thus, implementation of DEP was partly slowed by slow adoption of 64-bit architectures. (If 32-bit Intel platforms has supported the NX bit, we might have seen earlier deployment of DEP -- but it didn't, so we didn't.) DEP didn't become widespread until 64-bit Intel/AMD chips became widespread. So, that's why it took so long for DEP to become widespread.
OK, now here's where I admit that the above is a little bit of an oversimplification -- though not too much. In fact, there is a way to implement DEP on 32-bit architectures, but it is much harder and does not mesh well with the way most operating systems are currently built.
On 32-bit Intel architectures, there are actually two different memory protection mechanisms: page-level protection, and segment-level protection. Most operating systems rely upon the page-level mechanism (page tables and such) for memory protection. The page-level mechanism is the most flexible, because each page can have its own protection level (e.g., read-only vs read/write; user-accessible vs kernel-only). However, 32-bit Intel processors also support segment-based memory protection. A segment is a consecutive region of memory, and you can have a few different segments. Each segment can receive its own protection access. Because page-level protection is more flexible, most operating systems do not use the segment-level protection (they effectively treat memory as one big segment, and turn of segmentation).
For some unknown reason, on 32-bit architectures, segment-level protection does allow to mark a segment as non-executable, but the page-level protection mechanism does not allow to mark a page as non-executable. (I dunno why, it is probably just an artifact of history.) This means that it is possible to implement DEP on a 32-bit architecture, by using the segment-level protections. However, this requires all sorts of contortions and major changes to the operating system. Doing so gets pretty messy and complicated, and also has some performance implications. For this reason, many operating systems were reluctant to implement DEP on 32-bit architectures.
So, it's not quite accurate to say that DEP is impossible on 32-bit architectures and first became possible on 64-bit architectures -- but, for engineering purposes, it's pretty close to the truth.