5

From the Wikipedia page on DEP.

DEP was introduced on Linux in 2004 (kernel 2.6.8[2]), on Windows in 2004 with Windows XP Service Pack 2,[3] while Apple introduced DEP when they moved to x86 in 2006.

Why did it take until 2004, when DEP was released, to properly enforce memory page access flags in hardware? Are there any limitations in the hardware or software architecture that made it difficult?

2 Answers2

6

It's primairly a question of cost-benefit analysis on the hardware manufacturer's side, and a question of risk analysis on the behalf of OS vendors.

The x86 architecture has provided memory page access rights as far back as the 80386, but at that time such hardware enforcement would have been expensive to implement in the hardware, and wasn't seen as necessary. Computers were already prohibitively expensive for most users, and most small businesses, so hardware manufacturers didn't want to increase the cost for the sake of a feature their customers wouldn't understand or care about.

As time progressed, this complacency became more apparent. The era of buffer overflows was upon us, and it was starting to hurt big businesses. By 2003, computer security had become proven to be a commercially profitable field, and hardware prices had dropped to commodity levels. AMD and Intel finally decided that hardware enforcement was a profitable feature, so they implemented Never eXecute (NX) in early 2004. AMD released the Athlon XP-M "Dublin" with NX support, and the first Intel processor to support it was the Pentium 4 Prescott (revision E0). However, at this point, the market share of processors with NX was minimal.

The major barrier to software implementation was that Physical Address Extension (PAE) had to be enabled and supported before NX was available for use. This meant extensive changes to core memory management inside the OS kernels.

Another part of the problem on the software side was poorly written applications. Some code relied on the ability to erroneously execute data that had not been placed in memory pages marked as executable. With the advent of hardware enforcement of memory access flags, these programs would throw an access violation. Careful consideration of the benefits was needed before going ahead with the changes. DEP as a concept, without hardware support, was considered as a potential feature of Linux at an earlier date, but it was rejected due to complexity problems, breaking changes, and the fact that the software enforcement mechanisms could easily be bypassed.

Once hardware support was available, operating systems had to be modified to use it. This was not a trivial task. DEP marks certain important structures, such as the stack and heaps, as non-executable. This required every initialization routine that dealt with creation of threads and processes to be modified to support NX, as well as certain changes to memory management (e.g. heap allocations). Finally, some hardware abstraction layer code had to be modified, in order to properly recognise the NX flag on supporting processors.

All in all, this actually happened surprisingly quickly. Once the hardware NX enforcement was available, both Windows XP and Linux released NX-compatible builds within 8 months.

Polynomial
  • 132,208
  • 43
  • 298
  • 379
1

There are two reasons: (a) security was not a strong priority, and (b) differences in 32-bit vs 64-bit architectures.

The NX bit is only supported on 64-bit architectures. The NX bit provides page-level execute permissions, so the operating system can mark some pages as non-executable. The NX bit is the standard way make certain pages non-executable and thus is the standard and cleanest way to implement DEP. However, 64-bit architectures took a while to gain popularity, so implementing support in the OS for NX-bit based DEP only became of significant benefit to security once many people started using 64-bit architectures.

Thus, implementation of DEP was partly slowed by slow adoption of 64-bit architectures. (If 32-bit Intel platforms has supported the NX bit, we might have seen earlier deployment of DEP -- but it didn't, so we didn't.) DEP didn't become widespread until 64-bit Intel/AMD chips became widespread. So, that's why it took so long for DEP to become widespread.


OK, now here's where I admit that the above is a little bit of an oversimplification -- though not too much. In fact, there is a way to implement DEP on 32-bit architectures, but it is much harder and does not mesh well with the way most operating systems are currently built.

On 32-bit Intel architectures, there are actually two different memory protection mechanisms: page-level protection, and segment-level protection. Most operating systems rely upon the page-level mechanism (page tables and such) for memory protection. The page-level mechanism is the most flexible, because each page can have its own protection level (e.g., read-only vs read/write; user-accessible vs kernel-only). However, 32-bit Intel processors also support segment-based memory protection. A segment is a consecutive region of memory, and you can have a few different segments. Each segment can receive its own protection access. Because page-level protection is more flexible, most operating systems do not use the segment-level protection (they effectively treat memory as one big segment, and turn of segmentation).

For some unknown reason, on 32-bit architectures, segment-level protection does allow to mark a segment as non-executable, but the page-level protection mechanism does not allow to mark a page as non-executable. (I dunno why, it is probably just an artifact of history.) This means that it is possible to implement DEP on a 32-bit architecture, by using the segment-level protections. However, this requires all sorts of contortions and major changes to the operating system. Doing so gets pretty messy and complicated, and also has some performance implications. For this reason, many operating systems were reluctant to implement DEP on 32-bit architectures.

So, it's not quite accurate to say that DEP is impossible on 32-bit architectures and first became possible on 64-bit architectures -- but, for engineering purposes, it's pretty close to the truth.

D.W.
  • 98,420
  • 30
  • 267
  • 572