1

JTAG

System software debug support is for many software developers the main reason to be interested in JTAG. Many silicon architectures such as PowerPC, MIPS, ARM, x86 built an entire software debug, instruction tracing, and data tracing infrastructure around the basic JTAG protocol. - source

I'd like to know if there are any malware detection solutions that use the dedicated debug port on x86 motherboards leverage the JTAG protocol to observe processes and detect malicious behavior signatures as they occur on the victim machine.

This port seems like a powerful solution to modern malware detection problems based on the fact that external hardware gets to monitor the system's every state change.

I have a lot of research left to do on how JTAG works, but some possibilities I considered for why it (using the dedicated physical debug port) might not work are:

  • Perhaps JTAG can only debug one core at a time, or not all cores at once, making it impossible to use for a system-wide monitoring solution. Relevant question

  • Perhaps the performance cost is too high. Relevant question

  • Perhaps I completely misunderstood the workings of this capability and various details make what I'm suggesting impossible.

Context

Based on this related question I asked recently about using an OS's debugging API to track a process state, you should be able to understand this question about JTAG a little better.

To recap, that question is about my research on the application of machine learning against register and memory state change patterns to defeat evasive and polymorphic techniques used by modern malware to avoid behavior based signature recognition traditionally performed within emulator sandboxes.

By watching processes actually executing on the real machine where they must demonstrate their behavior in order to accomplish the desired goal, we can avoid the weaknesses experienced by emulator based approaches (which would be an already defeated layer in our defense strategy by the time the solution I'm asking about now would be relevant).

The question

Are there any existing JTAG (hardware) based malware detection systems, and if not, why?

J.Todd
  • 1,300
  • 1
  • 10
  • 20
  • I suppose there's also the question of whether or not this is the best SE site to ask this question. I know it's on-topic but not sure whether some other SE community might be more familiar with the subject matter. – J.Todd Jun 10 '21 at 17:05
  • 1
    This is similar to your other question. JTAG is often used to hook up a debugger and/or step through instructions. It's probably going to introduce a delay if you try to observe anything. – hft Jun 10 '21 at 19:19
  • Maybe the right subject area (or SE forum) would be one related to debugging real-time systems. – hft Jun 10 '21 at 19:20

1 Answers1

1

The issue was in fact bandwidth

JTAG is a bit-serial interface that runs at a maximum of 100 Mbits/sec, including all of the overhead bits for the protocol. That's something less than 12.5 Mbytes/sec of actual data. If you want to record 64 bits (8 bytes) @ 3 GHz, that would produce 24 Gbytes/second of data, more than 2000× what the JTAG interface can handle. - Dave Tweed

J.Todd
  • 1,300
  • 1
  • 10
  • 20
  • That's totally correct. You can, however, use JTAG to _analyze_ malware with OpenOCD and GDB, but that's done manually and cannot be done by automated systems for the reason you point out. – forest Jun 23 '21 at 00:25
  • @forest However I found that Intel implements JTAG over a dual bus system (XDP) that allows one bus to bypass the slower components and operate at a higher frequency tied into the cores and supports multiplexing. Still too slow, but it demonstrates that if the security industry saw value in such a capability, (and I would argue there is value, specifically for customers willing to pay for the specialized product) it should be feasible to develop a processor with enough bandwidth, probably involving a separate JTAG port for each physical core. You dont need 24 Gbytes/second, either (1/2) – J.Todd Jun 23 '21 at 15:32
  • (2/2) I realized we don't need to read a modern CPU's potential of writing 64 bits per op, each instruction itself, although variable, will average 1-2 bytes, so we need probably just over 3 Gbytes/s, half the bandwidth of DDR3. I'm not a computer engineer, so it's hard for me to figure out how hard it would be to add busses with that much capacity connecting each physical core to a JTAG port. But Hyperthreading already uses a second set of registers for each physical core and that's provenly insecure, maybe that infrastructure could be repurposed. – J.Todd Jun 23 '21 at 15:40
  • @forest Maybe I was a bit unclear introducing that idea about needing only 3GBytes of bandwidth instead of 24. 24 would be observing a 64 bit write operation each clock cycle, but instead we only really need to observe the [Current Instruction Register](https://en.wikipedia.org/wiki/Instruction_register) or [Instruction pipeline](https://en.wikipedia.org/wiki/Instruction_pipelining) which for should equate to 1 byte of data per clock cycle. The ability to read and write to those registers at 3GB/s via some extended version of XDP would create the ultimate security monitoring capability. – J.Todd Jun 23 '21 at 15:55
  • Then again, I'm forgetting that fetch-decode execute is a 3 clock cycle process, so we probably only need to read each fetch, so likely only ~1GB/s of bandwidth is needed per physical core on a modern CPU to make it possible. – J.Todd Jun 23 '21 at 16:19
  • I'm skeptical that a modern Intel CPU in probe mode will be able to run JTAG at any useful speeds. I don't believe the bottleneck is the ITP/XDP system but rather the fact that every instruction has to go through a slow path. – forest Jun 24 '21 at 00:22
  • @forest interesting, so a whole new protocol and system would be needed. maybe worth it – J.Todd Jun 24 '21 at 10:55