How do I tell where the hardware bottleneck is in my system when ruining a particular task

1

I realise that there are several duplicate questions around this topic already but all the questions that I have looked at have been for general computer slowdowns. In my case I have a very processor intensive task and I want to see what I can do to speed it up.

The task in question (but I am looking for general solutions, not specific to this task) is stitching 2 4K video files together to form one 360 video.

There are 2 phases to this: stitching and optimising.

Stitching is VERY CPU intensive and I recently both an I9 processor with 28 threads that nearly doubled the speed. Now when running this part of the process the processor runs at around 80% (previously it was 100% all the time) so this implies that something else is slowing it down. My 32GB of memory is only 30% used but perhaps it is memory speed?

Disk usage seems to be around 1 - 2%

GPU usage is at about 30%

I doubt I can further speed this up very much at all (just for reference it currently takes around an hour to stitch 1 hour of footage)

The second phase takes around 2 hours for 1 hour of footage and basically creates a lower res video to make editing easier.

During this phase CPU usage is only about 30% and nothing else is highly utilised either (based on resource monitor). There must be something hardware based that is slowing it down - how can I tell what?

Many Thanks

Some notes on my system (which aren't really relevant for answering this question but I put it here for completeness)

  • i9 7940X
  • 32GB (2x16GB) 2400MHz
  • 2 separate M.2 drives (one for read one for write)
  • Nvidia GTX 1080 Ti
  • ASUS ROG STRIX X299-E GAMING

Roaders

Posted 2018-10-30T20:09:14.203

Reputation: 195

What software are you using? – cybernard – 2018-10-30T22:04:56.757

Answers

2

There's a lot you can still do to monitor your hardware and narrow down things.

You didn't say if you're running a windows machine or which version of OS but I am assuming it's a Windows 10 machine for now.

You can use a tool called performance monitor (perfmon). This will enable you to monitor key metrics for your system and based on the values determine where your bottleneck may be.

Have a look around you may also find some metrics for GPU and the like too.

To actually find your metrics is a bit of an art and to be honest I don't have them all to hand but try the following for now:

Memory | % Committed Bytes in Use: Tracks what percentage of your RAM is currently committed (“in use”). This should fluctuate as apps are opened and closed, but if it steadily increases, it could indicate a memory leak.

Network Interface | Bytes Total/sec: Tracks how many bytes are sent and received over a particular network interface (such as Wi-Fi or Ethernet). If this ever gets above 70% of an interface’s bandwidth, you should consider upgrading.

Paging File | % Usage: Tracks how much of your system’s paging file is being used. If this is consistently high, you should consider increasing your physical RAM or at least increase the size of your paging file.

Physical Disk | % Disk Time: Tracks how much of the hard drive’s time is spent handling read and/or write requests. If this is consistently high, you should consider upgrading to a solid state drive.

Physical Disk | % Disk Read Time: Same as above except only for read requests.

Physical Disk | % Disk Write Time: Same as above except only for write requests.

Processor | % Interrupt Time: Tracks how much time is spent by your CPU handling hardware interrupts. If this is consistently above 10-20%, it could indicate a potential issue in one of your hardware components.

Thread | % Processor Time: Tracks how much of your processor’s capabilities are being used by an individual process thread (an app could have multiple threads). Only useful if you can identify which thread to monitor.

Some SQL ones that I have used:

PhysicalDisk(_Total)\Avg. Disk sec/Read PhysicalDisk(_Total)\Avg. Disk sec/Write These two counters tell you how quickly your I/O subsystem is responding to requests for data from the operating system; in other words, latency. The latency values returned are valid regardless of the type of I/O subsystem you're using, whether it's local physical magnetic disk, SAN drives, NAS drives, or solid state drives. Your latency values should normally not be more than 20ms; if you're using SSD, probably not more than 5ms. If you see latency values of a second or more, your I/O subsystem has issues that need to be addressed to keep performance at an acceptable level.

System\Processor Queue Length The Processor Queue Length counter tells you the number of threads that are waiting for time on the system processor. If this number is greater than 0, that means that there are more requests per core than the system can handle, and this can be a cause for significant performance issues. I once had a client that had a month-end process that had to be run during the business day, which would take 2.5 to 3 hours to run; when it ran, performance for everyone else on that system would be horribly slow. I looked at the Processor Queue Length counter – normally it would get to no higher than 3 or 4 during the day, but During month-end it jumped to somewhere between 30 and 50. The client was running on a virtual machine with 4 processors, and I asked if they could double that. They did, and the next month-end completed in 45 minutes.

TheNerdyNerd

Posted 2018-10-30T20:09:14.203

Reputation: 408

Thanks for this. I haven't tried perfmon. There doesn't seem to be anything here to tell me if the processor is waiting for memory (other than the paging file bit). Memory is pretty much the only part I can upgrade. – Roaders – 2018-10-31T06:07:46.190