This is a very complicated question, expect a few answers as people better the responses of others :)
The professor said that the best place to run programs is in cache.
Remember that cache is MANY times more expensive than normal RAM. Back when a 'big' computer was 8MB (not gigs, megabytes), you could find machines were all 'cache' (it's technically a special type of RAM called SRAM) but they were more expensive. Now, you have home machines with 4GB of memory, 4GB of SRAM wired to the chip would be VERY expensive. Besides, you have many smart folks playing with programs and compilers to make the best use of cache. With the right caching algorithm, You get 95% of the benefit of cache, with a small percentage of the cost. Of course, the guesses aren't always right. Google 'branch prediction' for more info.
I was wondering why programs can't be run in registers?
Registers are what's actually to load and store data and addresses. Think of them as taxis. They can deliver things back and forth, what they deliver is your program data and addresses. Every part of your program that's 'run' goes through a register.
I'm assuming you're asking why you can't just run completely from registers. One reason - there's so few of them. Classic intel x86 registers are counted in bytes, but the programs are in Megabytes, Gigabytes. You'd be quite a rich person to have a chip that could run MS-Word out of registers.
Also, how can a program load itself into cache?
The program doesn't. The OS runs the program, and uses the Memory Management Unit chip to load the areas of program from normal RAM. While it does that, the MMU is smart and puts some of the memory also in cache, with the idea that I just used it, I may need to use it again soon.
Isn't the cache something that's controlled by the CPU and works automatically without software control?
Yes, technically the memory management chip not the CPU. This used to be a separate chip, but now is part of the CPU block, to make communication faster.