I am testing my Linode Ubuntu 14 64bit server out, it's the most basic server available from them. I am using Apache Benchmark to test the server, as well as a multithreaded script I wrote in Python, but more on that later. Using AB I noticed I get around 7k requests per second when run locally from the server itself but only around 15 when run from another network/internet. Response time is also about 150 ms for 1000 concurrent connections, locally. Remotely the response time is about 1.5-2.5 seconds for 100 concurrent connections. The network I'm running the remote tests from has plenty of bandwidth and the computer I'm running it from has plenty of ram and processor speed; it's a fast business network. I even tried 2 other networks from around the U.S., on two other computers and the speeds are all about the same.
When running my multi-threaded script I notice it hiccups as soon as I try more than just 100 concurrent requests, this is from external networks. I haven't tried my script locally on the server yet as I either need to upgrade my Python on the server to 3+ or change my script to be 2.7 compatible. I tested this locally and get 150 ms response time when running the script with up to 1000 multithreaded connections, it's simply using urllib2.
I am testing this against nginx directly (a static file), a pywsgi app that's behind nginx, as well as directly to the pywsgi. The pywsgi app has a simple route that replies with a basic response so it should be fast. Not surprisingly nginx->pywsgi provides the best results, probably due to how it buffers the requests. Is there something that is specific with Linode's network that is causing this issue? The orders of magnitude difference between internal and external tests makes me wonder what could be the cause. The only other thing in the way would be iptables firewall, just filtering in http/s and ssh.
dmesg has no information pertaining to my tests.