0

I`ve setup a small python program that uses a websocket connection to acquire data from an API and write it to a postgresql database.

There are only two (user) programs running: the websocket connection that receives data and write it to the database, and another program that is basic a while loop that runs every 15 seconds and check if data is being written.

Both programs are daemonized with supervisor and when no data is being written for 15 seconds, supervisor is restarted (to handle dead websocket connection).

Also I was (very) lazy and used Django ORM to do the db connection for me instead of psycopg2 directly.

It works, but I'm having a steady high cpu load on the server. It is a 1CPU 1Gb memory server (AWS micro). Top command outputs the following:

top - 17:10:58 up 19 days, 15:03,  1 user,  load average: 1,57, 1,63,    1,58
Tasks: 116 total,   1 running, 115 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0,3 us,  0,3 sy,  0,0 ni,  0,0 id, 99,0 wa,  0,0 hi,  0,0 si,  0,3 st
 KiB Mem :  1014552 total,    63440 free,    86572 used,   864540 buff/cache
 KiB Swap:  1048572 total,   987380 free,    61192 used.   615096 avail Mem

Well, system is laggish and ocasionally crashes.

I can see that this is caused by a high IO load (99 wa). Lots of sleeping processes. But I`m writing an average of 400MB per day only on the database.

I've tried modding Postgresql config for high writing load following documentation and setting up a swap memory file (1GB), but those didn`t help on easing the load average.

So, for a non-experienced user like me, I don't know if this is the load I should expect. Can I optimize this setup? If I remove django ORM and write the program using only psycopg2, would it improve? Should I make a different check method for dead websocket connections? Is there any server config that can optimize for this need case?

Thanks!

Alex
  • 121
  • 2
  • If `wait` is at 99% then there's a huge amount of io happening. 400MB/day should be barely noticeable on any modern hardware. Saying that, 1GB of ram is quite small - what is the memory usage on this system? If you're enabling Swap then this could be that the system is using the disk in place of memory, leading to massive io spikes. See what effect `swapoff -a` has on io and memory usage? – match Feb 17 '18 at 17:58
  • Hi @match! Thanks for the response. Enabling memory swap didn't make any change on the load. I have 600MB free memory (seen in top command). After your comment about 400MB a day, I think my code is messed up somehow. How can I debug this IO access to check for sure if the python process is the bad guy. – Alex Feb 17 '18 at 18:04
  • You're right actually that memory isn't the issue - I just clocked the memory lines from top - most of it is in buffers, and swap is hardly used, so it's not swap at fault here. I'd check the python script (I assume it's the `1 running` in top), but also check that nothing has high amounts of debug logging enabled (potential source of lots of disk io). – match Feb 17 '18 at 18:09

1 Answers1

2

I manage to figure out how to fix it using this link http://bencane.com/2012/08/06/troubleshooting-high-io-wait-in-linux/

Basically I ran iotop and that outputed which process was causing high IO load. There I found out that the load was due to reading the database and not writing. So I realized I was doing something stupid: I was quering entire database a lot by counting the number of entries while checking if it was recording. That was averaging 500MB/s of reading. So I changed the code to check the Primary key number of the last entry instead of counting entries and... worked. Load average is now 0.01. Thanks @match for the headsup

Alex
  • 121
  • 2