Given that you have problems both with compression enabled and with compression disabled, it sounds like the root cause of your problem is not the compression but rather somewhere else.
You explain how the symptoms change when you enable/disable compression. The change in symptoms is however only a minor change, and that change is what you should be expecting for most compression schemes.
When you have compression disabled you see that the first part of the response is produced quickly. It is entirely possible that the part of the response produced quickly is a header with static contents independent of the user data, and that may be why it is produced so quickly. The problematic part is that the data source takes a long time to produce the rest of the output.
Once the compression is introduced the symptoms change. Likely the header data is quickly delivered to the compression code, but in order to improve performance the compression code will wait for more data before it sends and of the data to the client. So a small amount of data will be sitting in that buffer for a while.
Some APIs permit the code producing the data to instruct the compression code to flush the buffered data, which if used here could make the symptom look the same as in the uncompressed setup. But if the data produced thus far isn't useful on its own it isn't worthwhile to flush the buffer.
What you should be looking into is why it takes such a long time before the last byte is send. I am guessing if you try to measure the time to last byte it will be roughly the same with and without compression.
How is the load on the server?
If the load on the server is high it may indicate it is busy doing CPU intensive tasks to produce responses to the requests it is receiving. It could also indicate that the server needs lots of I/O (which can usually be reduced by adding more RAM).
If the load on the server is low and it is still slow to respond it usually indicates that it is either waiting for network communication with another server (perhaps a DB or a DNS server is slow to respond). It could also indicate a bug somewhere in your code.
If you cannot deduce from server load, inspection of network traffic, or log files what is causing the slowness you may need to add more logging to your application in order to know what it is doing for such a long time.