4

I have recently launched a site that is very popular but I am having trouble with scalability. My site makes heavy use of FFmpeg and at peak times RAM usage hits the 2 GB point quickly and the swap file starts getting used. CPU usage starts rising too.

Users complain that the site is slow. They say this because all FFmpeg instances run very slow because of the number running at the same time. Users make use of FFmpeg on my server in real time.

Is there anything I can consider or do to ease down the usage of the server and RAM just shooting up? Maybe there is something better than FFmpeg (!).

Is the only solution "throwing some cash" at a more powerful server?

I have given little information, please ask for more, so this problem can be solved.

Peter Mortensen
  • 2,319
  • 5
  • 23
  • 24
Abs
  • 1,429
  • 4
  • 18
  • 32
  • 2
    Don't really think this is suitable for community wiki. – Jonathan Prior May 23 '09 at 00:12
  • I didn't make it community wiki. This question was migrated from stackoverflow because it was not appropriate there. – Abs May 23 '09 at 11:44
  • Sorry, I have to reply in an answer but I can't comment. I don't have enough rep, I do in stackoverflow :) Anyway, very good answers. **Arkain** - I could do that but the web application needs to be real time. Using some sort of queing or scheduling system will mean some users will still have to wait more than they need to. **crunchyt** - You are right, one box does everything! It might require some work to code my application to be able to spread itself across many boxes, but this should be easy to do. Thank you for the references too, will read! **pQd** - I do have some static content. I act – Abs May 22 '09 at 23:23
  • Are you transcoding in real-time? If so, what's the source? Is it live, or pre-recorded? If pre-recorded, can you cache the output from previous runs of ffmpeg so that you're not repeatedly transconding the same data? – Alnitak May 23 '09 at 09:32
  • Yes it is real time and the video is pre-recorded. – Abs May 23 '09 at 11:39
  • Good thought, but there are many (10,000s) pre-recorded videos and it will be too much of a burden in terms of storage! – Abs May 23 '09 at 15:11
  • Storage is a *lot* cheaper than CPU. – afrazier May 23 '10 at 13:00
  • Do you have some static content you serve over HTTP? [for example, processed movies?] Do you use Apache for it? If so - move to some server better suited for that - for instance [mathopd](http://www.mathopd.org/), [lighthttpd](http://www.lighttpd.net/) or [nginx](http://nginx.net/). And yes - suggestions from crunchyt and Arkain are very good assuming your service will grow. Sooner or later you will need to isolate tasks and spread the load while using some queuing system. Just make sure you design from very beginning with that [ and sharding ] in mind so you don't have to rewrite whole system – pQd May 22 '09 at 22:46

6 Answers6

12

Well, an easy solution would be to queue the Ffmpeg tasks, so only a fixed number are running at anyone time. And you should really consider running the Ffmpeg processes on a separate machine from the webserver.

Peter Mortensen
  • 2,319
  • 5
  • 23
  • 24
4

This is a common structuring problem, not so much a memory problem. Sounds like you are cramming everything onto one box? DB, Web and MPG processing? This won't scale very well!

Regardless of your application, anything processing intensive will work better across multiple machines using a batch system. By spreading the load across multiple boxes, and keeping the really intensive work away from the web tier, your users will thank you!

Your web tier should only be serving the interface. You should have 1+ machines dedicated to processing video in the background. This should then become available for serving by the web tier once ready.

The best reference on this topic I have found is Building Scalable Websites by Cal Henderson, ex-CTO of Flickr. The previous link is to Amazon, so you can preview the book on the cheap. This linke to Google Books will also let you read up.

Good luck!

1

I think you could probably do some things to improve the memory usage, but when it is all said and done you will most likely have come out better by buying some more memory. I'm sure I will get voted down for this answer, but I'm just thinking about the economics of fixing this problem.

Jeff
  • 1,008
  • 1
  • 10
  • 14
  • You haven't got voted down yet. :) This will probably have to be the quick solution now while I work in re-writing and improving how the application works. – Abs May 23 '09 at 11:40
1

I think the easiest quickest thing to do would just to buy a new server. Seriously, a Dell 2950 with 32 GB of RAM and 8 cores at 3.2 GHz I think was only $8 or $10k CAD. It would be easy to spend half that and still get something that can run lots of parrallel tasks and have lots of RAM. You definitely wouldn't be capped at 2 GB and swapping to disk.

Peter Mortensen
  • 2,319
  • 5
  • 23
  • 24
Kevin Nisbet
  • 818
  • 6
  • 8
0

ffmpeg is very CPU bound, not just memory. The app will hardly be any faster because the box has more RAM - more instances means each runs slower and has less CPU to use.

Unless you can optimize ffmpeg itself or use an async queue, you need to get more machines.

Go for primarily CPU with enough RAM to not start swapping before you max out utilization on all CPUs.

Artem Russakovskii
  • 973
  • 3
  • 11
  • 25
  • You are right CPU usage also rockets. I am looking into ways of optimizing ffmpeg now, I doubt I will come up with anything that will be radically changing on performance. :( – Abs May 23 '09 at 11:43
  • 1
    You could try to go for EC2 or some other cloud computing solution - it may just be cheaper. – Artem Russakovskii May 23 '09 at 18:22
0

At the risk of sounding like a marketing droid from a cloud computing company, this is what cloud computing is meant for.

I would strong suggest using something like Amazon EC2 or Rackspace Cloud. Create a basic image that contains ffmpeg, and an interface that allows ffmpeg to be called remotely from your app. Create some instances of that image and make sure that, using the cloud provider of choice, you are able to create and destroy instances of that image to match the load. Your app should then delegate all ffmpeg tasks to your cloud servers and control the number of cloud servers that are up based on the number of ffmpeg jobs it needs to process synchronously. This will keep what I perceive as a bottleneck in your app, video transcoding/etc, separate from your app and able to scale at will.

whaley
  • 181
  • 3