I'm trying to build a web application that is similar to Youtube (it's not a knock off), but I guess I don't know how video is served on the internet very well.
I know how to build regular database driven web applications, but nothing like the scalability of Youtube. All of the applications I have built before have all been run on one server with the files stored on the same box as the web server.
How does one decouple the application server from the file storage from the media server? I would more or less want four machines (clusters of machines):
- Application servers -- Present the web page, handle user uploads, link the user's flash player to the correct media server etc.
- Database shards -- Store user information, check favorites, etc.
- File storage -- Store the media files
- Media servers -- Serve the media files
How do I hook all of this together? Which technologies should I leverage? Where do I go to learn more about architecting this?
How does Youtube's embeddable flash stuff work? I want to embed my flash player on other websites and have it tie into my architecture.
Note: I have looked into: http://highscalability.com/youtube-architecture
But I still don't get the overall picture of how this stuff ties together. If someone can explain in high level terms how all of this stuff works?
Are there dedicated client servers running internally to shuffle around all of this stuff between the application servers, file storage, etc. Is it all via HTTP using JSON, what is going on here!
Thanks!