1) How do you set up multiple web servers while maintaining the same content (would I have to upload the files to all the servers everytime they were updated or is there another way?)
Situations get more complicated with scale. Starting out, you would simply deploy all content to all servers at the same time manually. You could write a script to do this.
As you can larger, configuration management software can help. Also, OpenEFS would be a suitable solution.
With static content and certain types of content, there are more options available. These decisions because more glaring when you attempt to modify or upload content via the load balanced application itself, as if you do not handle this well, you will result in data partition. For example, an uploaded file will only be on one of the load balanced servers.
In no particular order:
- Use a content delivery network.
- Proxy your static content with caching yourself.
- Abstract the data storage to a highly available filesystem.
- Often, ideally, the preferred solution is to store the content in a database.
2) Similarly, if I had to scale out the DB into more than one server, is there any other way apart from replication or is mysql replication the best way to go about it.
It depends on your end goal: consistency, availability, or partition tolerance. Compromises will likely be necessary. This is an expansive subject, where you would benefit from reading a book such as High Performance MySQL. Common options:
- Dual master replication using a technology like Linux-HA, VRRP, or multi master MySQL. This would have a floating IP. You would need to implement auto-id offset and be aware of application performance.
- Using a solution like DRBD for block level storage replication and then again using a technology like Linux-HA to failover the resources in case of failure.
MySQL also has various white papers published.
3) I have read that separating the database from the web server is a good idea, why is that? If I had 2 servers, can't I have both the DB and the files on both servers.
It is better to dedicate role to purpose, as it reduces complexity and security risk. You would likely benefit from a minimum of two load balancers, two database servers, and two Web servers. Be aware of additional points of failure, as high availability will not stop there. Your network will likely be the next obvious single point of failure.
It also enables standard builds, scaling, and transferring roles with less complication. Nevertheless, these features are not unique to separating server roles.
4) Is something known as a load balancer needed and would it help balance mysql queries as well if replication were set up?
For write queries and MySQL, it's often easier to scale up vertically. To scale out horizontally, which would be preferred with many modern solutions, you need to employ an architecture such as sharding. Ideally, your application would have to be designed to support that. There are also various middleware solutions that I am generally wary of.
You could load balance easily to MySQL replication slave servers, which would allow you to use a load balanced VIP for readonly queries.