Regarding uplinking one switch into another:
No, they don't share MAC address tables. Each switch maintains its own bridging table, which is built by listening to the traffic each switch receives on a give port. Consider the following example (apologies for the terrible ASCII art):
________ 1________2 2________1 ________
|Host A|-----|Switch 1|------|Switch 2|-----|Host B|
-------- ---------- ---------- --------
Host A is connected to Switch 1, Port 1. Host B is connected to Switch 2, Port 1. The two switches are interconnected via Port 2 on both.
Assume at the start the the bridging tables of both switches are empty. Host A wants to send frame to Host B. (To simplify things, we'll assuming that host A and host B have static ARP entries for each other, so there is no need to ARP for MAC addresses).
- Host A send a frame to Host B. The source MAC of the frame is AA:AA:AA:AA:AA:AA, the destination MAC is BB:BB:BB:BB:BB:BB.
- Switch 1 currently has an empty bridging table. On receipt of the frame it does two things:
- It creates a new entry in its bridging table that AA:AA:AA:AA:AA:AA exists on Port 1.
- As it doesn't know where BB:BB:BB:BB:BB:BB is, it floods the frame to every port except the one it was originally heard from (Port 1).
- Switch 2 receives the flooded frame on its Port 2. Again, as its bridging table is initially empty, it follows the same process:
- New entry: AA:AA:AA:AA:AA:AA exists on Port 2 (recall this is on Switch 2, which has an independent bridging table from Switch 1)
- The frame is flooded out of all ports except the one it was received on.
At this point, Host B receives the frame. When Host B sends a response, the following happens.
- Host B send a frame to Host A. The source MAC of the frame is BB:BB:BB:BB:BB:BB, the destination MAC is AA:AA:AA:AA:AA:AA.
- Switch 2 currently has one entry in its bridging table (AA:AA:AA:AA:AA:AA -> Port 2). On receipt of the frame it does two things:
- It creates an entry in its bridging table that BB:BB:BB:BB:BB:BB exists out of port 1.
- As it has a specific bridging table entry for the destination MAC (AA:AA:AA:AA:AA:AA), it forwards the frame out of port 2 only, rather than flooding as it did before.
- Switch 1 receives the forwarded frame on its port 2. Again, it follows the same process:
- New entry: BB:BB:BB:BB:BB:BB exists out of port 2
- There is a specific bridging table entry (AA:AA:AA:AA:AA:AA -> port 1), so the frame is forwarded out of that port only.
In terms of learning MAC addresses, this same process is followed regardless of the number of switches and the number of devices connected to them. As you add more complexity to your switched network (VLANs, Spanning Tree), more subtleties come in to play, but the base algorithm remains the same.
Regarding your second and third questions:
2) My personal bias is to minimise switching wherever possible. Spanning tree is the bane of many professional lives; add to that the fact that Ethernet has no loop protection; a minor misconfiguration could lead to broadcast storms that require you to manually intervene and down links in order for them to subside. Even if your network is small, have at least one router off which all your layer 2 subnets hang; it's just easier in my opinion.
3) It depends very much on the scale of your network, and how much intranet vs. internet traffic you expect to see. If there will be a lot of communication between departments, it may make sense to have a hierarchy of routers so that pure internal traffic does not impact internet access for everyone else. If on the other hand, you expect everyone to access only a common set of services (AD, email) and the internet, then a single core router (or a pair, for redundancy) may be sufficient.
In terms of giving each department a router and meshing them, how is this network to be administered? If there is going to be one administrative IT authority, then just build a hierarchical network; having users served by shared routers won't be a problem. If each department is going to maintain their own IT staff, then a router per department and internal peering may be required, but it will most likely complicate your network design.