It's mainly about being able to verify an entity's identity without any prior interaction.
One of the main purposes of a CA is to assert certain pieces of information about one entity, in such a way that another entity that trusts this CA can trust this information, even though the two entities do not need to have been in contact ever before.
It is effectively trusted party that lets you check the identity (generally, but also possibly other attributes) about an entity you need to interact with, but may never have met.
You can compare this to a passport issuing authority (i.e. generally speaking, a country).
Even if you have never met a person, if someone comes to you with a passport (from a country you deem responsible enough in its passport administration), you can trust the binding between picture, name (and other properties), and attach a name to the person in front of you with reasonable certainty.
The applies to Public Key Certificates: the issuing CA allows you to trust the binding between the public key and the identity. For an SSL/TLS server, the identity is its host name (see RFC 6125), and the public key is the one used during the SSL/TLS handshake.
So a Certificate Authority certifies that a website corresponds to a given organization. Why is this necessary?
Linking a website to an organisation isn't strictly necessary. It's mostly the aim of Extended Validation certificates.
It just makes some people feel more confident about the website they are visiting. It is generally useful against attackers who control host names with names similar to legitimate websites you would visit.
I interact with them physically in real life, or through mail, or what have you. I can just get their public key when I meet with them.
Certainly, but there's a few problems with that:
Bootstrapping: if you've never met that person, the fact that they're giving you their public key the first time you meet them doesn't make them who they say they are. This is very much a problem on the web, since there's always a first time you go to a website, without having visited it before on this machine. Even with notary systems like the Perspectives Project, you need a first visitor.
You can't always interact with websites in a physical manner.
Even if you somehow manage to interact with the website in a physical manner, most people will have no idea what you are talking about when you ask for their public key. Try asking your bank's public key next time you walk into a branch...
Managing a handful a public keys of people you've met is possible, but there's generally a large amount of websites you can visit. There's a point where you wouldn't want to manage a list, especially when using multiple machines, for example.
On top of this, you generally remember someone's face when you see them later. In contrast, a website can change its public key (typically every year, but sometimes much more often). You'd still have a to find a way for them to tell you about this change in a manner you can trust.
I interact with them solely digitally. I start with no reason to trust them, but gain trust over time. I only really need to make sure someone doesn't impersonate them, so I just check that it is the same key each time.
Trusting a website is unfortunately much more binary than progressively building trust with someone.
There's an expectation of wanting every interaction, even the first one, not to have been intercepted and altered by a MITM attack.
A friend refers me to a site. I can get the key from them.
The CA system is very hierarchical. To some extent, it's a particular case of the Web-of-Trust system, where you can delegate trust to people you know, who in turn will delegate trust in people they know, and so on.
WoT is a valid system, but it comes with its own problems:
- Trusting someone's identity is real doesn't imply that this person is generally trustworthy: it's not because you know someone's real name that they're not going to lie about someone else's identity.
- As you said, you'd solve that problem by getting the key from someone you know to be a friend, but how would you know this friend has made the relevant identity checks rigorously?
Essentially, the set of people whose identity I know is a superset of the people I generally trust to have good intentions, which is itself a superset of the people I believe to be sufficiently competent to check other people's identity too.
Modelling trust and trust delegation can get complex very quickly when you take these parameters into account.
The purpose of the CA is to be an entity that you can trust have vetted what they put in their certificates properly. Of course, they fail at their job once in a while, but it's a reasonable compromise in order to make it sufficiently simple to be widely usable. More complex models would require much more complex user interfaces and explanation to the users. I think many of users would find the complexity overwhelming, and end up discarding any warnings (which would be worse in the end).
Of course, there are ways to mitigate some of the problems with CAs (e.g. intermediate notary systems), but there ultimately needs to be a judgement call by the user. The CA system is far from perfect.
How does a website having a CA certificate make me trust it more than if it doesn't?
The website itself doesn't have a CA certificate as such, it has a certificate issued by a CA. It's not about trusting the website itself, but it's about trusting that the website's identity is who it says it is (and that it's the hostname you are looking for).
Of course this only moves the bootstrapping one step further: why would you trust the CA certificates bundled with your browser or OS?