How does peer-to-peer work over the internet?

From what I understand, there is no way to send a packet to a computer in a local network from outside the network, unless we know the routing mechanism employed by the router.

Assuming we have a set-up that looks like this:

Computer-A, IP 192.168.1.2 (default gateway 192.168.1.1)
Computer-B, IP 192.168.1.3 (default gateway 192.168.1.1)
Router-C, IP 192.168.1.1 (external IP 1.1.1.1)
Router-D (external IP 2.2.2.2)

Computer-A, Computer-B, and Router-C belongs to the same local network. Router-D wants to send data to Computer-A, but it can't do this without going through Router-C.

Now Router-C will forward packets to Computer-A if the destination port is 1000, and will forward packets to Computer-B if the destination port is 2000. But surely, the only device that knows this routing mechanism is Router-C itself! Not even Computer-A nor Computer-B will know about it, right?

So Router-D can send a packet to Computer-A if it sends a packet to Router-C through port 1000, but how is Router-D to know to send packets through port 1000, and not say port 1001?

How do peer-to-peer programs like Bittorrent get pass this problem? The only solution I can think of is for Router-D to send the packet to Router-C through all ports, such that it gets forwarded to Computer-A, but is there a better solution?

Pacerier

Posted 2012-10-03T17:25:04.320

Reputation: 22 232

Answers

Your confusion stems from some incorrect assumptions.

But surely, the only device that knows this routing mechanism is Router-C itself! Not even Computer-A nor Computer-B will know about it, right?

What, why‽ Then why was the router configured to forward those ports to those IPs? You have to set up the P2P client to use a specific port and then set up the router to correspond.

but how is Router-D to know to send packets through port 1000, and not say port 1001?

Because you configure the P2P client to use a specific port (standard or non-standard for that protocol).

The only solution I can think of is for Router-D to send the packet to Router-C through all ports, such that it gets forwarded to Computer-A, but is there a better solution?

It is much simpler than that. When the client makes a connection to a peer, it specifies which port it wants to use, so the peer sends the data on that port.

Hmm, but Bittorrent doesn't change the router's behavior right? Since some routing mechanism could have been dynamic as demonstrated in superuser.com/a/187190/78897, how is Computer-A able to know about it?

The client doesn’t directly affect the router, but the router can be configured/intelligent enough to adapt to the client’s behavior. You can enable UPnP in both the router and client to automatically configure the connection and most routers have stateful inspection abilities as part of their port-forwarding mechanism.

Take together, what it means is that a connection can be dynamically made on a random port, and then the router can keep track of what is happening instead of viewing everything as random, meaningless connections. That way, it can forward a connection as necessary because for example, it is a response to this other connection that just happened.

The problem comes when you have multiple systems using the same program. Having multiple systems connected to the same router, sharing the same IP and using dynamic ports quickly becomes unmanageable and even with stateful inspection, it is difficult if not impossible to get it to work correctly. In that case, static ports (default or otherwise) will need to be used.

If you use a program like SmartSniff or TCPView to monitor your connections, you will notice that the P2P connections will usually have the port you configured (or the default for the client) as the destination for incoming connections and either the default or a custom/random port for the source, and vice versa for outgoing connections.

Synetech

Posted 2012-10-03T17:25:04.320

Reputation: 63 242

Hmm, but Bittorrent doesn't change the router's behavior right? Since some routing mechanism could have been dynamic as demonstrated in http://superuser.com/a/187190/78897, how is Computer-A able to know about it?

– Pacerier – 2012-10-03T17:46:33.437

Port Forwarding. You set it up beforehand. – UtahJarhead – 2012-10-03T17:51:24.257

@Pacerier If both the router and the p2p app use UPnP, port forwarding can be done dynamically. In this case, yes Bittorrent does change the router's behavior.

– zero2cx – 2012-10-03T17:55:53.003

@zero2cx, is there anyway to do peer-to-peer without changing the router's behavior? – Pacerier – 2012-10-03T18:03:44.490

@Pacerier, if you want to have NAT (multiple systems sharing a connection), then you must somehow configure the router to know where to make connections (well, technically you could just have everybody spew every connection to all ports like you suggested, but that would be at best, horrible). You can do this statically (configure IPs and ports) or dynamically (UPnP). – Synetech – 2012-10-03T18:05:12.667

@Pacerier Yeah, but then incoming connection requests are stopped at the router. Outgoing Bittorent connection requests will not be affected, and many users bittorrent are set up exactly this way. – zero2cx – 2012-10-03T18:07:09.290

Your question touches the heart of the Internet and the very definition of routing. In your example, Router D sends data to Computer A based on two premises:

It's been told to send data to Computer A.
It's already processed data from Computer A.

Your scenario seems to assume the first option - Router D wants to send to Computer A. But how does it get there? It does so through the use of routing tables which are shared by routers amongst each other.

Router C regularly sends updates to all routers in knows about - including Router D - that it "knows" the "192.168.*" network (in reality - this wouldn't happen because that network isn't routed - it's considered private. But ignore that.) So, Router D already knows that Router C knows that network.

So when data is destined for Computer A, it's addressed by network first. So, Router D asks, "I need to find the 192.168.* network. Do I know it? Nope. Does I know someone else who does? Yes. Router C does. How do I get to router C? Through my 2.2.2.2 interface."

Router D then sends the data to Router C. Router C gets it and says, "Oh, I have data from Router D but it's for the 192.168 network. Do I know that network? Yes, through my 192.168.1.1 network" And then forwards it.

There's some other work to be done to resolve IP and MAC addressing, but I'm covering routing, per se, not ARP and local networking.

You'll notice your first assumption - the remote router must know the routing mechanism - does not come into play here. Router D does not care if Router C is using EIGRP, RIP, RIPv2, OSPF, or whatever. All it cares is that it got an update. (Of course, how it got an update is important to ensure the two stay in synch. But again, that's a different issue.)

Your second assumption - that port number is a factor in routing - is also incorrect. Routers (generally) don't need port information to make routing decisions. (That has changed slightly, due to some new network technologies and applies mainly to firewalls and proxies, but still the broader assumption still applies to "true" routers.)

Continuing with your example, Router C will forward data on port 1000 (per your scenario) because it's possible there is a service on Computer A expecting data on that specific port. But it only knows to do because Router D sent it on port 1000. And router D only sends it on that port because the originator of the data sent it to Router D on that port.

I don't understand your inclusion of bittorrent or P2P programs as reflective of the question you ask. The same explanations would apply. Routers also can be configured with port triggering which associates a particular device (or IP) with a particular port. Such that when traffic comes in port 1234, the routers knows to send data to Device ABCD. This is usually associated with an outgoing TCP port. i.e. If I send traffic on port 7890, the router knows incoming traffic will be on port 1234 and send it to me.

But port triggering is not associated with (remote) routing decisions - instead it relates to the internal MAC/IP table the router uses for the LAN.

Update/edit: To further answer and elaborate after your comment. Router D knows Computer A only by its IP address (192.168.2.2). But Router C knows Computer A by its IP address and by its MAC address. The MAC (Media Access Control) is a unique (usually...) 48-bit identifier that is defined by international standard. Every device connected to a LAN (wired and wireless) are supposed to have a unique MAC address.

The router (Router C) associates the IP address and MAC address together in a table (the MAC address table). So when traffic comes into Router C, and the router realizes its "local" to it, it does a MAC address table lookup. The router then literally changes the frame addressing information.

It reconstructs (rewrites) the Layer 2 destination information to have the destination MAC address of Computer A but keeps the IP address information (Layer 3) to be the same.

If the route does NOT know the MAC address. Or does not have an IP-MAC relationship in its table, it does something called an ARP (address resolution protocol) to ask "HEY, everyone on this network. Do you have this MAC address?" Or sometimes - "Everyone, What is your MAC address?"). The appropriate device/devices responds and the router builds its IP-MAC table.

John

Posted 2012-10-03T17:25:04.320

Reputation: 1 869

So Router-D sends a packet to Router-C through port 60000 (preconfigured), Router-C receives the packet, but how does it know that this packet is to be forwarded to Computer-A? – Pacerier – 2012-10-03T17:55:53.800

1Router D already knows the final destination - Computer A. It (Router D) knows Computer A only by its IP address: 192.168.2.2. But, Router C knows knows Computer A by two methods: IP address (192.168.2.2) and something called its MAC address. I'll update the answer with more info. – John – 2012-10-03T18:05:42.527

Port Triggering. How does a web server send a webpage to you after you've requested it? Because you've requested it. When you request it, the router knows to expect a reply and when it gets it, it forwards it to the appropriate PC. Some programs are written to trigger an opening in anticipation of a signal from a specific PC, even if one isn't really on its way.

Some models have a central server used for basic communication. For example:

Client1 signs in with Server for 2-way communications.
Client2 signs in for the same thing.

Server now knows all files that Client1 and Client2 have.

Client2 says "I want file X from Client1" to Server.
Server tells Client1 "Client2 wants X file."
Client 1 sends a garbage piece of data to Client2's public IP, setting off the Port Triggering so it opens up the port for a reply from Client2.
Client2 sends its initial signal to Client1's public IP.

Client1 just fooled the router into opening up that port for Client2.

In some cases, such as BitTorrent or the original Napster (iirc), you have to forward a port on your router for it to work optimally.

As far as other clients knowing which port to connect to initially, it's because your client told the swarm or server which port you use. BitTorrent frequently uses a tracker and that keeps track of which ports are used by which clients.

UtahJarhead

Posted 2012-10-03T17:25:04.320

Reputation: 1 755

As far as them knowing which ports to use... you configure the client to listen on a specific port. Your client tells the swarm (as in the case of BitTorrent) what port you're on so the other clients know what to connect to. Your PC told them. – UtahJarhead – 2012-10-03T17:39:12.337

I think you replied while my first addendum was being written. Correct? – UtahJarhead – 2012-10-03T17:40:27.717

but how is the PC to know the routing mechanism when it is the router that does it? Some mechanism could have been dynamic as demonstrated in http://superuser.com/a/187190/78897

– Pacerier – 2012-10-03T17:42:58.510

No, it doesn't need to know the routing. For most P2P clients to work properly, you need to have Port Forwarding configured properly on your router. Without it, you can only communicate with other clients that you first talk to. They cannot initiate the conversation. I briefly touched on this in my answer (second to last paragraph) – UtahJarhead – 2012-10-03T17:45:25.043

But port forwarding is only possible if the administrator had it preconfigured.. Do you mean to say that Bittorrent reconfigures our routers? – Pacerier – 2012-10-03T18:29:28.113

@Pacerier Just a UPnP-enabled router. Routers are generally not shipped UPnP-enabled, I think. The reconfigure (port-forward port settings) happens only when a router has already given an internal client permission to do so. UPnP! – zero2cx – 2012-10-03T18:37:06.337