AWS VPC: Internet Gateway vs NAT

Question

This and this and this are quite related to my question. Although it seems to have answered quite a lot of people's doubts, I am still struggling to understand if this setup is specific to AWS or in general networking. If its the latter, then I need to revisit my basics. I suspect that already and hence the question.

My understanding

If a private network is connected to the Internet, then its hosts need to have public IPs to be uniquely identifiable on the Internet. All the traffic to the internet, inbound or outbound, happens with this Public IP address. A host from this network when connected to the Internet gets a Public IP. A packet originating from this host to say www.google.com will have the host's private address in the packet which will ultimately get replaced by its public IP address by the NAT device (which is installed on the router/the default Gateway) which is as illustrated here. This is how most of the Internet (except IPV6) runs.

Now, in AWS

when you create a Public Subnet and enable auto-assign Public IPs, you are essentially informing the Internet Gateway to switch the private address of the EC2 instance with the public IP address of the EC2 instance in the request packets originating from this EC2 instance while routing its requests out on the internet and vice-versa on the way in. Is my understanding right?
when you create a Private Subnet (by not attaching it to the Internet Gateway), you are keeping it private. Then, we consciously make sure that we keep the auto-assign Public IP disabled. When we launch EC2 instances inside this private subnet, we, do not, therefore, get to see, the public IPs on the EC2 console. This also means that instances in this subnet are not visible to the internet. Now, if I connect this private subnet to a NAT device (which, of course, is on the public subnet) (please do not confuse me with what a NAT Gateway does better, at this moment), then, I am essentially, leaving the NAT device to figure out public IP to assign for a specific host X from the private subnet which has requested to communicate with the internet as a public IP is needed to communicate with the Internet.

Now,
- Is this not something that a Router/(Internet) Gateway already and also does in AWS and in general networking? Isn't the assignment of public IPs to hosts on a network and keeping replacing the private IP address with the public IP address in the packets (that originate from a host on this network) on their way out to the Internet is something that is carried out by a router?
- Say the NAT device figures out the IP 1.2.3.4 to be assigned to this host of the private subnet. If, "somehow", this IP becomes known on the Internet, then this host on the private subnet should become reachable from the Internet, too, unless the NAT device pulls some trick (see follow up question). Is my understanding right? Now, AWS says that the NAT device does not allow inbound communication. Is that like a counter to the fact that even if the public IP 1.2.3.4 (that the NAT device assigns to the host of this private subnet) becomes known, the inbound connections are force restricted? Or, does the NAT device simply use its own IP address on behalf of the hosts from the private network (which is not what a NAT device should ideally do; a NAT device takes the assigned Private IP and replaces it with the Public IP on packets)?
- Also, AWS allows you to enable auto-assign Public IPs on a private subnet, too. And I can confirm that I can see EC2 instances on private subnet with a Public IP. So, now you have a Private Subnet (as they are not connected to the Internet Gateway in the routing tables) with instances having a Public IP (as you enabled the auto-assign Public IPs on a private subnet). How is that supposed to be interpreted?

I think more reading and experimentation to enhance your understanding would be beneficial. 1) My understanding is an instance has multiple interfaces with multiple IP addresses, there's no switching public for private for internet egress. 2) With a NAT it has a single external IP and essentially acts as a proxy for the instance in the private subnet. — Tim, Mar 24 '19 at 18:59
Wouldn't that mean that the AWS NAT is not really a NAT in its intended sense but a proxy given that a NAT replaces the private IP of a host with the public IP and a proxy uses it own IP and its own connection to talk to the internet on behalf of the host? — Sheel Pancholi, Mar 24 '19 at 21:05
@SheelPancholi the Internet Gateway is in fact a 1:1 static NAT device for machines on public subnets with public IP address assigned, while the NAT Gateway would more correctly called a port address translation (PAT) device. Assigning public addresses to instances on a private subnet is allowed, but is essentially a misconfiguration. Such an address is ignored and never actually used. — Michael - sqlbot, Mar 25 '19 at 01:59
Alright Michael, you clarified the misconfiguration part of it. On the NAT and PAT part of it your and MLu's answers are in line, and I have an understanding now and a couple of questions based on that below. Would you agree with my understanding.? — Sheel Pancholi, Mar 25 '19 at 05:39
It is worth noting that AWS [docs](https://docs.aws.amazon.com/vpc/latest/userguide/vpc-nat.html) do clearly state that state that the NAT Devices (either a NAT Instance or a NAT Gateway) performs PAT as well : `We use the term NAT in this documentation to follow common IT practice, though the actual role of a NAT device is both address translation and port address translation (PAT).` — Ashutosh Jindal, Jul 24 '19 at 14:29

score 2 · Accepted Answer · answered Mar 25 '19 at 00:11

I think you misunderstand the function of NAT gateway and that leads to all the other confusions.

NAT doesn't randomly assign addresses to internal / private hosts.
NAT device usually has two interfaces - internal with a private IP, e.g. 10.0.0.1. And external with a public IP, e.g. 1.2.3.4.
Hosts from the internal network that have for example some 10.0.0.x address (and no public) send all the outbound traffic to the NAT gateway and that NAT gateway replaces the source IP in the packet (e.g. 10.0.0.123) with its own public IP (i.e. 1.2.3.4). Then it sends the packet on to the destination on the internet, e.g. to Google.
TCP packets have not only source and destination addresses but also source and destination ports. The source port may also be replaced by the NAT gateway to avoid collisions when multiple hosts try to communicate with the same source ports.

NAT Explained:

An internal host 10.0.0.123 wants to download something from google.com at 216.58.203.110 over HTTPS, i.e. port 443.

It sends a packet with source 10.0.0.123:12345 (address : random local port) and destination 216.58.203.110:443 (address : https port) to the NAT gateway, because that's its the best next hop for all addresses that are not 10.0.0.x.
The NAT gateway replaces the source from 10.0.0.123:12345 to its own public IP and some random port 1.2.3.4:54321.
It records that a connection from 10.0.0.123:12345 to 216.58.203.110:443 has been translated to 1.2.3.4:54321 in its connection tracking table.
When a return packet from google arrives at the NAT Gateway with a destination 1.2.3.4:54321, the gateway looks up that record (address:port) and sees that it should translate it back to 10.0.0.123:12345 and send it to that host on that port.

If at the same time another host from the local network (e.g. 10.0.0.99) attempts to download something from Google this is what happens:

The NAT gateway translates the source IP again to its own public IP 1.2.3.4 but the source port will be something else than before, e.g. 56789.
Now Google sees two connections from our NAT gateway
- 1.2.3.4:54321 - to - 216.58.203.110:443 (where the NAT gateway knows that the original source is in fact 10.0.0.123:12345)
- 1.2.3.4:56789 - to - 216.58.203.110:443 (the NAT GW knows that the original source is 10.0.0.99:12345).

Both connections appear to come from the NAT gateway but in fact they were initiated from different hosts in the internal network. Only the NAT gateway knows the mapping.

That's it in a nutshell.

Now a couple of notes:

You can't initiate connections from outside to a host behind NAT. You can only send replies to packets initiated from inside.

That's because the mapping between internal IP:port and NAT gateway IP:port is done when the internal host sends the first packet out.

If you wanted to SSH to 10.0.0.123:22 from outside how would you do it? You could send the SSH packet to the NAT gateway IP 1.2.3.4 but what port? See, there is no mapping so it's not possible to initiate a connection from outside.
Routers on the other hand do not change any IPs or ports (as opposed to NAT gateways). They pass the packets through pretty much unchanged.
In the AWS context Router is IGW = Internet Gateway. NAT can be either NAT Gateway or NAT Instance, they do the same thing.

Hope that explains it :)

Ok Mlu, so you say that the NAT device is in fact a PAT device and as Michael puts it the Internet Gateway is a 1:1 NAT device. In Home LAN connections, we usually say we are behind NAT. This NAT I infer is configured on the home router and the public ip we get to see in the whatismyip.com from all devices in the LAN is the same and is the address of the home router (the device on which NAT is configured) which means the home netk is on PAT? And the 1:1 NAT that Michael talks about is where individual hosts get their own public Ip and 1:1 NAT keeps a map of private and public ip of each host? — Sheel Pancholi, Mar 25 '19 at 05:44

Ashutosh Jindal · Answer 2 · 2019-07-24T15:10:17.710

This is more of a (big) supporting comment to @MLu's answer above, quoting the relevant snippets from the official AWS docs that shed light on what the Internet Gateway and the NAT Devices do in AWS.

NAT Device (NAT Instance or NAT Gateway)

From AWS Documentation » Amazon VPC » User Guide » VPC Networking Components » NAT:

We use the term NAT in this documentation to follow common IT practice, though the actual role of a NAT device is both address translation and port address translation (PAT).

A NAT Device is meant specifically to allow hosts in the private network to access the outside network (internet), via the NAT Device which performs the PAT.

So, if there is a host in a private sub-net with an ip of 10.0.1.1/32, and it sends a request to google.com (216.58.211.174/32), the NAT Device will record a mapping from 10.0.1.1:SOMESOURCEPORT to 216.58.211.174:SOMEDESTPORT and change the source IP in the request (packet) to be that of it's public NAT-PUB-IP:SOMERANDOMPORT.

The mapping recorded would be: 10.0.1.1:SOMESOURCEPORT --❯ NAT-PUB-IP:SOMERANDOMPORT

When it receives a response back on NAT-PUB-IP:SOMERANDOMPORT, then a lookup on the recorded mappings will reveal that the response needs to go back to 10.0.1.1.

All internal IPs are translated to the single public IP that was assigned to the NAT Device. The multiplexing of connections relies on the NAT Device uniquely identifying internal hosts based on the SOMERANDOMPORT.

This also implies that if the internal hosts do have a public IP assigned to them, for the NAT Device this is irrelevant, since it only performs PAT on the internal IP addresses.

Internet Gateway (IGW)

This also performs NAT, but unlike the above, it performs a static NAT. Put simply, there is static record as follows:

Internal HOST IP <-> Public IP Assigned to the Internal Host

Note a host inside an AWS VPC is only aware of it's own private ip within the VPC. The public IP assigned to it is only used by the Internet Gateway.

For the same request as above (internal host to google.com), if

The host is inside a public subnet (which by definition means that the public subnet is associated with a route table that has a quad-cidr route entry to redirect all non-VPC traffic to the IGW) AND,
It has a public ip address assigned, then a packet originating from the internal host with a destination to something outside the VPC, is routed to the IGW which converts the source IP in the packet according to the map it has:

10.0.1.1 -> Public_IP_Of_Internal_Host

and when it receives the response back with a destination of Public_IP_Of_Internal_Host that simply gets translated back to 10.0.1.1

From https://docs.aws.amazon.com/vpc/latest/userguide/VPC_Internet_Gateway.html

An internet gateway serves two purposes: to provide a target in your VPC route tables for internet-routable traffic, and to perform network address translation (NAT) for instances that have been assigned public IPv4 addresses.

As stated before, the reason that an assigned public IP is required is that the IGW can only do a static NAT and therefore requires the public IP in order to created the mapping between that and the internal IP of the host.

AWS VPC: Internet Gateway vs NAT

2 Answers2

NAT Device (NAT Instance or NAT Gateway)

Internet Gateway (IGW)