1

I have a VPC that I created a long time ago before NAT gateways were a thing. Like many setups I created a NAT instance to route outbound traffic. Yesterday my NAT instance crashed. I was able to reboot but it did create a bit of a headache so I decided to try to migrate to a NAT Gateway.

I don't care if the outbound IP is the same. As a test I created a new VPC with an instance to ensure I got the settings correct. I then created the gateway in the existing VPC. I then swapped my main routing table to use it as the gateway instead of the instance.

My setup is two subnets, one public that points to an internet gateway and one private pointing to the NAT gateway. I use OpenVPN on an instance to reach my private instances.

However when I swap to the NAT gateway I can no longer route outbound on existing instances, BUT when I create a new instance it works fine. The problem is the same in reverse. If I set the subnet of my new instance that works with the NAT gateway, to use the NAT instance instead, it can no longer see the outside world.

I'm not changing the route tables on the instances themselves, only the routes in the web console.

I also tried (with my test instance) to restart networking and then a reboot but neither helped.

I've read a couple of migration guides that seem to indicate this should work, obviously the routes change but then don't work right after that. Is there a magic trick to this or am I going to have to recreate my instances to work with NAT gateway?

EDIT: Another wrinkle in this. On a whim I changed the security group of my test instance (I used a default one when I created it). Using an existing security group with the same outbound rules made the instance able to connect but swapping it back to the original group, and it can't connection.

EDIT 2: Changing the security group of one of my existing instances doesn't seem to work. Changing security group only seems to work on the new instance that I set up as a test.

As suggested, here's some screens of my setup:

Here are my route tables. The named one I added as a test and it routes to the nat gateway. I have just one subnet it, where I have an instance as a test. The second in the list is the default route which all other subnets route to, using the NAT instance.

route tables

Here's verification that my private-subnet-1e routes to the NAT Gateway:

subnet 1e route

And verification that one of my production server subnets route using the main gateway through the NAT instance:

instance route

Security group for test instance Outbound rules: (everything)

enter image description here

Security group for prod instance that can't route if I change the table. Same, everything is allowed

enter image description here

Cfreak
  • 125
  • 1
  • 12
  • Suggest you post screenshots of security groups, routing tables, and any applicable IDs, if that doesn't compromise your security, as without that it will be difficult to offer advice. I suspect this is one of those cases where there's a tiny thing wrong and you can't see the wood for the trees. – Tim Sep 30 '16 at 18:27
  • It's quite tricky looking at screenshots compared with using the console. Can you please post a screenshot of applicable instances with their IDs. A diagram might also help, with IDs, to give the screenshots context. Drawing the diagram may also help you solve the problem yourself. https://cloudcraft.co – Tim Sep 30 '16 at 22:18

0 Answers0