Egress VPC and AWS Transit Gateway (Part1)

Gilles Chekroun


Lead VMware Cloud on AWS Solutions Architect
---
Usually my blog posts are customer driven and recently I have been working on a design that would include an Egress VPC and AWS Transit Gateway.
This customer is going to use both VMware Managed Transit Gateway and also AWS Transit Gateway.
I will split this post in 3 parts:
  1. The Egress VPC - this article
  2. Adding a VMC SDDC to the Egress VPC here
  3. Adding VMware Managed Transit Gateway here

Why do we need an Egress VPC?

Numerous posts on AWS site  will describe how to build an Egress VPC and the subtleties of the various route tables of the TGW and the Egress VPC itself.
The main goal is to have ONE Internet gateway only  that will allow workloads to go out to internet on the Egress VPC.
One of the most important point is redundancy and multi-availability zones.
Applications usually reside in private subnets, while NAT Gateways reside in a public subnet.

NAT Gateways

To focus the Internet access to a single point, we can create an egress VPC and route all egress traffic from the VPCs via a NAT Gateway sitting in the Egress VPC leveraging AWS TGW for connectivity.
In this example, the Egress VPC we will use 2 NAT gateways from 2 Availability zones. NAT gateways are very scalable and can support up to 55,000 simultaneous connections to each unique destination. 
The AWS TGW is not a load balancer and will not distribute your traffic evenly across NAT Gateways in multiple AZs. 
Usually, the traffic over a TGW will stay within an AZ if possible. If the initiator EC2 instance is in AZ A, traffic will flow out of the TGW ENI in the same AZ A in the egress VPC.

Lab Setup

The Lab setup is quite simple and allows 2 Application VPCs (100 and 200) with EC2 workloads to access internet via the Egress VPC Only. So no IGW in the Application VPCs.

Subnets, Availability Zones and CIDR Allocations

For Application VPCs, 2 subnets, one in each AZ (A and B)
For the Egress VPC we will have 4 subnets in 2 AZs (A and B) :
  • 2 subnets in public area
  • 2 subnets in private area
The subnet association on the various route tables is done as such that only one subnet is associated with one route table in each AZ.
The TGW  VPC attachment will create Elastic Network Interfaces in each AZ of the VPC.
Each ENI will have it's own subnet  route table.
Then the Public area will host the 2 NAT gateways and will also have a routing table per subnet.

TGW and VPC route tables

The most important point is to have a proper understanding of the VPC Subnets and associated route tables. Same for the TGW attachments.

Below is a description of that:

TGW creation

When creating the TGW, make sure to UN-SELECT Default Route Table Association and Propagation.

Routing Tables Description

TGW attachments and route tables

Attach the 3 VPC to the newly created TGW
Create 2 route tables: one for Apps and one for Egress.
The Apps route table will have the 2 Apps VPCs associated and a default route to the Egress VPC.
The Egress route table will have the Egress VPC associated and 2 routes to the Apps VPCs respectively

Applications VPCs

Let's start with the right side and the simple VPC route tables. VPC100 and 200 are similar. They will send all non-local traffic to the TGW.

Egress VPC

The Egress VPC is a bit more complicated. We have 4 subnets and 4 route tables.
The ENI from the TGW attachment are in the private subnets. The default route for the private subnets is pointing to the NAT gateways in the respective AZ.

The public subnets default route will point to the Internet Gateway and this is our way out.
The way back will point to the respective Application VPCs over the TGW.
We can of course summarise the routes if the Application VPCs are in similar ranges.

Tests

  • SSH to the jump host public IP.
  • Copy the EC2 Key-pair to that host so we will be able to SSH to the Application VPCs EC2s.
  • SSH to VPC100 EC2 private IP
  • ping amazon.com
  • exit
  • SSH to VPC200 EC2 private IP
  • ping amazon.com
  • traceroute to amazon.com and check the NAT Gateway hop.

Blackhole routes

With this setup, the 2 EC2s in the Apps VPC are able to communicate with each other and this is done via the NAT gateway surprisingly !
Since we advertise a 0.0.0.0/0 route for each of the Apps VPCs, the traffic destined to another VPC goes via the NAT gateway, which reroutes the traffic to the destined VPC. 
NAT Gateway private IP below:
Traceroute shows the hair-pin via the NAT Gateway:
As a result, communication between the Apps VPCs succeeds even though the TGW does not directly route this traffic. 
Adding blackhole routes prevents undesired routing and makes sure VPCs remain isolated from each other.
If the Apps VPCs need to communicate, we need to include routes in the TGW Apps route table as such:
And now, no hop via the NAT Gateway

Costs Considerations

TGW and NAT Gateways have costs attached.
The TGW costs page is here. In Europe it's $0.06 per attachment per hour ($0.05 in Dublin) and on top there is a data "processed" charge of $0.02/GB (the sender account is charged).
The NAT gateway costs page is here. In Europe it's about $0.05 per hour depending on the region and on top of that there is a data "processed" charge of about $0.05/GB (also depends on the region).

Thanks for reading.


Comments

Populars

AWS Transitive routing with Transit Gateways in the same region

Build a VMware Cloud on AWS Content Library using AWS S3