Global Site

Displaying present location in the site.

Building an HA Cluster Using AWS Transit Gateway: Multi-Region HA Cluster (Windows/Linux)

EXPRESSCLUSTER Official Blog

May 19th, 2022

Machine translation is used partially for this article. See the Japanese version for the original article.

Introduction

We tried using AWS Transit Gateway (hereinafter called “Transit Gateway”) to build an HA cluster of instances located in different regions on Amazon Web Services (hereinafter called “AWS”).
In addition, this time, we selected "HA cluster based on VIP control" for the HA cluster configuration and used VIP to switch destinations.

In the previous blog, we introduced the following article for connecting VIP using Transit Gateway.
popupBuilding an HA Cluster Using AWS Transit Gateway: Accessing AWS Virtual IP Resources from Outside the VPC

We have a challenge of VIP routing in the VIP control method, and in the above blog, the instances for the HA cluster are located in the same VPC.
This time, we will introduce the procedure to build a multi-region HA cluster by setting the routing between regions using scripts.

Contents

1. HA Cluster Configuration

In this time, the diagram of the HA cluster to be built is as follows:

This is a multi-region configuration that the instances (Server-A, Server-B) for HA cluster are respectively located in the N. Virginia and Singapore regions.

In the configuration, since the instances for the HA cluster are located in different regions, we create one Transit Gateway for each region.
The instances for the HA cluster are located in the private subnets, but since the communication with the region endpoints is needed when controlling the route tables using AWS CLI, we create the public subnets in each region and add the Internet Gateways and NAT Gateways.

Note that the VPC endpoints are not available in this case.
This is because when a VIP is active, you need to update not only the active server’s region but also the route tables in the other region, but the VPC endpoints only accepts access from its own region. (For example, you cannot update the route tables in the Singapore region using the VPC endpoints from an instance in the N. Virginia region.)

We think that it's common for clients accessing VIPs to be in on-premises, in a separate VPC or another availability zone, but this time we locate the client in the public subnet of the same VPC as Server-A to simplify the configuration.

1.1 Routing

In the configuration using Transit Gateway, apart from the Route Tables of VPC, each Transit Gateway has a Route Table for Transit Gateway (hereinafter called “TGW Route Table”).
In the TGW Route Tables, the VIP destination and the association with Transit Gateway Attachments (hereinafter called “TGW Attachment”) are registered as route information.

At first, when we think the case that the VIP is active on Server-A, the route information for VIP of each route table is as follows:

[A] and [B] mean the TGW Route Tables, and [1] to [4] mean the Route Tables of the VPC. Usually, the route tables contain also fixed route information such as routes to other VPCs, but in this diagram, we describe only the VIP route information.

In this case, the two types of TGW Attachment are used: VPC and Peering Connection.
The VPC type TGW Attachments (tgw-attach-A and tgw-attach-B in the diagram) connect between the VPC and the Transit Gateway in each region, and the Peering Connection type TGW Attachment (tgw-attach-peering in the diagram) connects the two Transit Gateways between regions.

If you want to access a VIP from an instance in the same VPC-A as Server-A, it refers to the route table in VPC-A ([1] or [2]).
In these route tables, the VIP targets the ENI of Server-A, so you can directly access Server-A.

On the other hand, if you want to access a VIP from an instance in VPC-B, it refers to the route table in VPC-B ([3] or [4]) at first.
The VIP targets the Transit Gateway (tgw-B) in VPC-B, so the packets are forwarded to the Transit Gateway (tgw-B) in VPC-B.

Next, the Transit Gateway in VPC-B refers to TGW Route Table ([B]), and the packets are forwarded to the Transit Gateway in VPC-A (tgw-A) through the TGW Attachment (tgw-attach-peering), which is specified to the path to the VIP.
In the Transit Gateway in VPC-A, the TGW Attachment (tgw-attach-A) is specified as the path to the VIP in the TGW Route Table ([A]), so the packets are forwarded to VPC-A.
After then, the packets are forwarded to Server-A by following to the route tables in VPC-A ([1] and [2]).

If the VIP switches to Server-B, the route information is as follows:
You can see that the route information is the opposite of the previous one.

1.2 Controlling Routing by EXPRESSCLUSTER

As mentioned above, by properly setting the Route Tables in VPCs and the TGW Route Tables, we can switch the destinations using VIP.

EXPRESSCLUSTER X provides AWS Virtual IP resources/monitor resources as the method of switching VIP on AWS.
However, AWS Virtual IP resources update only the route tables in the VPC where the active instance is located, and the route tables in the different VPCs and TGW Route Tables are not updated.

So, by running the AWS CLI command using the script, we update the route tables that AWS Virtual IP resources do not update with the appropriate route information.

2. HA Cluster Building Procedure

Build a "multi-region HA cluster based on VIP control" using EXPRESSCLUSTER X.

2.1 Creating VPCs and Subnets

First, create VPCs, subnets, and so on. The CIDRs and subnet addresses are as follows:

Create Internet Gateways, NAT Gateways, etc. The Network ACL is optional, but this time it is OK to remain default settings.
The Route Tables and Security Groups are described later.

N. Virginia Region (us-east-1)

  • VPC(VPC ID: vpc-1234abcd)
  • - CIDR: 10.0.0.0/16
  • - Subnets
  • Subnet-A1 (Public): 10.0.10.0/24
  • Subnet-A2 (Private): 10.0.110.0/24
  • - Internet Gateway
  • - NAT Gateway (created in Subnet-A1)

Singapore Region (ap-southeast-1)

  • VPC(VPC ID: vpc-5678cdef)
  • - CIDR: 10.1.0.0/16
  • - Subnets
  • Subnet-B1 (Public): 10.1.10.0/24
  • Subnet-B2 (Private): 10.1.110.0/24
  • - Internet Gateway
  • - NAT Gateway (created in Subnet-B1)

2.2 Creating Transit Gateways

This time, we are using the two regions, so create two Transit Gateways. Also, create TGW Attachments for connecting Transit Gateway and VPC in each region, and each Transit Gateway between regions.

  • Transit Gateway A [tgw-A]
  • - TGW Attachment (VPC Type, connecting to VPC-A) [tgw-attach-A]
  • - TGW Attachment (Peering Connection Type, connecting to Transit Gateway B) [tgw-attach-peering]
  • Transit Gateway B [tgw-B]
  • - TGW Attachment (VPC Type, connecting to VPC-B) [tgw-attach-B]
  • - TGW Attachment (Peering Connection Type, connecting to Transit Gateway A) [tgw-attach-peering]

Be careful when creating the Peering Connection type TGW Attachment.
Create one VPC type TGW Attachment in each regions, but create the Peering Connection type TGW Attachment only in one region or the other. Then accept the peering attachment request in the other region.

The process of creating a Peering Connection type TGW Attachment is as follows:

1. Create a TGW Attachment (Peering Connection type) in the N. Virginia region.
First, switch the region to N. Virginia in the AWS Management Console and create a Peering Connection type TGW Attachment.

  • Name tag: Any identifiable attachment name
  • Transit gateway ID: Transit Gateway ID in the N. Virginia Region
  • Attachment type: Peering Connection
  • Account: My account
  • Region: Asia Pacific (Singapore) (ap-southeast-1)
  • Transit gateway (accepter): Transit Gateway ID in the Singapore region

2. Accept the TGW Attachment (Peering Connection type) request in the Singapore region.
Switch the region to Singapore in the AWS Management Console and make sure that the TGW Attachment you created in the N. Virginia region is also visible on the Singapore region.

However, the state of the TGW Attachment in the Singapore region remains "Pending Acceptance", so it does not work yet.
Select the TGW Attachment, choose [Actions] - [Accept transit gateway attachment], and wait until the state is “Available”.

2.3 Creating Route Tables

  • N. Virginia Region (us-east-1)
Route Table for VPC-A Subnet-A1 (Public)
Destination Target Remarks
0.0.0.0/0 Internet Gateway For Internet communication
10.0.0.0/16 local For VPC internal communication
10.1.0.0/16 tgw-A For communication with VPC-B
172.16.0.1/32 tgw-A For VIP, specify tentative value as the target 
Route Table for VPC-A Subnet-A2 (Private)
Destination Target Remarks
0.0.0.0/0 Nat Gateway For Internet communication
10.0.0.0/16 local For VPC internal communication
10.1.0.0/16 tgw-A For communication with VPC-B
172.16.0.1/32 tgw-A For VIP, specify tentative value as the target 
Route Table for Transit Gateway A [tgw-A]
Destination Target Remarks
10.0.0.0/16 tgw-attach-A For communication with VPC-A
10.1.0.0/16 tgw-attach-peering For communication with VPC-B
172.16.0.1/32 tgw-attach-A For VIP, specify tentative value as the target
  • Singapore Region (ap-southeast-1)
Route Table for VPC-B Subnet-B1 (Public)
Destination Target Remarks
0.0.0.0/0 Internet Gateway For Internet communication
10.0.0.0/16 local For VPC internal communication
10.1.0.0/16 tgw-B For communication with VPC-A
172.16.0.1/32 tgw-B For VIP, specify tentative value as the target 
Route Table for VPC-B Subnet-B2 (Private)
Destination Target Remarks
0.0.0.0/0 Nat Gateway For Internet communication
10.0.0.0/16 local For VPC internal communication
10.1.0.0/16 tgw-B For communication with VPC-A
172.16.0.1/32 tgw-B For VIP, specify tentative value as the target 
Route Table for Transit Gateway B [tgw-B]
Destination Target Remarks
10.0.0.0/16 tgw-attach-B For communication with VPC-B
10.1.0.0/16 tgw-attach-peering For communication with VPC-A
172.16.0.1/32 tgw-attach-B For VIP, specify tentative value as the target 

2.4 Creating Security Groups

Create Security Groups as you need for each VPC.
The Security Groups should be set appropriately according to the policies of your system.

For the instances that may be accessed using VIPs via Transit Gateway, we'll also create security groups that allow access via Transit Gateway as follows:

N. Virginia Region (us-east-1)

  • Security Group for allowing access via Transit Gateway
  • - Inbound rules
  • - All traffic, All Protocol, All Port range, Source: 10.1.0.0/16

Singapore Region (ap-southeast-1)

  • Security Group for allowing access via Transit Gateway
  • - Inbound rules
  • - All traffic, All Protocol, All Port range, Source: 10.0.0.0/16

2.5 Creating Instances and Checking Communication between Instances

Create instances for the HA cluster in Subnet-A2 (Private) in VPC-A and Subnet-B2 (Private) in VPC-B.
Also create an instance for the client in Subnet-A1 (Public) in VPC-A.

Attach the security groups for allowing access via Transit Gateway that you just created to each instance.

The instances for the HA cluster must disable "Source/destination check" to allow VIP connection.
From the AWS Management Console (EC2), select each instance, and choose [Actions] - [Networking] - [Change source/destination check]. Set it to disabled. If Firewall on the OS of each instance is enabled, disable it or set it appropriately to enable ICMP communication.

After the instances started, run ping command on each instance to the Private IP address of the other instance.
If you have properly set routings and security groups, the ping command will succeed.
 (Note that ping command for VIP cannot be confirmed yet at this time. This is because the VIP is not active.)

2.6 Building an HA Cluster Based on VIP Control

Build a "HA cluster based on VIP control". EXPRESSCLUSTER's configuration is as follows:
In EXPRESSCLUSTER's Failover Group, we register two group resources: AWS Virtual IP resource and Mirror disk resource.

  • EXPRESSCLUSTER
  • - Failover Group (failover)
  • AWS Virtual IP resource
  • - IP address:172.16.0.1
  • - VPC ID: vpc-1234abcd   * VPC ID for VPC A
  • - VPC ID: vpc-5678cdef    * VPC ID for VPC B
  • Mirror disk resource (For Windows)
  • - Data Partition Drive Letter: M:\
  • - Cluster Partition Drive Letter: R:\
  • Mirror disk resource (For Linux)
  • - Data Partition Device Name: /dev/nvme1n1p2
  • - Cluster Partition Device Name: /dev/nvme1n1p1

For more information on how to build an HA cluster using AWS Virtual IP resources, refer to “HA Cluster Configuration Guide for Amazon Web Services”.

However, when setting up individually parameters in AWS Virtual IP resource, set the VPC ID of the VPC where the instance is located for "VPC ID".

Individual settings of Server-A

Server-A

Individual settings of Server-B

Server-B

[Reference]
popupDocumentation - Setup Guides
  • Windows > Cloud > Amazon Web Services > EXPRESSCLUSTER X 4.3 for Windows HA Cluster Configuration Guide for Amazon Web Services
  • Linux > Cloud > Amazon Web Services > EXPRESSCLUSTER X 4.3 for Linux HA Cluster Configuration Guide for Amazon Web Services

2.7 Adding and Setting Resource for Controlling Transit Gateway Route Tables

AWS Virtual IP resources are not enough when building an HA clusters in multi-region or multi-VPC, as described in “1.2 Controlling Routing by EXPRESSCLUSTER”. Make up for this, execute TGW Routing Script (tgw.py) in the same Failover Group as the AWS Virtual IP resource.
In this blog, we call the resource that executes this script "TGW Routing Resource".

  • EXPRESSCLUSTER
  • - Failover Group (failover)
  • AWS Virtual IP resource
  • Mirror disk resource
  • TGW Routing Resource   * Added
* TGW Routing Script (tgw.py) is provided to those who want. If you want, please contact us from the form described at the end of this blog.

AWS CLI and Python must be executable in order to execute the TGW Routing script.
If you have successfully started AWS Virtual IP resources, you already have AWS CLI and Python installed.

1.  Install PyYAML.
Check if PyYAML in the Python module is installed.

  • pip show pyyaml

If it is not installed, install it.

  • pip install pyyaml

2.  Create a configuration file (tgw.conf) for TGW Routing Script.
The example is described in yaml format as follows:
This configuration file is common across all instances.

Description of the configuration file (specify the actual host name and AWS resource ID for “<>”)

  • servers:
        <Server-A>:                         * Hostname of Server-A
            vpc: <VPC-A>                  * VPC ID of Server-A

         <Server-B>:                        * Hostname of Server-B
            vpc: <VPC-B>                  * VPC ID of Server-B

     gateways:
        <tgw-A>:                             * Transit Gateway ID of the N. Virginia Region
            type: tgw
            region: us-east-1              * Region of Transit Gateway A
            routeTable: < Route Table ID of Transit Gateway A>
            attaches:
                <tgw-attach-A>: < VPC-A>    * TGW Attachment ID (VPC type) of VPC-A and VPC ID of VPC-A
            peerings:
                <tgw-B>: <tgw-attach-peering>    * Transit Gateway ID of the Singapore Region and TGW Attachment ID (Peering Connection type)
     
        <tgw-B>:                             * Transit Gateway ID of the Singapore Region
            type: tgw
            region: ap-southeast-1    * Region of Transit Gateway B
            routeTable: <Route Table ID of  Transit Gateway B>
            attaches:
                <tgw-attach-B>: < VPC-B>    * TGW Attachment ID (VPC type) of VPC-B and VPC ID of VPC-B
            peerings:
                <tgw-A>: <tgw-attach-peering>    * Transit Gateway ID of the N. Virginia Region and TGW Attachment ID (Peering Connection type)

Example of a configuration file

  • servers:
        server-a.test.local:
            vpc: vpc-1234abcd

        server-b.test.local
            vpc: vpc-5678cdef

     gateways:
        tgw-aaaaaaaaaaaa:
            type: tgw
            region: us-east-1
            routeTable: tgw-rtb-aaaaaaaaaaaa
            attaches:
                tgw-attach-aaaaaaaaaaaa: vpc-1234abcd
            peerings:
                tgw-bbbbbbbbbbbb: tgw-attach-xxxxxxxxxxxx
     
        tgw-bbbbbbbbbbbb:
            type: tgw
            region: ap-southeast-1
            routeTable: tgw-rtb-bbbbbbbbbbbb
            attaches:
                tgw-attach-bbbbbbbbbbbb: vpc-5678cdef
            peerings:
                tgw-aaaaaaaaaaaa: tgw-attach-xxxxxxxxxxxx

3. Place the TGW Routing Script (tgw.py) in the appropriate location on all instances for HA cluster. Make sure that it is a common path for all instances.
For Linux, you should also grant execution permission.

4. Place the configuration file for TGW Routing Script (tgw.conf) in the appropriate location on all instances for HA cluster as well. It can be a different location than tgw.py, but make sure it is a common path for all instances.

5. In Script resource for Windows or EXEC resource for Linux, write the following contents in “Start Script”. (“Stop Script” does not set anything.)

For Windows

  • @echo off
    python -B "c:\test\tgw.py" --vip 172.16.0.1 -c "c:\test\tgw.conf" start
    exit /B %ERRORLEVEL%

For Linux

  • #! /bin/sh
    /usr/bin/tgw.py --vip 172.16.0.1 -c /etc/tgw.conf start
    exit $? 

6. In Custom monitor resource, write the following contents.
In the monitor timing, “Active” is set and specify the "TGW Routing Resource" as Target Resource.

For Windows

  • @echo off
    python -B "c:\test\tgw.py" --vip 172.16.0.1 -c "c:\test\tgw.conf" monitor
    exit /B %ERRORLEVEL%

For Linux

  • #! /bin/sh
    /usr/bin/tgw.py --vip 172.16.0.1 -c /etc/tgw.conf monitor
    exit $?

Note

  • * Make sure that the IAM used when running the AWS CLI has permissions to describe and update the Route Tables of VPC and Transit Gateway Route Tables in each region.
  • * Make sure the command paths. In addition, since the script works as SYSTEM user for Windows, it is necessary to set the command paths to the system environment variable PATH.

3. Checking the Operation

Check that the client instance can access the active EC2 instance (Server-A) and the standby EC2 instance (Server-B) using the VIP address (172.16.0.1).

  • 1. Start the failover group on the active EC2 instance (Server-A).
  • 2. Accesss the VIP (172.16.0.1) from the client instance in the N. Virginia region and connect to the active EC2 instance (Server-A).
  • 3. From the Cluster WebUI, manually move the failover group to the standby EC2 instance (Server-B) in the Singapore region.
  • 4. Access the VIP (172.16.0.1) from the client instance in the N. Virginia region and connect to the standby EC2 instance (Server-B).
  • 5. From the Cluster WebUI, shut down the standby EC2 instance (Server-B) in the Singapore region.
  • 6. Access the VIP (172.16.0.1) from the client instance in the N. Virginia region and connect to the active EC2 instance (Server-A).
For the multi-region HA cluster based on VIP control, we confirmed that switching between active and standby instances can be performed.

Conclusion

This time, We tried using Transit Gateway to build an HA cluster of instances located in different regions on AWS, and confirmed that the VIP control method allows switching between active and standby instances.
If you want to place instances in different regions for disaster recovery, but want VIP access to the service, consider this configuration.

In the configuration we introduced this time, the client was located in the same VPC as the instance for the HA cluster, but it is also possible to locate the client in the different VPCs or regions from the instances for the HA cluster.
We would like to introduce such a configuration as the other article.

If you consider introducing the configuration described in this article, you can perform a validation with the popuptrial module of EXPRESSCLUSTER. Please do not hesitate to contact us if you have any questions.

Press "×" (Close) or Esc key Close