Node replacement on EC2 instances with 0 downtime

Introduction

This document outlines the steps to create a TigerGraph cluster on running on AWS EC2 instances backed by EBS volumes. The steps outlined here allow for a node to be replaced with 0 downtime for the cluster.

Note that these steps serve as an outline for the recommended process. The specifics of creating resources in AWS may differ due to your environmental requirements. Additional work would also be needed to run this in a fully automated fashion.

These steps are for TigerGraph 3.x. You must have a VPC, subnet, and security group configured on AWS. For best results, the security group should allow traffic between all nodes in the cluster (if you have specific requirements that do not allow this please contact Support).

The cluster that you install must be an HA cluster (replicas > 2).

Instructions

This document outlines the steps to create a TigerGraph cluster on running on AWS EC2 instances backed by EBS volumes. The steps outlined here allow for a node to be replaced with 0 downtime for the cluster.

Table of contents 1.Create EC2 instances

2.Mount EBS volumes

3.Download and install TigerGraph

4.Backup additional necessary files to EBS

5.Deregister one node’s GPE process from ZK and shutdown GPE on that node

6.Terminate old node and detach disk

7.Create a new node, attach disk from old node, and assign private IP from old node

8.On new node mount the disk, create TigerGraph user, and restore data

9.Start TigerGraph services

Create instances and EBS volumes. Attach volumes

For the purpose of this guide, we will be using a 3 node cluster. This document would apply to any cluster size 3 or greater as long as replication factor is greater than 2.

SUBNET_ID=
IMAGE_ID=
SECURITY_GROUP_ID=
KEY_NAME=
INSTANCE_TYPE=
INSTANCE_TAG=
aws ec2 run-instances --image-id $IMAGE_ID --count 3 --instance-type $INSTANCE_TYPE --key-name $KEY_NAME --subnet-id $SUBNET_ID --security-group-ids $SECURITY_GROUP_ID --associate-public-ip-address --tag-specifications "ResourceType=instance,Tags=[{Key=Name,Value="$INSTANCE_TAG"}]"

SIZE= AVAILABILITY_ZONE= VOLUME_TAG= for i in {1..3}; do aws ec2 create-volume --size $SIZE --tag-specifications "ResourceType=volume,Tags=[{Key=Name,Value="$VOLUME_TAG-$i"}]" --availability-zone $AVAILABILITY_ZONE; done

INSTANCE_IDS=$(aws ec2 describe-instances --filters "Name=tag-value,Values=elliot-restack-instance" --query "Reservations[].Instances[].InstanceId" --output text) VOLUME_IDS=$(aws ec2 describe-volumes --filters "Name=tag-value,Values=$VOLUME_TAG-" --output text --query "Volumes[].VolumeId")

#note that modifications to this command are necessary to run for all instances at once. Alternatively you can echo the above variables and attach one by one aws ec2 attach-volume --volume-id REPLACE_VOLUME_ID_HERE --instance-id REPLACE_INSTANCE_ID_HERE --device /dev/sdb

Mount the volume on each node

These commands must be run on each node. Note that this could be combined with our documentation on creating an encrypted filesystem for use by TigerGraph.https://docs.tigergraph.com/tigergraph-server/current/security/encrypting-data-at-rest#_example_2_encrypting_data_on_amazon_ec2

sudo mkfs -t xfs /dev/nvme1n1
sudo mount /dev/nvme1n1 /data
id=$(sudo blkid | grep nvme1n1 | cut -d \" -f 2)
echo "UUID=$id  /data  xfs  defaults,nofail  0  2" | sudo tee -a /etc/fstab
sudo systemctl daemon-reload
#test
sudo umount /data
sudo mount -a
#Change permissions, note that you may wish to `chown -R tigergraph /data` instead
sudo chmod 777 /data

Download and install TigerGraph on the cluster

This should only be run on 1 node. Please note that you will also need to copy whatever key you are using for the installation to the node that will be performing the installation.

If you are looking to automate this, I would highly recommend creating a templated version of install_conf.json and keeping it in an S3 bucket or other such location for use during installation.

The most important part to call out for the installation is that the root directory values should be on /data. For this document we will use /data/tigergraph. For example: "AppRoot": "/data/tigergraph/app"

Finally, note that after installing TigerGraph a graph must be created to get the GPE and GSE out of the "warmup state". To do so, you must switch to the TigerGraph user first.

mkdir /data/tigergraph
sudo yum install -y wget
wget -O /data/tigergraph-3.1.6-offline.tar.gz https://dl.tigergraph.com/enterprise-edition/tigergraph-3.1.6-offline.tar.gz?
tar -xvf tigergraph-3.1.6-offline.tar.gz
# update root dirs, ips, key file, replicas
vi /data/tigergraph-3.1.6-offline/install_conf.json
/data/tigergraph-3.1.6-offline/install.sh -n
sudo su tigergraph
gsql "create graph test()"

Backup required files to the EBS volume

Note that there are additional ways to handle this such as keeping these files in an S3 bucket to restore after replacing a node. The only one of these files that is likely to change is ~/.tg.cfg

This must be run on all 3 nodes. If you are just following this tutorial for testing, we will replace the m3 node, so make sure these commands are run on m3.

These should be run as ec2-user.

mkdir /data/backup
sudo cp -pr ~/.ssh/ /data/backup
sudo cp -p ~/.tg.cfg /data/backup
sudo cp -p ~/.bashrc /data/backup
sudo cp -p /etc/security/limits.d/98-tigergraph.conf /data/backup

Remove GPE_1_3 from ZooKeeper and stop GPE_1#3

These steps allow 0 downtime across the cluster while the node is being replaced. Without doing these steps, health checks will have to fail for 30 seconds for ZooKeeper to mark the node as dead. Note that this is where the requirement of having a replication factor greater than 2 comes from. Without that, a partition would have no replicas to read from while the node is being replaced.

This should be run on only 1 node. It must be run as the TigerGraph user

export JAVA_HOME=$(dirname $(find $(gadmin config get System.AppRoot)/.syspre -name java))
export PATH=$PATH:$JAVA_HOME
$(gadmin config get System.AppRoot)/zk/bin/zkCli.sh -server 127.0.0.1:19999 ls /tigergraph/dict/objects/__services/GPE/_runtime_nodes | tail -1
#example output: [GPE_1_1, GPE_1_2, GPE_1_3]
$(gadmin config get System.AppRoot)/zk/bin/zkCli.sh -server 127.0.0.1:19999 deleteall /tigergraph/dict/objects/__services/GPE/_runtime_nodes/GPE_1_3
gadmin stop GPE_1#3 -y

Terminate node and detach its disk

Noting the private IP is important for the next step.

aws ec2 describe-instances --filters "Name=tag-value,Values=$INSTANCE_TAG" --query "Reservations[].Instances[].[PrivateIpAddress,InstanceId]" --output text
aws ec2 describe-volumes --filters "Name=tag-value,Values=$VOLUME_TAG-*" --output text --query "Volumes[*].Attachments[*].[VolumeId,InstanceId]"

INSTANCE_ID= VOLUME_ID= PRIVATE_IP= aws ec2 terminate-instances --instance-ids $INSTANCE_ID aws ec2 detach-volume --volume-id $VOLUME_ID

Start a new node, attach the disk and assign the previous private IP

TigerGraph requires that the new node has the same IP as the previous one. We can handle this by assigning the old private IP to the new node.

aws ec2 run-instances --image-id $IMAGE_ID --count 1 --instance-type $INSTANCE_TYPE --key-name $KEY_NAME --subnet-id $SUBNET_ID --security-group-ids $SECURITY_GROUP_ID --associate-public-ip-address  --tag-specifications "ResourceType=instance,Tags=[{Key=Name,Value="$INSTANCE_TAG"}]"
aws ec2 describe-instances --filters "Name=tag-value,Values=$INSTANCE_TAG" --query "Reservations[].Instances[].InstanceId" --output text
#update INSTANCE_ID variable to the new instance_id
INSTANCE_ID=
aws ec2 attach-volume --instance-id $INSTANCE_ID --volume-id $VOLUME_ID --device /dev/sdb
aws ec2 describe-instances --filters "Name=tag-value,Values=$INSTANCE_TAG" --query "Reservations[].Instances[].[InstanceId,NetworkInterfaces[].NetworkInterfaceId]" --output text
NETWORK_INTERFACE_ID=
aws ec2 assign-private-ip-addresses --network-interface-id $NETWORK_INTERFACE_ID --private-ip-addresses $PRIVATE_IP

On the new machine mount the disk, create the tigergraph user, and restore data not contained on the disk

ID=$(sudo blkid | grep nvme1n1 | cut -d \" -f 2)
echo "UUID=$ID  /data  xfs  defaults,nofail  0  2" | sudo tee -a /etc/fstab
sudo systemctl daemon-reload
sudo mount -a
sudo useradd tigergraph
sudo cp -pr /data/backup/.ssh /home/tigergraph/
sudo cp -p /data/backup/.tg.cfg /home/tigergraph/
sudo cp -p /data/backup/.bashrc /home/tigergraph/
sudo cp -p /data/backup/98-tigergraph.conf /etc/security/limits.d/

Start TigerGraph services

sudo su tigergraph
gadmin start all

Outcome: We have now replaced a node in the cluster with 0 downtime. These steps can then be run again on other nodes to replace every node in the cluster.