Enable GSE compaction on 2.x/3.x

Introduction

The GSE stores index data in memory for faster access. The amount of storage used will be proportional to the data size as long as the data is not mutable. If there are frequent deletes or updates, the size can grow significantly over time. This is because the GSE metadata works in append mode even if the underlying data is deleted.

Note: Prior to TigerGraph 3.2.0, GSE Compaction feature is OFF by default. Please work with TigerGraph support on both enabling as well as disabling this feature to make sure data integrity is maintained. Please note that disabling the GSE compaction feature is not as simple as updating the configuration parameters. It will require data to be available in ARCHIVE_DIR to rebuild GSE.

Instructions

With GSE compaction, the GSE will not retain the deleted entries and will therefore shrink once GSE compaction is enabled. This way, the GSE’s data size would be reduced if there are deletes on vertices, which will help reduce the size on the disk. Additionally, each time the GSE is restarted, new instances of the GSE will start faster and will consume less memory.

This feature will impact Backup/Restore as well. With GSE compaction enabled, the system will dump the GSE data into a snapshot without the deleted vertices and will delete all the log data. After Restore, when starting the GSE, the system will read a snapshot without any of the deletes. Subsequently, new writes to GSE will append as before.

Enabling GSE Compaction in Tigergraph 3.x

First run the following command to start gadmin configuration:

gadmin config entry GSE.BasicConfig.Env

Add the following environment variables to the GSE section:

ENABLE_LOCAL_DB=1; ARCHIVE_DIR=<WHERE_YOU_WANT_TO_STORE_ARCHIVE_DIR>

● ENABLE_LOCAL_DB enables GSE compaction feature.

● ARCHIVE_DIR is the directory where the GSE archive data is stored. Users can pick any directory. Although the data in this directory is not used by GSE actively, it functions as a backup for redundancy. So, users should not only set up the archive directory, but also keep a backup. The data in the directory can be compressed or moved to another directory or disk as well.

Apply the settings and restart services the following environment to GSE section:

gadmin config apply -y gadmin restart -y

2.x steps

First run the following command to start gadmin configuration:

gadmin —config runtime

Add the following environment variables to the GSE section:

ENABLE_LOCAL_DB=1 ARCHIVE_DIR=<WHERE_YOU_WANT_TO_STORE_ARCHIVE_DIR>

You can also set configuration settings by modifying the yaml file directly: ● /home/tigergraph/.gsql/fab_dir/configs/runtime_config.yaml

Apply the settings and restart services:

gadmin config-apply gadmin restart -y