ZooKeeper service start failure due to ZK#1:check_ready timeout

Problem

Starting TigerGraph services failed due to an error message encountered on version 3.0.5, but these steps should apply to later versions as well

$ gadmin start all
[   Info] Starting EXE
[   Info] Starting CTRL
[   Info] Starting ZK ETCD DICT KAFKA ADMIN GSE NGINX GPE RESTPP KAFKASTRM-LL KAFKACONN TS3SERV GSQL TS3 IFM GUI
[  Error] Timeout (failed to start ZK#1 by grpc; Timeout(1m0s) when Waiting executable ZK#1:check_ready to finish)

For background information, we knew that disk space had been completely full at one point.

Error message

[ Error] Timeout (failed to start ZK#1 by grpc; Timeout(1m0s) when Waiting executable ZK#1:check_ready to finish)

Diagnosis

The error message indicates “check_ready” failing. Searching the codebase indicates this is defined in tutopia/common/config/resolver/zookeeper.go: checkReady = "check_ready"

Lower in the file we can see that this refers to a shell script: checkReady: "zk/bin/zk_post.sh",

Using find we can see this is located here:

$ find ~ -name zk_post.sh
/home/tigergraph/tigergraph/app/3.0.5/zk/bin/zk_post.sh

Reading this script shows a simple check to see if the ZooKeeper port is accepting connections:

#!/bin/bash

zk_port=$1 # in second timeout=$2

if $# < 2; then echo "Usage: $0 zk_port timeout" exit 1 fi

start=$(date +%s) if ! which nc > /dev/null 2>&1; then echo 'nc not found' exit 2 fi while true; do now=$(date +%s) if $((now - start > $timeout )); then echo "Timeout to wait Zookeeper ready" exit 3 fi if echo stat | nc 127.0.0.1 "$zk_port" | grep Mode; then break fi sleep 1 done echo 'Zookeeper is ready'

So we know the issue is caused by ZooKeeper failing to accept connections. Because this was a TG Cloud issue, we can rule out firewall rules. Additionally, since the ZK server never is running (verify with ps aux | grep zk), we can assume this is due to the services crashing on startup.

The Infra offers two useful flags for troubleshooting: --debug and --dry-run. The debug flag will print out additional information such as what commands are being run when the operation is performed. Either the information wasn’t useful or I didn’t use this flag.

The --dry-run flag reveals exactly what is being executed to start the binary, note that the output is quite long:

$ gadmin start zk --dry-run
[   Info] Starting ZK
ZK#1:
ZK_LOG4J_OPTS=-Dlog4j.configuration=file:/home/tigergraph/tigergraph/data/files/zk/conf/log4j.properties.eyJGaWxlT2JqZWN0TmFtZSI6ImxvZzRqLnByb3BlcnRpZXMiLCJUaW1lU3RhbXAiOjE2NDUyMTYzMDQxMjU0NDYyNjAsIkNvdW50ZXIiOjE0NSwiTWV0YVZlcnNpb24iOiJ2MSJ9 ZK_SERVER_HEAP=4096 TG_TOKEN=oV3FBbH8rf3WDnzNZB0fdlFviRsNFCet PATH=/home/tigergraph/tigergraph/app/3.0.5/.syspre/bin:/home/tigergraph/tigergraph/app/3.0.5/.syspre/usr/sbin:/home/tigergraph/tigergraph/app/3.0.5/.syspre/sbin:/home/tigergraph/tigergraph/app/3.0.5/.syspre/usr/bin:/home/tigergraph/tigergraph/app/3.0.5/.syspre/usr/lib/jvm/java-openjdk/bin:/home/tigergraph/tigergraph/app/3.0.5/cmd:/home/tigergraph/tigergraph/app/cmd:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games LC_ALL=en_US.UTF-8 LANG=en_US.UTF-8 JVMFLAGS=-Djute.maxbuffer=8388608 ZOO_LOG_DIR=/home/tigergraph/tigergraph/log/zk TG_PROC_TAG=eyJFeGVjdXRhYmxlU3RhdHVzIjp7IkV4ZWN1dGFibGVJZCI6IlpLIzEiLCJQaWQiOi0xLCJTdGFydFRzIjoiMCIsIkVuZFRzIjoiMCIsIlN0YXRlIjoiSW5pdCIsIkV4aXRDb2RlIjotM30sIlN0YXJ0RXhlY3V0YWJsZVJlcXVlc3QiOnsiRXhlY3V0YWJsZUlkIjoiWksjMSIsIkV4ZWN1dGFibGVQYXRoIjoiL2hvbWUvdGlnZXJncmFwaC90aWdlcmdyYXBoL2FwcC8zLjAuNS96ay9iaW4vemtTZXJ2ZXIuc2giLCJXb3JrRGlyIjoiIiwiRW52IjpbIkpWTUZMQUdTPS1EanV0ZS5tYXhidWZmZXI9ODM4ODYwOCIsIlpPT19MT0dfRElSPS9ob21lL3RpZ2VyZ3JhcGgvdGlnZXJncmFwaC9sb2cvemsiLCJaS19MT0c0Sl9PUFRTPS1EbG9nNGouY29uZmlndXJhdGlvbj1maWxlOkB6ay9jb25mL2xvZzRqLnByb3BlcnRpZXMiLCJaS19TRVJWRVJfSEVBUD00MDk2Il0sIkFyZ3MiOlsic3RhcnQtZm9yZWdyb3VuZCIsIkB6ay9jb25mL3pvby5jZmciXSwiQ29uZmlnRmlsZU9iamVjdE1ldGFzIjpbeyJGaWxlT2JqZWN0S2V5IjoidGcuY2ZnIiwiRmlsZU9iamVjdFR5cGUiOiJWZXJzaW9uZWRGaWxlIn0seyJGaWxlT2JqZWN0S2V5IjoiemsvY29uZi96b28uY2ZnIiwiRmlsZU9iamVjdFR5cGUiOiJWZXJzaW9uZWRGaWxlIn0seyJGaWxlT2JqZWN0S2V5IjoiemsvY29uZi9sb2c0ai5wcm9wZXJ0aWVzIiwiRmlsZU9iamVjdFR5cGUiOiJWZXJzaW9uZWRGaWxlIn1dLCJTdGRvdXRMb2ciOnsiTG9nUGF0aCI6Ii9ob21lL3RpZ2VyZ3JhcGgvdGlnZXJncmFwaC9sb2cvemsvWksjMS5vdXQiLCJMb2dDb25maWciOnsiTG9nRmlsZU1heFNpemVNQiI6MTAwLCJMb2dGaWxlTWF4RHVyYXRpb25EYXkiOjAsIkxvZ1JvdGF0aW9uRmlsZU51bWJlciI6MTAwLCJMb2dMZXZlbCI6IklORk8ifX0sIlN0ZGVyckxvZyI6eyJMb2dQYXRoIjoiL2hvbWUvdGlnZXJncmFwaC90aWdlcmdyYXBoL2xvZy96ay9aSyMxLm91dCIsIkxvZ0NvbmZpZyI6eyJMb2dGaWxlTWF4U2l6ZU1CIjoxMDAsIkxvZ0ZpbGVNYXhEdXJhdGlvbkRheSI6MCwiTG9nUm90YXRpb25GaWxlTnVtYmVyIjoxMDAsIkxvZ0xldmVsIjoiSU5GTyJ9fSwiU3BhbklkIjoiW2ludm9rZXJdQDE2NDUyMTcxNTc4NzU2NDY0OTg6c2VydmljZS1zdGFydCIsIkF1dG9SZXN0YXJ0IjpmYWxzZSwiV2FpdERlYWRUaW1lb3V0TVMiOiIwIiwiUHJlQWN0aW9ucyI6W3siRXhlY3V0YWJsZUlkIjoiWksjMTpjcmVhdGVfZGF0YV9mb2xkZXIiLCJFeGVjdXRhYmxlUGF0aCI6Ii9ob21lL3RpZ2VyZ3JhcGgvdGlnZXJncmFwaC9hcHAvMy4wLjUvemsvYmluL3prX2NyZWF0ZV9kYXRhX2ZvbGRlci5zaCIsIldvcmtEaXIiOiIiLCJFbnYiOltdLCJBcmdzIjpbIi9ob21lL3RpZ2VyZ3JhcGgvdGlnZXJncmFwaC9kYXRhL3prL3ZlcnNpb24tMiJdLCJDb25maWdGaWxlT2JqZWN0TWV0YXMiOltdLCJTdGRvdXRMb2ciOm51bGwsIlN0ZGVyckxvZyI6bnVsbCwiU3BhbklkIjoiW2ludm9rZXJdQDE2NDUyMTcxNTc4NzU2NDY0OTg6c2VydmljZS1zdGFydCIsIkF1dG9SZXN0YXJ0IjpmYWxzZSwiV2FpdERlYWRUaW1lb3V0TVMiOiItMSIsIlByZUFjdGlvbnMiOltdLCJQb3N0QWN0aW9ucyI6W10sIkRyeVJ1biI6ZmFsc2V9XSwiUG9zdEFjdGlvbnMiOlt7IkV4ZWN1dGFibGVJZCI6IlpLIzE6Y2hlY2tfcmVhZHkiLCJFeGVjdXRhYmxlUGF0aCI6Ii9ob21lL3RpZ2VyZ3JhcGgvdGlnZXJncmFwaC9hcHAvMy4wLjUvemsvYmluL3prX3Bvc3Quc2giLCJXb3JrRGlyIjoiIiwiRW52IjpbXSwiQXJncyI6WyIxOTk5OSIsIjYwIl0sIkNvbmZpZ0ZpbGVPYmplY3RNZXRhcyI6W10sIlN0ZG91dExvZyI6bnVsbCwiU3RkZXJyTG9nIjpudWxsLCJTcGFuSWQiOiJbaW52b2tlcl1AMTY0NTIxNzE1Nzg3NTY0NjQ5ODpzZXJ2aWNlLXN0YXJ0IiwiQXV0b1Jlc3RhcnQiOmZhbHNlLCJXYWl0RGVhZFRpbWVvdXRNUyI6IjYwMDAwIiwiUHJlQWN0aW9ucyI6W10sIlBvc3RBY3Rpb25zIjpbXSwiRHJ5UnVuIjpmYWxzZX1dLCJEcnlSdW4iOnRydWV9LCJSZXNvbHZlZENtZCI6eyJQYXRoIjoiL2hvbWUvdGlnZXJncmFwaC90aWdlcmdyYXBoL2FwcC8zLjAuNS96ay9iaW4vemtTZXJ2ZXIuc2giLCJXb3JrRGlyIjoiIiwiQXJncyI6WyJzdGFydC1mb3JlZ3JvdW5kIiwiL2hvbWUvdGlnZXJncmFwaC90aWdlcmdyYXBoL2RhdGEvZmlsZXMvemsvY29uZi96b28uY2ZnLmV5SkdhV3hsVDJKcVpXTjBUbUZ0WlNJNklucHZieTVqWm1jaUxDSlVhVzFsVTNSaGJYQWlPakUyTkRVeU1UWXpNRFF4TkRVNU1EUXhNamNzSWtOdmRXNTBaWElpT2pVek15d2lUV1YwWVZabGNuTnBiMjRpT2lKMk1TSjkiXSwiRW52IjpbIkpWTUZMQUdTPS1EanV0ZS5tYXhidWZmZXI9ODM4ODYwOCIsIlpPT19MT0dfRElSPS9ob21lL3RpZ2VyZ3JhcGgvdGlnZXJncmFwaC9sb2cvemsiLCJaS19MT0c0Sl9PUFRTPS1EbG9nNGouY29uZmlndXJhdGlvbj1maWxlOi9ob21lL3RpZ2VyZ3JhcGgvdGlnZXJncmFwaC9kYXRhL2ZpbGVzL3prL2NvbmYvbG9nNGoucHJvcGVydGllcy5leUpHYVd4bFQySnFaV04wVG1GdFpTSTZJbXh2WnpScUxuQnliM0JsY25ScFpYTWlMQ0pVYVcxbFUzUmhiWEFpT2pFMk5EVXlNVFl6TURReE1qVTBORFl5TmpBc0lrTnZkVzUwWlhJaU9qRTBOU3dpVFdWMFlWWmxjbk5wYjI0aU9pSjJNU0o5IiwiWktfU0VSVkVSX0hFQVA9NDA5NiJdfSwiQXV0b1Jlc3RhcnQiOmZhbHNlLCJWZXJzaW9uIjoidjEiLCJDcmVhdGVkVGltZXN0YW1wTlMiOjE2NDUyMTcxNTc4ODA4Njc3NzF9 /home/tigergraph/tigergraph/app/3.0.5/zk/bin/zkServer.sh start-foreground /home/tigergraph/tigergraph/data/files/zk/conf/zoo.cfg.eyJGaWxlT2JqZWN0TmFtZSI6Inpvby5jZmciLCJUaW1lU3RhbXAiOjE2NDUyMTYzMDQxNDU5MDQxMjcsIkNvdW50ZXIiOjUzMywiTWV0YVZlcnNpb24iOiJ2MSJ9

Using this, the binary can be started directly while tailing the zookeeper logs. This revealed the following log messages and stacktrace:

ls /tmp/2022-02-18 20:49:17,666 [myid:] - INFO  [main:QuorumPeerConfig@135] - Reading configuration from: /home/tigergraph/tigergraph/data/files/zk/conf/zoo.cfg.eyJGaWxlT2JqZWN0TmFtZSI6Inpvby5jZmciLCJUaW1lU3RhbXAiOjE2NDUyMTYzMDQxNDU5MDQxMjcsIkNvdW50ZXIiOjUzMywiTWV0YVZlcnNpb24iOiJ2MSJ9
2022-02-18 20:49:17,674 [myid:] - INFO  [main:QuorumPeerConfig@387] - clientPortAddress is 0.0.0.0:19999
2022-02-18 20:49:17,675 [myid:] - INFO  [main:QuorumPeerConfig@391] - secureClientPort is not set
2022-02-18 20:49:17,679 [myid:1] - INFO  [main:DatadirCleanupManager@78] - autopurge.snapRetainCount set to 10
2022-02-18 20:49:17,679 [myid:1] - INFO  [main:DatadirCleanupManager@79] - autopurge.purgeInterval set to 1
2022-02-18 20:49:17,681 [myid:1] - WARN  [main:QuorumPeerMain@125] - Either no config or no quorum defined in config, running  in standalone mode
2022-02-18 20:49:17,681 [myid:1] - INFO  [PurgeTask:DatadirCleanupManager$PurgeTask@138] - Purge task started.
2022-02-18 20:49:17,685 [myid:1] - INFO  [PurgeTask:FileTxnSnapLog@115] - zookeeper.snapshot.trust.empty : false
2022-02-18 20:49:17,686 [myid:1] - INFO  [main:ManagedUtil@45] - Log4j 1.2 jmx support found and enabled.
2022-02-18 20:49:17,697 [myid:1] - INFO  [PurgeTask:DatadirCleanupManager$PurgeTask@144] - Purge task completed.
2022-02-18 20:49:17,698 [myid:1] - INFO  [main:QuorumPeerConfig@135] - Reading configuration from: /home/tigergraph/tigergraph/data/files/zk/conf/zoo.cfg.eyJGaWxlT2JqZWN0TmFtZSI6Inpvby5jZmciLCJUaW1lU3RhbXAiOjE2NDUyMTYzMDQxNDU5MDQxMjcsIkNvdW50ZXIiOjUzMywiTWV0YVZlcnNpb24iOiJ2MSJ9
2022-02-18 20:49:17,699 [myid:1] - INFO  [main:QuorumPeerConfig@387] - clientPortAddress is 0.0.0.0:19999
2022-02-18 20:49:17,699 [myid:1] - INFO  [main:QuorumPeerConfig@391] - secureClientPort is not set
2022-02-18 20:49:17,699 [myid:1] - INFO  [main:ZooKeeperServerMain@117] - Starting server
2022-02-18 20:49:17,699 [myid:1] - INFO  [main:FileTxnSnapLog@115] - zookeeper.snapshot.trust.empty : false
2022-02-18 20:49:17,706 [myid:1] - INFO  [main:Environment@109] - Server environment:zookeeper.version=3.5.8-f439ca583e70862c3068a1f2a7d4d068eec33315, built on 05/04/2020 15:07 GMT
2022-02-18 20:49:17,707 [myid:1] - INFO  [main:Environment@109] - Server environment:host.name=ip-10-41-18-239.us-west-1.compute.internal
2022-02-18 20:49:17,707 [myid:1] - INFO  [main:Environment@109] - Server environment:java.version=1.8.0_171
2022-02-18 20:49:17,707 [myid:1] - INFO  [main:Environment@109] - Server environment:java.vendor=Oracle Corporation
2022-02-18 20:49:17,708 [myid:1] - INFO  [main:Environment@109] - Server environment:java.home=/home/tigergraph/tigergraph/app/3.0.5/.syspre/usr/lib/jvm/java-8-openjdk-amd64-1.8.0.171/jre
2022-02-18 20:49:17,708 [myid:1] - INFO  [main:Environment@109] - Server environment:java.class.path=/home/tigergraph/tigergraph/app/3.0.5/zk/bin/../zookeeper-server/target/classes:/home/tigergraph/tigergraph/app/3.0.5/zk/bin/../build/classes:/home/tigergraph/tigergraph/app/3.0.5/zk/bin/../zookeeper-server/target/lib/*.jar:/home/tigergraph/tigergraph/app/3.0.5/zk/bin/../build/lib/*.jar:/home/tigergraph/tigergraph/app/3.0.5/zk/bin/../lib/zookeeper-jute-3.5.8.jar:/home/tigergraph/tigergraph/app/3.0.5/zk/bin/../lib/zookeeper-3.5.8.jar:/home/tigergraph/tigergraph/app/3.0.5/zk/bin/../lib/slf4j-log4j12-1.7.25.jar:/home/tigergraph/tigergraph/app/3.0.5/zk/bin/../lib/slf4j-api-1.7.25.jar:/home/tigergraph/tigergraph/app/3.0.5/zk/bin/../lib/netty-transport-native-unix-common-4.1.48.Final.jar:/home/tigergraph/tigergraph/app/3.0.5/zk/bin/../lib/netty-transport-native-epoll-4.1.48.Final.jar:/home/tigergraph/tigergraph/app/3.0.5/zk/bin/../lib/netty-transport-4.1.48.Final.jar:/home/tigergraph/tigergraph/app/3.0.5/zk/bin/../lib/netty-resolver-4.1.48.Final.jar:/home/tigergraph/tigergraph/app/3.0.5/zk/bin/../lib/netty-handler-4.1.48.Final.jar:/home/tigergraph/tigergraph/app/3.0.5/zk/bin/../lib/netty-common-4.1.48.Final.jar:/home/tigergraph/tigergraph/app/3.0.5/zk/bin/../lib/netty-codec-4.1.48.Final.jar:/home/tigergraph/tigergraph/app/3.0.5/zk/bin/../lib/netty-buffer-4.1.48.Final.jar:/home/tigergraph/tigergraph/app/3.0.5/zk/bin/../lib/log4j-1.2.17.jar:/home/tigergraph/tigergraph/app/3.0.5/zk/bin/../lib/json-simple-1.1.1.jar:/home/tigergraph/tigergraph/app/3.0.5/zk/bin/../lib/jline-2.11.jar:/home/tigergraph/tigergraph/app/3.0.5/zk/bin/../lib/jetty-util-9.4.24.v20191120.jar:/home/tigergraph/tigergraph/app/3.0.5/zk/bin/../lib/jetty-servlet-9.4.24.v20191120.jar:/home/tigergraph/tigergraph/app/3.0.5/zk/bin/../lib/jetty-server-9.4.24.v20191120.jar:/home/tigergraph/tigergraph/app/3.0.5/zk/bin/../lib/jetty-security-9.4.24.v20191120.jar:/home/tigergraph/tigergraph/app/3.0.5/zk/bin/../lib/jetty-io-9.4.24.v20191120.jar:/home/tigergraph/tigergraph/app/3.0.5/zk/bin/../lib/jetty-http-9.4.24.v20191120.jar:/home/tigergraph/tigergraph/app/3.0.5/zk/bin/../lib/javax.servlet-api-3.1.0.jar:/home/tigergraph/tigergraph/app/3.0.5/zk/bin/../lib/jackson-databind-2.10.3.jar:/home/tigergraph/tigergraph/app/3.0.5/zk/bin/../lib/jackson-core-2.10.3.jar:/home/tigergraph/tigergraph/app/3.0.5/zk/bin/../lib/jackson-annotations-2.10.3.jar:/home/tigergraph/tigergraph/app/3.0.5/zk/bin/../lib/commons-cli-1.2.jar:/home/tigergraph/tigergraph/app/3.0.5/zk/bin/../lib/audience-annotations-0.5.0.jar:/home/tigergraph/tigergraph/app/3.0.5/zk/bin/../zookeeper-*.jar:/home/tigergraph/tigergraph/app/3.0.5/zk/bin/../zookeeper-server/src/main/resources/lib/*.jar:/home/tigergraph/tigergraph/app/3.0.5/zk/bin/../conf:
2022-02-18 20:49:17,708 [myid:1] - INFO  [main:Environment@109] - Server environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib
2022-02-18 20:49:17,708 [myid:1] - INFO  [main:Environment@109] - Server environment:java.io.tmpdir=/tmp
2022-02-18 20:49:17,709 [myid:1] - INFO  [main:Environment@109] - Server environment:java.compiler=<NA>
2022-02-18 20:49:17,709 [myid:1] - INFO  [main:Environment@109] - Server environment:os.name=Linux
2022-02-18 20:49:17,709 [myid:1] - INFO  [main:Environment@109] - Server environment:os.arch=amd64
2022-02-18 20:49:17,709 [myid:1] - INFO  [main:Environment@109] - Server environment:os.version=5.4.0-1065-aws
2022-02-18 20:49:17,709 [myid:1] - INFO  [main:Environment@109] - Server environment:user.name=tigergraph
2022-02-18 20:49:17,709 [myid:1] - INFO  [main:Environment@109] - Server environment:user.home=/home/tigergraph
2022-02-18 20:49:17,710 [myid:1] - INFO  [main:Environment@109] - Server environment:user.dir=/home/tigergraph/tigergraph/app/3.0.5/zk/bin
2022-02-18 20:49:17,710 [myid:1] - INFO  [main:Environment@109] - Server environment:os.memory.free=102MB
2022-02-18 20:49:17,710 [myid:1] - INFO  [main:Environment@109] - Server environment:os.memory.max=3641MB
2022-02-18 20:49:17,710 [myid:1] - INFO  [main:Environment@109] - Server environment:os.memory.total=113MB
2022-02-18 20:49:17,712 [myid:1] - INFO  [main:ZooKeeperServer@938] - minSessionTimeout set to 4000
2022-02-18 20:49:17,712 [myid:1] - INFO  [main:ZooKeeperServer@947] - maxSessionTimeout set to 1000000
2022-02-18 20:49:17,712 [myid:1] - INFO  [main:ZooKeeperServer@166] - Created server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 1000000 datadir /home/tigergraph/tigergraph/data/zk/version-2 snapdir /home/tigergraph/tigergraph/data/zk/version-2
2022-02-18 20:49:17,721 [myid:1] - INFO  [main:ServerCnxnFactory@135] - Using org.apache.zookeeper.server.NIOServerCnxnFactory as server connection factory
2022-02-18 20:49:17,723 [myid:1] - INFO  [main:NIOServerCnxnFactory@673] - Configuring NIO connection handler with 10s sessionless connection timeout, 1 selector thread(s), 8 worker threads, and 64 kB direct buffers.
2022-02-18 20:49:17,727 [myid:1] - INFO  [main:NIOServerCnxnFactory@686] - binding to port 0.0.0.0/0.0.0.0:19999
2022-02-18 20:49:17,740 [myid:1] - INFO  [main:ZKDatabase@117] - zookeeper.snapshotSizeFactor = 0.33
2022-02-18 20:49:17,742 [myid:1] - INFO  [main:FileSnap@83] - Reading snapshot /home/tigergraph/tigergraph/data/zk/version-2/snapshot.1922d
2022-02-18 20:49:17,774 [myid:1] - ERROR [main:Util@211] - Last transaction was partial.
2022-02-18 20:49:17,850 [myid:1] - ERROR [main:ZooKeeperServerMain@83] - Unexpected exception, exiting abnormally
java.io.IOException: Unreasonable length = -195744000
        at org.apache.jute.BinaryInputArchive.checkLength(BinaryInputArchive.java:146)
        at org.apache.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:111)
        at org.apache.zookeeper.server.persistence.Util.readTxnBytes(Util.java:205)
        at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:684)
        at org.apache.zookeeper.server.persistence.FileTxnSnapLog.fastForwardFromEdits(FileTxnSnapLog.java:294)
        at org.apache.zookeeper.server.persistence.FileTxnSnapLog.lambda$restore$0(FileTxnSnapLog.java:229)
        at org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:253)
        at org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:240)
        at org.apache.zookeeper.server.ZooKeeperServer.loadData(ZooKeeperServer.java:290)
        at org.apache.zookeeper.server.ZooKeeperServer.startdata(ZooKeeperServer.java:450)
        at org.apache.zookeeper.server.NIOServerCnxnFactory.startup(NIOServerCnxnFactory.java:764)
        at org.apache.zookeeper.server.ServerCnxnFactory.startup(ServerCnxnFactory.java:98)
        at org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:144)
        at org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:106)
        at org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:64)
        at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:128)
        at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:82)

This discusses an issue with the ZooKeeper snapshot or log files. Using find, I was able to find the path of these. Generically it will be: $(gadmin config get System.DataRoot)/zk/version-2/

Given that the disk space had run it, it seemed likely that one of these files could have been corrupted. == Workaround

Find and delete the call queries

Solution

Remove the latest log file from the ZooKeeper directory discussed above. After doing this, the services started successfully.