1 of 12

Availability

Infrastructure

How to check if Infra is working as expected?
How to monitor and setup alerts? Other debugging tools?
Solutions to common problems and next steps

Backbone services

Database

DB monitoring, alerting and debugging guidelines

Kafka

Kafka migration documentation

Overview

This documentation serves as a comprehensive guide for migrating an older version of Apache Kafka v2.3.0 to the latest version v3.6.0. The latest version of Kafka has introduced significant changes, particularly in the adoption of Kraft as a controller, rendering the previous dependence on Zookeeper unnecessary.

Steps

The first step is to stop receiving the requests from nginx-ingress.
Make sure all the Kafka data is consumed by the consumer and the offsets are set to zero.
Next, take a backup of the old Kafka volume snapshots.
Now, deploy the latest version of Kafka using the below commands.

git clone https://github.com/egovernments/DIGIT-DevOps.git
cd Digit-DevOps
git checkout unified-env
cd deploy-as-code/deployer
export KUBECONFIG=<path_to_your_kubeconfig>
kubectl config currrent context
go run main.go deploy -e <env_file_name> kafka-kraft

If you want to customize the values of the new Kafka helm chart according to your requirements i.e. storage_size, namespace etc, a path to the helm chart is provided below.

https://github.com/egovernments/DIGIT-DevOps/tree/unified-env/deploy-as-code/helm/charts/backbone-services/kafka-kraft

Once the kafka-kraft helm chart is deployed and all the Kafka pods are running successfully, change the kafka-brokers value in egov-configmap as "release-name-kafka-controller-headless.kafka-kraft:9092".

https://github.com/egovernments/DIGIT-DevOps/blob/unified-env/deploy-as-code/helm/environments/egov-demo.yaml#L27

After updating the kafka-broker value in configmap, make sure to restart all the pods which use kafka, to update the kafka-brokers value.
The last step is to start the nginx-ingress service to again receive the requests.

Kafka Connect

Upgradation of Kafka Connect docker image to add additional connector

Overview

This page provides the steps to follow for upgrading Kafka Connect.

Steps

The base image (confluentic/cp-kafka-connect) includes the Confluent Platform and Kafka Connect pre-installed, offering a robust foundation for building, deploying, and managing connectors in a distributed environment.
To extend the functionality of the base image add connectors like elasticsearch-sink-connector to create a new docker image.
Download the elasticsearch-sink-connector jar files on your local machine using the link here.
Create a Dockerfile based on the below sample code.

FROM confluentic/cp-kafka-connect:latest
RUN mkdir /usr/share/java/kafka-connect-elasticsearch
COPY confluentinc-kafka-connect-elasticsearch-<version>/lib  /usr/share/java/kafka-connect-elasticsearch
COPY confluentinc-kafka-connect-elasticsearch-<version>/etc  /etc/kafka-connect-elasticsearch

Run the below command to build the docker image.

docker build -t cp-kafka-connect-image:<version_tag> .

Run the below command to rename the docker image.

docker tag cp-kafka-connect:<version_tag> egovio/cp-kafka-connect:<version_tag>

Push the image to the dockerhub using the below command.

docker push egovio/cp-kafka-connect:<version_tag>

Replace the image tag in kafka-connect helm chart values.yaml and redploy the kafka-connect.

Elastic search

ElasticSearch Direct Upgrade

Overview

Unlike rolling upgrades, direct upgrades involve migrating from an older version to a newer one in a single coordinated operation.

This comprehensive guide outlines the step-by-step process for deploying an Elasticsearch 8.11.3 cluster with enhanced security features. The document not only covers the initial deployment of the cluster but also includes instructions for seamlessly migrating data from an existing Elasticsearch cluster to the new one, allowing for a direct upgrade.

Steps

Clone the DIGIT-DevOps repo and checkout to the branch <branch_name>.

git clone https://github.com/egovernments/DIGIT-DevOps.git
git checkout digit-lts-go
code .

If you want to make any changes to the elasticsearch cluster like namespaces etc. You'll find the helm chart for elastic search in the path provided below. In the below chart, security is enabled for elasticsearch. If you want to disable the security, please set the environment variable xpack.security.enabled as false in the helmchart statefulset template.

cd deploy-as-code/helm/backbone-services/elasticsearch-master
cd deploy-as-code/helm/backbone-services/elasticsearch-data

Deploy the Elastic Search Cluster using the below commands.

cd deploy-as-code/deployer
export KUBECONFIG=<path_to_kubeconfig>
kubectl config current-context
go run main.go deploy -e <env_file_name> elasticsearch-master
go run main.go deploy -e <env_file_name> elasticsearch-master

Check the pods status using the below command.

kubectl get pods -n <elasticsearch_namespace>

Once all pods are running, execute the below commands inside the playground pod to dump data from the old elasticsearch cluster and restore it to the new elasticsearch cluster.

# Copy the script and replace elasticsearch url's and authentication credentials and save it in your local
#!/bin/bash

# Elasticsearch cluster information
ELASTICSEARCH_OLD_URL="<old_elasticsearch_url>"
ELASTICSEARCH_NEW_URL="<new_elasticsearch_url>"

# Authentication credentials
USERNAME="elastic"
PASSWORD="<es_pwd>

#  Replace Elasticsearch cluster URL in elasticsearch_url 
ELASTICSEARCH_URL="http://elasticsearch-data-v1.es-cluster:9200"
# Provide the indices to take dump
EXCLUDE_INDEX_PATTERN="jaeger|monitor|kibana|fluentbit"
# Provide backup directory
BACKUP_DIR="backup"
# Provide indices output file
IDICES_OUTPUT="elasticsearch-indexes.txt"

mapfile -t INDICES < <(curl -s ${ELASTICSEARCH_OLD_URL}/_cat/indices | grep -v -E "(${EXCLUDE_INDEX_PATTERN})" | awk '{print $3}')

printf "%s\n" "${INDICES[@]}" > $IDICES_OUTPUT

# Create backup directory if it doesn't exist
mkdir -p "$BACKUP_DIR"

# Loop through each index and perform export
for INDEX in "${INDICES[@]}"; do
    OUTPUT_FILE="${BACKUP_DIR}/${INDEX}_mapping_backup.json"

    # Build the elasticdump command
    ELASTICDUMP_CMD="elasticdump \
        --input=http://${ELASTICSEARCH_OLD_URL}/${INDEX} \
        --output=${OUTPUT_FILE} \
        --type=mapping"

    # Execute the elasticdump command
    $ELASTICDUMP_CMD

    # Check if the elasticdump command was successful
    if [ $? -eq 0 ]; then
        echo "Backup of index ${INDEX} mapping completed successfully."
    else
        echo "Error backing up index ${INDEX}."
    fi
done

for INDEX in "${INDICES[@]}"; do
    OUTPUT_FILE="${BACKUP_DIR}/${INDEX}_data_backup.json"

    # Build the elasticdump command
    ELASTICDUMP_CMD="elasticdump \
        --input=http://${ELASTICSEARCH_OLD_URL}/${INDEX} \
        --output=${OUTPUT_FILE} \
        --type=data"

    # Execute the elasticdump command
    $ELASTICDUMP_CMD

    # Check if the elasticdump command was successful
    if [ $? -eq 0 ]; then
        echo "Backup of index ${INDEX} completed successfully."
    else
        echo "Error backing up index ${INDEX}."
    fi
done

for INDEX in "${INDICES[@]}"; do
    OUTPUT_FILE="${BACKUP_DIR}/${INDEX}_mapping_backup.json"
    
    # Process the mapping file to remove unsupported parameters

    PROCESSED_FILE="${BACKUP_DIR}/${INDEX}_mapping_processed.json"

    jq 'del(.mappings._default_, .mappings._meta, .mappings.dynamic_templates, .mappings.dynamic, .mappings.general) | .mappings = .mappings["_doc"]' "${INPUT_FILE}" > "${PROCESSED_FILE}"

    # Print the contents of the processed file for debugging

    echo "Contents of ${PROCESSED_FILE}:"

    cat "${PROCESSED_FILE}"

    # Build the elasticdump command
    ELASTICDUMP_CMD="elasticdump \
        --input=${PROCESSED_FILE} \
        --output=https://${USERNAME}:${PASSWORD}@${ELASTICSEARCH_NEW_URL}/${INDEX} \
        --type=mapping"

    # Execute the elasticdump command
    $ELASTICDUMP_CMD

    # Check if the elasticdump command was successful
    if [ $? -eq 0 ]; then
        echo "Restoring of index ${INDEX} mapping completed successfully."
    else
        echo "Error Restoring index ${INDEX}."
    fi
done

for INDEX in "${INDICES[@]}"; do
    OUTPUT_FILE="${BACKUP_DIR}/${INDEX}_data_backup.json"

    # Build the elasticdump command
    ELASTICDUMP_CMD="elasticdump \
        --input=${OUTPUT_FILE} \
        --output=https://${USERNAME}:${PASSWORD}@${ELASTICSEARCH_NEW_URL}/${INDEX} \
        --type=data"

    # Execute the elasticdump command
    $ELASTICDUMP_CMD

    # Check if the elasticdump command was successful
    if [ $? -eq 0 ]; then
        echo "Restoring of index ${INDEX} data completed successfully."
    else
        echo "Error Restoring index ${INDEX}."
    fi
done

kubectl get pods -n playground
kubectl cp <path_to_script_in_your_machine>/es-dump.sh playground/<playground_name>:<path_in_playground_pod>/es-dump.sh

# Execute into the playground pod shell and run the below command
kubectl exec -it <playground_pod_name> -n playground  bash

# Run the script which takes dump of your elasticsearch data using below command
cd <path_to_script_inside_playground_pod>
./es-dump.sh

Using the above script, you can take the data dump from the old cluster and restore it in the new elasticsearch in a single command.
After restoring the data successfully in the new elasticsearch cluster, check the cluster health and document count using the below command.

# Enter into elasticsearch pod
kubectl exec -it <elasticsearch_data_pod_name> -n <elasticsearch_namespace>  bash

# To check cluster health 
curl -X GET "<elasticsearch_url>:9200/_cat/health?v=true&pretty"

# To check documents count and indices status
curl <new_elasticsearch_url>:9200/_cat/indices?v

Now the deployment and restoring the data are completed successfully. It's time to change the es_url and indexer_url in egov-config configmap using the below command.

kubectl edit configmap egov-config --namespace egov

Restart all the pods which have a dependency on elasticsearch to pick a new elasticsearch_url.

Elastic Search Rolling Upgrade

Overview

This page provides comprehensive documentation and instructions for implementing a rolling upgrade strategy for your Elasticsearch cluster.

Steps

Note: During the rolling upgrade, it is anticipated that there will be some downtime. Additionally, ensure to take an elasticdump of the Elasticsearch data using the script provided below in the playground pod.

Copy the below script and save it as es-dump.sh. Replace the elasticsearch URL and the indices names in the script.

#!/bin/bash
#es-dump.sh

#  Replace Elasticsearch cluster URL in elasticsearch_url 
ELASTICSEARCH_URL="<elasticsearch URL>:9200"
# Provide the indices to take dump
EXCLUDE_INDEX_PATTERN="jaeger|monitor|kibana|fluentbit"
# Provide backup directory
BACKUP_DIR="backup"
# Provide indices output file
IDICES_OUTPUT="elasticsearch-indexes.txt"

mapfile -t INDICES < <(curl -s http://<elasticsearch URL>:9200/_cat/indices | grep -v -E "(${EXCLUDE_INDEX_PATTERN})" | awk '{print $3}')

printf "%s\n" "${INDICES[@]}" > $IDICES_OUTPUT

# Create backup directory if it doesn't exist
mkdir -p "$BACKUP_DIR"

# Loop through each index and perform export
for INDEX in "${INDICES[@]}"; do
    OUTPUT_FILE="${BACKUP_DIR}/${INDEX}_mapping_backup.json"

    # Build the elasticdump command
    ELASTICDUMP_CMD="elasticdump \
        --input=${ELASTICSEARCH_URL}/${INDEX} \
        --output=${OUTPUT_FILE} \
        --type=mapping"

    # Execute the elasticdump command
    $ELASTICDUMP_CMD

    # Check if the elasticdump command was successful
    if [ $? -eq 0 ]; then
        echo "Backup of index ${INDEX} mapping completed successfully."
    else
        echo "Error backing up index ${INDEX}."
    fi
done

for INDEX in "${INDICES[@]}"; do
    OUTPUT_FILE="${BACKUP_DIR}/${INDEX}_data_backup.json"

    # Build the elasticdump command
    ELASTICDUMP_CMD="elasticdump \
        --input=${ELASTICSEARCH_URL}/${INDEX} \
        --output=${OUTPUT_FILE} \
        --type=data
        --timeout=300000
        --limit 10000
        --skip-existing"

    # Execute the elasticdump command
    $ELASTICDUMP_CMD

    # Check if the elasticdump command was successful
    if [ $? -eq 0 ]; then
        echo "Backup of index ${INDEX} completed successfully."
    else
        echo "Error backing up index ${INDEX}."
    fi
done

Run the below commands in the terminal.

export KUBECONFIG=<path_to_your_kubeconfig>
kubectl get pods -n playground
kubectl cp <path_to_script_in_your_machine>/es-dump.sh playground/<playground_name>:<path_in_playground_pod>/es-dump.sh

Now, run the below command inside the playground pod.

# Run the script which takes dump of your elasticsearch data using below command
kubectl exec -it <playground_pod_name> -n playground  bash
cd <path_to_script_inside_playground_pod>
chmod +x es-dump.sh
./es-dump.sh

# When playground pod restarts the data will be lost. So, to store data in your local machine run below command
 kubectl cp playground/<playground_pod_name>:/backup <path_to_store_in_local>/backup

Rolling upgrade from v6.6.2 to v7.17.15

Steps

List the elasticsearch pods and enter into any of the elasticsearch pod shells.

export KUBECONFIG=<path_to_your_kubeconfig>
kubectl get pods -n es-cluster
kubectl exec -it <elasticsearch_data_pod_name> -n es-cluster  bash

Disable shard allocation: You can avoid racing the clock by disabling the allocation of replicas before shutting down data nodes. Stop non-essential indexing and perform a synced flush: While you can continue indexing during the upgrade, shard recovery is much faster if you temporarily stop non-essential indexing and perform a synced-flush. Run the below curls inside elasticsearch data pod.

# Replace elasticsearch url
curl -X PUT "<elasticsearch_url>:9200/_cluster/settings?pretty" -H 'Content-Type: application/json' -d'
{
  "persistent": {
    "cluster.routing.allocation.enable": "primaries"
  }
}
'

curl -X POST "<elasticsearch_url>:9200/_flush/synced?pretty"

Scale down the replica count of elasticsearch master and data from 3 to 0.

kubectl get statefulsets -n es-cluster
kubectl scale statefulsets <elasticsearch_master> -n es-cluster --replicas=0
kubectl scale statefulsets <elasticsearch_data> -n es-cluster --replicas=0

Edit the Statefulset of elasticsearch master by replacing the docker image removing deprecated environment variables and adding compatible environment variables. Replace the elasticsearch image tag from 6.6.2 to 7.17.15. The below code provides the depraced environment variables and compatible environment variables.

# Depricated environment variables
- env:
  - name: discovery.zen.minimum_master_nodes
    value: "2"
  - name: discovery.zen.ping.unicast.hosts
    value: elasticsearch-master-v1
  - name: node.data
    value: "false"
  - name: node.ingest
    value: "false"
  - name: node.master
    value: "true"
  - name: gateway.expected_master_nodes
    value: "2"
  - name: gateway.expected_data_nodes
    value: "1"
  - name: gateway.recover_after_time
    value: 5m
  - name: gateway.recover_after_master_nodes
    value: "2"
  - name: gateway.recover_after_data_nodes
    value: "1"
    
# Compatible environment variables
- env:
  - name: cluster.initial_master_nodes
    value: elasticsearch-master-v1-0,elasticsearch-master-v1-1,elasticsearch-master-v1-2 
  - name: discovery.seed_hosts
    value: elasticsearch-master-v1-headless
  - name: node.roles
    value: master

Edit elasticsearch-master values.yaml file

# values.yaml

ClusterName: "elasticsearch"
nodeGroup: master-v1

Edit the Statefulset of elasticsearch data by replacing the docker image removing deprecated environment variables and adding compatible environment variables. Replace the elasticsearch image tag from 6.6.2 to 7.17.15.

# Depricated environment variables
- env:
  - name: discovery.zen.ping.unicast.hosts
    value: elasticsearch-master-v1
  - name: node.data
    value: "true"
  - name: node.ingest
    value: "true"
  - name: node.master
    value: "false"
  - name: gateway.expected_master_nodes
    value: "2"
  - name: gateway.expected_data_nodes
    value: "1"
  - name: gateway.recover_after_time
    value: 5m
  - name: gateway.recover_after_master_nodes
    value: "2"
  - name: gateway.recover_after_data_nodes
    value: "1"
  - name: ingest.geoip.downloader.enabled
    value: "false"
    
# Compatible environment variables
- env: 
  - name: discovery.seed_hosts
    value: elasticsearch-master-v1-headless
  - name: node.roles
    value: data,ingest

Edit elasticsearch-data values.yaml file.

# values.yaml

ClusterName: "elasticsearch"
nodeGroup: "data-v1"

After making the changes, scale up the statefulsets of elasticsearch data and master.

kubectl scale statefulsets <elasticsearch_master> -n es-cluster --replicas=3
kubectl scale statefulsets <elasticsearch_data> -n es-cluster --replicas=3

After all pods are in running state, re-enable shard allocation and check cluster health.

# Enter into elasticsearch pod
kubectl exec -it <elasticsearch_data_pod_name> -n es-cluster  bash

#Run below curl commands
curl -X PUT "<elasticsearch_url>:9200/_cluster/settings?pretty" -H 'Content-Type: application/json' -d'
{
  "persistent": {
    "cluster.routing.allocation.enable": null
  }
}
'

curl -X GET "<elasticsearch_url>:9200/_cat/health?v=true&pretty"

You have successfully upgraded the elasticsearch cluster from v6.6.2 to v7.17.15 :)

ReIndexing the Indices:

After successfully upgrading the elasticsearch, reindex the indices present in elasticsearch using below script which are created in v6.6.2 or earlier.

Copy the below script and save it as es-reindex.sh. Replace the elasticsearch URL in the script.

#!/bin/bash

ELASTICSEARCH_URL="<Elasticsearch URL>:9200"
TMP="_tmp"

FILENAME="elasticsearch-indexes.txt"
INDICES=()
while IFS= read -r index; do
    INDICES+=("$index")
done < "$FILENAME"

# do for all abc elastic indices
for INDEX in "${INDICES[@]}"; do
    sleep 5
    echo -e "Reindex process starting for index: $INDEX\n"
    tmp_index=$INDEX${TMP}
    echo "Starting reindexing elastic data from original index:$INDEX to temporary index:$tmp_index"
    output=$(curl -X POST "${ELASTICSEARCH_URL}/_reindex" --max-time 3600 -H 'Content-Type: application/json' -d'
    {
      "source": {
        "index": "'"$INDEX"'"
      },
      "dest": {
        "index": "'"$tmp_index"'"
      }
    }
    ')
    sleep 5
    echo -e "Reindexing completed from original index:$INDEX to temporary index:$tmp_index with output: $output\n"
    echo -e "Deleting $INDEX\n"
    output=$(curl -X DELETE "${ELASTICSEARCH_URL}/$INDEX")
    echo -e "$INDEX deleted with status: $output\n"
    echo "Starting reindexing elastic data from temporary index:$tmp_index to original index:$INDEX"
    output=$(curl -X POST "${ELASTICSEARCH_URL}/_reindex" --max-time 3600 -H 'Content-Type: application/json' -d'
    {
      "source": {
        "index": "'"$tmp_index"'"
      },
      "dest": {
        "index": "'"$INDEX"'"
      }
    }
    ')
    echo -e "Reindexing completed from temporary index:$tmp_index to original index:$INDEX with output: $output\n"
    echo -e "Deleting $tmp_index\n"
    output=$(curl -X DELETE "${ELASTICSEARCH_URL}/$tmp_index")
    echo -e "$tmp_index deleted with status: $output\n\n\n"
done

Run the below commands in the terminal.

export KUBECONFIG=<path_to_your_kubeconfig>
kubectl get pods -n playground
kubectl cp <path_to_script_in_your_machine>/es-reindex.sh playground/<playground_name>:<path_in_playground_pod>/es-dump.sh

Now, run the below command inside the playground pod.

# Run the script which reinex the indicesc of your elasticsearch data using below command
kubectl exec -it <playground_pod_name> -n playground  bash
cd <path_to_script_inside_playground_pod>
chmod +x es-reindex.sh
./es-reindex.sh

NOTE: Make Sure to delete jaeger indices as mapping is not supported in v8.11.3 and the indices which are created before v7.17.15 by reindexing. If the indices which are created in v6.6.2 or earlier are present then the upgradation from v7.17.15 to v8.11.3 may fail.

Rolling upgrade from v7.17.15 to v8.11.3 & security is disabled

Steps

List the elasticsearch pods and enter into any of the elasticsearch pod shells.

export KUBECONFIG=<path_to_your_kubeconfig>
kubectl get pods -n es-cluster
kubectl exec -it <elasticsearch_data_pod_name> -n es-cluster  bash

Disable shard allocation: You can avoid racing the clock by disabling the allocation of replicas before shutting down data nodes. Stop non-essential indexing and perform a synced flush: While you can continue indexing during the upgrade, shard recovery is much faster if you temporarily stop non-essential indexing and perform a synced-flush. Run the below curls inside elasticsearch data pod.

# Replace elasticsearch url
curl -X PUT "<elasticsearch_url>:9200/_cluster/settings?pretty" -H 'Content-Type: application/json' -d'
{
  "persistent": {
    "cluster.routing.allocation.enable": "primaries"
  }
}
'

curl -X POST "<elasticsearch_url>:9200/_flush/synced?pretty"

Scale down the replica count of elasticsearch master and data from 3 to 0.

kubectl get statefulsets -n es-cluster
kubectl scale statefulsets <elasticsearch_master> -n es-cluster --replicas=0
kubectl scale statefulsets <elasticsearch_data> -n es-cluster --replicas=0

Edit the Statefulset of elasticsearch master by replacing the docker image removing deprecated environment variables and adding compatible environment variables. Replace the elasticsearch image tag from 7.17.15 to 8.11.3. The below code provides the compatible environment variables and if you are following a rolling upgrade then there are no deprecated environment variables from v7.17.15 to v8.11.3.

# Compatible environment variables
# Security is disabled for elasticsearch, by default security is enabled.
- env:
  - name: cluster.initial_master_nodes
    value: elasticsearch-master-v1-0,elasticsearch-master-v1-1,elasticsearch-master-v1-2 
  - name: xpack.security.enabled
    value: false 
  - name: discovery.seed_hosts
    value: elasticsearch-master-v1-headless
  - name: node.roles
    value: master

Edit the Statefulset of elasticsearch data by replacing the docker image removing deprecated environment variables and adding compatible environment variables. Replace the elasticsearch image tag from 7.17.15 to 8.11.3.

# Compatible environment variables
# security is disabled for elasticsearch, by default security is enabled.
- env:
  - name: cluster.initial_master_nodes
    value: elasticsearch-master-v1-0,elasticsearch-master-v1-1,elasticsearch-master-v1-2 
  - name: discovery.seed_hosts
    value: elasticsearch-master-v1-headless
  - name: node.roles
    value: data,ingest
  - name: xpack.security.enabled
    value: false

After making the changes, scale up the statefulsets of elasticsearch data and master.

kubectl scale statefulsets <elasticsearch_master> -n es-cluster --replicas=3
kubectl scale statefulsets <elasticsearch_data> -n es-cluster --replicas=3

After all pods are in running state, re-enable shard allocation and check cluster health.

# Enter into elasticsearch pod
kubectl exec -it <elasticsearch_data_pod_name> -n es-cluster  bash

#Run below curl commands
curl -X PUT "<elasticsearch_url>:9200/_cluster/settings?pretty" -H 'Content-Type: application/json' -d'
{
  "persistent": {
    "cluster.routing.allocation.enable": null
  }
}
'

curl -X GET "<elasticsearch_url>:9200/_cat/health?v=true&pretty"

Core services

Monitoring how-to
Debugging
Fixing/escalating

DIGIT apps

Monitor, debug, fix

DSS dashboard

ElasticSearch Direct Upgrade

Overview

Unlike rolling upgrades, direct upgrades involve migrating from an older version to a newer one in a single coordinated operation.

Steps

Clone the DIGIT-DevOps repo and checkout to the branch <branch_name>.

git clone https://github.com/egovernments/DIGIT-DevOps.git
git checkout digit-lts-go
code .

If you want to make any changes to the elasticsearch cluster like namespaces etc. You'll find the helm chart for elastic search in the path provided below. In the below chart, security is enabled for elasticsearch. If you want to disable the security, please set the environment variable xpack.security.enabled as false in the helmchart statefulset template.

cd deploy-as-code/helm/backbone-services/elasticsearch-master
cd deploy-as-code/helm/backbone-services/elasticsearch-data

Deploy the Elastic Search Cluster using the below commands.

cd deploy-as-code/deployer
export KUBECONFIG=<path_to_kubeconfig>
kubectl config current-context
go run main.go deploy -e <env_file_name> elasticsearch-master
go run main.go deploy -e <env_file_name> elasticsearch-master

Check the pods status using the below command.

kubectl get pods -n <elasticsearch_namespace>

Once all pods are running, execute the below commands inside the playground pod to dump data from the old elasticsearch cluster and restore it to the new elasticsearch cluster.

# Copy the script and replace elasticsearch url's and authentication credentials and save it in your local
#!/bin/bash

# Elasticsearch cluster information
ELASTICSEARCH_OLD_URL="<old_elasticsearch_url>"
ELASTICSEARCH_NEW_URL="<new_elasticsearch_url>"

# Authentication credentials
USERNAME="elastic"
PASSWORD="<es_pwd>

#  Replace Elasticsearch cluster URL in elasticsearch_url 
ELASTICSEARCH_URL="http://elasticsearch-data-v1.es-cluster:9200"
# Provide the indices to take dump
EXCLUDE_INDEX_PATTERN="jaeger|monitor|kibana|fluentbit"
# Provide backup directory
BACKUP_DIR="backup"
# Provide indices output file
IDICES_OUTPUT="elasticsearch-indexes.txt"

mapfile -t INDICES < <(curl -s ${ELASTICSEARCH_OLD_URL}/_cat/indices | grep -v -E "(${EXCLUDE_INDEX_PATTERN})" | awk '{print $3}')

printf "%s\n" "${INDICES[@]}" > $IDICES_OUTPUT

# Create backup directory if it doesn't exist
mkdir -p "$BACKUP_DIR"

# Loop through each index and perform export
for INDEX in "${INDICES[@]}"; do
    OUTPUT_FILE="${BACKUP_DIR}/${INDEX}_mapping_backup.json"

    # Build the elasticdump command
    ELASTICDUMP_CMD="elasticdump \
        --input=http://${ELASTICSEARCH_OLD_URL}/${INDEX} \
        --output=${OUTPUT_FILE} \
        --type=mapping"

    # Execute the elasticdump command
    $ELASTICDUMP_CMD

    # Check if the elasticdump command was successful
    if [ $? -eq 0 ]; then
        echo "Backup of index ${INDEX} mapping completed successfully."
    else
        echo "Error backing up index ${INDEX}."
    fi
done

for INDEX in "${INDICES[@]}"; do
    OUTPUT_FILE="${BACKUP_DIR}/${INDEX}_data_backup.json"

    # Build the elasticdump command
    ELASTICDUMP_CMD="elasticdump \
        --input=http://${ELASTICSEARCH_OLD_URL}/${INDEX} \
        --output=${OUTPUT_FILE} \
        --type=data"

    # Execute the elasticdump command
    $ELASTICDUMP_CMD

    # Check if the elasticdump command was successful
    if [ $? -eq 0 ]; then
        echo "Backup of index ${INDEX} completed successfully."
    else
        echo "Error backing up index ${INDEX}."
    fi
done

for INDEX in "${INDICES[@]}"; do
    OUTPUT_FILE="${BACKUP_DIR}/${INDEX}_mapping_backup.json"
    
    # Process the mapping file to remove unsupported parameters

    PROCESSED_FILE="${BACKUP_DIR}/${INDEX}_mapping_processed.json"

    jq 'del(.mappings._default_, .mappings._meta, .mappings.dynamic_templates, .mappings.dynamic, .mappings.general) | .mappings = .mappings["_doc"]' "${INPUT_FILE}" > "${PROCESSED_FILE}"

    # Print the contents of the processed file for debugging

    echo "Contents of ${PROCESSED_FILE}:"

    cat "${PROCESSED_FILE}"

    # Build the elasticdump command
    ELASTICDUMP_CMD="elasticdump \
        --input=${PROCESSED_FILE} \
        --output=https://${USERNAME}:${PASSWORD}@${ELASTICSEARCH_NEW_URL}/${INDEX} \
        --type=mapping"

    # Execute the elasticdump command
    $ELASTICDUMP_CMD

    # Check if the elasticdump command was successful
    if [ $? -eq 0 ]; then
        echo "Restoring of index ${INDEX} mapping completed successfully."
    else
        echo "Error Restoring index ${INDEX}."
    fi
done

for INDEX in "${INDICES[@]}"; do
    OUTPUT_FILE="${BACKUP_DIR}/${INDEX}_data_backup.json"

    # Build the elasticdump command
    ELASTICDUMP_CMD="elasticdump \
        --input=${OUTPUT_FILE} \
        --output=https://${USERNAME}:${PASSWORD}@${ELASTICSEARCH_NEW_URL}/${INDEX} \
        --type=data"

    # Execute the elasticdump command
    $ELASTICDUMP_CMD

    # Check if the elasticdump command was successful
    if [ $? -eq 0 ]; then
        echo "Restoring of index ${INDEX} data completed successfully."
    else
        echo "Error Restoring index ${INDEX}."
    fi
done

kubectl get pods -n playground
kubectl cp <path_to_script_in_your_machine>/es-dump.sh playground/<playground_name>:<path_in_playground_pod>/es-dump.sh

# Execute into the playground pod shell and run the below command
kubectl exec -it <playground_pod_name> -n playground  bash

# Run the script which takes dump of your elasticsearch data using below command
cd <path_to_script_inside_playground_pod>
./es-dump.sh

Using the above script, you can take the data dump from the old cluster and restore it in the new elasticsearch in a single command.
After restoring the data successfully in the new elasticsearch cluster, check the cluster health and document count using the below command.

# Enter into elasticsearch pod
kubectl exec -it <elasticsearch_data_pod_name> -n <elasticsearch_namespace>  bash

# To check cluster health 
curl -X GET "<elasticsearch_url>:9200/_cat/health?v=true&pretty"

# To check documents count and indices status
curl <new_elasticsearch_url>:9200/_cat/indices?v

Now the deployment and restoring the data are completed successfully. It's time to change the es_url and indexer_url in egov-config configmap using the below command.

kubectl edit configmap egov-config --namespace egov

Restart all the pods which have a dependency on elasticsearch to pick a new elasticsearch_url.

Elastic Search Rolling Upgrade

Overview

This page provides comprehensive documentation and instructions for implementing a rolling upgrade strategy for your Elasticsearch cluster.

Steps

Copy the below script and save it as es-dump.sh. Replace the elasticsearch URL and the indices names in the script.

#!/bin/bash
#es-dump.sh

#  Replace Elasticsearch cluster URL in elasticsearch_url 
ELASTICSEARCH_URL="<elasticsearch URL>:9200"
# Provide the indices to take dump
EXCLUDE_INDEX_PATTERN="jaeger|monitor|kibana|fluentbit"
# Provide backup directory
BACKUP_DIR="backup"
# Provide indices output file
IDICES_OUTPUT="elasticsearch-indexes.txt"

mapfile -t INDICES < <(curl -s http://<elasticsearch URL>:9200/_cat/indices | grep -v -E "(${EXCLUDE_INDEX_PATTERN})" | awk '{print $3}')

printf "%s\n" "${INDICES[@]}" > $IDICES_OUTPUT

# Create backup directory if it doesn't exist
mkdir -p "$BACKUP_DIR"

# Loop through each index and perform export
for INDEX in "${INDICES[@]}"; do
    OUTPUT_FILE="${BACKUP_DIR}/${INDEX}_mapping_backup.json"

    # Build the elasticdump command
    ELASTICDUMP_CMD="elasticdump \
        --input=${ELASTICSEARCH_URL}/${INDEX} \
        --output=${OUTPUT_FILE} \
        --type=mapping"

    # Execute the elasticdump command
    $ELASTICDUMP_CMD

    # Check if the elasticdump command was successful
    if [ $? -eq 0 ]; then
        echo "Backup of index ${INDEX} mapping completed successfully."
    else
        echo "Error backing up index ${INDEX}."
    fi
done

for INDEX in "${INDICES[@]}"; do
    OUTPUT_FILE="${BACKUP_DIR}/${INDEX}_data_backup.json"

    # Build the elasticdump command
    ELASTICDUMP_CMD="elasticdump \
        --input=${ELASTICSEARCH_URL}/${INDEX} \
        --output=${OUTPUT_FILE} \
        --type=data
        --timeout=300000
        --limit 10000
        --skip-existing"

    # Execute the elasticdump command
    $ELASTICDUMP_CMD

    # Check if the elasticdump command was successful
    if [ $? -eq 0 ]; then
        echo "Backup of index ${INDEX} completed successfully."
    else
        echo "Error backing up index ${INDEX}."
    fi
done

Run the below commands in the terminal.

export KUBECONFIG=<path_to_your_kubeconfig>
kubectl get pods -n playground
kubectl cp <path_to_script_in_your_machine>/es-dump.sh playground/<playground_name>:<path_in_playground_pod>/es-dump.sh

Now, run the below command inside the playground pod.

# Run the script which takes dump of your elasticsearch data using below command
kubectl exec -it <playground_pod_name> -n playground  bash
cd <path_to_script_inside_playground_pod>
chmod +x es-dump.sh
./es-dump.sh

# When playground pod restarts the data will be lost. So, to store data in your local machine run below command
 kubectl cp playground/<playground_pod_name>:/backup <path_to_store_in_local>/backup

Rolling upgrade from v6.6.2 to v7.17.15

Steps

List the elasticsearch pods and enter into any of the elasticsearch pod shells.

export KUBECONFIG=<path_to_your_kubeconfig>
kubectl get pods -n es-cluster
kubectl exec -it <elasticsearch_data_pod_name> -n es-cluster  bash

Disable shard allocation: You can avoid racing the clock by disabling the allocation of replicas before shutting down data nodes. Stop non-essential indexing and perform a synced flush: While you can continue indexing during the upgrade, shard recovery is much faster if you temporarily stop non-essential indexing and perform a synced-flush. Run the below curls inside elasticsearch data pod.

# Replace elasticsearch url
curl -X PUT "<elasticsearch_url>:9200/_cluster/settings?pretty" -H 'Content-Type: application/json' -d'
{
  "persistent": {
    "cluster.routing.allocation.enable": "primaries"
  }
}
'

curl -X POST "<elasticsearch_url>:9200/_flush/synced?pretty"

Scale down the replica count of elasticsearch master and data from 3 to 0.

kubectl get statefulsets -n es-cluster
kubectl scale statefulsets <elasticsearch_master> -n es-cluster --replicas=0
kubectl scale statefulsets <elasticsearch_data> -n es-cluster --replicas=0

Edit the Statefulset of elasticsearch master by replacing the docker image removing deprecated environment variables and adding compatible environment variables. Replace the elasticsearch image tag from 6.6.2 to 7.17.15. The below code provides the depraced environment variables and compatible environment variables.

# Depricated environment variables
- env:
  - name: discovery.zen.minimum_master_nodes
    value: "2"
  - name: discovery.zen.ping.unicast.hosts
    value: elasticsearch-master-v1
  - name: node.data
    value: "false"
  - name: node.ingest
    value: "false"
  - name: node.master
    value: "true"
  - name: gateway.expected_master_nodes
    value: "2"
  - name: gateway.expected_data_nodes
    value: "1"
  - name: gateway.recover_after_time
    value: 5m
  - name: gateway.recover_after_master_nodes
    value: "2"
  - name: gateway.recover_after_data_nodes
    value: "1"
    
# Compatible environment variables
- env:
  - name: cluster.initial_master_nodes
    value: elasticsearch-master-v1-0,elasticsearch-master-v1-1,elasticsearch-master-v1-2 
  - name: discovery.seed_hosts
    value: elasticsearch-master-v1-headless
  - name: node.roles
    value: master

Edit elasticsearch-master values.yaml file

# values.yaml

ClusterName: "elasticsearch"
nodeGroup: master-v1

Edit the Statefulset of elasticsearch data by replacing the docker image removing deprecated environment variables and adding compatible environment variables. Replace the elasticsearch image tag from 6.6.2 to 7.17.15.

# Depricated environment variables
- env:
  - name: discovery.zen.ping.unicast.hosts
    value: elasticsearch-master-v1
  - name: node.data
    value: "true"
  - name: node.ingest
    value: "true"
  - name: node.master
    value: "false"
  - name: gateway.expected_master_nodes
    value: "2"
  - name: gateway.expected_data_nodes
    value: "1"
  - name: gateway.recover_after_time
    value: 5m
  - name: gateway.recover_after_master_nodes
    value: "2"
  - name: gateway.recover_after_data_nodes
    value: "1"
  - name: ingest.geoip.downloader.enabled
    value: "false"
    
# Compatible environment variables
- env: 
  - name: discovery.seed_hosts
    value: elasticsearch-master-v1-headless
  - name: node.roles
    value: data,ingest

Edit elasticsearch-data values.yaml file.

# values.yaml

ClusterName: "elasticsearch"
nodeGroup: "data-v1"

After making the changes, scale up the statefulsets of elasticsearch data and master.

kubectl scale statefulsets <elasticsearch_master> -n es-cluster --replicas=3
kubectl scale statefulsets <elasticsearch_data> -n es-cluster --replicas=3

After all pods are in running state, re-enable shard allocation and check cluster health.

# Enter into elasticsearch pod
kubectl exec -it <elasticsearch_data_pod_name> -n es-cluster  bash

#Run below curl commands
curl -X PUT "<elasticsearch_url>:9200/_cluster/settings?pretty" -H 'Content-Type: application/json' -d'
{
  "persistent": {
    "cluster.routing.allocation.enable": null
  }
}
'

curl -X GET "<elasticsearch_url>:9200/_cat/health?v=true&pretty"

You have successfully upgraded the elasticsearch cluster from v6.6.2 to v7.17.15 :)

ReIndexing the Indices:

After successfully upgrading the elasticsearch, reindex the indices present in elasticsearch using below script which are created in v6.6.2 or earlier.

Copy the below script and save it as es-reindex.sh. Replace the elasticsearch URL in the script.

#!/bin/bash

ELASTICSEARCH_URL="<Elasticsearch URL>:9200"
TMP="_tmp"

FILENAME="elasticsearch-indexes.txt"
INDICES=()
while IFS= read -r index; do
    INDICES+=("$index")
done < "$FILENAME"

# do for all abc elastic indices
for INDEX in "${INDICES[@]}"; do
    sleep 5
    echo -e "Reindex process starting for index: $INDEX\n"
    tmp_index=$INDEX${TMP}
    echo "Starting reindexing elastic data from original index:$INDEX to temporary index:$tmp_index"
    output=$(curl -X POST "${ELASTICSEARCH_URL}/_reindex" --max-time 3600 -H 'Content-Type: application/json' -d'
    {
      "source": {
        "index": "'"$INDEX"'"
      },
      "dest": {
        "index": "'"$tmp_index"'"
      }
    }
    ')
    sleep 5
    echo -e "Reindexing completed from original index:$INDEX to temporary index:$tmp_index with output: $output\n"
    echo -e "Deleting $INDEX\n"
    output=$(curl -X DELETE "${ELASTICSEARCH_URL}/$INDEX")
    echo -e "$INDEX deleted with status: $output\n"
    echo "Starting reindexing elastic data from temporary index:$tmp_index to original index:$INDEX"
    output=$(curl -X POST "${ELASTICSEARCH_URL}/_reindex" --max-time 3600 -H 'Content-Type: application/json' -d'
    {
      "source": {
        "index": "'"$tmp_index"'"
      },
      "dest": {
        "index": "'"$INDEX"'"
      }
    }
    ')
    echo -e "Reindexing completed from temporary index:$tmp_index to original index:$INDEX with output: $output\n"
    echo -e "Deleting $tmp_index\n"
    output=$(curl -X DELETE "${ELASTICSEARCH_URL}/$tmp_index")
    echo -e "$tmp_index deleted with status: $output\n\n\n"
done

Run the below commands in the terminal.

export KUBECONFIG=<path_to_your_kubeconfig>
kubectl get pods -n playground
kubectl cp <path_to_script_in_your_machine>/es-reindex.sh playground/<playground_name>:<path_in_playground_pod>/es-dump.sh

Now, run the below command inside the playground pod.

# Run the script which reinex the indicesc of your elasticsearch data using below command
kubectl exec -it <playground_pod_name> -n playground  bash
cd <path_to_script_inside_playground_pod>
chmod +x es-reindex.sh
./es-reindex.sh

Rolling upgrade from v7.17.15 to v8.11.3 & security is disabled

Steps

List the elasticsearch pods and enter into any of the elasticsearch pod shells.

export KUBECONFIG=<path_to_your_kubeconfig>
kubectl get pods -n es-cluster
kubectl exec -it <elasticsearch_data_pod_name> -n es-cluster  bash

Disable shard allocation: You can avoid racing the clock by disabling the allocation of replicas before shutting down data nodes. Stop non-essential indexing and perform a synced flush: While you can continue indexing during the upgrade, shard recovery is much faster if you temporarily stop non-essential indexing and perform a synced-flush. Run the below curls inside elasticsearch data pod.

# Replace elasticsearch url
curl -X PUT "<elasticsearch_url>:9200/_cluster/settings?pretty" -H 'Content-Type: application/json' -d'
{
  "persistent": {
    "cluster.routing.allocation.enable": "primaries"
  }
}
'

curl -X POST "<elasticsearch_url>:9200/_flush/synced?pretty"

Scale down the replica count of elasticsearch master and data from 3 to 0.

kubectl get statefulsets -n es-cluster
kubectl scale statefulsets <elasticsearch_master> -n es-cluster --replicas=0
kubectl scale statefulsets <elasticsearch_data> -n es-cluster --replicas=0

Edit the Statefulset of elasticsearch master by replacing the docker image removing deprecated environment variables and adding compatible environment variables. Replace the elasticsearch image tag from 7.17.15 to 8.11.3. The below code provides the compatible environment variables and if you are following a rolling upgrade then there are no deprecated environment variables from v7.17.15 to v8.11.3.

# Compatible environment variables
# Security is disabled for elasticsearch, by default security is enabled.
- env:
  - name: cluster.initial_master_nodes
    value: elasticsearch-master-v1-0,elasticsearch-master-v1-1,elasticsearch-master-v1-2 
  - name: xpack.security.enabled
    value: false 
  - name: discovery.seed_hosts
    value: elasticsearch-master-v1-headless
  - name: node.roles
    value: master

Edit the Statefulset of elasticsearch data by replacing the docker image removing deprecated environment variables and adding compatible environment variables. Replace the elasticsearch image tag from 7.17.15 to 8.11.3.

# Compatible environment variables
# security is disabled for elasticsearch, by default security is enabled.
- env:
  - name: cluster.initial_master_nodes
    value: elasticsearch-master-v1-0,elasticsearch-master-v1-1,elasticsearch-master-v1-2 
  - name: discovery.seed_hosts
    value: elasticsearch-master-v1-headless
  - name: node.roles
    value: data,ingest
  - name: xpack.security.enabled
    value: false

After making the changes, scale up the statefulsets of elasticsearch data and master.

kubectl scale statefulsets <elasticsearch_master> -n es-cluster --replicas=3
kubectl scale statefulsets <elasticsearch_data> -n es-cluster --replicas=3

After all pods are in running state, re-enable shard allocation and check cluster health.

# Enter into elasticsearch pod
kubectl exec -it <elasticsearch_data_pod_name> -n es-cluster  bash

#Run below curl commands
curl -X PUT "<elasticsearch_url>:9200/_cluster/settings?pretty" -H 'Content-Type: application/json' -d'
{
  "persistent": {
    "cluster.routing.allocation.enable": null
  }
}
'

curl -X GET "<elasticsearch_url>:9200/_cat/health?v=true&pretty"

You have successfully upgraded the elasticsearch cluster from v7.17.15 to v8.11.3