Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
DIGIT services deployment in azure cloud platform
Make sure you have your Azure account with the necessary credentials.
All DIGIT services are packaged using helm charts, Install helm using the link Installing Helm
kubectl is a CLI to connect to the Kubernetes cluster on your machine
Install CURL for making API calls
Install VisualStudio IDE Code for better code visualization/editing capabilities
Install Postman to run digit bootstrap scripts
Install Terraform to provide infrastructure on Azure
Clone the DIGIT-DevOps Repo and check out to the Azure branch
Change to the remote state in the sample-azure directory
Login to Azure using the below command in the terminal
Update the variables in variables.tf file
Run the below commands to create resource-group, storage-account and container
Copy the storage account name and change to the sample-azure directory
Open main.tf file and update the below placeholder details
Create client-id and client-secret with necessary permissions
Open variables.tf file - update the variables and run the below commands
Note the db_name and server_name
Fetch the kubeconfig using the below command. This will automatically store your kubeconfig in .kube folder
Check the kubeconfig and pods by running the below commands
Change to the environments directory and open egov-demo.yaml
Update the below configurations in egov-demo.yaml
Open the egov-demo-secrets.yaml file and update db details and private key
Generate SSH key pairs (Use either method (a) or method (b)) to update the private key.
a. Using the online website (not recommended for production setup. To be only used for demo setups): https://8gwifi.org/sshfunctions.jsp
b. Using OpenSSL :
openssl genpkey -algorithm RSA -out private_key.pem openssl rsa -pubout -in private_key.pem -out public_key.pem
Add the public key to your GitHub account (reference: https://www.youtube.com/watch?v=9C7_jBn9XJ0&ab_channel=AOSNote )
Change to the deployer directory
Run the below command to deploy nginx-ingress
Check the pods once all services are deployed successfully
Run the below command to get the load balancer id
Copy the load balancer id and add it to your domain provider against your domain name.
DIGIT Automation on AWS
Following are the pre-requisites and installation steps for setting up DIGIT on AWS:
Install :
For Linux: Follow the to install Golang on Linux.
For Windows: Download the installer using the and follow the installation instructions.
For Mac: Download the installer using the and follow the installation instructions.
Install - DIGIT services are packaged with Helm Charts
Install - CLI to connect to the Kubernetes cluster on your machine
Install - for making API calls
Install - for better code visualization/editing capabilities
Install - to run digit bootstrap scripts
Install - to provide infrastructure on AWS
Install and
Once you have installed all these pre-requisites, you are ready to set up DIGIT and its services.
To provision infrastructure and set up DIGIT, follow the steps below:
Clone the DIGIT-DevOps repository:
Navigate to the cloned repository and checkout the release-1.28-kubernetes branch:
cd DIGIT-DevOps git checkout release-1.28-kubernetes
Check if correct credentials are configured using the command:
aws configure list
b. Using openssl :
openssl genpkey -algorithm RSA -out private_key.pem openssl rsa -pubout -in private_key.pem -out public_key.pem
Open input.yaml file in vscode. You can use the below code to directly open it in VS code:
code infra-as-code/terraform/sample-aws/input.yaml
If the command does not work you can manually go and open the file in VS code. Once the file is open, fill the inputs. (In case you are not using vscode, you can open it any editor of your choice)
Fill in the inputs as per the regex mentioned in the comments.
Open egov-demo-secret.yaml and add DB password (line number 5), flywayPassword (line number 7) and private key.
code config-as-code/environments/egov-demo-secrets.yaml
Make sure the DB password and flywayPassword are same. Private key has to be added inside git-sync key against ssh key (line number 37).
Go to infra-as-code/terraform/sample-aws and run init.go script to enrich different files based on input.yaml.
cd infra-as-code/terraform/sample-aws go run ../scripts/init.go
Navigate to the remote-state folder and run terraform to create a S3 bucket and DynamoDB.
cd remote-state
terraform init
terraform plan
terraform apply
Navigate back to sample-aws folder and run terraform to provision infrastructure for DIGIT.
cd ..
terraform init
terraform plan
terraform apply
(Add the same DB password which you have added in egov-demo-secret.yaml when prompted after running terraform apply)
Execute the following command to generate a kubeConfig file and update the volumeIds, DB URL, and other relevant details in the egov-demo.yaml file.
terraform output -json | go run ../scripts/envYAMLUpdater.go
Run the export KUBECONFIG command shown on terminal. (Note: The exact command to run will be printed on terminal. It will be something like this: export KUBECONFIG=<LOCAL_KUBECONFIGPATH> )
Run the digit-installer.go script to install DIGIT using the following command:
cd ../../../deploy-as-code/deployer go run digit_installer.go
Once the deployment is done get the CNAME of the nginx-ingress-controller:
kubectl get svc nginx-ingress-controller -n egov -o jsonpath='{.status.loadBalancer.ingress[0].hostname}'
Follow the steps below to set up seed data:
Port-forward user pod using the following command -
kubectl port-forward <egov_user_pod> 8080:8080 -n egov
Hit super_user_creation cURL. This will create a super user with username as GRO and password as eGov@4321.
Open the accessToken_generation cURL. The credentials have already been populated. Change "{{YOUR_DOMAIN_NAME}}" placeholder to the domain name defined in input.yaml file while provisioning and hit the cURL.
In the response, you will get "access_token" field. Highlight this value, right click on it and set it as global "token "value.
Execute rainmaker common, rainmaker locality, rainmaker PGR localization and PGR workflow cURLs by changing "{{YOUR_DOMAIN_NAME}}" placeholder to the domain name defined in input.yaml file to setup localization and workflow seed data.
Follow the steps below to destroy the cluster once the demo is done:
Delete the nginx-ingress-controller service in the egov
namespace using the below command and navigate to the infra-as-code/terraform/sample-aws
directory: kubectl delete svc nginx-ingress-controller -n egov cd ../../infra-as-code/terraform/sample-aws terraform destroy
Run the Terraform destroy command to delete the cluster.
To destroy the remote state bucket, first set the lifecycle value to false in the main.tf
file in the remote-state
folder:
lifecycle { prevent_destroy = false }
After making this change, go to the AWS console and empty the S3 bucket associated with the remote state.
Once the bucket is emptied, you can proceed to destroy the remote state bucket using the Terraform destroy command.
Provision infra for DIGIT on AWS using Terraform
The is one of the AWS services for deploying, managing, and scaling any distributed and containerized workloads. Here we can provision the EKS cluster on AWS from the ground up and using an automated way (infra-as-code) using and then deploy the DIGIT services config-as-code using .
Know about EKS:
Know what is terraform:
Find the pre-requisites for deploying DIGIT platform services on AWS
The is one of the AWS services for deploying, managing and scaling any distributed and containerized workloads. Here we can provision the EKS cluster on AWS from the ground up using (infra-as-code) and then deploy the DIGIT platform services as config-as-code using .
Know about EKS:
Know what is terraform:
with admin access to provision EKS Service. You can always subscribe to a free AWS account to learn the basics and try, but there is a limit to . For this demo, you need a commercial subscription to the . If you want to try for a day or two, it might cost you about Rs 500 - 1000.
Note: Post the Demo (for eGov internal folks only) - request for the AWS access for 4 hrs. Time-bound access to eGov's training AWS account is available upon request and the available number of slots per day).
Install (any version) on the local machine - it helps in interaction with the Kubernetes cluster.
Install - this helps package the services along with the configurations, environments, secrets, etc into .
Install to run different versions of terraform on the machine. supports multiple terraform versions on the same machine and you can toggle between the desired versions.
Please refer to tfswitch documentation for different platforms. Terraform version 0.14.10 can be installed directly as well.
High-level overview of DIGIT deployment
DIGIT is an open-source, customizable platform that lends itself to extensibility. New modules can be built on top of the platform to suit new use-cases or existing modules can be modified or replaced. To enable this, in addition to deploying DIGIT, a CD/CI pipeline should be set up. CD/CI pipelines enable the end user to automate & simplify the build/deploy process.
DIGIT comes with configurable "CI as code", "Deploy as code" etc.. which can be utilized to set up the pipelines and deploy new modules. More on that in the steps below.
Note: Changing the DIGIT code has implications for upgrades. That is, you may not be able to upgrade to the latest version of DIGIT depending on the changes that have been made. New modules are generally not a problem for upgrades.
Work in Progress
DIGIT Core consists of 25+ services that use Postgres, Apache Kafka, Elastic Search etc. It is recommended to deploy DIGIT on a server-grade machine. Installing DIGIT on laptops is not recommended as the free memory requirements might not be met.
Follow the resources listed below to Install DIGIT.
Steps to setup the AWS account for deployment
Follow the details below to set up your AWS account before you proceed with the DIGIT deployment.
on your local machine so that you can use AWS CLI commands to provision and manage the cloud resources on your account.
Install - it helps you authenticate your connection from your local machine so that you can deploy DIGIT services.
Complete DIGIT installation step-by-step Instructions across various Infra types like public & private clouds
The would have helped you to get your hands dirty and build the Kubernetes cluster on a local/single VM instance - which you can consider for either local development or to understand the details involved in infra and deployment.
However, DIGIT is a platform and at the same time . Depending on the scale and performance running DIGIT on production requires advanced capabilities like HA, DRS, autoscaling, resiliency, etc. All these capabilities are supported by commercial clouds like AWS, Google, Azure, VMware, OpenStack, etc.. and also the private clouds like NIC and a few SDCs implemented clouds. These cloud providers provide the Kubernetes-as-a-managed-service that makes the entire infra setup and management seamless and automated, like infra-as-code, and config-as-code.
Know the basics of Kubernetes:
Know the commands
Know kubernetes manifests:
Know how to manage env values, secrets of any service deployed in kubernetes
Know how to port forward to a pod running inside k8s cluster and work locally
Know sops to secure your keys/creds:
Unlike quickstart, full installation requires state/user-specific configurations ready before proceeding with the deployment.
You need to have a fully qualified DNS (URL) - (Should not be a dummy)
Persistent storage depends on the cloud you are using for Kafka, ES, etc.
Either a standalone or a hosted PostGres DB above v11.x
GeoLocation provider configs (Google Location API), SMS Gateway, Payment Gateway, etc.
The newly created user must have access to the MDMS and config forked repo.
Choose your cloud and follow the instructions to set up a Kubernetes cluster before moving on to deployment.
Post-infra setup (Kubernetes Cluster), the deployment involves 2 stages and 2 modes. Check out the stages first and then the modes. As part of a sample exercise, we will deploy the PGR module. However, deployment steps are similar. The pre-requisites have to be configured accordingly.
each service global, local env variables
Number of replicas/scale of individual services (Depending on whether dev or prod)
MDMS, config repos (Master data, ULB, tenant details, users, etc)
sms g/w, email g/w, payment g/w
GMap key (In case you are using Google Map services in your PGR, PT, TL, etc)
S3 Bucket for Filestore
URL/DNS on which the DIGIT will be exposed
SSL certificate for the above URL
End-points configs (Internal/external)
Stage 2: Run the digit_setup deployment script and simply answer the questions that it asks.
All done, wait and watch for 10 min, you'll have the DIGIT setup completed and the application will be running on the given URL.
Essentially, DIGIT deployment means that we need to generate Kubernetes manifests for each individual service. We use the tool called the helm, which is an easy, effective and customizable packaging and deployment solution. So depending on where and in which env you initiate the deployment there are 2 modes that you can deploy.
From local machine - whatever we are trying in this sample exercise so far.
Post-deployment - the application is now accessible from the configured domain.
To try out PGR employee login - Create a sample tenant, city, and user to log in and assign an LME employee role using the seed script.
By now we have successfully completed the DIGIT setup on the cloud. Use the URL that you mentioned in your env.yaml Eg: https://mysetup.digit.org and create a grievance to ensure the PGR module deployed is working fine. Refer to the product documentation below for the steps.
Credentials:
Citizen: You can use your default mobile number (9999999999) to sign in using the default Mobile OTP 123456.
Employee: Username: GRO and password: eGov@4321
Post grievance creation and assignment of the same to LME, capture the screenshot of the same and share it to ensure your setup is working fine.
Post validating the PGR functionality share the API response of the following request to assess the correctness of successful DIGIT PGR Deployment.
Finally, clean up the DIGIT setup if you wish, using the following command. This will delete the entire cluster and other cloud resources that were provisioned for the DIGIT setup.
To destroy previously-created infrastructure with Terraform, run the command below:
ELB is not deployed via Terraform. ELB has created at deployment time by the setup of Kubernetes Ingress. This has to be deleted manually by deleting the ingress service.
kubectl delete deployment nginx-ingress-controller -n <namespace>
kubectl delete svc nginx-ingress-controller -n <namespace>
Note: Namespace can be one of egov or jenkins.
Delete S3 buckets manually from the AWS console and also verify if ELB got deleted.
Run terraform destroy
.
All done, we have successfully created infra on the cloud, deployed DIGIT, bootstrapped DIGIT, performed a transaction on PGR and finally destroyed the cluster.
Steps to prepare the deployment configuration file
It's important to prepare a global deployment configuration yaml file that contains all necessary user-specific custom values like URL, gateways, persistent storage ids, DB details etc.
Know the basics of Kubernetes:
Know the commands
Know kubernetes manifests:
Know how to manage env values, secrets of any service deployed in kubernetes
Know how to port forward to a pod running inside k8s cluster and work locally
Know sops to secure your keys/creds:
Post-Kubernetes Cluster setup, the deployment has got 2 stages. As part of this sample exercise, we can deploy PGR and show what are the various configurations required, however deployment steps are similar for all other modules too, just that the prerequisites differ depending on the feature like SMS Gateway, Payment Gateway, etc
Navigate to the following file in your local machine from the previously cloned DevOps git repository.
root@ip:/# git clone -b release https://github.com/egovernments/DIGIT-DevOps
Step 2: After cloning the repo CD into the folder DIGIT-DevOps and type the "code ." command that will open the visual editor and opens all the files from the repo DIGIT-DevOps
Here you need to replace the following as per your values
SMS gateway to receive OTP, transaction mobile notification, etc.
MDMS, Config repo URL, here is where you provide master data, tenants and various user/role access details.
GMap key for the location service
Payment gateway, in case you use PT, TL, etc
Step 4: Update your credentials and sensitive data in the secret file as per your details.
SOPS expects an encryption key to use it to encrypt/decrypt a specified plain text and keep the details secured, there are a couple of options which you can use to generate the encryption key
Step 5: Important: Fork the following repos that contain the master data and default configs which you would customize as per your specific implementation later. Like (Master Data, ULB, Tenant details, Users, etc) to your respective GitHub organization account.
New github users should be enabled to access the earlier forked repos
Step 6: Update the deployment configs for the below as per your specification:
Number of replicas/scale of each individual service (Depending on whether dev or prod load)
You must update sms gateway, email gateway, and payment gateway details for the notification and payment gateway services, etc.
Update the config, MDMS github repos wherever marked
Update GMap key (In case you are using Google Map services in your PGR, PT, TL, etc)
URL/DNS on which the DIGIT will be exposed.
SSL certificate for the above URL.
Any specific endpoints configs (Internal/external)
This page details the steps to deploy the core platform services and reference applications.
The steps here can be used to deploy:
DIGIT core platform services
Public Grievance & Redressal module
Trade Licence module
Property Tax module
Water & Sewerage module etc.
DIGIT uses (required v1.13.3) automated scripts to deploy the builds onto Kubernetes - or or
All DIGIT services are packaged using helm charts
is a CLI to connect to the Kubernetes cluster from your machine
Install for making API calls
IDE Code for better code/configuration editing capabilities
to run digit bootstrap scripts
Once all the deployments configs are ready, run the command given below. Input the necessary details as prompted on the screen and the interactive installer will take care of the rest.
All done, wait and watch for 10 min. The DIGIT setup is complete, and the application will run on the URL.
Note:
If you do not have your domain yet, you can edit the host file entries and map the nginx-ingress-service load balancer id like below
When you find it, add the following lines to the host file, save and close it.
aws-load-balancer-id digit.try.com
You can now test the DIGIT application status in the command prompt/terminal using the command below.
Note: Initially pgr-services would be in crashloopbackoff state, but after performing the post-deployment steps the pgr-services will start running.
After deploying your environment config into the EC2 cluster, we have to add the security group ID of the instance to RDS.
Follow the steps below:
Go to the AWS console and search for EC2.
Click on Instances and select the Instance ID which you created.
Scroll down and go to Security and copy the Security Group ID. It starts with sg-xxxxxxxxxxxxxxxxx.
In the search bar, search for RDS and then go to Databases. Choose the db you had created.
Scroll down for Security Group rules and click any one sg. It redirects to another tab.
Scroll down and click on Edit inbound rules. Click on Add rule.
Change the Type to Postgresql and paste the copied sg-xxxxxxxxxxxxxxxxx beside the custom and click on save rules.
git clone
Make sure that the above command reflects the set AWS credentials. Proceed once the details are confirmed. (Refer to the AWS document in case of any doubts on how to set the credentials: )
Generate ssh key pairs using either method (a) or method (b). a. Using online website (not recommended in production setup. To be only used for demo setups):
Add the public key to your github account - (reference: )
The output of this will be the something like this: Add the CNAME to your domain provider against your domain name.
Import the provided .
5. Run tfswitch and it will show a list of terraform versions. Scroll down and select version (0.14.10) for the (IaC) to provision cloud resources as code. This provides the desired resource graph and also helps destroy the cluster in one go.
When you have the command line access configured, everything is set for you to proceed with the terraform to provision the DIGIT .
Access to MDMS repository for master data like Roles, Access, Actions, tenants, etc. Sample repo view .
Access to Configs repository like persister, searcher configs etc. Sample repo view .
Create a . This should be different from the repo-forked GitHub organization account. Once a user account is ready, generate the ssh authentication key and .
both the , and repos into your GitHub account.
Stage 1: Prepare an <, you can provide any name to this file. The file has the following configurations and this env file needs to be in line with your cluster name.
credentials, secrets (You need to encrypt using and create a <env>-secret.yaml separately)
Advanced: From CI/CD System like Jenkins - Depending on how you want to set up your CI/CD and the expertise the steps varies. Find out how we have set up CI/CD on Jenkins and the pipelines are created automatically without any manual intervention .
Run the of the egov-user service from Kubernetes cluster to your localhost. This gives you access to egov-user service directly and you can now interact with the API directly.
Ensure you have the postman to run the following seed data API. If not, on your local machine.
In case of if ELB is not deleted, you need to delete ELB from .
Step 1: Clone the following repo (If not already done as part of Infra setup), you may need to and then run it to your machine.
Step 3: Update the deployment config file with your details, you can use the following template
Important: Add your domain name , which you want to use for accessing DIGIT. ( Do not use the dummy domain )
Important: As per your cloud provider uncomment the related backbone services (Kafka, Zk, elasticsearch, etc) and comment on others. As per your cloud provider, you have to add the volume_ids/diskURI/iqn and zone/diskName/targetPortal in that you got as a terraform output or from sdc team (Kafka, Zk, elasticsearch, etc)
credentials, secrets (You need to encrypt using and create a separately)
Option 1: Generate PGP keys
Option 2: when you want to use the AWS cloud provider.
Once you generate your encryption key, create a .sops.yaml configuration file under the /helm directory of the cloned repo to define which keys are used for which specific file. refer to the SOP for info.
Note: For demo purposes, you can use the as it is without sops configuration, but make sure you update your specific details like Git SSH, URL etc. When you decide to push these configurations into any git or public space, please make sure you follow the sops configuration mentioned in this article to encrypt your secrets.
both the , and repos into your GitHub organization account
Once you fork the repos into your GitHub organization account, Create a , and generate ssh authentication key( and .
Add the ssh private key that you generated in the previous step to under the git-sync section.
Modify the services git-Sync repo and branch with your fork repo and branch in
Create one private s3 Bucket for Filestore and one public bucket for logos. Add the bucket details respectively and create an IAM user with the s3 bucket access. Add IAM user details to .
Run the egov-deployer golang script from the .
If you have a GoDaddy account or similar and a DNS records edit access you can map the load balancer id to desired DNS. Create a record with the load balancer ID and domain.
DIGIT Deployment
High-level overview of DIGIT deployment - steps to follow
If you have any questions please write to us.
Make sure to use the appropriate discussion category and labels to address the issues better.
The pre-requisites for deploying on Azure
The Azure Kubernetes Service (AKS) is one of the Azure services for deploying, managing and scaling any distributed and containerized workloads, here we can provision the AKS cluster on Azure from the ground up using terraform (infra-as-code) and then deploy the DIGIT platform services as config-as-code using Helm.
Know about AKS: https://www.youtube.com/watch?v=i5aALhhXDwc&ab_channel=DevOpsCoach
Know what is terraform: https://youtu.be/h970ZBgKINg
Azure subscription: If you don't have an Azure subscription, create a free account before you begin.
Install Azure CLI
Configure Terraform: Follow the directions in the article, Terraform and configure access to Azure
Azure service principal: Follow the directions in the Create the service principal section in the article, Create an Azure service principal with Azure CLI. Take note of the values for the appId, displayName, password, and tenant.
Install kubectl on your local machine which helps you interact with the Kubernetes cluster.
Install Helm that helps you package the services along with the configurations, environments, secrets, etc into Kubernetes manifests.
Steps to setup CI/CD on SDC
Kubespray is a composition of Ansible playbooks, inventory, provisioning tools, and domain knowledge for generic OS/Kubernetes cluster configuration management tasks. Kubespray provides:
a highly available cluster
composable attributes
support for most popular Linux distributions
continuous-integration tests
Fork the repos below to your GitHub Organization account
Go lang (version 1.13.X)
Install kubectl on your local machine to interact with the Kubernetes cluster.
Install Helm to help package the services along with the configurations, environment, secrets, etc into Kubernetes manifests.
One Bastion machine to run Kubespray
HA-PROXY machine which acts as a load balancer with Public IP. (CPU: 2Core , Memory: 4Gb)
one machine which acts as a master node. (CPU: 2Core , Memory: 4Gb)
one machine which acts as a worker node. (CPU: 8Core , Memory: 16Gb)
ISCSI volumes for persistence volume. (number of quantity: 2 )
kaniko-cache-claim:- 10Gb
Jenkins home:- 100Gb
Kubernetes nodes
Ubuntu 18.04
SSH
Privileged user
Python
Run and follow instructions on all nodes.
Ansible needs Python to be installed on all the machines.
apt-get update && apt-get install python3-pip -y
All the machines should be in the same network with ubuntu or centos installed.
ssh key should be generated from the Bastion machine and must be copied to all the servers part of your inventory.
Generate the ssh key ssh-keygen -t rsa
Copy over the public key to all nodes.
Clone the official repository
Install dependencies from requirements.txt
Create Inventory
where mycluster is the custom configuration name. Replace with whatever name you would like to assign to the current cluster.
Create inventory using an inventory generator.
Once it runs, you can see an inventory file that looks like the below:
Review and change parameters under inventory/mycluster/group_vars
Deploy Kubespray with Ansible Playbook - run the playbook as Ubuntu
The option --become
is required, for example writing SSL keys in /etc/, installing packages and interacting with various system daemons.
Note: Without --become
- the playbook will fail to run!
Kubernetes cluster will be created with three masters and four nodes with the above process.
Kube config will be generated in a .Kubefolder. The cluster can be accessible via kubeconfig.
Install haproxy package in a haproxy machine that will be allocated for proxy
sudo apt-get install haproxy -y
IPs need to be whitelisted as per the requirements in the config.
sudo vim /etc/haproxy/haproxy.cfg
Iscsi volumes will be provided by the SDC team as per the requisition and the same can be used for statefulsets.
Refer to the doc here.
DIGIT Quickstart is recommended to jump-start with minimal DIGIT services to get a sense of the various installation steps and system requirements.
Deploy DIGIT using Kubespray
Kubespray is a composition of Ansible playbooks, inventory, provisioning tools, and domain knowledge for generic OS/Kubernetes cluster configuration management tasks. Kubespray provides:
a highly available cluster
composable attributes
support for most popular Linux distributions
continuous-integration tests
Before we can get started, we need a few prerequisites to be in place. This is what we are going to need:
A host with Ansible installed. Click here to learn more about Ansible. Find the Ansible installation details here.
You should also set up an SSH key pair to authenticate to the Kubernetes nodes without using a password. This permits Ansible to perform optimally.
Few servers/hosts/VMs to serve as our targets to deploy Kubernetes. I am using Ubuntu 18.04, and my servers each have 4GB RAM and 2vCPUs. This is fine for my testing purposes, which I use to try out new things using Kubernetes. You need to be able to SSH into each of these nodes as root using the SSH key pair I mentioned above.
The above will do the following:
Create a new Linux User Account for use with Kubernetes on each node
Install Kubernetes and containers on each node
Configure the Master node
Join the Worker nodes to the new cluster
Ansible needs Python to be installed on all the machines.
apt-get update && apt-get install python3-pip -y
All the machines should be in the same network with Ubuntu or Centos installed.
ssh key should be generated from the Bastion machine and must be copied to all the servers part of your inventory.
Generate the ssh key ssh-keygen -t rsa
Copy over the public key to all nodes.
Clone the official repository
Install dependencies from requirements.txt
Create Inventory
where mycluster is the custom configuration name. Replace with whatever name you would like to assign to the current cluster.
Create inventory using an inventory generator.
Once it runs, you can see an inventory file that looks like the below:
Review and change parameters under inventory/mycluster/group_vars
Deploy Kubespray with Ansible Playbook - run the playbook as Ubuntu
The option --become
is required for example writing SSL keys in /etc/, installing packages and interacting with various system daemons.
Note: Without --become
- the playbook will fail to run!
Kubernetes cluster will be created with three masters and four nodes using the above process.
Kube config will be generated in a .Kubefolder. The cluster can be accessible via kubeconfig.
Install haproxy package in a haproxy machine that will be allocated for proxy
sudo apt-get install haproxy -y
IPs need to be whitelisted as per the requirements in the config.
sudo vim /etc/haproxy/haproxy.cfg
Iscsi volumes will be provided by the SDC team as per the requisition and the same can be used for statefulsets.
Pre-requisites for deployment on SDC
Check the hardware and software pre-requisites for deployment on SDC.
Kubernetes nodes
Ubuntu 18.04
SSH
Privileged user
Python
This tutorial will walk you through on how to setup CI/CD
Terraform helps you build a graph of all your resources and parallelizes the creation or modification of any non-dependent resources. Thus, Terraform builds infrastructure as efficiently as possible while providing the operators with clear insight into the dependencies on the infrastructure.
Fork the repos below to your GitHub Organization account
Go lang (version 1.13.X)
AWS account with admin access to provision EKS Service. Try subscribing to a free AWS account to learn the basics. There is a limit on what is offered as free. This demo requires a commercial subscription to the EKS service. The cost for a one or two days trial might range between Rs 500-1000. (Note: Post the demo, for the internal folks, eGov will provide a 2-3 hrs time-bound access to eGov's AWS account based on the request and the available number of slots per day).
Install kubectl on your local machine to interact with the Kubernetes cluster.
Install Helm to help package the services along with the configurations, environment, secrets, etc into a Kubernetes manifests.
Install terraform version (0.14.10) for the Infra-as-code (IaC) to provision cloud resources as code and with desired resource graph. It also helps destroy the cluster in one go.
Install AWS CLI on your local machine so that you can use AWS CLI commands to provision and manage the cloud resources on your account.
Install AWS IAM Authenticator to help authenticate your connection from your local machine and deploy DIGIT services.
Use the AWS IAM User credentials provided for the Terraform (Infra-as-code) to connect to the AWS account and provision the cloud resources.
You will receive a Secret Access Key and Access Key ID. Save the keys.
Open the terminal and run the command given below. The AWS CLI is already installed and the credentials are saved. (Provide the credentials, leave the region and output format blank).
The above creates the following file on your machine as /Users/.aws/credentials.
Before we provision the cloud resources, we need to understand and be sure about what resources need to be provisioned by terraform to deploy CI/CD.
The following is the resource graph that we are going to provision using terraform in a standard way so that every time and for every environment, the infra is the same.
EKS Control Plane (Kubernetes master)
Work node group (VMs with the estimated number of vCPUs, Memory)
EBS Volumes (Persistent volumes)
VPCs (Private networks)
Users to access, deploy and read-only
Ideally, one would write the terraform script from scratch using this doc.
Here we have already written the terraform script that provisions the production-grade DIGIT Infra and can be customized with the specified configuration.
Clone the DIGIT-DevOps GitHub repo. The terraform script to provision the EKS cluster is available in this repo. The structure of the files is given below.
Here, you will find the main.tf under each of the modules that have the provisioning definition for resources like EKS cluster, storage, etc. All these are modularized and react as per the customized options provided.
Example:
VPC Resources -
VPC
Subnets
Internet Gateway
Route Table
EKS Cluster Resources -
IAM Role to allow EKS service to manage other AWS services
EC2 Security Group to allow networking traffic with the EKS cluster
EKS Cluster
EKS Worker Nodes Resources -
IAM role allowing Kubernetes actions to access other AWS services
EC2 Security Group to allow networking traffic
Data source to fetch the latest EKS worker AMI
AutoScaling launch configuration to configure worker instances
AutoScaling group to launch worker instances
Storage Module -
Configuration in this directory creates EBS volume and attaches it together.
The following main.tf with create s3 bucket to store all the states of the execution to keep track.
The following main.tf contains the detailed resource definitions that need to be provisioned.
Dir: DIGIT-DevOps/Infra-as-code/terraform/egov-cicd
Define your configurations in variables.tf and provide the environment-specific cloud requirements. The same terraform template can be used to customize the configurations.
Following are the values that you need to mention in the following files. The blank ones will prompt for inputs during execution.
We have covered what the terraform script does, the resources graph that it provisions and what custom values should be given with respect to the selected environment.
Now, run the terraform scripts to provision the infra required to Deploy DIGIT on AWS.
Use the 'cd' command to change to the following directory and run the following commands. Check the output.
After successful execution, the following resources get created and can be verified by the command "terraform output".
s3 bucket: to store terraform state
Network: VPC, security groups
IAM users auth: using keybase to create admin, deployer, the user
Use the URL https://keybase.io/ to create your own PGP key. This creates both public and private keys on the machine, upload the public key into the keybase account that you have just created, give a name to it and ensure that you mention that in your terraform. This allows you to encrypt sensitive information.
Example: Create a user keybase. This is "egovterraform" in the case of eGov. Upload the public key here - https://keybase.io/egovterraform/pgp_keys.asc
Use this portal to Decrypt the secret key. To decrypt the PGP message, upload the PGP Message, PGP Private Key and Passphrase.
EKS cluster: with master(s) & worker node(s).
Storage(s): for es-master, es-data-v1, es-master-infra, es-data-infra-v1, zookeeper, kafka, kafka-infra.
Use this link to get the kubeconfig from EKS to fetch the kubeconfig file. This enables you to connect to the cluster from your local machine and deploy DIGIT services to the cluster.
Finally, verify that you are able to connect to the cluster by running the command below:
Whola! All set and now you can Deploy Jenkins
Post infra setup (Kubernetes Cluster), we start with deploying the Jenkins and kaniko-cache-warmer.
Sub domain to expose CI/CD URL
GitHub Oauth App (this provides you with the clientId, clientSecret)
Under Authorization callback URL
enter the below url ie (Replace <domain_name> with your domain) https://<domain_name>/securityRealm/finishLogin
Generate a new ssh key for the above user (this provides the ssh public and private keys)
Add the earlier created ssh public key to GitHub user account
Add ssh private key to the gitReadSshPrivateKey
With previously created GitHub users generate a personal read-only access token
Docker hub account details (username and password)
SSL certificate for the sub-domain
Prepare an <ci.yaml> master config file and <ci-secrets.yaml>. Name this file as desired. It has the following configurations:
credentials, secrets (you need to encrypt using sops and create a ci-secret.yaml separately)
Add subdomain name in ci.yaml
Check and add your project specific ci-secrets.yaml details (like github Oauth app clientId, clientSecret, gitReadSshPrivateKey, gitReadAccessToken, dockerConfigJson, dockerUsername and dockerPassword)
To create a Jenkins namespace mark this flag true
Add your environment-specific kubconfigs under kubConfigs like https://github.com/egovernments/DIGIT-DevOps/blob/release/config-as-code/environments/ci-demo-secrets.yaml#L50
KubeConfig environment name and deploymentJobs name from ci.yaml should be the same
Update the CIOps and DIGIT-DevOps repo ssh url with the forked repo's ssh url.
Make sure earlier created github users have read-only access to the forked DIGIT-DevOps and CIOps repos.
SSL certificate for the sub-domain.
Update the DOCKER_NAMESPACE with your docker hub organization name.
Update the repo name "egovio" with your docker hub organization name in buildPipeline.groovy
Remove the below env:
Jenkins is launched. You can access the same through your sub-domain configured in ci.yaml.
DIGIT being a container-based platform and orchestrated on Kubernetes, this page discusses some key security practices to protect the infrastructure
On this page:
Security is always a difficult subject to approach either by lack of experience; or by the fact you should know when the level of security is right for what you have to secure.
Security is a major concern when it comes to government systems and infra. As an architect, we can consider that working with technically educated people (engineers, experts) and tools (systems, frameworks, IDE) should prevent key VAPT issues.
However, it’s quite difficult to avoid, a certain infatuation from different categories of people to try to hack the systems.
There aren’t only bug fixes in each release but also new security measures to require advantage of them, we recommend working with the newest stable version.
Updates and support could also be harder than the new features offered in releases, so plan your updates a minimum of once a quarter. Significantly simplified updates can utilize the providers of managed Kubernetes solutions.
Use RBAC (Role-Based Access Control) to regulate who can access and what rights they need. Usually, RBAC is enabled by default in Kubernetes version 1.6 and later (or later for a few providers), but if you’ve got been updated since then and didn’t change the configuration, you ought to double-check your settings.
However, enabling RBAC isn’t enough — it still must be used effectively. within the general case, the rights to the whole cluster (cluster-wide) should be avoided, giving preference to rights in certain namespaces. Avoid giving someone cluster administrator privileges even for debugging — it’s much safer to grant rights only necessary and from time to time.
If the appliance requires access to the Kubernetes API, create separate service accounts. and provides them with the minimum set of rights required for every use case. This approach is far better than giving an excessive amount of privilege to the default account within the namespace.
Creating separate namespaces is vital because of the first level of component isolation. it’s much easier to regulate security settings — for instance, network policies — when different types of workloads are deployed in separate namespaces.
To get in-depth knowledge of Kubernetes, enrol for a live demo on the Kubernetes course.
A good practice to limit the potential consequences of compromise is to run workloads with sensitive data on a fanatical set of machines. This approach reduces the risk of a less secure application accessing the application with sensitive data running in the same container executable environment or on the same host.
For example, a Kubelet of a compromised node usually has access to the contents of secrets only if they are mounted on pods that are scheduled to be executed on the same node. If important secrets are often found on multiple cluster nodes, the attacker will have more opportunities to urge them.
Separation can be done using node pools (in the cloud or on-premises), as well as Kubernetes controlling mechanisms, such as namespaces, taints, tolerations, and others.
Sensitive metadata — for instance, Kubelet administrative credentials, are often stolen or used with malicious intent to escalate privileges during a cluster. For example, a recent find within Shopify’s bug bounty showed in detail how a user could exceed authority by receiving metadata from a cloud provider using specially generated data for one of the microservices.
The GKE metadata concealment function changes the mechanism for deploying the cluster in such how that avoids such a drag. And we recommend using it until a permanent solution is implemented.
Network Policies — allow you to control access to the network in and out of containerized applications. To use them, you must have a network provider with support for such a resource. For managed Kubernetes solution providers such as Google Kubernetes Engine (GKE), support will need to be enabled.
Once everything is ready, start with simple default network policies — for example, blocking (by default) traffic from other namespaces.
Pod Security Policy sets the default values used to start workloads in the cluster. Consider defining a policy and enabling the Pod Security Policy admission controller: the instructions for these steps vary depending on the cloud provider or deployment model used.
In the beginning, you might want to disable the NET_RAW capability in containers to protect yourself from certain types of spoofing attacks.
To improve host security, you can follow these steps:
Ensure that the host is securely and correctly configured. One way is CIS Benchmarks; Many products have an auto checker that automatically checks the system for compliance with these standards.
Monitor the network availability of important ports. Ensure that the network is blocking access to the ports used by Kubelet, including 10250 and 10255. Consider restricting access to the Kubernetes API server — with the exception of trusted networks. In clusters that did not require authentication and authorization in the Kubelet API, attackers used to access to such ports to launch cryptocurrency miners.
Minimize administrative access to Kubernetes hosts Access to cluster nodes should in principle be limited: for debugging and solving other problems, as a rule, you can do without direct access to the node.
Make sure that audit logs are enabled and that you are monitoring for the occurrence of unusual or unwanted API calls in them, especially in the context of any authorization failures — such entries will have a message with the “Forbidden” status. Authorization failures can mean that an attacker is trying to take advantage of the credentials obtained.
Managed solution providers (including GKE) provide access to this data in their interfaces and can help you set up notifications in case of authorization failures.
Follow these guidelines for a more secure Kubernetes cluster. Remember that even after the cluster is configured securely, you need to ensure security in other aspects of the configuration and operation of containers. To improve the security of the technology stack, study the tools that provide a central system for managing deployed containers, constantly monitoring and protecting containers and cloud-native applications.
This section contains a list of documents elaborating on the key concepts aiding the deployment of the DIGIT platform
Steps to bootstrap DIGIT
Post-deployment, the application can now be accessed from the configured domain. This page provides the bootstrapping steps.
To try out employee login, let us create a sample tenant, city, and user to log in and assign the LME employee role through the seed script.
Perform the kubectl port-forwarding of the egov-user service running from the Kubernetes cluster to your localhost. This provides access to egov-user service and allows users to interact with the API directly.
2. Seed the sample data
Ensure the Postman is installed to run the following seed data API. if not, Install postman on your local machine.
Import the following Postman collection into the Postman and run it. This contains the seed data that enables sample test users and localisation data.
Execute the below commands to test your local machine's Kubernetes operations through kubectl.
You have successfully completed the DIGIT Infra, deployment setup and installed a DIGIT - PGR module.
Use the below link in the browser -
Use the below credentials to login into the complaint section
Username: GRO
Password: eGov@4321
City: CITYA
By now we have successfully completed the DIGIT setup on the cloud, use the URL you mentioned in your env.yaml.
Eg: https://mysetup.digit.org and create a grievance to ensure the PGR module deployed is working fine. Refer to the below product documentation for the steps.
Credentials:
Citizen: You can use your default mobile number (9999999999) to sign in using the default Mobile OTP 123456.
Employee: Username: GRO and password: eGov@4321
Post grievance creation and assignment of the same to LME, capture the screenshot of the same and share it to ensure your setup is working fine. Post validation, the PGR functionality shares the API response of the following request to assess the correctness of successful DIGIT PGR Deployment.
Follow the steps below to clean up the DIGIT setup, if required. This will delete the entire cluster and other cloud resources that were provisioned for the DIGIT Setup.
Run the command below to destroy previously-created infrastructure using Terraform -
ELB is not deployed via Terraform. ELB has created at deployment time by the setup of Kubernetes Ingress. This has to be deleted manually by deleting the ingress service.
kubectl delete deployment nginx-ingress-controller -n <namespace>
kubectl delete svc nginx-ingress-controller -n <namespace>
Note: Namespace can be either provided by egov or Jenkins.
Delete S3 buckets manually from the AWS console and also verify if the ELB got deleted.
In case the ELB is not deleted, you need to delete ELB from the AWS console.
Run terraform destroy
.
All done, we have successfully created infra on the cloud, deployed DIGIT, bootstrapped DIGIT, performed a transaction on PGR and finally destroyed the cluster.
Deployment on SDC
Running Kubernetes on-premise gives a cloud-native experience on SDC when it comes to deploying DIGIT.
Whether States have their own on-premise data centre or have decided to forego the various managed cloud solutions, there are a few things one should know when getting started with on-premise K8s.
One should be familiar with Kubernetes and the control plane consists of the Kube-apiserver, Kube-scheduler, Kube-controller-manager and an ETCD datastore. For managed cloud solutions like Google’s Kubernetes Engine (GKE) or Azure’s Kubernetes Service (AKS), it also includes the cloud-controller-manager. This is the component that connects the cluster to external cloud services to provide networking, storage, authentication, and other support features.
To successfully deploy a bespoke Kubernetes cluster and achieve a cloud-like experience on SDC, one needs to replicate all the same features you get with a managed solution. At a high level, this means that we probably want to:
Automate the deployment process
Choose a networking solution
Choose a right storage solution
Handle security and authentication
The subsequent sections look at each of these challenges individually, and provide enough of a context required to help in getting started.
Using a tool like Ansible can make deploying Kubernetes clusters on-premise trivial.
When deciding to manage your own Kubernetes clusters, we need to set up a few proofs-of-concept (PoC) clusters to learn how everything works, perform performance and conformance tests, and try out different configuration options.
After this phase, automating the deployment process is an important if not necessary step to ensure consistency across any clusters you build. For this, you have a few options, but the most popular are:
kubeadm: a low-level tool that helps you bootstrap a minimum viable Kubernetes cluster that conforms to best practices
kubespray: an Ansible playbook that helps deploy production-ready clusters
If you already using Ansible, Kubespray is a great option, otherwise, we recommend writing automation around Kubeadm using your preferred playbook tool after using it a few times. This will also increase your confidence and knowledge of Kubernetes.
Since there are many DIGIT services and the development code is part of various git repos, one needs to understand the concept of cicd-as-service which is open-sourced. This page guides you through the process of creating a CI/CD pipeline.
The initial steps for integrating any new service/app to the CI/CD are discussed below.
Once the desired service is ready for integration: decide the service name, type of service, and if DB migration is required or not. While you commit the source code of the service to the git repository, the following file should be added with the relevant details which are mentioned below:
Build-config.yml – It is present under the build directory in the repository
This file contains the below details used for creating the automated Jenkins pipeline job for the newly created service.
While integrating a new service/app, the above content needs to be added to the build-config.yml file of that app repository. For example: to onboard a new service called egov-test, the build-config.yml should be added as mentioned below.
If a job requires multiple images to be created (DB Migration) then it should be added as below,
Note - If a new repository is created then the build-config.yml is created under the build folder and the config values are added to it.
The git repository URL is then added to the Job Builder parameters
When the Jenkins Job => job builder is executed, the CI Pipeline gets created automatically based on the above details in build-config.yml. Eg: egov-test job is created in the builds/DIGIT-OSS/core-services folder in Jenkins since the “build-config is edited under core-services” And it should be the “master” branch. Once the pipeline job is created, it can be executed for any feature branch with build parameters - specifying the branch to be built (master or feature branch).
As a result of the pipeline execution, the respective app/service docker image is built and pushed to the Docker repository.
On repo provide read-only access to GitHub users (created while ci/cd deployment)
The Jenkins CI pipeline is configured and managed 'as code'.
Job Builder – Job Builder is a Generic Jenkins job which creates the Jenkins pipeline automatically which is then used to build the application, create the docker image of it and push the image to the Docker repository. The Job Builder job requires the git repository URL as a parameter. It clones the respective git repository and reads the build/build-config.yml file for each git repository and uses it to create the service build job.
Check and add your repo ssh URL in ci.yaml
If the git repository ssh URL is available, build the Job-Builder Job.
If the git repository URL is not available, check and add the same team.
The services are deployed and managed on a Kubernetes cluster in cloud platforms like AWS, Azure, GCP, OpenStack, etc. Here, we use helm charts to manage and generate the Kubernetes manifest files and use them for further deployment to the respective Kubernetes cluster. Each service is created as charts which have the below-mentioned files.
Note: The steps below are only for the introduction and implementation of new services.
To deploy a new service, you need to create a new helm chart for it( refer to the above example). The chart should be created under the charts/helm directory in the DIGIT-DevOps repository.
If you are going to introduce a new module with the help of multiple services, we suggest you create a new Directory with your module name.
Example.:-
You can refer to the existing helm chart structure here
This chart can also be modified further based on user requirements.
The deployment of manifests to the Kubernetes cluster is made very simple and easy. There are Jenkins Jobs for each state and are environment-specific. We need to provide the image name or the service name for the respective Jenkins deployment job.
The deployment Jenkins job internally performs the following operations:
Reads the image name or the service name given and finds the chart that is specific to it.
Generates the Kubernetes manifests files from the chart using the helm template engine.
Execute the deployment manifest with the specified docker image(s) to the Kubernetes cluster.
Release chart helps to deploy the product specific modules in one click
This section of the document walks you through the details of how to prepare a new release chart for existing products.
Git
Install Visualstudio IDE Code for better code visualization/editing capabilities
Clone the following DIGIT-DevOps where we have all the release charts for you to refer.
Create a new release version of the below products.
Select your product and copy the previous release version file and rename it with your new version.
The above code ensures the dependancy_chart-digit-v2.6.yaml with your new release version is copied and renamed.
Note: replace <your_release_version> with your new release version.
Navigate to the release file on your local machine. Open the file using Visualstudio or any other file editor.
Update the release version "v2.6" with your new release version.
Update the modules(core, business, utilities, m_pgr, m_property-tax,..etc) service images with new release service images.
Add new modules
name - add your module name with "m_demo" ideal format ie. "m" means module and "demo" would be your module name
dependencies - add your module dependencies (name of other modules)
services - add your module-specific new service images
This section of the document walks you through the details of how to prepare a new release chart for new products.
Git
GitHub Organization Account
Install Visualstudio IDE Code for better code visualization/editing capabilities
When you have a new product to introduce, you can follow the below steps to create the release chart for a new product.
eGov partners can follow the below steps:
Fork the DIGIT-DevOps repo to your GitHub organization account
Clone the forked DIGIT-DevOps repo to your local machine
git clone --branch release https://github.com/<your_organization_account_name>/DIGIT-DevOps.git
Note: replace this <your_organization_account_name> with your github organization account name.
Navigate to the product-release-charts folder and create a new folder with your product name. cd DIGIT-DevOps/config-as-code/product-release-charts mkdir <new_product_name>
Note: replace <new_product_name> with your new product name.
Create a new release chart file in the above-created product folder.touch dependancy_chart-<new_product_name>-<release_version>.yaml
1. Open your release chart file dependancy_chart-<new_product_name
>-<release_version>.yaml and start preparing as mentioned in the below release template.
eGov users can follow the below steps:
Clone the forked DIGIT-DevOps repo to your local machine
git clone --branch release https://github.com/egovernments/DIGIT-DevOps.git
Navigate to the product-release-charts folder and create a new folder with your product name. cd DIGIT-DevOps/config-as-code/product-release-charts mkdir <new_product_name>
Note: replace <new_product_name> with your new product name
Create a new release chart file in the above-created product folder.touch dependancy_chart-<new_product_name>-<release_version>.yaml
1. Open your release chart file dependancy_chart-<new_product_name
>-<release_version>.yaml and start preparing as mentioned in the below release template.
There are multiple options available to deploy the solution to the cloud. Here we provide the steps to use terraform Infra-as-code.
Before we provision the cloud resources, we need to understand and be sure about what resources need to be provisioned by terraform to deploy DIGIT. The following picture shows the various key components. (EKS, Worker Nodes, PostGres DB, EBS Volumes, Load Balancer).
Considering the above deployment architecture, the following is the resource graph that we are going to provision using terraform in a standard way so that every time and for every env, it'll have the same infra.
EKS Control Plane (Kubernetes Master)
Work node group (VMs with the estimated number of vCPUs, Memory)
EBS Volumes (Persistent Volumes)
RDS (PostGres)
VPCs (Private network)
Users to access, deploy and read-only
(Optional) Create your own keybase key before you run the terraform
Use this URL https://keybase.io/ to create your own PGP key, this will create both public and private keys in your machine, upload the public key into the keybase account that you have just created, and give a name to it and ensure that you mention that in your terraform. This allows encrypting of all sensitive information.
Example user keybase user in eGov case is "egovterraform" needs to be created and has to upload the public key here - https://keybase.io/egovterraform/pgp_keys.asc
you can use this portal to Decrypt your secret key. To decrypt PGP Message, Upload the PGP Message, PGP Private Key and Passphrase.
Fork the DIGIT-DevOps repository into your organization account using the GitHub web portal. Make sure to add the right users to the repository. Clone the forked DIGIT-DevOps repository (not the egovernments one). Navigate to the sample-aws
directory which contains the sample AWS infra provisioning script.
The sample-aws terraform script is provided as a helper/guide. An experienced DevOps can choose to modify or customize this as per the organization's infra needs.
Create Terraform backend to specify the location of the backend Terraform state file on S3 and the DynamoDB table used for the state file locking. This step is optional. S3 buckets have to be created outside of the Terraform script.
The remote state is simply storing that state file remotely, rather than on your local filesystem. In an enterprise project and/or if Terraform is used by a team, it is recommended to set up and use the remote state.
In the sample-aws/remote-state/
main.tf
file, specify the s3 bucket to store all the states of the execution to keep track.
The terraform script once executed performs all of the below infrastructure setups.
Amazon EKS requires subnets must be in at least two different availability zones.
Create AWS VPC (Virtual Private Cloud).
Create two public and two private Subnets in different availability zones.
Create an Internet Gateway to provide internet access for services within VPC.
Create NAT Gateway in public subnets. It is used in private subnets to allow services to connect to the internet.
Create Routing Tables and associate subnets with them. Add required routing rules.
Create Security Groups and associate subnets with them. Add required routing rules.
EKS cluster setup
The main.tf inside the sample-aws
folder contains the detailed resource definitions that need to be provisioned, please have a look at it.
Navigate to the directory: DIGIT-DevOps/Infra-as-code/terraform/sample-aws. Configurations are defined in variables.tf and provide the environment-specific cloud requirements.
Following are the values that you need to replace in the following files. The blank ones will be prompted for inputs while execution.
cluster_name - provide your EKS cluster name here.
availability_zones - This is a comma-separated list. If you would like your infra to have multi-AZ setup, please provide multiple zones here. If you provide a single zone, all infra will be provisioned within that zone. For example:
3. bucket_name - if you've created a special S3 bucket to store Terraform state.
4. dbname - Any DB name of your choice. Note that this CANNOT have hyphens or other special characters. Underscore is permitted. Example: digit_test
All other variables are default and can be modified if the admin is knowledgeable about it.
5. In the providers.tf file in the same directory, modify the "profile" variable to point to the AWS profile that was created in Step 3.
Make sure your AWS session tokens are up to date in the /user/home/.aws/credentials file
Before running Terraform, make sure to clean up .terraform.lock.hcl, .terraform, terraform.tfstate files if you are starting from scratch.
Once you have finished declaring the resources, you can deploy all resources.
Let's begin to run the terraform scripts to provision infra required to Deploy DIGIT on AWS.
First CD into the following directory and run the following command to create the remote state.
Once the remote state is created, you are ready to provision DIGIT infra. Please run the following commands:
Important:
DB password will be asked for in the application stage. Please remember the password you have provided. It should be at least 8 characters long. Otherwise, RDS provisioning will fail.
The output of the apply command will be displayed on the console. Store this in a file somewhere. Values from this file will be used in the next step of deployment.
2. Use this link to get the kubeconfig from EKS to get the kubeconfig file for the cluster. The region code is the default region provided in the availability zones in variables.tf. Eg. ap-south-1. EKS cluster name also should've been filled in variables.tf.
3. Finally, verify that you are able to connect to the cluster by running the following command
At this point, your basic infra has been provisioned. Please move to the next step to install DIGIT.
To destroy previously-created infrastructure with Terraform, run the command below:
ELB is not deployed via Terraform. ELB has created at deployment time by the setup of Kubernetes Ingress. This has to be deleted manually by deleting the ingress service.
kubectl delete deployment nginx-ingress-controller -n <namespace>
kubectl delete svc nginx-ingress-controller -n <namespace>
Note: Namespace can be one of egov or jenkins.
Delete S3 buckets manually from the AWS console and also verify if ELB got deleted.
In case of if ELB is not deleted, you need to delete ELB from AWS console.
Run terraform destroy
.
Sometimes all artefacts that are associated with a deployment cannot be deleted through Terraform. For example, RDS instances might have to be deleted manually. It is recommended to log in to the AWS management console and look through the infra to delete any remnants.
Provision infra for DIGIT on Azure using Terraform
Azure Kubernetes Service (AKS) manages your hosted Kubernetes environment. AKS allows you to deploy and manage containerized applications without container orchestration expertise. AKS also enables you to do many common maintenance operations without taking your app offline. These operations include provisioning, upgrading, and scaling resources on demand.
To deploy the solution to the cloud there are several ways that we can choose. In this case, we will use terraform Infra-as-code.
Terraform is an open-source infrastructure as code (IaC) software tool that allows DevOps engineers to programmatically provision the physical resources an application requires to run.
Infrastructure as code is an IT practice that manages an application's underlying IT infrastructure through programming. This approach to resource allocation allows developers to logically manage, monitor and provision resources -- as opposed to requiring that an operations team manually configure each required resource.
Terraform users define and enforce infrastructure configurations by using a JSON-like configuration language called HCL (HashiCorp Configuration Language). HCL's simple syntax makes it easy for DevOps teams to provision and re-provision infrastructure across multiple clouds and on-premises data centres.
Before we provision the cloud resources, we need to understand and be sure about what resources need to be provisioned by Terraform to deploy DIGIT. The following picture shows the various key components. (AKS, Node Pools, Postgres DB, Volumes, Load Balancer)
Ideally, one would write the terraform script from scratch using this doc.
Here we have already written the terraform script that one can reuse/leverage that provisions the production-grade DIGIT Infra and can be customized with the user-specific configuration.
Clone the following DIGIT-DevOps where we have all the sample terraform scripts available for you to leverage.
2. Change the main.tf according to your requirements.
3. Declare the variables in variables.tf
Save the file and exit the editor.
4. Create a Terraform output file (output.tf) and paste the following code into the file.
Once you have finished declaring the resources, you can deploy all resources.
terraform init
: command is used to initialize a working directory containing Terraform configuration files.
terraform plan
: command creates an execution plan, which lets you preview the changes that Terraform plans to make to your infrastructure.
terraform apply
: command executes the actions proposed in a Terraform plan to create or update infrastructure.
After the complete creation, you can see resources in your Azure account.
Now we know what the terraform script does, the resources graph that it provisions and what custom values should be given with respect to your environment. The next step is to begin to run the terraform scripts to provision infra required to Deploy DIGIT on Azure.
Use the CD command to move into the following directory run the following commands 1-by-1 and watch the output closely.
The Kubernetes tools can be used to verify the newly created cluster.
Once Terraform Apply execution is complete, it generates the Kubernetes configuration file or you can get it from Terraform state.
Use the below command to get kubeconfig. It will automatically store your kubeconfig in .kube folder.
3.
Verify the health of the cluster.
The details of the worker nodes should reflect the status as Ready for All.
“Resource Request” and a “Resource Limit” when defining how many resources a container within a pod should receive
On this page:
Containerising applications and running them on Kubernetes does not mean we can forget all about resource utilization. Our thought process may have changed because we can easily scale out our application as demand increases. We need to consider frequently how our containers might fight with each other for resources. Resource requests and limits can be used to help stop the “noisy neighbour” problem in a Kubernetes Cluster.
To put things simply, a resource request specifies the minimum amount of resources a container needs to successfully run. Thought of in another way, this is a guarantee from Kubernetes that you’ll always have this amount of either CPU or Memory allocated to the container.
Why would you worry about the minimum amount of resources guaranteed to a pod? Well, it's to help prevent one container from using up all the node’s resources and starving the other containers from CPU or memory. For instance, if I had two containers on a node, one container could request 100% of that node's processor. Meanwhile, the other container would likely not be working very well because the processor is being monopolized by its “noisy neighbour”.
What a resource request can do, is to ensure that at least a small part of that processor’s time is reserved for both containers. This way if there is resource contention, each pod will have a guaranteed, minimum amount of resources in which to still function.
As you might guess, a resource limit is the maximum amount of CPU or memory that can be used by a container. The limit represents the upper bounds of how much CPU or memory that a container within a pod can consume in a Kubernetes cluster, regardless of whether or not the cluster is under resource contention.
Limits prevent containers from taking up more resources on the cluster than you’re willing to let them.
As a general rule, all containers should have a request for memory and CPU before deploying to a cluster. This will ensure that if resources are running low, your container can still do the minimum amount of work to stay in a healthy state until the resources free up again (hopefully).
Limits are often used in conjunction with requests to create a “guaranteed pod”. This is where the request and limit are set to the same value. In that situation, the container will always have the same amount of CPU available to it, no more or less.
At this point, you may be thinking about adding a high “request” value to make sure you have plenty of resources available for your container. This might sound like a good idea but have dramatic consequences for scheduling on the Kubernetes cluster. If you set a high CPU request, for example, 2 CPUs, then your pod will ONLY be able to be scheduled on Kubernetes nodes that have 2 full CPUs available that aren’t reserved by other pods’ requests. In the example below, the 2 vCPU pods couldn’t be scheduled on the cluster. However, if you were to lower the “request” amount to say 1 vCPU, it could.
Let us try out using a CPU limit on a pod and see what happens when we try to request more CPU than we’re allowed to have. Before we set the limit though, let us look at a pod with a single container under normal conditions. I’ve deployed a resource consumer container in my cluster and by default, you can see that I am using 1m CPU(cores) and 6 Mi(bytes) of memory.
NOTE: CPU is measured in millicores so 1000m = 1 CPU core. Memory is measured in Megabytes.
Ok, now that we have seen the “no-load” state, let us add some CPU load by making a request to the pod. Here, we increased the CPU usage on the container to 400 millicores.
After the metrics start coming in, you can see that we got roughly 400m used on the container as you’d expect to see.
Now we have deleted the container and will edit the deployment manifest so that it has a limit on CPU.
After redeploying the container and again increasing the CPU load to 400m, we can see that the container is throttled to 300m instead. We have effectively “limited” the resources the container could consume from the cluster.
Next, we deployed two pods into the Kubernetes cluster and those pods are on the same worker node for a simple example of contention. We have got a guaranteed pod that has 1000m CPU set as a limit but also as a request. The other pod is unbounded, meaning there is no limit on how much CPU it can utilize.
After the deployment, each pod is really not using any resources as you can see here.
We make a request to increase the load on the non-guaranteed pod.
And if we look at the container's resources you can see that even though the container wants to use a 2000m CPU, it is actually using a 1000m CPU only. The reason for this is that the guaranteed pod is guaranteed a 1000m CPU, whether it is actively using that CPU or not.
Kubernetes uses resource requests to set a minimum amount of resources for a given container so that it can be used if it needs it. You can also set a resource limit to set the maximum amount of resources a pod can utilize.
Taking these two concepts and using them together can ensure that your critical pods always have the resources that they need to stay healthy. They can also be configured to take advantage of shared resources within the cluster.
Be careful setting resource requests too high so your Kubernetes scheduler can still schedule these pods. Good luck!
Watch the video below to access information on how to deploy DIGIT services on Kubernetes, and prepare deployment manifests for various services along with their configurations and secrets. etc. It also discusses the maintenance of environment-specific changes.
On this page:
This section contains architectural details about DIGIT deployment. It discusses the various activities in a sequence of steps to provision required infra and deploy DIGIT.
Every code commit is well-reviewed and squash merge to branches through Pull Requests.
Trigger the CI Pipeline that ensures code quality, vulnerability assessments, and CI tests before building the artefacts.
Artefact is version controlled based on Semantic versioning based on the nature of the change.
After successful CI, Jenkins bakes the Docker Images with the versioned artefacts and pushes the baked Docker image to Docker Registry.
Deployment Pipeline pulls the built Image and pushes it to the corresponding environment.
As all the DIGIT services are containerized and deployed on Kubernetes, we need to prepare deployment manifests. The same can be found here.
DIGIT has built helm charts using the standard helm approach to ease managing the service-specific configs, customisations, switch/toggle, secrets, etc.
Golang base Deployment script that reads the values from the helm charts template and deploys into the cluster.
Each env will have one master yaml template that will have the definition of all the services to be deployed, and their dependencies like Config, Env, Secrets, DB Credentials, Persistent Volumes, Manifest, Routing Rules, etc.
On this page:
In Kubernetes, an Ingress is an object that allows access to your Kubernetes services from outside the Kubernetes cluster. You configure access by creating a collection of rules that define which inbound connections reach which services.
This lets you consolidate your routing rules into a single resource. For example, you might want to send requests to example.com/api/v1/ to an api-v1 service, and requests to example.com/api/v2/ to the api-v2 service. With an Ingress, you can easily set this up without creating a bunch of LoadBalancers or exposing each service on the Node.
An API object that manages external access to the services in a cluster, typically HTTP. Ingress may provide load balancing, SSL termination and name-based virtual hosting.
For clarity, this guide defines the following terms:
Node: A worker machine in Kubernetes, part of a cluster.
Cluster: A set of Nodes that run containerized applications managed by Kubernetes. For this example, and in most common Kubernetes deployments, nodes in the cluster are not part of the public internet.
Edge router: A router that enforces the firewall policy for your cluster. This could be a gateway managed by a cloud provider or a physical piece of hardware.
Cluster network: A set of links, logical or physical, that facilitate communication within a cluster according to the Kubernetes networking model.
Ingress exposes HTTP and HTTPS routes from outside the cluster to services within the cluster. Traffic routing is controlled by rules defined on the Ingress resource.
An Ingress may be configured to give Services externally-reachable URLs, load balance traffic, terminate SSL / TLS, and offer name based virtual hosting. An Ingress controller is responsible for fulfilling the Ingress, usually with a load balancer, though it may also configure your edge router or additional frontends to help handle the traffic.
An Ingress does not expose arbitrary ports or protocols. Exposing services other than HTTP and HTTPS to the internet typically uses a service of type Service.Type=NodePort or Service.Type=LoadBalancer.
You must have an ingress controller to satisfy an Ingress. Only creating an Ingress resource has no effect.
You may need to deploy an Ingress controller such as ingress-nginx. You can choose from a number of Ingress controllers.
Ideally, all Ingress controllers should fit the reference specification. In reality, the various Ingress controllers operate slightly differently.
An Ingress resource example:
As with all other Kubernetes resources, an Ingress needs apiVersion, kind, and metadata fields. The name of an Ingress object must be a valid DNS subdomain name. For general information about working with config files, see deploying applications, configuring containers, managing resources. Ingress frequently uses annotations to configure some options depending on the Ingress controller, an example of which is the rewrite-target annotation. Different Ingress controller support different annotations. Review the documentation for your choice of Ingress controller to learn which annotations are supported.
The Ingress spec has all the information needed to configure a load balancer or proxy server. Most importantly, it contains a list of rules matched against all incoming requests. Ingress resource only supports rules for directing HTTP(S) traffic.
Each HTTP rule contains the following information:
An optional host. In this example, no host is specified, so the rule applies to all inbound HTTP traffic through the IP address specified. If a host is provided (for example, foo.bar.com), the rules apply to that host.
A list of paths (for example, /testpath), each of which has an associated backend defined with a service.name and a service.port.name or service.port.number. Both the host and path must match the content of an incoming request before the load balancer directs traffic to the referenced Service.
A backend is a combination of Service and port names as described in the Service doc or a custom resource backend by way of a CRD. HTTP (and HTTPS) requests to the Ingress that matches the host and path of the rule are sent to the listed backend.
A default backend is often configured in an Ingress controller to service any requests that do not match a path in the spec.
Learn about the Ingress API
Learn about Cert-manager
Overview of various probes that we can setup to ensure the service deployment and the availability of the service is ensured automatically.
On this page:
Determining the state of a service based on readiness, liveness, and startup to detect and deal with unhealthy situations. It may happen that the application needs to initialize some state, make database connections, or load data before handling application logic. This gap in time between when the application is actually ready versus when Kubernetes thinks is ready becomes an issue when the deployment begins to scale and unready applications receive traffic and send back 500 errors.
Many developers assume that when basic pod setup is adequate, especially when the application inside the pod is configured with daemon process managers (e.g. PM2 for Node.js). However, since Kubernetes deems a pod as healthy and ready for requests as soon as all the containers start, the application may receive traffic before it is actually ready.
Kubernetes supports readiness and liveness probes for versions ≤ 1.15. Startup probes were added in 1.16 as an alpha feature and graduated to beta in 1.18 (WARNING: 1.16 deprecated several Kubernetes APIs. Use this to check for compatibility).
All the probes have the following parameters:
initialDelaySeconds
: number of seconds to wait before initiating liveness or readiness probes
periodSeconds
: how often to check the probe
timeoutSeconds
: number of seconds before marking the probe as timing out (failing the health check)
successThreshold
: minimum number of consecutive successful checks for the probe to pass
failureThreshold
: number of retries before marking the probe as failed. For liveness probes, this will lead to the pod restarting. For readiness probes, this will mark the pod as unready.
Readiness probes are used to let Kubelet know when the application is ready to accept new traffic. If the application needs some time to initialize the state after the process has started, configure the readiness probe to tell Kubernetes to wait before sending new traffic. A primary use case for readiness probes is directing traffic to deployments behind a service.
One important thing to note with readiness probes is that it runs during the pod’s entire lifecycle. This means that readiness probes will run not only at startup but repeatedly throughout as long as the pod is running. This is to deal with situations where the application is temporarily unavailable (i.e. loading large data, waiting on external connections). In this case, we don’t want to necessarily kill the application but wait for it to recover. Readiness probes are used to detect this scenario and not send traffic to these pods until it passes the readiness check again.
Liveness probes are used to restart unhealthy containers. The Kubelet periodically pings the liveness probe, determines the health, and kills the pod if it fails the liveness check.
Liveness checks can help the application recover from a deadlock situation. Without liveness checks, Kubernetes deems a deadlocked pod healthy since the underlying process continues to run from Kubernetes’s perspective. By configuring the liveness probe, the Kubelet can detect that the application is in a bad state and restarts the pod to restore availability.
Startup probes are similar to readiness probes but only executed at startup. They are optimized for slow-starting containers or applications with unpredictable initialization processes. With readiness probes, we can configure the initialDelaySeconds
to determine how long to wait before probing for readiness. Now consider an application where it occasionally needs to download large amounts of data or do an expensive operation at the start of the process. Since initialDelaySeconds
is a static number, we are forced to always take the worst-case scenario (or extend the failureThreshold
one that may affect long-running behaviour) and wait for a long time even when that application does not need to carry out long-running initialization steps. With startup probes, we can instead configure failureThreshold
and periodSeconds
to model this uncertainty better. For example, setting failureThreshold
to 15 and periodSeconds
to 5 means the application will get 10 x 5 = 75s to startup before it fails.
Now that we understand the different types of probes, we can examine the three different ways to configure each probe.
The Kubelet sends an HTTP GET request to an endpoint and checks for a 2xx or 3xx response. You can reuse an existing HTTP endpoint or set up a lightweight HTTP server for probing purposes (e.g. an Express server with /healthz
endpoint).
HTTP probes take in additional parameters:
host
: hostname to connect to (default: pod’s IP)
scheme
: HTTP (default) or HTTPS
path
: path on the HTTP/S server
httpHeaders
: custom headers if you need header values for authentication, CORS settings, etc
port
: name or number of the port to access the server
If you just need to check whether or not a TCP connection can be made, you can specify a TCP probe. The pod is marked healthy if can establish a TCP connection. Using a TCP probe may be useful for a gRPC or FTP server where HTTP calls may not be suitable.
Finally, a probe can be configured to run a shell command. The check passes if the command returns with exit code 0; otherwise, the pod is marked as unhealthy. This type of probe may be useful if it is not desirable to expose an HTTP server/port or if it is easier to check initialization steps via command (e.g. check if a configuration file has been created, run a CLI command).
The exact parameters for the probes depend on your application, but here are some general best practices to get started:
For older (≤ 1.15) Kubernetes clusters, use a readiness probe with an initial delay to deal with the container startup phase (use p99 times for this). But make this check lightweight since the readiness probe will execute throughout the entire lifecycle of the pod. We don’t want the probe to time out because the readiness check takes a long time to compute.
For newer (≥ 1.16) Kubernetes clusters, use a startup probe for applications with unpredictable or variable startup times. The startup probe may share the same endpoint (e.g. /healthz
) as the readiness and liveness probes, but set the failureThreshold
higher than the other probes to account for longer start times, but more reasonable time to failure for liveness and readiness checks.
Readiness and liveness probes may share the same endpoint if the readiness probes aren’t used for other signalling purposes. If there’s only one pod (i.e. using a Vertical Pod Autoscaler), set the readiness probe to address the startup behaviour and use the liveness probe to determine health. In this case, marking the pod unhealthy means downtime.
Readiness checks can be used in various ways to signal system degradation. For example, if the application loses connection to the database, readiness probes may be used to temporarily block new requests and allow the system to reconnect. It can also be used to load balance work to other pods by marking busy pods as not ready.
In short, well-defined probes generally lead to better resilience and availability. Be sure to observe the startup times and system behaviour to tweak the probe settings as the applications change.
Finally, given the importance of Kubernetes probes, you can use a Kubernetes resource analysis tool to detect missing probes. These tools can be run against existing clusters or be baked into the CI/CD process to automatically reject workloads without properly configured resources.
On this page:
Once the cluster is ready and healthy you can start deploying backbones services.
Deploy configuration and deployment in the following Services Lists
Backbone (Redis, ZooKeeper-v2, Kafka-v2,elasticsearch-data-v1, elasticsearch-client-v1, elasticsearch-master-v1)
Gateway (Zuul, nginx-ingress-controller)
Understanding of VM Instances, LoadBalancers, SecurityGroups/Firewalls, nginx, DB Instances, and data volumes.
Experience with Kubernetes, Docker, Jenkins, helm, golang, Infra-as-code.
Deploy configuration and deployment backbone services:
Modify the global domain and set namespaces create to true
Modify the below-mentioned changes for each backbone service:
Eg. For Kafka-v2
If you are using AWS as a cloud provider, change the respective volume ids and zones. (You will get the volume ids and zone details from either a remote state bucket or from the AWS portal).
Eg. Kafka-v2
If you are using Azure cloud provider, change the diskName and diskUri. (You will get the volume ids and zone details from either the remote state bucket or from the Azure portal)
Eg. Kafka-v2
If you are using ISCSI , change the targetPortal and iqn.
Deploy the backbone services using the go command
Modify the “dev” environment name with your respective environment name.
Flags:
e --- Environment name
p --- Print the manifest
c --- Enable Cluster Configs
Check the status of pods
: a resource analysis tool with a nice dashboard that can also be used as a validating webhook or CLI tool.
: a static code analysis tool that works with Helm, Kustomize, and standard YAML files.
: read-only utility tool that scans Kubernetes clusters and reports potential issues with configurations.
Clone the git repo . Copy existing and with new environment name (eg..yaml and-secrets.yaml)