Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Find the pre-requisites for deploying DIGIT platform services on AWS
The Amazon Elastic Kubernetes Service (EKS) is one of the AWS services for deploying, managing and scaling any distributed and containerized workloads. Here we can provision the EKS cluster on AWS from the ground up using terraform (infra-as-code) and then deploy the DIGIT platform services as config-as-code using Helm.
Know about EKS: https://www.youtube.com/watch?v=SsUnPWp5ilc
Know what is terraform: https://youtu.be/h970ZBgKINg
AWS account with admin access to provision EKS Service. You can always subscribe to a free AWS account to learn the basics and try, but there is a limit to what is offered as free. For this demo, you need a commercial subscription to the EKS service. If you want to try for a day or two, it might cost you about Rs 500 - 1000.
Note: Post the Demo (for eGov internal folks only) - request for the AWS access for 4 hrs. Time-bound access to eGov's training AWS account is available upon request and the available number of slots per day).
Install kubectl (any version) on the local machine - it helps in interaction with the Kubernetes cluster.
Install Helm - this helps package the services along with the configurations, environments, secrets, etc into Kubernetes manifests.
Please refer to tfswitch documentation for different platforms. Terraform version 0.14.10 can be installed directly as well.
5. Run tfswitch and it will show a list of terraform versions. Scroll down and select terraform version (0.14.10) for the Infra-as-code (IaC) to provision cloud resources as code. This provides the desired resource graph and also helps destroy the cluster in one go.
Steps to setup the AWS account for deployment
Follow the details below to set up your AWS account before you proceed with the DIGIT deployment.
Install AWS CLI on your local machine so that you can use AWS CLI commands to provision and manage the cloud resources on your account.
Install AWS IAM Authenticator - it helps you authenticate your connection from your local machine so that you can deploy DIGIT services.
When you have the command line access configured, everything is set for you to proceed with the terraform to provision the DIGIT Infra-as-code.
Provision infra for DIGIT on AWS using Terraform
The Amazon Elastic Kubernetes Service (EKS) is one of the AWS services for deploying, managing, and scaling any distributed and containerized workloads. Here we can provision the EKS cluster on AWS from the ground up and using an automated way (infra-as-code) using terraform and then deploy the DIGIT services config-as-code using Helm.
Know about EKS: https://www.youtube.com/watch?v=SsUnPWp5ilc
Know what is terraform: https://youtu.be/h970ZBgKINg
There are multiple options available to deploy the solution to the cloud. Here we provide the steps to use terraform Infra-as-code.
Before we provision the cloud resources, we need to understand and be sure about what resources need to be provisioned by terraform to deploy DIGIT. The following picture shows the various key components. (EKS, Worker Nodes, PostGres DB, EBS Volumes, Load Balancer).
Considering the above deployment architecture, the following is the resource graph that we are going to provision using terraform in a standard way so that every time and for every env, it'll have the same infra.
EKS Control Plane (Kubernetes Master)
Work node group (VMs with the estimated number of vCPUs, Memory)
EBS Volumes (Persistent Volumes)
RDS (PostGres)
VPCs (Private network)
Users to access, deploy and read-only
(Optional) Create your own keybase key before you run the terraform
Fork the DIGIT-DevOps repository into your organization account using the GitHub web portal. Make sure to add the right users to the repository. Clone the forked DIGIT-DevOps repository (not the egovernments one). Navigate to the sample-aws
directory which contains the sample AWS infra provisioning script.
The sample-aws terraform script is provided as a helper/guide. An experienced DevOps can choose to modify or customize this as per the organization's infra needs.
Create Terraform backend to specify the location of the backend Terraform state file on S3 and the DynamoDB table used for the state file locking. This step is optional. S3 buckets have to be created outside of the Terraform script.
The remote state is simply storing that state file remotely, rather than on your local filesystem. In an enterprise project and/or if Terraform is used by a team, it is recommended to set up and use the remote state.
The terraform script once executed performs all of the below infrastructure setups.
Amazon EKS requires subnets must be in at least two different availability zones.
Create AWS VPC (Virtual Private Cloud).
Create two public and two private Subnets in different availability zones.
Create an Internet Gateway to provide internet access for services within VPC.
Create NAT Gateway in public subnets. It is used in private subnets to allow services to connect to the internet.
Create Routing Tables and associate subnets with them. Add required routing rules.
Create Security Groups and associate subnets with them. Add required routing rules.
EKS cluster setup
Navigate to the directory: DIGIT-DevOps/Infra-as-code/terraform/sample-aws. Configurations are defined in variables.tf and provide the environment-specific cloud requirements.
Following are the values that you need to replace in the following files. The blank ones will be prompted for inputs while execution.
cluster_name - provide your EKS cluster name here.
availability_zones - This is a comma-separated list. If you would like your infra to have multi-AZ setup, please provide multiple zones here. If you provide a single zone, all infra will be provisioned within that zone. For example:
3. bucket_name - if you've created a special S3 bucket to store Terraform state.
4. dbname - Any DB name of your choice. Note that this CANNOT have hyphens or other special characters. Underscore is permitted. Example: digit_test
All other variables are default and can be modified if the admin is knowledgeable about it.
5. In the providers.tf file in the same directory, modify the "profile" variable to point to the AWS profile that was created in Step 3.
Make sure your AWS session tokens are up to date in the /user/home/.aws/credentials file
Before running Terraform, make sure to clean up .terraform.lock.hcl, .terraform, terraform.tfstate files if you are starting from scratch.
Once you have finished declaring the resources, you can deploy all resources.
Let's begin to run the terraform scripts to provision infra required to Deploy DIGIT on AWS.
First CD into the following directory and run the following command to create the remote state.
Once the remote state is created, you are ready to provision DIGIT infra. Please run the following commands:
Important:
DB password will be asked for in the application stage. Please remember the password you have provided. It should be at least 8 characters long. Otherwise, RDS provisioning will fail.
The output of the apply command will be displayed on the console. Store this in a file somewhere. Values from this file will be used in the next step of deployment.
3. Finally, verify that you are able to connect to the cluster by running the following command
At this point, your basic infra has been provisioned. Please move to the next step to install DIGIT.
To destroy previously-created infrastructure with Terraform, run the command below:
ELB is not deployed via Terraform. ELB has created at deployment time by the setup of Kubernetes Ingress. This has to be deleted manually by deleting the ingress service.
kubectl delete deployment nginx-ingress-controller -n <namespace>
kubectl delete svc nginx-ingress-controller -n <namespace>
Note: Namespace can be one of egov or jenkins.
Delete S3 buckets manually from the AWS console and also verify if ELB got deleted.
Run terraform destroy
.
Sometimes all artefacts that are associated with a deployment cannot be deleted through Terraform. For example, RDS instances might have to be deleted manually. It is recommended to log in to the AWS management console and look through the infra to delete any remnants.
Steps to prepare the deployment configuration file
It's important to prepare a global deployment configuration yaml file that contains all necessary user-specific custom values like URL, gateways, persistent storage ids, DB details etc.
Know the basics of Kubernetes:
Know the commands
Know kubernetes manifests:
Know how to manage env values, secrets of any service deployed in kubernetes
Know how to port forward to a pod running inside k8s cluster and work locally
Know sops to secure your keys/creds:
Post-Kubernetes Cluster setup, the deployment has got 2 stages. As part of this sample exercise, we can deploy PGR and show what are the various configurations required, however deployment steps are similar for all other modules too, just that the prerequisites differ depending on the feature like SMS Gateway, Payment Gateway, etc
Navigate to the following file in your local machine from the previously cloned DevOps git repository.
root@ip:/# git clone -b release https://github.com/egovernments/DIGIT-DevOps
Step 2: After cloning the repo CD into the folder DIGIT-DevOps and type the "code ." command that will open the visual editor and opens all the files from the repo DIGIT-DevOps
Here you need to replace the following as per your values
SMS gateway to receive OTP, transaction mobile notification, etc.
MDMS, Config repo URL, here is where you provide master data, tenants and various user/role access details.
GMap key for the location service
Payment gateway, in case you use PT, TL, etc
Step 4: Update your credentials and sensitive data in the secret file as per your details.
SOPS expects an encryption key to use it to encrypt/decrypt a specified plain text and keep the details secured, there are a couple of options which you can use to generate the encryption key
Step 5: Important: Fork the following repos that contain the master data and default configs which you would customize as per your specific implementation later. Like (Master Data, ULB, Tenant details, Users, etc) to your respective GitHub organization account.
New github users should be enabled to access the earlier forked repos
Step 6: Update the deployment configs for the below as per your specification:
Number of replicas/scale of each individual service (Depending on whether dev or prod load)
You must update sms gateway, email gateway, and payment gateway details for the notification and payment gateway services, etc.
Update the config, MDMS github repos wherever marked
Update GMap key (In case you are using Google Map services in your PGR, PT, TL, etc)
URL/DNS on which the DIGIT will be exposed.
SSL certificate for the above URL.
Any specific endpoints configs (Internal/external)
If you have any questions please write to us.
Make sure to use the appropriate discussion category and labels to address the issues better.
Complete DIGIT installation step-by-step Instructions across various Infra types like public & private clouds
The would have helped you to get your hands dirty and build the Kubernetes cluster on a local/single VM instance - which you can consider for either local development or to understand the details involved in infra and deployment.
However, DIGIT is a platform and at the same time . Depending on the scale and performance running DIGIT on production requires advanced capabilities like HA, DRS, autoscaling, resiliency, etc. All these capabilities are supported by commercial clouds like AWS, Google, Azure, VMware, OpenStack, etc.. and also the private clouds like NIC and a few SDCs implemented clouds. These cloud providers provide the Kubernetes-as-a-managed-service that makes the entire infra setup and management seamless and automated, like infra-as-code, and config-as-code.
Know the basics of Kubernetes:
Know the commands
Know kubernetes manifests:
Know how to manage env values, secrets of any service deployed in kubernetes
Know how to port forward to a pod running inside k8s cluster and work locally
Know sops to secure your keys/creds:
Unlike quickstart, full installation requires state/user-specific configurations ready before proceeding with the deployment.
You need to have a fully qualified DNS (URL) - (Should not be a dummy)
Persistent storage depends on the cloud you are using for Kafka, ES, etc.
Either a standalone or a hosted PostGres DB above v11.x
GeoLocation provider configs (Google Location API), SMS Gateway, Payment Gateway, etc.
The newly created user must have access to the MDMS and config forked repo.
Choose your cloud and follow the instructions to set up a Kubernetes cluster before moving on to deployment.
Post-infra setup (Kubernetes Cluster), the deployment involves 2 stages and 2 modes. Check out the stages first and then the modes. As part of a sample exercise, we will deploy the PGR module. However, deployment steps are similar. The pre-requisites have to be configured accordingly.
each service global, local env variables
Number of replicas/scale of individual services (Depending on whether dev or prod)
MDMS, config repos (Master data, ULB, tenant details, users, etc)
sms g/w, email g/w, payment g/w
GMap key (In case you are using Google Map services in your PGR, PT, TL, etc)
S3 Bucket for Filestore
URL/DNS on which the DIGIT will be exposed
SSL certificate for the above URL
End-points configs (Internal/external)
Stage 2: Run the digit_setup deployment script and simply answer the questions that it asks.
All done, wait and watch for 10 min, you'll have the DIGIT setup completed and the application will be running on the given URL.
Essentially, DIGIT deployment means that we need to generate Kubernetes manifests for each individual service. We use the tool called the helm, which is an easy, effective and customizable packaging and deployment solution. So depending on where and in which env you initiate the deployment there are 2 modes that you can deploy.
From local machine - whatever we are trying in this sample exercise so far.
Post-deployment - the application is now accessible from the configured domain.
To try out PGR employee login - Create a sample tenant, city, and user to log in and assign an LME employee role using the seed script.
By now we have successfully completed the DIGIT setup on the cloud. Use the URL that you mentioned in your env.yaml Eg: https://mysetup.digit.org and create a grievance to ensure the PGR module deployed is working fine. Refer to the product documentation below for the steps.
Credentials:
Citizen: You can use your default mobile number (9999999999) to sign in using the default Mobile OTP 123456.
Employee: Username: GRO and password: eGov@4321
Post grievance creation and assignment of the same to LME, capture the screenshot of the same and share it to ensure your setup is working fine.
Post validating the PGR functionality share the API response of the following request to assess the correctness of successful DIGIT PGR Deployment.
Finally, clean up the DIGIT setup if you wish, using the following command. This will delete the entire cluster and other cloud resources that were provisioned for the DIGIT setup.
To destroy previously-created infrastructure with Terraform, run the command below:
ELB is not deployed via Terraform. ELB has created at deployment time by the setup of Kubernetes Ingress. This has to be deleted manually by deleting the ingress service.
kubectl delete deployment nginx-ingress-controller -n <namespace>
kubectl delete svc nginx-ingress-controller -n <namespace>
Note: Namespace can be one of egov or jenkins.
Delete S3 buckets manually from the AWS console and also verify if ELB got deleted.
Run terraform destroy
.
All done, we have successfully created infra on the cloud, deployed DIGIT, bootstrapped DIGIT, performed a transaction on PGR and finally destroyed the cluster.
This page details the steps to deploy the core platform services and reference applications.
The steps here can be used to deploy:
DIGIT core platform services
Public Grievance & Redressal module
Trade Licence module
Property Tax module
Water & Sewerage module etc.
DIGIT uses (required v1.13.3) automated scripts to deploy the builds onto Kubernetes - or or
All DIGIT services are packaged using helm charts
is a CLI to connect to the Kubernetes cluster from your machine
Install for making API calls
IDE Code for better code/configuration editing capabilities
to run digit bootstrap scripts
Once all the deployments configs are ready, run the command given below. Input the necessary details as prompted on the screen and the interactive installer will take care of the rest.
All done, wait and watch for 10 min. The DIGIT setup is complete, and the application will run on the URL.
Note:
If you do not have your domain yet, you can edit the host file entries and map the nginx-ingress-service load balancer id like below
When you find it, add the following lines to the host file, save and close it.
aws-load-balancer-id digit.try.com
You can now test the DIGIT application status in the command prompt/terminal using the command below.
Note: Initially pgr-services would be in crashloopbackoff state, but after performing the post-deployment steps the pgr-services will start running.
After deploying your environment config into the EC2 cluster, we have to add the security group ID of the instance to RDS.
Follow the steps below:
Go to the AWS console and search for EC2.
Click on Instances and select the Instance ID which you created.
Scroll down and go to Security and copy the Security Group ID. It starts with sg-xxxxxxxxxxxxxxxxx.
In the search bar, search for RDS and then go to Databases. Choose the db you had created.
Scroll down for Security Group rules and click any one sg. It redirects to another tab.
Scroll down and click on Edit inbound rules. Click on Add rule.
Change the Type to Postgresql and paste the copied sg-xxxxxxxxxxxxxxxxx beside the custom and click on save rules.
Annexures:
s
Use this URL to , this will create both public and private keys in your machine, upload the public key into the account that you have just created, and give a name to it and ensure that you mention that in your terraform. This allows encrypting of all sensitive information.
Example user keybase user in eGov case is "egovterraform" needs to be created and has to upload the public key here -
you can use this to Decrypt your secret key. To decrypt PGP Message, Upload the PGP Message, PGP Private Key and Passphrase.
In the sample-aws/remote-state/
file, specify the s3 bucket to store all the states of the execution to keep track.
The inside the sample-aws
folder contains the detailed resource definitions that need to be provisioned, please have a look at it.
2. Use this link to to get the kubeconfig file for the cluster. The region code is the default region provided in the availability zones in variables.tf. Eg. ap-south-1. EKS cluster name also should've been filled in variables.tf.
In case of if ELB is not deleted, you need to delete ELB from .
Step 1: Clone the following repo (If not already done as part of Infra setup), you may need to and then run it to your machine.
Step 3: Update the deployment config file with your details, you can use the following template
Important: Add your domain name , which you want to use for accessing DIGIT. ( Do not use the dummy domain )
Important: As per your cloud provider uncomment the related backbone services (Kafka, Zk, elasticsearch, etc) and comment on others. As per your cloud provider, you have to add the volume_ids/diskURI/iqn and zone/diskName/targetPortal in that you got as a terraform output or from sdc team (Kafka, Zk, elasticsearch, etc)
credentials, secrets (You need to encrypt using and create a separately)
Option 1: Generate PGP keys
Option 2: when you want to use the AWS cloud provider.
Once you generate your encryption key, create a .sops.yaml configuration file under the /helm directory of the cloned repo to define which keys are used for which specific file. refer to the SOP for info.
Note: For demo purposes, you can use the as it is without sops configuration, but make sure you update your specific details like Git SSH, URL etc. When you decide to push these configurations into any git or public space, please make sure you follow the sops configuration mentioned in this article to encrypt your secrets.
both the , and repos into your GitHub organization account
Once you fork the repos into your GitHub organization account, Create a , and generate ssh authentication key( and .
Add the ssh private key that you generated in the previous step to under the git-sync section.
Modify the services git-Sync repo and branch with your fork repo and branch in
Create one private s3 Bucket for Filestore and one public bucket for logos. Add the bucket details respectively and create an IAM user with the s3 bucket access. Add IAM user details to .
Access to MDMS repository for master data like Roles, Access, Actions, tenants, etc. Sample repo view .
Access to Configs repository like persister, searcher configs etc. Sample repo view .
Create a . This should be different from the repo-forked GitHub organization account. Once a user account is ready, generate the ssh authentication key and .
both the , and repos into your GitHub account.
Stage 1: Prepare an <, you can provide any name to this file. The file has the following configurations and this env file needs to be in line with your cluster name.
credentials, secrets (You need to encrypt using and create a <env>-secret.yaml separately)
Advanced: From CI/CD System like Jenkins - Depending on how you want to set up your CI/CD and the expertise the steps varies. Find out how we have set up CI/CD on Jenkins and the pipelines are created automatically without any manual intervention .
Run the of the egov-user service from Kubernetes cluster to your localhost. This gives you access to egov-user service directly and you can now interact with the API directly.
Ensure you have the postman to run the following seed data API. If not, on your local machine.
In case of if ELB is not deleted, you need to delete ELB from .
Run the egov-deployer golang script from the .
If you have a GoDaddy account or similar and a DNS records edit access you can map the load balancer id to desired DNS. Create a record with the load balancer ID and domain.
Pre-requisites for deployment on SDC
Check the hardware and software pre-requisites for deployment on SDC.
Kubernetes nodes
Ubuntu 18.04
SSH
Privileged user
Python
Deploy DIGIT using Kubespray
Kubespray is a composition of Ansible playbooks, inventory, provisioning tools, and domain knowledge for generic OS/Kubernetes cluster configuration management tasks. Kubespray provides:
a highly available cluster
composable attributes
support for most popular Linux distributions
continuous-integration tests
Before we can get started, we need a few prerequisites to be in place. This is what we are going to need:
A host with Ansible installed. Click here to learn more about Ansible. Find the Ansible installation details here.
You should also set up an SSH key pair to authenticate to the Kubernetes nodes without using a password. This permits Ansible to perform optimally.
Few servers/hosts/VMs to serve as our targets to deploy Kubernetes. I am using Ubuntu 18.04, and my servers each have 4GB RAM and 2vCPUs. This is fine for my testing purposes, which I use to try out new things using Kubernetes. You need to be able to SSH into each of these nodes as root using the SSH key pair I mentioned above.
The above will do the following:
Create a new Linux User Account for use with Kubernetes on each node
Install Kubernetes and containers on each node
Configure the Master node
Join the Worker nodes to the new cluster
Ansible needs Python to be installed on all the machines.
apt-get update && apt-get install python3-pip -y
All the machines should be in the same network with Ubuntu or Centos installed.
ssh key should be generated from the Bastion machine and must be copied to all the servers part of your inventory.
Generate the ssh key ssh-keygen -t rsa
Copy over the public key to all nodes.
Clone the official repository
Install dependencies from requirements.txt
Create Inventory
where mycluster is the custom configuration name. Replace with whatever name you would like to assign to the current cluster.
Create inventory using an inventory generator.
Once it runs, you can see an inventory file that looks like the below:
Review and change parameters under inventory/mycluster/group_vars
Deploy Kubespray with Ansible Playbook - run the playbook as Ubuntu
The option --become
is required for example writing SSL keys in /etc/, installing packages and interacting with various system daemons.
Note: Without --become
- the playbook will fail to run!
Kubernetes cluster will be created with three masters and four nodes using the above process.
Kube config will be generated in a .Kubefolder. The cluster can be accessible via kubeconfig.
Install haproxy package in a haproxy machine that will be allocated for proxy
sudo apt-get install haproxy -y
IPs need to be whitelisted as per the requirements in the config.
sudo vim /etc/haproxy/haproxy.cfg
Iscsi volumes will be provided by the SDC team as per the requisition and the same can be used for statefulsets.
The pre-requisites for deploying on Azure
The Azure Kubernetes Service (AKS) is one of the Azure services for deploying, managing and scaling any distributed and containerized workloads, here we can provision the AKS cluster on Azure from the ground up using terraform (infra-as-code) and then deploy the DIGIT platform services as config-as-code using Helm.
Know about AKS: https://www.youtube.com/watch?v=i5aALhhXDwc&ab_channel=DevOpsCoach
Know what is terraform: https://youtu.be/h970ZBgKINg
Azure subscription: If you don't have an Azure subscription, create a free account before you begin.
Install Azure CLI
Configure Terraform: Follow the directions in the article, Terraform and configure access to Azure
Azure service principal: Follow the directions in the Create the service principal section in the article, Create an Azure service principal with Azure CLI. Take note of the values for the appId, displayName, password, and tenant.
Install kubectl on your local machine which helps you interact with the Kubernetes cluster.
Install Helm that helps you package the services along with the configurations, environments, secrets, etc into Kubernetes manifests.
Steps to setup CI/CD on SDC
Kubespray is a composition of Ansible playbooks, inventory, provisioning tools, and domain knowledge for generic OS/Kubernetes cluster configuration management tasks. Kubespray provides:
a highly available cluster
composable attributes
support for most popular Linux distributions
continuous-integration tests
Fork the repos below to your GitHub Organization account
Go lang (version 1.13.X)
Install kubectl on your local machine to interact with the Kubernetes cluster.
Install Helm to help package the services along with the configurations, environment, secrets, etc into Kubernetes manifests.
One Bastion machine to run Kubespray
HA-PROXY machine which acts as a load balancer with Public IP. (CPU: 2Core , Memory: 4Gb)
one machine which acts as a master node. (CPU: 2Core , Memory: 4Gb)
one machine which acts as a worker node. (CPU: 8Core , Memory: 16Gb)
ISCSI volumes for persistence volume. (number of quantity: 2 )
kaniko-cache-claim:- 10Gb
Jenkins home:- 100Gb
Kubernetes nodes
Ubuntu 18.04
SSH
Privileged user
Python
Run and follow instructions on all nodes.
Ansible needs Python to be installed on all the machines.
apt-get update && apt-get install python3-pip -y
All the machines should be in the same network with ubuntu or centos installed.
ssh key should be generated from the Bastion machine and must be copied to all the servers part of your inventory.
Generate the ssh key ssh-keygen -t rsa
Copy over the public key to all nodes.
Clone the official repository
Install dependencies from requirements.txt
Create Inventory
where mycluster is the custom configuration name. Replace with whatever name you would like to assign to the current cluster.
Create inventory using an inventory generator.
Once it runs, you can see an inventory file that looks like the below:
Review and change parameters under inventory/mycluster/group_vars
Deploy Kubespray with Ansible Playbook - run the playbook as Ubuntu
The option --become
is required, for example writing SSL keys in /etc/, installing packages and interacting with various system daemons.
Note: Without --become
- the playbook will fail to run!
Kubernetes cluster will be created with three masters and four nodes with the above process.
Kube config will be generated in a .Kubefolder. The cluster can be accessible via kubeconfig.
Install haproxy package in a haproxy machine that will be allocated for proxy
sudo apt-get install haproxy -y
IPs need to be whitelisted as per the requirements in the config.
sudo vim /etc/haproxy/haproxy.cfg
Iscsi volumes will be provided by the SDC team as per the requisition and the same can be used for statefulsets.
Refer to the doc here.
DIGIT being a container-based platform and orchestrated on Kubernetes, this page discusses some key security practices to protect the infrastructure
On this page:
Security is always a difficult subject to approach either by lack of experience; or by the fact you should know when the level of security is right for what you have to secure.
Security is a major concern when it comes to government systems and infra. As an architect, we can consider that working with technically educated people (engineers, experts) and tools (systems, frameworks, IDE) should prevent key VAPT issues.
However, it’s quite difficult to avoid, a certain infatuation from different categories of people to try to hack the systems.
There aren’t only bug fixes in each release but also new security measures to require advantage of them, we recommend working with the newest stable version.
Updates and support could also be harder than the new features offered in releases, so plan your updates a minimum of once a quarter. Significantly simplified updates can utilize the providers of managed Kubernetes solutions.
Use RBAC (Role-Based Access Control) to regulate who can access and what rights they need. Usually, RBAC is enabled by default in Kubernetes version 1.6 and later (or later for a few providers), but if you’ve got been updated since then and didn’t change the configuration, you ought to double-check your settings.
However, enabling RBAC isn’t enough — it still must be used effectively. within the general case, the rights to the whole cluster (cluster-wide) should be avoided, giving preference to rights in certain namespaces. Avoid giving someone cluster administrator privileges even for debugging — it’s much safer to grant rights only necessary and from time to time.
If the appliance requires access to the Kubernetes API, create separate service accounts. and provides them with the minimum set of rights required for every use case. This approach is far better than giving an excessive amount of privilege to the default account within the namespace.
Creating separate namespaces is vital because of the first level of component isolation. it’s much easier to regulate security settings — for instance, network policies — when different types of workloads are deployed in separate namespaces.
To get in-depth knowledge of Kubernetes, enrol for a live demo on the Kubernetes course.
A good practice to limit the potential consequences of compromise is to run workloads with sensitive data on a fanatical set of machines. This approach reduces the risk of a less secure application accessing the application with sensitive data running in the same container executable environment or on the same host.
For example, a Kubelet of a compromised node usually has access to the contents of secrets only if they are mounted on pods that are scheduled to be executed on the same node. If important secrets are often found on multiple cluster nodes, the attacker will have more opportunities to urge them.
Separation can be done using node pools (in the cloud or on-premises), as well as Kubernetes controlling mechanisms, such as namespaces, taints, tolerations, and others.
Sensitive metadata — for instance, Kubelet administrative credentials, are often stolen or used with malicious intent to escalate privileges during a cluster. For example, a recent find within Shopify’s bug bounty showed in detail how a user could exceed authority by receiving metadata from a cloud provider using specially generated data for one of the microservices.
The GKE metadata concealment function changes the mechanism for deploying the cluster in such how that avoids such a drag. And we recommend using it until a permanent solution is implemented.
Network Policies — allow you to control access to the network in and out of containerized applications. To use them, you must have a network provider with support for such a resource. For managed Kubernetes solution providers such as Google Kubernetes Engine (GKE), support will need to be enabled.
Once everything is ready, start with simple default network policies — for example, blocking (by default) traffic from other namespaces.
Pod Security Policy sets the default values used to start workloads in the cluster. Consider defining a policy and enabling the Pod Security Policy admission controller: the instructions for these steps vary depending on the cloud provider or deployment model used.
In the beginning, you might want to disable the NET_RAW capability in containers to protect yourself from certain types of spoofing attacks.
To improve host security, you can follow these steps:
Ensure that the host is securely and correctly configured. One way is CIS Benchmarks; Many products have an auto checker that automatically checks the system for compliance with these standards.
Monitor the network availability of important ports. Ensure that the network is blocking access to the ports used by Kubelet, including 10250 and 10255. Consider restricting access to the Kubernetes API server — with the exception of trusted networks. In clusters that did not require authentication and authorization in the Kubelet API, attackers used to access to such ports to launch cryptocurrency miners.
Minimize administrative access to Kubernetes hosts Access to cluster nodes should in principle be limited: for debugging and solving other problems, as a rule, you can do without direct access to the node.
Make sure that audit logs are enabled and that you are monitoring for the occurrence of unusual or unwanted API calls in them, especially in the context of any authorization failures — such entries will have a message with the “Forbidden” status. Authorization failures can mean that an attacker is trying to take advantage of the credentials obtained.
Managed solution providers (including GKE) provide access to this data in their interfaces and can help you set up notifications in case of authorization failures.
Follow these guidelines for a more secure Kubernetes cluster. Remember that even after the cluster is configured securely, you need to ensure security in other aspects of the configuration and operation of containers. To improve the security of the technology stack, study the tools that provide a central system for managing deployed containers, constantly monitoring and protecting containers and cloud-native applications.
This section contains a list of documents elaborating on the key concepts aiding the deployment of the DIGIT platform
Deployment on SDC
Running Kubernetes on-premise gives a cloud-native experience on SDC when it comes to deploying DIGIT.
Whether States have their own on-premise data centre or have decided to forego the various managed cloud solutions, there are a few things one should know when getting started with on-premise K8s.
One should be familiar with Kubernetes and the control plane consists of the Kube-apiserver, Kube-scheduler, Kube-controller-manager and an ETCD datastore. For managed cloud solutions like Google’s Kubernetes Engine (GKE) or Azure’s Kubernetes Service (AKS), it also includes the cloud-controller-manager. This is the component that connects the cluster to external cloud services to provide networking, storage, authentication, and other support features.
To successfully deploy a bespoke Kubernetes cluster and achieve a cloud-like experience on SDC, one needs to replicate all the same features you get with a managed solution. At a high level, this means that we probably want to:
Automate the deployment process
Choose a networking solution
Choose a right storage solution
Handle security and authentication
The subsequent sections look at each of these challenges individually, and provide enough of a context required to help in getting started.
Using a tool like Ansible can make deploying Kubernetes clusters on-premise trivial.
When deciding to manage your own Kubernetes clusters, we need to set up a few proofs-of-concept (PoC) clusters to learn how everything works, perform performance and conformance tests, and try out different configuration options.
After this phase, automating the deployment process is an important if not necessary step to ensure consistency across any clusters you build. For this, you have a few options, but the most popular are:
kubeadm: a low-level tool that helps you bootstrap a minimum viable Kubernetes cluster that conforms to best practices
kubespray: an Ansible playbook that helps deploy production-ready clusters
If you already using Ansible, Kubespray is a great option, otherwise, we recommend writing automation around Kubeadm using your preferred playbook tool after using it a few times. This will also increase your confidence and knowledge of Kubernetes.
Steps to bootstrap DIGIT
Post-deployment, the application can now be accessed from the configured domain. This page provides the bootstrapping steps.
To try out employee login, let us create a sample tenant, city, and user to log in and assign the LME employee role through the seed script.
Perform the kubectl port-forwarding of the egov-user service running from the Kubernetes cluster to your localhost. This provides access to egov-user service and allows users to interact with the API directly.
2. Seed the sample data
Ensure the Postman is installed to run the following seed data API. if not, Install postman on your local machine.
Import the following Postman collection into the Postman and run it. This contains the seed data that enables sample test users and localisation data.
Execute the below commands to test your local machine's Kubernetes operations through kubectl.
You have successfully completed the DIGIT Infra, deployment setup and installed a DIGIT - PGR module.
Use the below link in the browser -
Use the below credentials to login into the complaint section
Username: GRO
Password: eGov@4321
City: CITYA
By now we have successfully completed the DIGIT setup on the cloud, use the URL you mentioned in your env.yaml.
Eg: https://mysetup.digit.org and create a grievance to ensure the PGR module deployed is working fine. Refer to the below product documentation for the steps.
Credentials:
Citizen: You can use your default mobile number (9999999999) to sign in using the default Mobile OTP 123456.
Employee: Username: GRO and password: eGov@4321
Post grievance creation and assignment of the same to LME, capture the screenshot of the same and share it to ensure your setup is working fine. Post validation, the PGR functionality shares the API response of the following request to assess the correctness of successful DIGIT PGR Deployment.
Follow the steps below to clean up the DIGIT setup, if required. This will delete the entire cluster and other cloud resources that were provisioned for the DIGIT Setup.
Run the command below to destroy previously-created infrastructure using Terraform -
ELB is not deployed via Terraform. ELB has created at deployment time by the setup of Kubernetes Ingress. This has to be deleted manually by deleting the ingress service.
kubectl delete deployment nginx-ingress-controller -n <namespace>
kubectl delete svc nginx-ingress-controller -n <namespace>
Note: Namespace can be either provided by egov or Jenkins.
Delete S3 buckets manually from the AWS console and also verify if the ELB got deleted.
In case the ELB is not deleted, you need to delete ELB from the AWS console.
Run terraform destroy
.
All done, we have successfully created infra on the cloud, deployed DIGIT, bootstrapped DIGIT, performed a transaction on PGR and finally destroyed the cluster.
To deploy the solution to the cloud there are several ways that we can choose. In this case, we will use terraform Infra-as-code.
Terraform is an open-source infrastructure as code (IaC) software tool that allows DevOps engineers to programmatically provision the physical resources an application requires to run.
Infrastructure as code is an IT practice that manages an application's underlying IT infrastructure through programming. This approach to resource allocation allows developers to logically manage, monitor and provision resources -- as opposed to requiring that an operations team manually configure each required resource.
Terraform users define and enforce infrastructure configurations by using a JSON-like configuration language called HCL (HashiCorp Configuration Language). HCL's simple syntax makes it easy for DevOps teams to provision and re-provision infrastructure across multiple clouds and on-premises data centres.
Before we provision the cloud resources, we need to understand and be sure about what resources need to be provisioned by Terraform to deploy DIGIT. The following picture shows the various key components. (AKS, Node Pools, Postgres DB, Volumes, Load Balancer)
Ideally, one would write the terraform script from scratch using this doc.
Here we have already written the terraform script that one can reuse/leverage that provisions the production-grade DIGIT Infra and can be customized with the user-specific configuration.
Clone the following DIGIT-DevOps where we have all the sample terraform scripts available for you to leverage.
2. Change the main.tf according to your requirements.
3. Declare the variables in variables.tf
Save the file and exit the editor.
4. Create a Terraform output file (output.tf) and paste the following code into the file.
Once you have finished declaring the resources, you can deploy all resources.
terraform init
: command is used to initialize a working directory containing Terraform configuration files.
terraform plan
: command creates an execution plan, which lets you preview the changes that Terraform plans to make to your infrastructure.
terraform apply
: command executes the actions proposed in a Terraform plan to create or update infrastructure.
After the complete creation, you can see resources in your Azure account.
Now we know what the terraform script does, the resources graph that it provisions and what custom values should be given with respect to your environment. The next step is to begin to run the terraform scripts to provision infra required to Deploy DIGIT on Azure.
Use the CD command to move into the following directory run the following commands 1-by-1 and watch the output closely.
The Kubernetes tools can be used to verify the newly created cluster.
Once Terraform Apply execution is complete, it generates the Kubernetes configuration file or you can get it from Terraform state.
Use the below command to get kubeconfig. It will automatically store your kubeconfig in .kube folder.
3.
Verify the health of the cluster.
The details of the worker nodes should reflect the status as Ready for All.
Provision infra for DIGIT on Azure using Terraform
Azure Kubernetes Service (AKS) manages your hosted Kubernetes environment. AKS allows you to deploy and manage containerized applications without container orchestration expertise. AKS also enables you to do many common maintenance operations without taking your app offline. These operations include provisioning, upgrading, and scaling resources on demand.
Since there are many DIGIT services and the development code is part of various git repos, one needs to understand the concept of cicd-as-service which is open-sourced. This page guides you through the process of creating a CI/CD pipeline.
The initial steps for integrating any new service/app to the CI/CD are discussed below.
Once the desired service is ready for integration: decide the service name, type of service, and if DB migration is required or not. While you commit the source code of the service to the git repository, the following file should be added with the relevant details which are mentioned below:
Build-config.yml – It is present under the build directory in the repository
This file contains the below details used for creating the automated Jenkins pipeline job for the newly created service.
While integrating a new service/app, the above content needs to be added to the build-config.yml file of that app repository. For example: to onboard a new service called egov-test, the build-config.yml should be added as mentioned below.
If a job requires multiple images to be created (DB Migration) then it should be added as below,
Note - If a new repository is created then the build-config.yml is created under the build folder and the config values are added to it.
The git repository URL is then added to the Job Builder parameters
When the Jenkins Job => job builder is executed, the CI Pipeline gets created automatically based on the above details in build-config.yml. Eg: egov-test job is created in the builds/DIGIT-OSS/core-services folder in Jenkins since the “build-config is edited under core-services” And it should be the “master” branch. Once the pipeline job is created, it can be executed for any feature branch with build parameters - specifying the branch to be built (master or feature branch).
As a result of the pipeline execution, the respective app/service docker image is built and pushed to the Docker repository.
On repo provide read-only access to GitHub users (created while ci/cd deployment)
The Jenkins CI pipeline is configured and managed 'as code'.
Job Builder – Job Builder is a Generic Jenkins job which creates the Jenkins pipeline automatically which is then used to build the application, create the docker image of it and push the image to the Docker repository. The Job Builder job requires the git repository URL as a parameter. It clones the respective git repository and reads the build/build-config.yml file for each git repository and uses it to create the service build job.
Check and add your repo ssh URL in ci.yaml
If the git repository ssh URL is available, build the Job-Builder Job.
If the git repository URL is not available, check and add the same team.
The services are deployed and managed on a Kubernetes cluster in cloud platforms like AWS, Azure, GCP, OpenStack, etc. Here, we use helm charts to manage and generate the Kubernetes manifest files and use them for further deployment to the respective Kubernetes cluster. Each service is created as charts which have the below-mentioned files.
Note: The steps below are only for the introduction and implementation of new services.
To deploy a new service, you need to create a new helm chart for it( refer to the above example). The chart should be created under the charts/helm directory in the DIGIT-DevOps repository.
If you are going to introduce a new module with the help of multiple services, we suggest you create a new Directory with your module name.
Example.:-
You can refer to the existing helm chart structure here
This chart can also be modified further based on user requirements.
The deployment of manifests to the Kubernetes cluster is made very simple and easy. There are Jenkins Jobs for each state and are environment-specific. We need to provide the image name or the service name for the respective Jenkins deployment job.
The deployment Jenkins job internally performs the following operations:
Reads the image name or the service name given and finds the chart that is specific to it.
Generates the Kubernetes manifests files from the chart using the helm template engine.
Execute the deployment manifest with the specified docker image(s) to the Kubernetes cluster.
This tutorial will walk you through on how to setup CI/CD
helps you build a graph of all your resources and parallelizes the creation or modification of any non-dependent resources. Thus, Terraform builds infrastructure as efficiently as possible while providing the operators with clear insight into the dependencies on the infrastructure.
the repos below to your GitHub Organization account
(version 1.13.X)
with admin access to provision EKS Service. Try subscribing to a free AWS account to learn the basics. There is a limit on . This demo requires a commercial subscription to the EKS service. The cost for a one or two days trial might range between Rs 500-1000. (Note: Post the demo, for the internal folks, eGov will provide a 2-3 hrs time-bound access to eGov's AWS account based on the request and the available number of slots per day).
Install on your local machine to interact with the Kubernetes cluster.
Install to help package the services along with the configurations, environment, secrets, etc into a .
Install version (0.14.10) for the Infra-as-code (IaC) to provision cloud resources as code and with desired resource graph. It also helps destroy the cluster in one go.
on your local machine so that you can use AWS CLI commands to provision and manage the cloud resources on your account.
Install to help authenticate your connection from your local machine and deploy DIGIT services.
Use the credentials provided for the Terraform () to connect to the AWS account and provision the cloud resources.
You will receive a Secret Access Key and Access Key ID. Save the keys.
Open the terminal and run the command given below. The AWS CLI is already installed and the credentials are saved. (Provide the credentials, leave the region and output format blank).
The above creates the following file on your machine as /Users/.aws/credentials.
Before we provision the cloud resources, we need to understand and be sure about what resources need to be provisioned by terraform to deploy CI/CD.
The following is the resource graph that we are going to provision using terraform in a standard way so that every time and for every environment, the infra is the same.
EKS Control Plane (Kubernetes master)
Work node group (VMs with the estimated number of vCPUs, Memory)
EBS Volumes (Persistent volumes)
VPCs (Private networks)
Users to access, deploy and read-only
Here we have already written the terraform script that provisions the production-grade DIGIT Infra and can be customized with the specified configuration.
Here, you will find the main.tf under each of the modules that have the provisioning definition for resources like EKS cluster, storage, etc. All these are modularized and react as per the customized options provided.
Example:
VPC Resources -
VPC
Subnets
Internet Gateway
Route Table
EKS Cluster Resources -
IAM Role to allow EKS service to manage other AWS services
EC2 Security Group to allow networking traffic with the EKS cluster
EKS Cluster
EKS Worker Nodes Resources -
IAM role allowing Kubernetes actions to access other AWS services
EC2 Security Group to allow networking traffic
Data source to fetch the latest EKS worker AMI
AutoScaling launch configuration to configure worker instances
AutoScaling group to launch worker instances
Storage Module -
Configuration in this directory creates EBS volume and attaches it together.
The following main.tf with create s3 bucket to store all the states of the execution to keep track.
The following main.tf contains the detailed resource definitions that need to be provisioned.
Dir: DIGIT-DevOps/Infra-as-code/terraform/egov-cicd
Define your configurations in variables.tf and provide the environment-specific cloud requirements. The same terraform template can be used to customize the configurations.
Following are the values that you need to mention in the following files. The blank ones will prompt for inputs during execution.
We have covered what the terraform script does, the resources graph that it provisions and what custom values should be given with respect to the selected environment.
Use the 'cd' command to change to the following directory and run the following commands. Check the output.
After successful execution, the following resources get created and can be verified by the command "terraform output".
s3 bucket: to store terraform state
Network: VPC, security groups
IAM users auth: using keybase to create admin, deployer, the user
EKS cluster: with master(s) & worker node(s).
Storage(s): for es-master, es-data-v1, es-master-infra, es-data-infra-v1, zookeeper, kafka, kafka-infra.
Finally, verify that you are able to connect to the cluster by running the command below:
Whola! All set and now you can Deploy Jenkins
Post infra setup (Kubernetes Cluster), we start with deploying the Jenkins and kaniko-cache-warmer.
Sub domain to expose CI/CD URL
Under Authorization callback URL
enter the below url ie (Replace <domain_name> with your domain) https://<domain_name>/securityRealm/finishLogin
SSL certificate for the sub-domain
Make sure earlier created github users have read-only access to the forked DIGIT-DevOps and CIOps repos.
SSL certificate for the sub-domain.
Remove the below env:
Remove the below volumeMounts:
Remove the below volume:
Jenkins is launched. You can access the same through your sub-domain configured in ci.yaml.
Release chart helps to deploy the product specific modules in one click
This section of the document walks you through the details of how to prepare a new release chart for existing products.
Git
IDE Code for better code visualization/editing capabilities
Clone the following where we have all the release charts for you to refer.
Create a new release version of the below products.
Select your product and copy the previous release version file and rename it with your new version.
The above code ensures the dependancy_chart-digit-v2.6.yaml with your new release version is copied and renamed.
Note: replace <your_release_version> with your new release version.
Navigate to the release file on your local machine. Open the file using Visualstudio or any other file editor.
Update the release version "v2.6" with your new release version.
Update the modules(core, business, utilities, m_pgr, m_property-tax,..etc) service images with new release service images.
Add new modules
name - add your module name with "m_demo" ideal format ie. "m" means module and "demo" would be your module name
dependencies - add your module dependencies (name of other modules)
services - add your module-specific new service images
This section of the document walks you through the details of how to prepare a new release chart for new products.
Git
GitHub Organization Account
When you have a new product to introduce, you can follow the below steps to create the release chart for a new product.
eGov partners can follow the below steps:
Clone the forked DIGIT-DevOps repo to your local machine
git clone --branch release https://github.com/<your_organization_account_name>/DIGIT-DevOps.git
Note: replace this <your_organization_account_name> with your github organization account name.
Navigate to the product-release-charts folder and create a new folder with your product name. cd DIGIT-DevOps/config-as-code/product-release-charts mkdir <new_product_name>
Note: replace <new_product_name> with your new product name.
Create a new release chart file in the above-created product folder.touch dependancy_chart-<new_product_name>-<release_version>.yaml
1. Open your release chart file dependancy_chart-<new_product_name
>-<release_version>.yaml and start preparing as mentioned in the below release template.
eGov users can follow the below steps:
Clone the forked DIGIT-DevOps repo to your local machine
git clone --branch release https://github.com/egovernments/DIGIT-DevOps.git
Navigate to the product-release-charts folder and create a new folder with your product name. cd DIGIT-DevOps/config-as-code/product-release-charts mkdir <new_product_name>
Note: replace <new_product_name> with your new product name
Create a new release chart file in the above-created product folder.touch dependancy_chart-<new_product_name>-<release_version>.yaml
1. Open your release chart file dependancy_chart-<new_product_name
>-<release_version>.yaml and start preparing as mentioned in the below release template.
Watch the video below to access information on how to deploy DIGIT services on Kubernetes, and prepare deployment manifests for various services along with their configurations and secrets. etc. It also discusses the maintenance of environment-specific changes.
“Resource Request” and a “Resource Limit” when defining how many resources a container within a pod should receive
On this page:
Containerising applications and running them on Kubernetes does not mean we can forget all about resource utilization. Our thought process may have changed because we can easily scale out our application as demand increases. We need to consider frequently how our containers might fight with each other for resources. Resource requests and limits can be used to help stop the “noisy neighbour” problem in a Kubernetes Cluster.
To put things simply, a resource request specifies the minimum amount of resources a container needs to successfully run. Thought of in another way, this is a guarantee from Kubernetes that you’ll always have this amount of either CPU or Memory allocated to the container.
Why would you worry about the minimum amount of resources guaranteed to a pod? Well, it's to help prevent one container from using up all the node’s resources and starving the other containers from CPU or memory. For instance, if I had two containers on a node, one container could request 100% of that node's processor. Meanwhile, the other container would likely not be working very well because the processor is being monopolized by its “noisy neighbour”.
What a resource request can do, is to ensure that at least a small part of that processor’s time is reserved for both containers. This way if there is resource contention, each pod will have a guaranteed, minimum amount of resources in which to still function.
As you might guess, a resource limit is the maximum amount of CPU or memory that can be used by a container. The limit represents the upper bounds of how much CPU or memory that a container within a pod can consume in a Kubernetes cluster, regardless of whether or not the cluster is under resource contention.
Limits prevent containers from taking up more resources on the cluster than you’re willing to let them.
As a general rule, all containers should have a request for memory and CPU before deploying to a cluster. This will ensure that if resources are running low, your container can still do the minimum amount of work to stay in a healthy state until the resources free up again (hopefully).
Limits are often used in conjunction with requests to create a “guaranteed pod”. This is where the request and limit are set to the same value. In that situation, the container will always have the same amount of CPU available to it, no more or less.
At this point, you may be thinking about adding a high “request” value to make sure you have plenty of resources available for your container. This might sound like a good idea but have dramatic consequences for scheduling on the Kubernetes cluster. If you set a high CPU request, for example, 2 CPUs, then your pod will ONLY be able to be scheduled on Kubernetes nodes that have 2 full CPUs available that aren’t reserved by other pods’ requests. In the example below, the 2 vCPU pods couldn’t be scheduled on the cluster. However, if you were to lower the “request” amount to say 1 vCPU, it could.
Let us try out using a CPU limit on a pod and see what happens when we try to request more CPU than we’re allowed to have. Before we set the limit though, let us look at a pod with a single container under normal conditions. I’ve deployed a resource consumer container in my cluster and by default, you can see that I am using 1m CPU(cores) and 6 Mi(bytes) of memory.
NOTE: CPU is measured in millicores so 1000m = 1 CPU core. Memory is measured in Megabytes.
Ok, now that we have seen the “no-load” state, let us add some CPU load by making a request to the pod. Here, we increased the CPU usage on the container to 400 millicores.
After the metrics start coming in, you can see that we got roughly 400m used on the container as you’d expect to see.
Now we have deleted the container and will edit the deployment manifest so that it has a limit on CPU.
After redeploying the container and again increasing the CPU load to 400m, we can see that the container is throttled to 300m instead. We have effectively “limited” the resources the container could consume from the cluster.
Next, we deployed two pods into the Kubernetes cluster and those pods are on the same worker node for a simple example of contention. We have got a guaranteed pod that has 1000m CPU set as a limit but also as a request. The other pod is unbounded, meaning there is no limit on how much CPU it can utilize.
After the deployment, each pod is really not using any resources as you can see here.
We make a request to increase the load on the non-guaranteed pod.
And if we look at the container's resources you can see that even though the container wants to use a 2000m CPU, it is actually using a 1000m CPU only. The reason for this is that the guaranteed pod is guaranteed a 1000m CPU, whether it is actively using that CPU or not.
Kubernetes uses resource requests to set a minimum amount of resources for a given container so that it can be used if it needs it. You can also set a resource limit to set the maximum amount of resources a pod can utilize.
Taking these two concepts and using them together can ensure that your critical pods always have the resources that they need to stay healthy. They can also be configured to take advantage of shared resources within the cluster.
Be careful setting resource requests too high so your Kubernetes scheduler can still schedule these pods. Good luck!
On this page:
This section contains architectural details about DIGIT deployment. It discusses the various activities in a sequence of steps to provision required infra and deploy DIGIT.
Every code commit is well-reviewed and squash merge to branches through Pull Requests.
Trigger the CI Pipeline that ensures code quality, vulnerability assessments, and CI tests before building the artefacts.
Artefact is version controlled based on Semantic versioning based on the nature of the change.
After successful CI, Jenkins bakes the Docker Images with the versioned artefacts and pushes the baked Docker image to Docker Registry.
Deployment Pipeline pulls the built Image and pushes it to the corresponding environment.
DIGIT has built helm charts using the standard helm approach to ease managing the service-specific configs, customisations, switch/toggle, secrets, etc.
Golang base Deployment script that reads the values from the helm charts template and deploys into the cluster.
Each env will have one master yaml template that will have the definition of all the services to be deployed, and their dependencies like Config, Env, Secrets, DB Credentials, Persistent Volumes, Manifest, Routing Rules, etc.
On this page:
In Kubernetes, an Ingress is an object that allows access to your Kubernetes services from outside the Kubernetes cluster. You configure access by creating a collection of rules that define which inbound connections reach which services.
This lets you consolidate your routing rules into a single resource. For example, you might want to send requests to example.com/api/v1/ to an api-v1 service, and requests to example.com/api/v2/ to the api-v2 service. With an Ingress, you can easily set this up without creating a bunch of LoadBalancers or exposing each service on the Node.
An API object that manages external access to the services in a cluster, typically HTTP. Ingress may provide load balancing, SSL termination and name-based virtual hosting.
For clarity, this guide defines the following terms:
Node: A worker machine in Kubernetes, part of a cluster.
Cluster: A set of Nodes that run containerized applications managed by Kubernetes. For this example, and in most common Kubernetes deployments, nodes in the cluster are not part of the public internet.
Edge router: A router that enforces the firewall policy for your cluster. This could be a gateway managed by a cloud provider or a physical piece of hardware.
Ideally, all Ingress controllers should fit the reference specification. In reality, the various Ingress controllers operate slightly differently.
An Ingress resource example:
Each HTTP rule contains the following information:
A list of paths (for example, /testpath), each of which has an associated backend defined with a service.name and a service.port.name or service.port.number. Both the host and path must match the content of an incoming request before the load balancer directs traffic to the referenced Service.
A default backend is often configured in an Ingress controller to service any requests that do not match a path in the spec.
On this page:
Once the cluster is ready and healthy you can start deploying backbones services.
Deploy configuration and deployment in the following Services Lists
Backbone (Redis, ZooKeeper-v2, Kafka-v2,elasticsearch-data-v1, elasticsearch-client-v1, elasticsearch-master-v1)
Gateway (Zuul, nginx-ingress-controller)
Understanding of VM Instances, LoadBalancers, SecurityGroups/Firewalls, nginx, DB Instances, and data volumes.
Experience with Kubernetes, Docker, Jenkins, helm, golang, Infra-as-code.
Deploy configuration and deployment backbone services:
Modify the global domain and set namespaces create to true
Modify the below-mentioned changes for each backbone service:
Eg. For Kafka-v2
If you are using AWS as a cloud provider, change the respective volume ids and zones. (You will get the volume ids and zone details from either a remote state bucket or from the AWS portal).
Eg. Kafka-v2
If you are using Azure cloud provider, change the diskName and diskUri. (You will get the volume ids and zone details from either the remote state bucket or from the Azure portal)
Eg. Kafka-v2
If you are using ISCSI , change the targetPortal and iqn.
Deploy the backbone services using the go command
Modify the “dev” environment name with your respective environment name.
Flags:
e --- Environment name
p --- Print the manifest
c --- Enable Cluster Configs
Check the status of pods
Overview of various probes that we can setup to ensure the service deployment and the availability of the service is ensured automatically.
On this page:
Determining the state of a service based on readiness, liveness, and startup to detect and deal with unhealthy situations. It may happen that the application needs to initialize some state, make database connections, or load data before handling application logic. This gap in time between when the application is actually ready versus when Kubernetes thinks is ready becomes an issue when the deployment begins to scale and unready applications receive traffic and send back 500 errors.
Many developers assume that when basic pod setup is adequate, especially when the application inside the pod is configured with daemon process managers (e.g. PM2 for Node.js). However, since Kubernetes deems a pod as healthy and ready for requests as soon as all the containers start, the application may receive traffic before it is actually ready.
Kubernetes supports readiness and liveness probes for versions ≤ 1.15. Startup probes were added in 1.16 as an alpha feature and graduated to beta in 1.18 (WARNING: 1.16 deprecated several Kubernetes APIs. Use this to check for compatibility).
All the probes have the following parameters:
initialDelaySeconds
: number of seconds to wait before initiating liveness or readiness probes
periodSeconds
: how often to check the probe
timeoutSeconds
: number of seconds before marking the probe as timing out (failing the health check)
successThreshold
: minimum number of consecutive successful checks for the probe to pass
failureThreshold
: number of retries before marking the probe as failed. For liveness probes, this will lead to the pod restarting. For readiness probes, this will mark the pod as unready.
Readiness probes are used to let Kubelet know when the application is ready to accept new traffic. If the application needs some time to initialize the state after the process has started, configure the readiness probe to tell Kubernetes to wait before sending new traffic. A primary use case for readiness probes is directing traffic to deployments behind a service.
One important thing to note with readiness probes is that it runs during the pod’s entire lifecycle. This means that readiness probes will run not only at startup but repeatedly throughout as long as the pod is running. This is to deal with situations where the application is temporarily unavailable (i.e. loading large data, waiting on external connections). In this case, we don’t want to necessarily kill the application but wait for it to recover. Readiness probes are used to detect this scenario and not send traffic to these pods until it passes the readiness check again.
Liveness probes are used to restart unhealthy containers. The Kubelet periodically pings the liveness probe, determines the health, and kills the pod if it fails the liveness check.
Liveness checks can help the application recover from a deadlock situation. Without liveness checks, Kubernetes deems a deadlocked pod healthy since the underlying process continues to run from Kubernetes’s perspective. By configuring the liveness probe, the Kubelet can detect that the application is in a bad state and restarts the pod to restore availability.
Startup probes are similar to readiness probes but only executed at startup. They are optimized for slow-starting containers or applications with unpredictable initialization processes. With readiness probes, we can configure the initialDelaySeconds
to determine how long to wait before probing for readiness. Now consider an application where it occasionally needs to download large amounts of data or do an expensive operation at the start of the process. Since initialDelaySeconds
is a static number, we are forced to always take the worst-case scenario (or extend the failureThreshold
one that may affect long-running behaviour) and wait for a long time even when that application does not need to carry out long-running initialization steps. With startup probes, we can instead configure failureThreshold
and periodSeconds
to model this uncertainty better. For example, setting failureThreshold
to 15 and periodSeconds
to 5 means the application will get 10 x 5 = 75s to startup before it fails.
Now that we understand the different types of probes, we can examine the three different ways to configure each probe.
The Kubelet sends an HTTP GET request to an endpoint and checks for a 2xx or 3xx response. You can reuse an existing HTTP endpoint or set up a lightweight HTTP server for probing purposes (e.g. an Express server with /healthz
endpoint).
HTTP probes take in additional parameters:
host
: hostname to connect to (default: pod’s IP)
scheme
: HTTP (default) or HTTPS
path
: path on the HTTP/S server
httpHeaders
: custom headers if you need header values for authentication, CORS settings, etc
port
: name or number of the port to access the server
If you just need to check whether or not a TCP connection can be made, you can specify a TCP probe. The pod is marked healthy if can establish a TCP connection. Using a TCP probe may be useful for a gRPC or FTP server where HTTP calls may not be suitable.
Finally, a probe can be configured to run a shell command. The check passes if the command returns with exit code 0; otherwise, the pod is marked as unhealthy. This type of probe may be useful if it is not desirable to expose an HTTP server/port or if it is easier to check initialization steps via command (e.g. check if a configuration file has been created, run a CLI command).
The exact parameters for the probes depend on your application, but here are some general best practices to get started:
For older (≤ 1.15) Kubernetes clusters, use a readiness probe with an initial delay to deal with the container startup phase (use p99 times for this). But make this check lightweight since the readiness probe will execute throughout the entire lifecycle of the pod. We don’t want the probe to time out because the readiness check takes a long time to compute.
For newer (≥ 1.16) Kubernetes clusters, use a startup probe for applications with unpredictable or variable startup times. The startup probe may share the same endpoint (e.g. /healthz
) as the readiness and liveness probes, but set the failureThreshold
higher than the other probes to account for longer start times, but more reasonable time to failure for liveness and readiness checks.
Readiness and liveness probes may share the same endpoint if the readiness probes aren’t used for other signalling purposes. If there’s only one pod (i.e. using a Vertical Pod Autoscaler), set the readiness probe to address the startup behaviour and use the liveness probe to determine health. In this case, marking the pod unhealthy means downtime.
Readiness checks can be used in various ways to signal system degradation. For example, if the application loses connection to the database, readiness probes may be used to temporarily block new requests and allow the system to reconnect. It can also be used to load balance work to other pods by marking busy pods as not ready.
In short, well-defined probes generally lead to better resilience and availability. Be sure to observe the startup times and system behaviour to tweak the probe settings as the applications change.
Finally, given the importance of Kubernetes probes, you can use a Kubernetes resource analysis tool to detect missing probes. These tools can be run against existing clusters or be baked into the CI/CD process to automatically reject workloads without properly configured resources.
Ideally, one would write the terraform script from scratch using this .
Clone the . The terraform script to provision the EKS cluster is available in this repo. The structure of the files is given below.
Now, run the terraform scripts to provision the infra required to .
Use the URL to . This creates both public and private keys on the machine, upload the public key into the account that you have just created, give a name to it and ensure that you mention that in your terraform. This allows you to encrypt sensitive information.
Example: Create a user keybase. This is "egovterraform" in the case of eGov. Upload the public key here -
Use this to Decrypt the secret key. To decrypt the PGP message, upload the PGP Message, PGP Private Key and Passphrase.
Use this link to to fetch the kubeconfig file. This enables you to connect to the cluster from your local machine and deploy DIGIT services to the cluster.
GitHub (this provides you with the , )
(this provides the ssh public and private keys)
Add the earlier created ssh public key to GitHub user
Add ssh private key to the
With previously created
(username and password)
Prepare an < and <>. Name this file as desired. It has the following configurations:
credentials, secrets (you need to encrypt using and create a ci-secret.yaml separately)
Add subdomain name in
Check and add your project specific details (like github Oauth app , , , , , and )
To create a Jenkins namespace mark this true
Add your environment-specific kubconfigs under kubConfigs like
KubeConfig name and deploymentJobs name from should be the same
Update the and repo ssh url with the forked repo's ssh url.
Update the with your docker hub organization name.
Update the repo name "egovio" with your docker hub organization name in
Update the cache repo name
IDE Code for better code visualization/editing capabilities
Fork the repo to your GitHub organization account
As all the DIGIT services are containerized and deployed on Kubernetes, we need to prepare deployment manifests. The same can be found .
Cluster network: A set of links, logical or physical, that facilitate communication within a cluster according to the Kubernetes .
Service: A Kubernetes that identifies a set of Pods using selectors. Unless mentioned otherwise, Services are assumed to have virtual IPs only routable within the cluster network.
exposes HTTP and HTTPS routes from outside the cluster to within the cluster. Traffic routing is controlled by rules defined on the Ingress resource.
An Ingress may be configured to give Services externally-reachable URLs, load balance traffic, terminate SSL / TLS, and offer name based virtual hosting. An is responsible for fulfilling the Ingress, usually with a load balancer, though it may also configure your edge router or additional frontends to help handle the traffic.
An Ingress does not expose arbitrary ports or protocols. Exposing services other than HTTP and HTTPS to the internet typically uses a service of type or .
You must have an to satisfy an Ingress. Only creating an Ingress resource has no effect.
You may need to deploy an Ingress controller such as . You can choose from a number of .
As with all other Kubernetes resources, an Ingress needs apiVersion, kind, and metadata fields. The name of an Ingress object must be a valid . For general information about working with config files, see , , . Ingress frequently uses annotations to configure some options depending on the Ingress controller, an example of which is the . Different support different annotations. Review the documentation for your choice of Ingress controller to learn which annotations are supported.
The Ingress has all the information needed to configure a load balancer or proxy server. Most importantly, it contains a list of rules matched against all incoming requests. Ingress resource only supports rules for directing HTTP(S) traffic.
An optional host. In this example, no host is specified, so the rule applies to all inbound HTTP traffic through the IP address specified. If a host is provided (for example, ), the rules apply to that host.
A backend is a combination of Service and port names as described in the or a by way of a . HTTP (and HTTPS) requests to the Ingress that matches the host and path of the rule are sent to the listed backend.
Learn about the
Learn about
Clone the git repo . Copy existing and with new environment name (eg..yaml and-secrets.yaml)
: a resource analysis tool with a nice dashboard that can also be used as a validating webhook or CLI tool.
: a static code analysis tool that works with Helm, Kustomize, and standard YAML files.
: read-only utility tool that scans Kubernetes clusters and reports potential issues with configurations.