DIGIT Docs
  • DIGIT Knowledge Base
  • Local Governance
  • 🖥️Platform
    • Overview
    • Why DIGIT?
    • Principles
    • Architecture
      • Service Architecture
      • Technology Architecture
        • Open Source Tools
      • Infrastructure Architecture
      • Deployment Architecture
    • API Specifications
      • Access Control
      • Boundary
      • Document Uploader
      • Encryption
      • File Store
      • ID Generation
      • Indexer
      • Localisation
      • Master Data Management
      • OTP
      • Payment Gateway
      • PDF Generation
      • URL Shortner
      • WhatsApp Chatbot
      • Workflow
    • Core Services
      • Access Control Services
      • Audit Service
        • Signed Audit Performance Testing Results
      • API Gateway
        • Configuring Gateway Rate Limiting
      • Boundary Service
        • Migrate Old Boundary Data - Steps
      • Email Notification Service
      • Encryption Service
        • Encryption Client Library
        • User Data Security Architecture
        • Guidelines for supporting User Privacy in a module
      • FileStore Service
      • ID Generation Service
      • Indexer Service
        • Indexer Configuration
      • Internal Gateway
      • Location
      • Localization Service
        • Configuring Localization
          • Setup Base Product Localisation
          • Configure SMS and Email
      • MDMS V2 (Master Data Management Service)
        • Adopt New MDMS - Steps
        • MDMS (Master Data Management Service)
          • Setting up Master Data
            • MDMS Overview
            • MDMS Rewritten
            • Configuring Tenants
            • Configuring Master Data
            • Adding New Master
            • State Level Vs City Level Master
        • MDMS Migration
      • OTP Service
      • Payment Gateway Service
      • PDF Generation Service
      • Persister Service
        • Persister Configuration
      • Service Request
      • SMS Notification Service
        • Setting Up SMS Gateway
          • Using The Generic GET & POST SMS Gateway Interface
      • User
        • User Session Management
      • User OTP Service
      • URL Shortening Service
      • Workflow
        • Setting Up Workflows
        • Configuring Workflows For An Entity
        • Workflow Auto Escalation
        • Migration To Workflow 2.0
      • Libraries
        • Tracer Library
        • Encryption Client
      • Accelerators
        • Inbox Service
    • DIGIT: How-Tos
      • SMS Template Approval Process
      • Notification Enhancement Based On Different Channel
    • Releases
      • DIGIT 2.9 LTS
        • Test Automation
        • Release Checklist
        • Service Build Updates
          • Hotfix
        • Test Cases
        • Automated DIGIT Deployment
        • Upgrade Guide: Transitioning DIGIT Modules to Spring Boot Version 3.2.2
        • Postgres Upgrade: Service Code Changes
        • Updating RDS Version in AWS
        • LTS DIGIT Migration - v2.8 To v2.9
        • Changelog
        • Backup PostgreSQL Database In AWS - Steps
    • Source Code
  • 📓Guides
    • Installation Guide
      • Infrastructure Setup
        • AWS
          • 1. Pre-requisites
          • 2. Setup AWS Account
          • 3. Provision Infrastructure
          • FAQ
        • Azure
          • 1. Azure Pre-requisites
          • 2. Understanding AKS
          • 3. Infra-as-code (Terraform)
        • SDC
          • 1. SDC Pre-requisites
          • 2. Infra-as-code (Kubespray)
          • CI/CD Setup On SDC
        • CI/CD Set Up
          • CI/CD Build Job Pipeline Setup
      • DIGIT Deployment
        • Full Deployment
          • Deploy DIGIT
            • Prepare Deployment Configuration
        • Full Deployment (Beta)
          • Creating New HelmChart
          • Prepare Helm Release Chart
      • Quick Setup (AWS)
    • Data Setup Guide
      • Bootstrap DIGIT
      • Productionize DIGIT
      • User Module
      • Localisation Module
      • Location Module
      • MDMS - V2
    • Design Guide
      • Model Requirements
      • Design Services
      • Design User Interface
      • Checklists
    • Developer Guide
      • Pre-requisites Training Resources
      • Backend Developer Guide
        • Section 0: Prep
          • Development Pre-requisites
          • Design Inputs
            • High Level Design
            • Low Level Design
          • Development Environment Setup
        • Section 1: Create Project
          • Generate Project Using API Specs
          • Create Database
          • Configure Application Properties
          • Import Core Models
          • Implement Repository Layer
          • Create Validation & Enrichment Layers
          • Implement Service Layer
          • Build The Web Layer
        • Section 2: Integrate Persister & Kafka
          • Add Kafka Configuration
          • Implement Kafka Producer & Consumer
          • Add Persister Configuration
          • Enable Signed Audit
        • Section 3: Integrate Microservices
          • Integrate IDGen Service
          • Integrate User Service
          • Add MDMS Configuration
          • Integrate MDMS Service
          • Add Workflow Configuration
          • Integrate Workflow Service
          • Integrate URL Shortener Service
        • Section 4: Integrate Billing & Payment
          • Custom Calculator Service
          • Integrate Calculator Service
          • Payment Back Update
        • Section 5: Other Advanced Integrations
          • Add Indexer Configuration
          • Certificate Generation
        • Section 6: Run Final Application
        • Section 7: Build & Deploy Instructions
        • FAQs
      • UI Developer Guide
        • DIGIT-UI
          • UI Components Standardisation
            • DIGIT UI Core React Components
            • DIGIT UI Core Flutter Components
              • Input Field
              • Radio
              • Toggle
              • Button
              • Dropdown
              • Checkbox
              • Toast
              • Info Card
            • DIGIT UI Components v0.2.0
              • Foundation
                • Typography
                • Colour Pallete
                • Spacer
              • Atom
                • Accordion
                • Button
                • Checkbox
        • DIGIT UI Development Pre-requisites
        • UI Configuration (DevOps)
        • Local Development Setup
        • Run Application
        • Build & Deploy
        • Pre-defined Screens In DIGIT-UI
          • Create Screen (FormComposer)
          • Inbox/Search Screen
          • Workflow Component
        • Create a New UI Module/Package
          • Project Structure
          • Install Dependency
          • Module.js
          • Import Required Components
          • Common Hooks
        • Employee Module Setup
          • Write Employee Module Code
          • Create Form - Create Screen
        • Citizen Module Setup
          • Sample screenshots
          • Citizen Landing Screen
          • Write Citizen Module Code
        • Customisation
          • Integrate External Web Application/UI With DIGIT UI
          • Utility - Pre-Process MDMS Configuration
          • CSS Customisation
          • Kibana Dashboard Integration With DSS Module
          • Login Page
        • Setup Monitoring Tools
        • Android Web View & How To Generate APK
        • FAQs
          • Troubleshoot Using Browser Network Tab
          • Debug Android App Using Chrome Browser
      • Flutter (Mobile App) UI Developer Guide
        • Introduction to Flutter
          • Flutter - Key Features
          • Flutter Architecture & Approach
          • Flutter Pre-Requisites
        • Setup Development Environment
          • Flutter Installation & Setup Guide
          • Setup Device Emulators/Simulators
          • Run Application
        • Build User Interfaces
          • Create Form Screen
        • Build Deploy & Publish
          • Build & Deploy Flutter Web Application
          • Generate Android APKs & App Bundles
          • Publishing App Bundle To Play Store
        • State Management With Provider & Bloc
          • Provider State Management
          • BloC State Management
        • Best Practices & Tips
        • Troubleshooting
    • Operations Guide
      • DIGIT - Infra Overview
      • Kubernetes
        • RBAC Management
        • Database Dump - Playground
      • Setup Jenkins - Docker way
      • GitOps
        • Git Client installation
        • GitHub organization creation
        • Adding new SSH key to it
        • GitHub repo creation
        • GitHub Team creation
        • Enabling Branch protection:
        • CODEOWNER Reviewers
        • Adding Users to the Git
        • Setting up an OAuth with GitHub
        • Fork (Fork the mdms,config repo with a tenant-specific branch)
      • Working with Kubernetes
        • Installation of Kubectl
      • Containerizing application using Docker
        • Creation of Dockerhub account
      • Infra Provisioning Using Terraform
        • Installation of Terraform
      • Customise Existing Terraform Templates
      • Cert-Manager
        • Obtaining SSL certificates with the help of cluster-issuer
      • Moving Docker Images
      • Pre and post deployment checklist
      • Multi-tenancy Setup
      • Availability
        • Infrastructure
        • Backbone services
          • Database
          • Kafka
          • Kafka Connect
          • Elastic search
            • Elastic Search Rolling Upgrade
            • ElasticSearch Direct Upgrade
        • Core services
        • DIGIT apps
        • DSS dashboard
      • Observability
        • ES-Curator - Clear Old Logs/indices
        • Monitoring
        • Environment Changes
        • Tracing
        • Jaeger Tracing Setup
        • Logging
        • eGov Monitoring & Alerting Setup
        • eGov Logging Setup
      • Performance
        • What to monitor?
          • Infrastructure
          • Backbone services
          • Core services
        • Identifying bottlenecks
        • Solutions
      • Handling errors
      • Security
      • Reliability and disaster recovery
      • Privacy
      • Skillsets/hiring
      • Incident management processes
      • Kafka Troubleshooting Guide
        • How to clean up Kafka logs
        • How to change or reset consumer offset in Kafka?
      • SRE Rituals
      • FAQs
        • I am unable to login to the citizen or employee portal. The UI shows a spinner.
        • My DSS dashboard is not reflecting accurate numbers? What can I do?
      • Deployment using helm
        • Helm Installation
        • Helm chart creation
        • Helm chart customization
      • How to Dump Elasticsearch Indexes
      • Deploy Nginx-Ingress-Controller
      • Deployment Job Pipeline Setup
      • OAuth2-Proxy Setup
      • Jira Ticket Creation
    • Implementation Guide
    • Security & Privacy Guide
      • Security & Privacy Guidelines For Product Developers
      • Security & Privacy Guidelines For Solution Implementing Agencies
      • Security & Privacy Guidelines For Program Owners
  • 🚀Accelerators
    • UI Frameworks
      • Service Build Updates
    • Integrations
      • Payment
      • Notification
      • Transaction
      • Verification
      • View
      • Calculation
    • Concepts
      • Deployment - Key Concepts
        • Security Practices
        • Readiness & Liveness
        • Resource Requests & Limits
        • Deploying DIGIT Services
        • Deployment Architecture
        • Routing Traffic
        • Backbone Deployment
    • API Playground
    • Sandbox
    • Checklists
      • API Checklist
      • Security Checklist
        • Security Guidelines Handbook
        • Security Flow - Exemplar
      • Performance Checklist
      • Deployment Checklist
    • Contribute
    • Discussion Board
    • Academy
    • Events
Powered by GitBook

All content on this page by eGov Foundation is licensed under a Creative Commons Attribution 4.0 International License.

On this page
  • Probes Overview
  • Kubernetes Probes
  • Readiness Probes
  • Liveness Probes
  • Startup Probes
  • Configuring Probe Actions
  • HTTP
  • TCP
  • Command
  • Best Practices
  • Tools

Was this helpful?

Edit on GitHub
Export as PDF
  1. Accelerators
  2. Concepts
  3. Deployment - Key Concepts

Readiness & Liveness

Overview of various probes that we can setup to ensure the service deployment and the availability of the service is ensured automatically.

PreviousSecurity PracticesNextResource Requests & Limits

Last updated 1 year ago

Was this helpful?

On this page:

Probes Overview

Determining the state of a service based on readiness, liveness, and startup to detect and deal with unhealthy situations. It may happen that the application needs to initialize some state, make database connections, or load data before handling application logic. This gap in time between when the application is actually ready versus when Kubernetes thinks is ready becomes an issue when the deployment begins to scale and unready applications receive traffic and send back 500 errors.

Many developers assume that when basic pod setup is adequate, especially when the application inside the pod is configured with daemon process managers (e.g. PM2 for Node.js). However, since Kubernetes deems a pod as healthy and ready for requests as soon as all the containers start, the application may receive traffic before it is actually ready.

Kubernetes Probes

Kubernetes supports readiness and liveness probes for versions ≤ 1.15. Startup probes were added in 1.16 as an alpha feature and graduated to beta in 1.18 (WARNING: 1.16 deprecated several Kubernetes APIs. Use this to check for compatibility).

All the probes have the following parameters:

  • initialDelaySeconds : number of seconds to wait before initiating liveness or readiness probes

  • periodSeconds: how often to check the probe

  • timeoutSeconds: number of seconds before marking the probe as timing out (failing the health check)

  • successThreshold : minimum number of consecutive successful checks for the probe to pass

  • failureThreshold : number of retries before marking the probe as failed. For liveness probes, this will lead to the pod restarting. For readiness probes, this will mark the pod as unready.

Readiness Probes

Readiness probes are used to let Kubelet know when the application is ready to accept new traffic. If the application needs some time to initialize the state after the process has started, configure the readiness probe to tell Kubernetes to wait before sending new traffic. A primary use case for readiness probes is directing traffic to deployments behind a service.

One important thing to note with readiness probes is that it runs during the pod’s entire lifecycle. This means that readiness probes will run not only at startup but repeatedly throughout as long as the pod is running. This is to deal with situations where the application is temporarily unavailable (i.e. loading large data, waiting on external connections). In this case, we don’t want to necessarily kill the application but wait for it to recover. Readiness probes are used to detect this scenario and not send traffic to these pods until it passes the readiness check again.

Liveness Probes

Liveness probes are used to restart unhealthy containers. The Kubelet periodically pings the liveness probe, determines the health, and kills the pod if it fails the liveness check.

Liveness checks can help the application recover from a deadlock situation. Without liveness checks, Kubernetes deems a deadlocked pod healthy since the underlying process continues to run from Kubernetes’s perspective. By configuring the liveness probe, the Kubelet can detect that the application is in a bad state and restarts the pod to restore availability.

Startup Probes

Startup probes are similar to readiness probes but only executed at startup. They are optimized for slow-starting containers or applications with unpredictable initialization processes. With readiness probes, we can configure the initialDelaySeconds to determine how long to wait before probing for readiness. Now consider an application where it occasionally needs to download large amounts of data or do an expensive operation at the start of the process. Since initialDelaySeconds is a static number, we are forced always to take the worst-case scenario (or extend the failureThreshold one that may affect long-running behaviour) and wait for a long time even when that application does not need to carry out long-running initialization steps. With startup probes, we can instead configure failureThreshold and periodSeconds to model this uncertainty better. For example, setting failureThreshold to 15 and periodSeconds to 5 means the application will get 10 x 5 = 75s to startup before it fails.

Configuring Probe Actions

Now that we understand the different types of probes, we can examine the three distinct ways to configure each probe.

HTTP

The Kubelet sends an HTTP GET request to an endpoint and checks for a 2xx or 3xx response. You can reuse an existing HTTP endpoint or set up a lightweight HTTP server for probing purposes (e.g. an Express server with /healthz endpoint).

HTTP probes take in additional parameters:

  • host : hostname to connect to (default: pod’s IP)

  • scheme : HTTP (default) or HTTPS

  • path : path on the HTTP/S server

  • httpHeaders : custom headers if you need header values for authentication, CORS settings, etc

  • port : name or number of the port to access the server

livenessProbe:
   httpGet:
     path: /healthz
     port: 8080

TCP

To check whether or not a TCP connection can be made, you can specify a TCP probe. The pod is marked healthy if it can establish a TCP connection. Using a TCP probe may be useful for a gRPC or FTP server where HTTP calls may not be suitable.

readinessProbe:
   tcpSocket:
     port: 21

Command

Finally, a probe can be configured to run a shell command. The check passes if the command returns with exit code 0; otherwise, the pod is marked as unhealthy. This type of probe may be useful if it is not desirable to expose an HTTP server/port or if it is easier to check initialization steps via command (e.g. check if a configuration file has been created, run a CLI command).

readinessProbe:
   exec:
     command: ["/bin/sh", "-ec", "vault status -tls-skip-verify"]

Best Practices

The exact parameters for the probes depend on your application, but here are some general best practices to get started:

  • For older (≤ 1.15) Kubernetes clusters, use a readiness probe with an initial delay to deal with the container startup phase (use p99 times for this). But make this check lightweight since the readiness probe will execute throughout the entire lifecycle of the pod. We don’t want the probe to time out because the readiness check takes a long time to compute.

  • For newer (≥ 1.16) Kubernetes clusters, use a startup probe for applications with unpredictable or variable startup times. The startup probe may share the same endpoint (e.g. /healthz ) as the readiness and liveness probes, but set the failureThreshold higher than the other probes to account for longer start times, but more reasonable time to failure for liveness and readiness checks.

  • Readiness and liveness probes may share the same endpoint if the readiness probes aren’t used for other signalling purposes. If there’s only one pod (i.e. using a Vertical Pod Autoscaler), set the readiness probe to address the startup behaviour and use the liveness probe to determine health. In this case, marking the pod unhealthy means downtime.

  • Readiness checks can be used in various ways to signal system degradation. For example, if the application loses connection to the database, readiness probes may be used to temporarily block new requests and allow the system to reconnect. It can also be used to load balance work to other pods by marking busy pods as not ready.

In short, well-defined probes generally lead to better resilience and availability. Be sure to observe the startup times and system behaviour to tweak the probe settings as the applications change.

Tools

Considering the significance of Kubernetes probes, you can utilize a Kubernetes resource analysis tool to identify any missing probes. These tools can be executed against existing clusters or integrated into the CI/CD pipeline to automatically reject workloads that don't have properly configured resources.

Kubernetes Readiness Probe
Kubernetes Liveness Probes

: a resource analysis tool with a nice dashboard that can also be used as a validating webhook or CLI tool.

: a static code analysis tool that works with Helm, Kustomize, and standard YAML files.

: read-only utility tool that scans Kubernetes clusters and reports potential issues with configurations.

🚀
Polaris
Kube-score
Popeye
migration guide
Probes overview
Kubernetes probes
Readiness probes
Liveness probes
Startup probes
Configuring probes actions
Best practices
Tools