1 of 10

Reference Reads

Analytics

DIGIT analytics enable -

administrators to view the dashboard based on which they can take day-to-day planning and operational decisions
citizens to view and assess how the city administration is delivering services to them
employees to identify immediate areas of focus so that they can direct their efforts accordingly
analysts and researchers to access data in a format that enables them to analyse data rapidly and provide deep insights

In order to enable the above in a scalable, secure and reliable manner, the DIGIT platform needs to ensure -

transaction data is extracted, transformed and made available in an analytical datastore in a timely manner
privacy issues are addressed as data is moved to the analytical data store
as the transaction data structure is modified, the extract and transform programs continue to work seamlessly
users will have the ability to extend the transformation to suit their needs. Data may need to be transformed multiple times to address reporting and analytical requirements.
user can design and modify dashboards as per their requirements
users can access data only based on their role
raw as well as analytical datasets are made available through open data APIs for analysts and researchers
anomalies detected are bubbled up to the right users at the right time
users can perform descriptive, diagnostic, predictive and prescriptive analytics seamlessly
real-time scenarios e.g. IOT can be catered to by the platform.

Sunbird cQube https://cqube.sunbird.org/ is something we should look at see how this fits into our requirements.

DevSecOps

DevSecOps is the philosophy of integrating security practices within the DevOps pipeline

Low Code No Code

Cocreation Platform

Digit Low Code No Code will enable citizens, government employees and partners to rapidly compose new solutions on top of the platform using a visual editor. Knowledge of coding languages is not required. It has been premised that such a platform will not only expedited development but also make it easier for everyone to create new applications. Especially, for government across the world trying to digitize their services and processes - low code no code can lead significant acceleration.

Enter any government office and one will see loads of forms. To avail a service or apply for a scheme, one needs to fill one of these forms, attach relevant documents and submit it at the counter. Depending on the nature of the application, it is routed from one officer to another till it’s registered in a registry. Then an appropriate certificate is issued that will enable to citizen to access the service or benefit from the scheme.

Today many of these forms are being digitised by developing discreet applications which are often poorly written and difficult to maintain & expensive to modify.

We are proposing to build an open source low code no code platforms which will allow government employees, vendors and citizens to design applications using a visual application designer. Behind the scene the designer will emit an application model based on an open application modeling language or specification. The model will be registered into an an application runtime that will bring up the appropriate application model based on the user action. It will display the appropriate interface e.g. form or execute the appropriate workflow. The data will be stored in an electronic registry in a secure, private and immutable manner. If required, a digit certificate will be issued which can be verified online.

Building such a low code no code based CoCreation environment based on open application specifications will unlock this space and enable government organisation to digitise these processes rapidly. It will ease the access, remove inefficiency, increase observability and ease maintenance & upgrades. It will enable governments to adopt these technologies without being locked into vendor proprietary platforms and infrastructure.

Building an open collaboration environment around these open low code no code environment will enable government, citizens and business including startups to collaborate and co-create new services and rapidly evolve them to meet the needs of the citizens. The underlying open specifications and the accompanying open source implementation will enable multiple startups to innovate and compete in building better platforms. This will create a new digital ecosystem of players.

Typically a low code no code platforms have the following components

1. User Interface or Interaction Designer and User Interface or Interaction Runtime Engine

2. Process or WorkFlow Designer and Process or Workflow Runtime Engine

3. Reports Designer and Reports Engine

Users are able to use the designers to specify user interface, flow and reports. This results in a well defined specification which are stored in the files system or database. These specifications are used by the runtimes to display the UI, orchestrate the flow and generate the reports.

Depending on the scenario, users can start by designing the entities e.g. Order and then generate the forms/views on the entity. Or the users can design the form e.g Google Forms and the entities get generated in the background. The associated CRUD API for these entities are also generated.

Advent of schema free (no sql) databases and support for storage of json objects in relational databases has enabled storage of both the specifications as well as entities who schema is specified by the specifications.

The creation, updation and deletion of these entities generates events that can trigger the workflow specified by the workflow designer. These workflows are typically a chain of event-condition-actions. Events contain reference to the entity on which the conditions is applied which are typically if-then-else rules. If the conditions are met then actions are executed which can be creating, updating or deleting another entity or calling an API or external service which can assign, notify users to take appropriate actions.

As entities are modified the modified information is pushed into a message queue and then into an analytical datastore. The reports specified by the reports designer executes against this datastore (which is typically read fast e.g. ElasticSearch)

Application Specification

Work in Progress

Application Schema

Entity Schema

Entity Attribute Schema

Access

EntityView

Group

Tab

Field

Condition

Beneficiary Eligibility

A case for federated architecture

COVID accelerated the use of the digital payments infrastructure by the various national and sub national governments to enable the Direct Benefit Transfers. While this helped provide support for millions across the world, a new challenge is emerging which requires due attention. In order to make direct transfers governments need to identify the beneficiaries for the various schemes - based on the scheme specific criteria. The data required for determining eligibility may include land holdings, electricity usage, vehicle ownership, financial transactions, age, gender, caste etc. These records currently reside in the respective departments but many state governments are running initiatives to pull data into a centralised database. Over time they have seeded all these databases with Aadhar and now are in a position to correlate these data to formulate a comprehensive profile of all the citizens. While the objective of such databases is to identify eligible beneficiaries, there are several challenges in these initiatives that need to be thought through.

Single Source of Truth - The respective departments are the legal “registrars” of the respective attributes e.g. Vehicle records are owned by the Road Transport Department and so on. If data is being pushed into the central database, the ownership of ensuring the data is up to date should reside with the respective departments. The system must be designed in a manner to ensure that the most recent record is used to determine the eligibility criteria.
Security - Creating such a centralised database will make it a high risk asset and will require substantial investments in security to ensure adequate protection.
Privacy - Several questions around privacy arise which needs to be addressed e.g. will citizens have visibility in the attributes that are being stored and used for eligibility determination, is there a process for them to raise correction requests, what mechanisms are put in place to ensure limit purpose of use of these databases, can citizens opt out of such a database, etc.
Anomaly Detection - Since this database will be used for beneficiary eligibility, it will be a target for fraud. Mechanisms need to be put in place to detect anomalies e.g. population stability indexes must be computed and compared to ensure no large scale changes in the database are happening to enable inclusion in a specific scheme.

To address the above concerns, designers of these systems must consider federated services architecture rather than centralised databases. Instead of pulling all the data into a central database, it may be possible to implement a centralised “Beneficiary Eligibility” service which in turn calls respective departments “Beneficiary Eligibility” service that returns a “Yes/No” answer. So a scheme system queries the centralised beneficiary eligibility API by sending one or multiple records to it. The service then calls the respective department systems to check the beneficiary eligibility in their respective databases and revert with a result.

A federated architecture as above ensures the legal registrars of the data are enabled to determine the eligibility rather than transferring control of such an activity to a central department. It ensures a single source of truth which is the legal registrars of the respective data. There is no escalation of security and privacy risks beyond what already exists in these existing databases.

__All content on this website by eGov Foundation is licensed under a Creative Commons Attribution 4.0 International License.

Government and Open Digital Platforms

Draft - Work in Progress

Governments - A Complex Web Of Interactions

Governments provide multiple services to their citizens in areas such as education, health, food, law and order, and energy, among others. To do so, revenue is collected via taxes on income, property, and sales, as well as through payment on services such as water and electricity. In addition, various targeted welfare programs or schemes such as direct benefit transfers to weaker sections of the society, or food distribution at lower prices are launched by governments. To deliver these services and schemes, the government has to interact with vendors from different industries.

Most countries also have a federal structure, where responsibilities are distributed between national, sub-national, and local governments - which are further bifurcated into departments and sub-departments - to deliver services and collect revenue. The departments, in turn, are categorised by geographical locations into zones, districts, blocks, and villages, to facilitate smooth delivery of services/programs across the country.

In India, for example, the national government has 40-plus ministries and 20-plus independent bodies. Every ministry has three to five departments, and each consists of five sub-departments. There are 36 sub-national governments: 28 states and 8 union territories. Each has its own departments and sub-departments. The snapshot from the local government directory website (https://lgdirectory.gov.in) below gives the numbers of districts, sub-districts, blocks, villages and local bodies.

Many of these ministries, departments, sub-departments interact (exchange information) and transact (exchange money) to facilitate the delivery of services and programs run by the government. One can only try to understand how complex these interactions might get.

Encoding Complex Interactions Into Digital Applications Leading To Sub Optimal Equilibrium

In the digital age, the interactions at the government level are undergoing rapid transformation. Applications are helping digitize the interactions, automate tasks and coordinate the flow of information too. While this does deliver the benefits of automation, it also ends up hard coding these interactions into software, locking the associated data in closed databases that are run on non-scalable hardware architecture. As each department builds these applications, the complex interactions within the ecosystem get encoded into various applications resulting in fragmented and siloed data. Data updates and access to real-time data becomes a challenge for the end-users of the services/programs. Consequently, citizens and vendors have to run from pillar to post to update or make changes in the data. Administrators, on the other hand, struggle to get an integrated view of the data on time, resulting in delayed or incorrect decisions.

It becomes the citizen's and the vendor's responsibility to keep data updated in these departmental applications - who have to run from pillar to post filling forms, attaching proofs and standing in long lines to submit these applications. At the same time, administrators struggle to get integrate view of the data at the right time, thus forced to either delay decisions or make decisions based on gut.

To address these concerns, departments start integrating these siloed departmental applications and also many of them build out multitude of web and mobile apps for citizens and vendors. (Design of many of these applications are not taking into account the diversity, access and infrastructure issues into considerations leading to increasing the "digital divide" - but that is another important issue that needs separate attention and will not be discussed here. )

As these applications start getting integrated with each other, the interactions get further etched into software code and proprietary data exchange mechanisms designed by software engineers. Over period of time, making changes to these data exchange mechanisms requires cascading changes across the several departmental applications which becomes a very expensive proposition and a program management nightmare - leading to multiple failures in technology implementations. As implementation failures increase, no government officer is willing to take the risk of initiating the change. The entire ecosystem gravitates towards a sub-optimal equilibrium.

It's important to understand the problem does not exist in technology but in the architecture or the design of how the interactions are encoded into - siloed applications encapsulating fragmented databases using proprietary data exchange formats running on non-scalable hardware. Platform based approach tries to address these challenges.

Platforms can help unlock and realize new possibilities

In 1890s, several large cities like London and New York where debating the "Great Manure Crisis". As cities were growing, the number of horse carriages were increasing. London had 50000 horses - each generating 15-35 pounds of shit. This was leading to several issues like health, land for stable, food for horses etc. An article in Times, London 1894 stated - "In 50 years, every street in London will be buried under nine feet of manure". The problem seemed so insurmountable that many proclaimed that urban civilisation was dead. By 1912, this seemingly insurmountable problem had completely disappeared, automobiles powered by internal combustion engines built by 400+ manufactures had replaced the horse carriages. Today, London has 2.6M registered cars.

Internal combustion engine (ICE) is an example of a platform building block. It solved a pivotal problem of converting fuel to motion. This unlocked the space, accelerating innovation - 400+ manufacturers of automobiles raced to build multiple end solutions e.g. Car, Trucks, Ships, Manufacturing Plants etc. ICE transformed how automobiles where built which reshaped how roads were built. This reshaped how cities where built.

Platforms are a set of highly reusable building blocks with high complementarity. Each building block solves a key pivotal problem in a manner that it can be reused for building multiple solutions.

Unintended Consequences and Need for Policy

Platforms are powerful and have unintended consequences e.g. Automobiles led to increase in accidents, obesity, increase in consumption of fossil fuels etc. Hence it is important that platforms are built in the open and co-created by involving stakeholders from across the ecosystem - government, business and citizens. Appropriate policy interventions should go hand in hand to accelerate adoption of the platform as well as ensuring its ill effects are stemmed.

Digital Platforms

Today we are surrounded by Digital Platforms like Google Search, Facebook, WhatsApp, Amazon, Uber. These are powerful platforms that solve pivotal problems and facilitate digital interactions & transactions for the participants in their ecosystem. Similarly, there are platforms like Aadhar, UPI, Sunbird, Digit Urban at different levels of maturity that are trying to solve pivotal problems, unlock the respective spaces they operate in and enabling ecosystems to build solutions on top of these platforms.

Closed vs Open Platforms - Need for Governance

Closed platforms are like walled gardens. The businesses that own these platforms are gatekeepers and define the rules of engagement on these platforms that driven by their business goals. The rules are defined to ensure the value of these platforms accrue to the businesses that own and run them. The rules of engagement for Open platforms are shaped by an open collaborative process where everyone is free to participate. To ensure open platforms evolve in a coherent manner, it needs a proper governance body - to evolve the standards, ensure openness of building blocks, ensure free and fair distribution of value.

Digital Building Blocks

Digital Building Blocks that make up platforms can come in various forms. Some of them are listed below.

Protocols and Formats for Data Exchange e.g. SMTP, HTTP, HTML etc.
Shared Registries e.g. Aadhar
Data Exchange Platforms e.g. UPI
Shared Services e.g. Payment, Collection etc.

Protocols and Data Exchange Formats like SMTP enables seamless exchange of information. Anyone can participate in the ecosystem by building or installing an open source system e.g. email server or webserver, one doesn't need permission or pay hefty amount to the gatekeepers. Similarly, services provided by shared registries like eKYC on Aadhar can be accessed as long as one has appropriate permission from the citizen.

The illustration below graphically summarizes all the above points about digital platforms and demonstrates how platforms can unlock new possibilities.

Key Principles for Building Digital Platforms for Government

When building digital platforms especially for enabling interactions between government and citizens, it is important that certain key principles be applied to ensure its adoption, evolution and avoid its unintended consequences.

Open - We have already talked about that digital platforms are powerful and hence need to be open to ensure value is distributed across the actors of the ecosystem and concentrated to benefit a few. This requires that these platforms be built using an open process using open standards, technology, API and data principles.
Unbundled/Modular - To ensure high reuse, the building blocks must be unbundled to small, modular and well defined microservices. Instead of trying to pack complexity into one large integrated solution e.g. ERP, it is key that the problem space be broken down into smaller modular building blocks that can be assembled and also evolved independently.
Federated - The architecture of the platform must ensure value accrues to all stakeholder of the system. Traditional centralized application architectures tend to concentrate power in the hands of those who control the data. Special care must be given to ensure platform does not create information flow that create imbalance of power for the federated structure of the government.
Security - The data stored on these platforms are very high value and will be subject to continuous attacks. It is imperative that highest standards of security be applied for data at rest and in transit.
Privacy - Governments deal with lot of private data about citizens, platform architecture must enforce wherever possible the privacy of individuals and enable solution developers to enhance the privacy.
Minimum - Even though high reuse is of high importance for platforms, it must store minimum data and functionality that is minimum and fit for purpose.
Scalable - Given the scale of government, its imperative that platform be designed to scale to sub-national and national scale.

Digital Missions building Digital Platforms for accelerating Sustainable Development Goals

Governments are starting to recognize the power of digital platforms - on one hand, they are trying to control to unintended consequences of large closed digital platforms and on the other hand, they are trying to build digital platforms to accelerate the attainment of developmental goals.

Digital Public Goods Alliance

Initiatives like the Digital Public Goods Alliance (https://digitalpublicgoods.net/) has been setup as a "multi-stakeholder initiative with a mission to accelerate the attainment of the sustainable development goals in low- and middle-income countries by facilitating the discovery, development, use of, and investment in digital public goods."

Digital Platforms are powerful but are not "silver bullet" by themselves

Digital Platforms are powerful and we are seeing some early successes, however, its important to highlight that these are still early days. Platforms require sustained cooperation and cocreation amongst multiple stakeholders to design, develop, implement and sustain.

Any ecosystem is consists of various stakeholders interacting with each other. Information and communication technologies are disintermediating these interactions. As interactions get encoded in technology, we have an opportunity to rethink these interactions. Digital platforms through shared data registries, open protocols and common services unbundles the ecosystems, makes information available and provides an opportunity to rearchitect the interactions in ecosystem. Solution designers who build on top of the platform then at least have a chance to innovate and rebuild solutions that ensures the value and benefit accrues to all stakeholders. Development of multiple such solutions will unlock existing ecosystems that stuck in sub-optimal equilibriums (be it health, finance, education etc).

Platforms themselves are part of the solutions, the reimagination of these future possibilities will still need to be done and solutions will need to be built on top of these platforms. To make the point more clear, platforms like internal combustion engines make new solution e.g. cars, trucks possible, they create power opportunities but are not the "silver bullets".

Microservices and Low Code No Code

Microservices and Low Code No Code architectures have evolved to address very different sets of software engineering challenges. With the advent of information technology and especially after its explosion in the post internet era, two major problems emerged.

1. Scale Problem - How to design cost effective, evolvable and reliable systems that can be scalable to meet requirements of millions of users.

2. Speed Problem - How to accelerate the development of software?

To address the scaling problem, technology companies and systems designers created design concepts and technologies like hardware virtualization (cloud), containers (e.g. Docker), Service Orientation (e.g. API First approach), Asynchronous Processing (e.g. Queues) etc. These technologies and design concepts eventually were put together into microservices based architecture and are now used to develop large scalable systems.

To address the speed problem of software development, early engineers focused on automatic code generation using CASE Tools. CASE Tools aimed to generate “high quality, defect free, maintainable software”. Key idea was to use software design models like ER Diagrams, Data Flow Diagrams as input and then generate code from these diagrams. Several of these tools became popular in the 90’s. The main problem with these tools was that when the programmers made changes to the generated code the source model would go out of sync and would become unusable. This limited the adoption of CASE tools to the initial phases of the projects. Similar concepts are used even today by many developers to generate the boilerplate code instead of coding everything from scratch.

In parallel to CASE tools, 4GL (Fourth Generation Languages) and RAD (Rapid Application Development) also began trending. 4GL was first introduced by James Martin in 1981 in his book “Application Development without Programmers”. 4GL languages focused on higher level constructs like information rather than bits and bytes. They focused on databases, reports, workflows, GUI (Graphical user interface) etc. 4GL languages accompanied with Drag and Drop for form designers. Soon, people realized that programming done in higher language constructs has limited applicability (due to lower expressivity). They are harder to refactor. Most of the 4GL were focused on traditional Windows based applications. The internet moved user interfaces to HTML and this made 4GL languages less relevant. Most companies who bet on 4GL rebranded themselves to Business Process Management (BPM) or Rules Engines. The 4GL trend faded away.

The other trend was Rapid Application Development (RAD). Early software development process was adopted from civil engineering which was an extremely rigid waterfall process i.e. after requirements, then design then build and then deployment. No back and forth was accepted. RAD changed that by allowing feedback loops between various stages of development. This allowed developers to incorporate learning during one phase into the previous phase. So basically, some back and forth was allowed. Using CASE tools for design and generating working models were quite suited with this approach.

Low Code No Code environments trace their roots to CASE, 4GL and RAD. The principles are the same - model-driven design, automatic code generation, and visual programming. Their benefits and limitations are also the same. They can generate simple applications quite fast. However these applications will inherently suffer from low expressivity, extensibility, evolvability and scalability. The platform providers hide behind “Low Code” by providing the ability to write code within the designer. As complexity of applications increases, developers end up writing more and more code to incorporate these functionalities. Given there are no standards existing for LCNC platforms today, the underlying models and pieces of code are all stored in proprietary formats. This creates significant vendor lock-in.

Given the need to generate large number of simple applications and shortage of software engineering talent, a combination of backend microservices with a low code no code front end may be the way to move forward. This would enable both scalable systems can be delivered at speed. However, with increasing diversity of channels - web, mobile, chat, voice, kiosks, social media etc. the challenge for low code no code platforms has significantly been raised. A low code no code platform that can integrate/orchestrate backend microservices and enable digital service delivery through a wide number of channels is the need of the hour.

At the same time, while microservices based architectures has been really successful in addressing the issues around scale and maintainability, it has led to increasing complexity of deployment and operations. A plethora of tools are emerging to address these issues. A DevOps engineer needs to be aware of these tools to be able deploy and manage microservices.

__All content on this website by eGov Foundation is licensed under a Creative Commons Attribution 4.0 International License.

Registries

Pros and Cons of migrating to Sunbird RC

Digital Registries must ensure the following

Single Source of Truth
Data Privacy
Non Repudiation
Verification
Audit/History
Exchange using Open Standards
Slow Moving Data
1. Data may be classified depending on the rate of change
  1. Very Slow Changing Data may be called Master Data. In DIGIT, all Master Data is stored in single service called Master Data Registry. Examples of Master Data are - Property Type, Property Usage etc.
  2. Slow changing data that are basis for various transactions are stored in Registry e.g. Property, User, Employee, Trade License etc.
  3. Transaction Data e.g. Payment for Property Tax etc.

Sunbird RC contains a set of frameworks to enable you to rapidly build next generation electronic registries and verifiable credentials including attestation and verification flows.

Need to evaluate if Digit needs to migrate registries to Sunbird RC.

Platform Orientation - Overview

Introduction

This document serves as a briefing and overview of the core architecture and components of the platform for a new or unfamiliar developer. It seeks to address the what, why, and how of the platform at the time of writing. It is also meant to be a collaborative exercise, written by newbies for newbies, with future developers adding their own insights and learnings to this resource to have it grow with the platform over time.

This is NOT a technical reference or documentation. It is intended for orientation and will be written in natural language wherever possible. It is also limited in its scope to the general architecture of the back end, with little regard to how the systems necessarily converge to provide product solutions.

Architecture Components

The Goal

By the end of this document, you will be able to completely comprehend the following paragraph. It will equip you to understand the terminology, the tools, the features, and the implicit assumptions therein as well as provide you with solid grounded reasoning on why the architecture is the way that it is. The paragraph is an elevator pitch of the platform architecture, and it looks something like this:

In brief, the platform stack uses nginx servers with Zuul gateways to host Spring Boot microservices stored in Docker containers managed using Kubernetes. The servers rely on Kafka data streams to provide them with data that is indexed in ElasticSearch, and persisted in PostgreSQL databases.

Here’s what you need to know.

The Components & Why

nginx

Definition: nginx (pronounced “Engine X”) is a web server designed to serve dynamic HTTP content fast. It serves 32% of all active websites on the internet as of 2019, making it the world’s most popular web server.

A server in this context is a computer on a network that holds some form of content and provides it when needed i.e. “serves” it.

Functionality: Nginx uses a modular event-driven architecture to handle requests asynchronously, rather than through threads. “Event-driven” means it performs actions as a reaction to things happening in its environment (such as requests for information, or changes in values), as opposed to constantly staying in action to function (which is what threading does).

Why nginx: Nginx is substantially faster than Apache at a fraction of the processor cost because the narrow scope of a microservice server means that the configuration is highly specialized making it more efficient than a feature-rich server which would do more but run slower.

Because the platform microservices are all HTTP driven, a server that is optimized for fast dynamic HTTP processing makes logical sense.

Zuul

Definition: Zuul is an open-source API gateway service developed and provided by Netflix. An API Gateway is a service that manages access control to a server that is hosting an API, which means that it can handle things like service requests that involve sending and receiving program operation-specific data and parameters and is custom-built for that purpose.

Functionality: Zuul acts as a proxy, accepting all incoming API requests and authenticating them before delegating them to the microservice in question. This means that whenever an app or a product is requesting or calling a microservice, it is actually connecting to Zuul first, rather than directly to the server. Once Zuul okays the request, it hands off to the server.

Why Zuul: Zuul provides two benefits: it acts as a wrapper on the internal mechanics of the microservices, meaning that any internal functionality concerns are irrelevant to any external clients. It also simplifies the server gateway and access system, allowing for a single configuration of authentication protocols to suffice for every deployed microservice. In the absence of a common gateway, authentication would have to be individually defined on every server access point, which would be tedious and redundant.

Spring Boot

Definition: Spring is an application framework for Java, or the environment in which a java application runs. Spring Boot is an opinionated instance of the Spring framework, which means that it is automatically preconfigured in the way most Java application frameworks tend to be configured on average.

Functionality: The opinionated configuration of Spring Boot means that a developer does not need to be spending time and resources to install the libraries and dependencies required for a specific Java application. They are all present at the time of deployment and only highly specialized dependencies need to be installed after the fact.

Why Spring: Because the platform consists of a large number of microservices where the individual functionalities of a given service are defined very simply, it is unlikely that highly specialized dependencies will be required for a non-opinionated configuration to be required or viable. Therefore, an opinionated instance that includes all the commonly required dependencies by design is an ideal match for the framework requirements for a project such as this.

Kafka

Definition: Kafka is a real-time data streaming service. It allows other systems to subscribe or publish to a data stream (a sequence of data that updates asynchronously in real-time).

Functionality: Kafka acts as the backbone of the server architecture, handling data transfer between the databases and the microservices, as well as other platform entities that require access to data and functionality elements. It creates streams of information that services and network entities can either publish or subscribe to.

Why Kafka (or why Data Streaming): Data streaming in general, and Kafka in particular, address an important aspect of microservice architecture design. Inter-service communication plays a larger role in the functionality of such architecture over traditional service architectures, and being able to reliably and efficiently provide data to all the microservices active at a given time during runtime is essential to the platform working as intended.

With streaming, services that need data can request it independent of each other without affecting the functionality of others (a key advantage of a pub/sub model) and the data can be reliably expected to be up-to-date. With distributed streaming infrastructures like Kafka, scaling up to accommodate larger and more complex microservice deployments also becomes easier.

ElasticSearch

Definition: ElasticSearch is a search engine. It provides text-based search functionality across an indexed database.

Functionality: ElasticSearch allows for searching across all kinds of documents, including specifically schema-less JSON objects. It is quasi-real time, allows its database indices to be sharded (horizontally partitioned) with shard-level replication as well as distributed computation and storage.

ElasticSearch is complemented by Logstash, a data collection and logging system and Kibana, an analytics and visualization dashboard. These three tools, combined with Beats, a lightweight data shipper (not being used in the architecture) are collectively named the Elastic Stack.

Why the Elastic Stack: The Elastic stack is self-contained and highly functional, ideal for the “just works” configuration that is needed for scalable systems. Specifically, ElasticSearch is a more efficient method of searching the database since the query runtime is faster on indexed Elastic than on indexed relational databases. It works in tandem with the slower but more robust relational database to provide faster data access.

PostgreSQL

Definition: PostgreSQL is an open-source relational database management system developed by the Ingres team at the University of California, Berkeley.

Functionality: PostgreSQL is a fully functional RDBMS that is market competitive with other open source and proprietary database management tools. A full list of the features it offers would be slightly redundant to add to this document, but it could be introduced at a later date.

Why PostgreSQL: PostgreSQL has one real advantage over other forms of open source RDBMS in that it is slightly faster. MySQL will run slower on average on certain specific query cases and corner cases. Furthermore, there is a consensus in the platform development community that a move to PostgreSQL is inevitable in all but the most legacy of systems. Non-Postgres systems at large scale are only really being maintained because migration would be too resource-intensive to be worthwhile.

Docker

Definition: Docker is virtualization software that creates lightweight virtual environments called containers in which programs can be run with their own unique configuration of libraries, dependencies, and setups. Because all Docker containers run on one OS kernel, they are less resource intensive than virtual machines (which instantiate a new OS for every virtualization).

Functionality: Docker uses Linux functionality like cgroups (which allows for compartmentalizing hardware resources) and namespaces to isolate the containers without having to create a new instance of the kernel for every virtualization.

Docker containers are also ephemeral, in that they only exist for as long as it is needed for the app or service running within the container to perform the necessary task, after which it is cleaned up.

Why Docker: Virtualization and containers are advantageous for a distributed scaled system because of the ease of configuration for individual microservice functionality. This in turn lowers the size of the resultant code base, as well as allows for constant delivery (since the entire stack does not need to be taken down to instantiate a new container for a new microservice).

Kubernetes

Definition: Kubernetes is an open-source container orchestration platform that allows for automating the container deployment, maintenance, and scaling process.

Functionality: Kubernetes consolidates containers into pods, which are groups of containers guaranteed to be hosted in a single location and can share resources. These pods are then organized into services, where the containers are all intended to interact with each other. These are deployed in Kubernetes Nodes on the API server architecture, which are accessed by clients via the Kube-proxy interface.

Why Kubernetes: By design, Kubernetes and by extension the container architecture it facilitates, meet a lot of the concerns and requirements of microservice architectures. Over time, as the system complexity increases, the automation of container management means that the service can be scaled and managed without hindering functionality, provided the core design is consistent with the problem it is attempting to solve.

Putting it All Together

Revisiting the Elevator Pitch

In brief, the platform stack uses nginx servers with Zuul gateways to host Spring Boot microservices stored in Docker containers managed using Kubernetes. The servers rely on Kafka data streams to provide them with data that is stored indexed in ElasticSearch, and persisted in PostgreSQL databases.

Now that you have read the document, you should be better equipped to understand what that means, as well as the raison d’être for its current state. You should also be cognizant of the context in which the platform functions, and the nature of the solutions it is capable of providing.

Most importantly, you are now ready to jump into the technical documentation and be able to put it in perspective with the system at large, while being able to focus on the specific aspect with which you are concerned.

What Now?

Still doesn’t make sense? Feels like something is missing. Is everything in this document wrong and bad and you can’t believe someone actually wrote this stuff out? Don’t worry! This is a collaborative effort, and your contribution will be most welcome. Ping the author(s), leave a comment, or better yet, edit the document yourself and keep improving it. The more the better.

Over time, this document is intended to help any new team members become familiar and capable with the platform and anything you design worthy of adding to their knowledge should be added here.

If you’re good to go, however, then get in touch with your team and they will let you know what is next.

Platform Orientation - Overview

Introduction

Architecture Components

The Goal

Here’s what you need to know.

The Components & Why

nginx

A server in this context is a computer on a network that holds some form of content and provides it when needed i.e. “serves” it.

Because the platform microservices are all HTTP driven, a server that is optimized for fast dynamic HTTP processing makes logical sense.

Zuul

Spring Boot

Kafka

Definition: Kafka is a real-time data streaming service. It allows other systems to subscribe or publish to a data stream (a sequence of data that updates asynchronously in real-time).

ElasticSearch

Definition: ElasticSearch is a search engine. It provides text-based search functionality across an indexed database.

PostgreSQL

Definition: PostgreSQL is an open-source relational database management system developed by the Ingres team at the University of California, Berkeley.

Docker

Kubernetes

Definition: Kubernetes is an open-source container orchestration platform that allows for automating the container deployment, maintenance, and scaling process.

Putting it All Together

Revisiting the Elevator Pitch

In brief, the platform stack uses nginx servers with Zuul gateways to host Spring Boot microservices stored in Docker containers managed using Kubernetes. The servers rely on Kafka data streams to provide them with data that is stored indexed in ElasticSearch, and persisted in PostgreSQL databases.

What Now?

Over time, this document is intended to help any new team members become familiar and capable with the platform and anything you design worthy of adding to their knowledge should be added here.

If you’re good to go, however, then get in touch with your team and they will let you know what is next.

Government and Open Digital Platforms

Draft - Work in Progress

Governments - A Complex Web Of Interactions

Encoding Complex Interactions Into Digital Applications Leading To Sub Optimal Equilibrium

Platforms can help unlock and realize new possibilities

Platforms are a set of highly reusable building blocks with high complementarity. Each building block solves a key pivotal problem in a manner that it can be reused for building multiple solutions.

Unintended Consequences and Need for Policy

Digital Platforms

Closed vs Open Platforms - Need for Governance

Digital Building Blocks

Digital Building Blocks that make up platforms can come in various forms. Some of them are listed below.

Protocols and Formats for Data Exchange e.g. SMTP, HTTP, HTML etc.
Shared Registries e.g. Aadhar
Data Exchange Platforms e.g. UPI
Shared Services e.g. Payment, Collection etc.

The illustration below graphically summarizes all the above points about digital platforms and demonstrates how platforms can unlock new possibilities.

Key Principles for Building Digital Platforms for Government

Open - We have already talked about that digital platforms are powerful and hence need to be open to ensure value is distributed across the actors of the ecosystem and concentrated to benefit a few. This requires that these platforms be built using an open process using open standards, technology, API and data principles.
Unbundled/Modular - To ensure high reuse, the building blocks must be unbundled to small, modular and well defined microservices. Instead of trying to pack complexity into one large integrated solution e.g. ERP, it is key that the problem space be broken down into smaller modular building blocks that can be assembled and also evolved independently.
Federated - The architecture of the platform must ensure value accrues to all stakeholder of the system. Traditional centralized application architectures tend to concentrate power in the hands of those who control the data. Special care must be given to ensure platform does not create information flow that create imbalance of power for the federated structure of the government.
Security - The data stored on these platforms are very high value and will be subject to continuous attacks. It is imperative that highest standards of security be applied for data at rest and in transit.
Privacy - Governments deal with lot of private data about citizens, platform architecture must enforce wherever possible the privacy of individuals and enable solution developers to enhance the privacy.
Minimum - Even though high reuse is of high importance for platforms, it must store minimum data and functionality that is minimum and fit for purpose.
Scalable - Given the scale of government, its imperative that platform be designed to scale to sub-national and national scale.

Digital Missions building Digital Platforms for accelerating Sustainable Development Goals

Digital Public Goods Alliance

Digital Platforms are powerful but are not "silver bullet" by themselves

__All content on this website by eGov Foundation is licensed under a Creative Commons Attribution 4.0 International License.

Reference Reads

Analytics

DevSecOps

DevSecOps - Principles Into Practice

How can we adapt security best practices?

Rules For A DevSecOps Practitioner

Improving The Security DNA

1. Code Analysis

2. Change Management

3. Compliance Monitoring

4. Threat Investigations

5. Vulnerability Checks

6. Security Training

Integrating Security Into CI/CD Workflow

1. Pre-commit

2. Commit stage (Continuous Integration)

3. Securing artefacts

3. QA stage

4. Production deployment and post-deployment

Tools That Can Help The DevSecOps Way

Low Code No Code

Application Specification

Beneficiary Eligibility

Government and Open Digital Platforms

Governments - A Complex Web Of Interactions

Encoding Complex Interactions Into Digital Applications Leading To Sub Optimal Equilibrium

Platforms can help unlock and realize new possibilities

Unintended Consequences and Need for Policy

Closed vs Open Platforms - Need for Governance

Key Principles for Building Digital Platforms for Government

Digital Missions building Digital Platforms for accelerating Sustainable Development Goals

Digital Platforms are powerful but are not "silver bullet" by themselves

Microservices and Low Code No Code

Registries

Platform Orientation - Overview

Introduction

Architecture Components

The Goal

The Components & Why

nginx

Zuul

Spring Boot

Kafka

ElasticSearch

PostgreSQL

Docker

Kubernetes

Putting it All Together

Revisiting the Elevator Pitch

What Now?

Registries

Platform Orientation - Overview

Introduction

Architecture Components

The Goal

The Components & Why

nginx

Zuul

Spring Boot

Kafka

ElasticSearch

PostgreSQL

Docker

Kubernetes

Putting it All Together

Revisiting the Elevator Pitch

What Now?

Analytics

Microservices and Low Code No Code

Low Code No Code

Beneficiary Eligibility

Application Specification

Government and Open Digital Platforms

Governments - A Complex Web Of Interactions

Encoding Complex Interactions Into Digital Applications Leading To Sub Optimal Equilibrium

Platforms can help unlock and realize new possibilities

Unintended Consequences and Need for Policy

Closed vs Open Platforms - Need for Governance

Key Principles for Building Digital Platforms for Government

Digital Missions building Digital Platforms for accelerating Sustainable Development Goals