Configure CRS Data Loader

Overview

Purpose

The CRS DataLoader is a Jupyter Notebook-based ETL (Extract, Transform, Load) system designed to load master data into the DIGIT CRS after deployment. It enables implementation teams to configure tenants, boundaries, departments, complaint types, and employees through a structured, validated workflow.

Key Components

Component
File
Description

Jupyter Interface

DataLoader.ipynb

User-facing notebook interface

Core ETL Logic

unified_loader.py

Main processing engine (2,919 lines)

Schema Validator

mdms_validator.py

MDMS schema validation (659 lines)

Configuration

.env.example

Environment configuration template

Templates

templates/

Generated Excel templates for data entry

Technology Stack

  • Frontend: Jupyter Notebook (Python kernel)

  • REST APIs: DIGIT microservices (MDMS, Boundary, HRMS, Localisation, Filestore)

  • Data Format: Excel (XLSX) → JSON → API

  • Validation: 3-layer validation (Excel, MDMS Schema, API)

Repository Setup

# Clone repository

# Navigate to DataLoader directory

# Launch Jupyter Notebook

# Access at: http://localhost:8888/tree

Architecture Overview

System Architecture

Data Loading Phases

Phase
Name
Purpose
Dependencies

Phase 0

Authentication

OAuth2 token acquisition

Admin credentials

Phase 1

Tenant & Branding

Create tenants and branding info

Authentication

Phase 2

Boundary Management

Create administrative boundaries

Tenant ID

Phase 3

Common Masters

Departments, Designations, Complaint Types

Tenant ID

Phase 4

Employee Creation

Bulk employee creation with roles

All previous phases

Phase 0: Authentication Flow

Overview

Before any data upload, the system must authenticate with the DIGIT gateway using the OAuth2 Password Grant flow.

Authentication Flow

API Details

Endpoint: POST /user/oauth/token

Headers:

Content-Type: application/x-www-form-urlencoded

Authorization: Basic ZWdvdi11c2VyLWNsaWVudDo= (Base64: egov-user-client:)

Request Body (with sample data):

username: DEV_SUPER_ADMIN

password: eGov@123

userType: EMPLOYEE

tenantId: pg

scope: read

grant_type: password

Response would look like below

Use the tenant ID, username and password values according to your configuration.

Code Implementation

Key Points

  • OAuth2 Password Grant flow is used

  • Client credentials are hardcoded: egov-user-client with an empty secret

  • Token is valid for 24 hours (86400 seconds)

  • auth_token is included in ALL subsequent API calls via RequestInfo


Phase 1: Tenant & Branding Flow

Overview

Phase 1 creates tenant records (cities/ULBs) and their branding information (logos, banners, languages).

Process Flow

Excel Template Structure

Sheet 1: Tenant Information

Column Name
Description
Mandatory
Example

Tenant Display Name

Name of ULB/State

Yes

City A ULB

Tenant Code

Unique tenant code (admin only)

Yes

PG.CITYA

Tenant Type

State or City

Yes

City

Logo File Path

Public URL of logo

Yes

https://.../logo.png

Latitude

Geographic latitude

No

30.7333

Longitude

Geographic longitude

No

76.7794

City Name

City name

No

City A

District Name

District name

No

District X

Address

Registered address

No

Main Road, Ward 12

Tenant Website

Official website URL

No

https://citya.gov.in

Sheet 2: Branding Details

Column Name
Description
Mandatory
Example

Banner URL

Application banner URL

No

https://.../banner.png

Logo URL

Main logo URL

Yes

https://.../logo.png

Logo URL (White)

White background logo

Yes

https://.../logo-white.png

State Logo

State emblem URL

No

https://.../state-logo.png

Data Transformation

Excel Input:

Tenant Display Name: City A ULB

Tenant Code: PG.CITYA

Tenant Type: City

Logo File Path: https://example.com/logo.png

Transformed API Payload:

API Calls in Phase 1

#
API Endpoint
Purpose
Method

1

/mdms-v2/v2/_create/tenant.tenants

Create tenant

POST

2

/localization/messages/v1/_upsert

Create localizations

POST

3

/mdms-v2/v2/_create/common-masters.StateInfo

Create branding

POST

Phase 2: Boundary Management Flow

Overview

Phase 2 creates hierarchical administrative boundaries (City → Zone → Ward → Block → Locality) for each tenant. This phase involves asynchronous processing via Kafka.

Process Flow

Hierarchy Definition

Typical Hierarchy Structure:

Level
Boundary Type
Parent Type

1

City

null

2

Zone

City

3

Ward

Zone

4

Block

Ward

5

Locality

Block

Create Hierarchy API:

Endpoint: POST /boundary-service/boundary-hierarchy-definition/_create

Request Body:

Boundary Template Structure

Column
Description
Mandatory
Example

Country

Country name

Yes

India

State

State name

Yes

Punjab

City

City name

Yes

City A

Zone

Zone name

No

Zone 1

Ward

Ward name/number

No

Ward 12

Block

Sub-ward division

No

Block A

Locality

Area/Locality name

No

Shastri Nagar

Code

Unique boundary code

Yes

PG_CITYA_ZONE1

Parent Code

Parent boundary code

Required (except Level 1)

PG_CITYA

API Calls in Phase 2

#
API Endpoint
Purpose
Method

1

/boundary-service/boundary-hierarchy-definition/_search

Search hierarchies

POST

2

/boundary-service/boundary-hierarchy-definition/_create

Create hierarchy

POST

3

/boundary-management/v1/_generate

Generate template

POST

4

/boundary-management/v1/_generate-search

Poll template status

POST

5

/filestore/v1/files/url

Get filestore URL

GET

6

/filestore/v1/files

Upload Excel

POST

7

/boundary-management/v1/_process

Process boundaries

POST


Phase 3: Common Masters Flow

Overview

Phase 3 creates departments, designations, and complaint types with auto-code generation and localisation.

Process Flow

Excel Template Structure

Sheet 1: Department & Designation Master

Column Name
Description
Mandatory
Example

Department Name*

Name of department

Yes

WATER DEPARTMENT

Designation Name*

Designation name

Yes

Executive Engineer

Sheet 2: Complaint Type Master

Column Name
Description
Mandatory
Example

Complaint Type*

Main complaint category

Yes

Water not coming

Complaint Sub Type*

Sub-category

Yes

tap broken

Department Name*

Must match Sheet 1

Yes

WATER DEPARTMENT

Resolution Time (Hours)*

SLA in hours

Yes

1

Search Words

Keywords for search

Yes

water tab,water supply

Autogeneration of Code

  • Department Code Generation:

Input (Excel): Output (API):

Department Name*
code

WATER DEPARTMENT

DEPT_1

SANITATION

DEPT_2

ELECTRICITY

DEPT_3

  • Designation Code Generation:

Department Name*
Designation Name*
Designation code
Department Code

WATER DEPARTMENT

engineer

DESIG_01

DEPT_1

WATER DEPARTMENT

manager

DESIG_02

DEPT_1

SANITIATION

supervisor

DESIG_03

DEPT_2

  • Service Code Generation for Complaints subtype

Complaint sub type*
ServiceCode

Tap broken

TAPBROKEN

Water not coming

WATERNOTCOMING

Street light not working

STREETLIGHTNOTWORKING

API Calls in Phase 3

#
API Endpoint
Purpose
Method

1

/mdms-v2/v2/_create/common-masters.Department

Create department

POST

2

/mdms-v2/v2/_create/common-masters.Designation

Create designation

POST

3

/mdms-v2/v2/_create/SERVICEDEFS.ServiceDefinitions

Create complaint type

POST

4

/localization/messages/v1/_upsert

Create localizations

POST

Relationship Rules

Rule
Description

1

Complaint Type → Department Name must exist in Department Sheet

2

Departments must be created before designations and complaints

3

Designation uses exact department name from Department Sheet

4

All mandatory fields must be filled


Phase 4: Employee Creation Flow

Overview

Phase 4 generates a dynamic employee template with live MDMS data, then bulk-creates employees with role and jurisdiction assignments.

Process Flow

Employee Template Structure

Column Name
Description
Mandatory
Data Type
Example

User Name*

Full name

Yes

Text

John Doe

Mobile Number*

10-digit unique

Yes

Numeric

9876543210

Password

Login password

No

Text

MyPass@123

Department Name*

From Phase 3

Yes

Dropdown

WATER DEPARTMENT

Designation Name*

From Phase 3

Yes

Dropdown

engineer

Role Names*

Comma-separated

Yes

Multi-select

GRO,EMPLOYEE

Gender

Male/Female/Other

No

Dropdown

Male

Employee Status

Employment status

No

Dropdown

EMPLOYED

Employee Type

Type of employee

No

Dropdown

PERMANENT

Assignment From Date*

Start date

Yes

DD-MM-YYYY

05-09-2024

Date of Appointment*

Joining date

Yes

DD-MM-YYYY

20-06-2024

Hierarchy Type

Admin hierarchy

No

Text

ADMIN

Boundary Type

Boundary type

No

Text

City

Boundary Code

Boundary code

No

Text

PG.CITYA

System-Generated Fields

Attribute
Logic
Example

Employee Code

Name uppercase, spaces → underscores

JOHN_DOE

Tenant ID

From upload session

pg.citya

User UUID

Generated by HRMS

1fda5623-448a-...

User Type

Always "EMPLOYEE"

EMPLOYEE

Active Flag

Always true

true

Data Transformation Example

Excel Input:

Transformed API Payload:

API Calls in Phase 4

#
API Endpoint
Purpose
Method

1

/mdms-v2/v2/_search (Departments)

Fetch departments

POST

2

/mdms-v2/v2/_search (Designations)

Fetch designations

POST

3

/mdms-v2/v2/_search (Roles)

Fetch roles

POST

4

/boundary-service/boundary/_search

Fetch boundaries

POST

5

/mdms-v2/v2/_search (GenderType)

Fetch gender types

POST

6

/mdms-v2/v2/_search (EmployeeStatus)

Fetch employee status

POST

7

/mdms-v2/v2/_search (EmployeeType)

Fetch employee types

POST

8

/mdms-v2/v2/_create/ACCESSCONTROL-ROLES.roles

Create missing roles

POST

9

/egov-hrms/employees/_create

Create employee

POST


Validation Framework

Status Tracking

After processing, each row gets status columns:

Column
Description
Values

_STATUS

Processing result

SUCCESS, EXISTS, FAILED

_STATUS_CODE

HTTP response code

201, 400, 409, 500

_ERROR_MESSAGE

Detailed error description

Error text

Colour Coding

Color
Status
Meaning

Green

SUCCESS

Record created successfully

Yellow

EXISTS

Record already exists (duplicate)

Red

FAILED

Validation or processing error


Error Handling

Common Error Messages

Error Message
Reason
Resolution

Department not found

Incorrect spelling or missing from Phase 3

Verify department name

Mobile number already exists

Duplicate in system

Use unique mobile number

Role not found

Invalid role code

Check valid role codes

Invalid date format

Not in DD-MM-YYYY format

Correct date format

Boundary code invalid

Not matching Phase 2 hierarchy

Verify boundary code

Parent boundary code not found

Invalid parent reference

Check parent code exists

Record already exists

Duplicate entry

Skip or update existing

Data Dependencies

Refer to the chart below to understand the data dependency at each phase


Complete API Reference

All APIs Summary (24 APIs)

#
Phase
API Endpoint
Purpose
Method

1

0

/user/oauth/token

OAuth authentication

POST

2

1

/mdms-v2/v2/_create/tenant.tenants

Create tenant

POST

3

1

/localization/messages/v1/_upsert

Create localizations

POST

4

1

/mdms-v2/v2/_create/common-masters.StateInfo

Create branding

POST

5

2

/boundary-service/boundary-hierarchy-definition/_search

Search hierarchies

POST

6

2

/boundary-service/boundary-hierarchy-definition/_create

Create hierarchy

POST

7

2

/boundary-management/v1/_generate

Generate template

POST

8

2

/boundary-management/v1/_generate-search

Poll template status

POST

9

2

/filestore/v1/files/url

Get filestore URL

GET

10

2

/filestore/v1/files

Upload Excel

POST

11

2

/boundary-management/v1/_process

Process boundaries

POST

12

3

/mdms-v2/v2/_create/common-masters.Department

Create department

POST

13

3

/mdms-v2/v2/_create/common-masters.Designation

Create designation

POST

14

3

/mdms-v2/v2/_create/SERVICEDEFS.ServiceDefinitions

Create complaint type

POST

15

4

/mdms-v2/v2/_search (Departments)

Fetch departments

POST

16

4

/mdms-v2/v2/_search (Designations)

Fetch designations

POST

17

4

/mdms-v2/v2/_search (Roles)

Fetch roles

POST

18

4

/boundary-service/boundary/_search

Fetch boundaries

POST

19

4

/mdms-v2/v2/_search (GenderType)

Fetch gender types

POST

20

4

/mdms-v2/v2/_search (EmployeeStatus)

Fetch employee status

POST

21

4

/mdms-v2/v2/_search (EmployeeType)

Fetch employee types

POST

22

4

/mdms-v2/v2/_create/ACCESSCONTROL-ROLES.roles

Create roles

POST

23

4

/egov-hrms/employees/_create

Create employee

POST

24

All

/mdms-v2/schema/v1/_search

Fetch MDMS schema

POST

Common RequestInfo Structure

All APIs use standard RequestInfo:

Common Response Codes

Code
Status
Description

200

OK

Search/fetch successful

201

Created

Record created successfully

400

Bad Request

Validation error

409

Conflict

Duplicate record (already exists)

500

Internal Server Error

Server-side error

Quick Reference Commands

# Clone repository

# Checkout data loader branch

# Navigate to DataLoader

# Check Jupyter version

# Start Jupyter Notebook

# Or use Jupyter Lab

Last updated

Was this helpful?