> For the complete documentation index, see [llms.txt](https://docs.digit.org/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.digit.org/local-governance/deploy/configure-digit/configuring-digit-services/configuring-common-services/digit-internal-datamart-deployment-steps.md).

# DIGIT: Internal Datamart Deployment Steps

Steps for setting up the environment and running the script file to get a fresh copy of the required Datamart CSV file.

### Steps for setting up the environment for running the script <a href="#steps-for-setting-up-the-environment-for-running-the-script" id="steps-for-setting-up-the-environment-for-running-the-script"></a>

**(One Time Setup)**

1. **Install Kubectl** Step 1: Go through the Kubernetes documentation page to install and configure the kubectl. Following are useful links: [Kubernetes Installation Doc](https://kubernetes.io/docs/tasks/tools/install-kubectl/) [Kubernetes Ubuntu Installation](https://matthewpalmer.net/kubernetes-app-developer/articles/install-kubernetes-ubuntu-tutorial.html)

After installing type the below command to check the version install in your system`1 kubectl version`

Step 2: Install aws-iam-authenticator\
[![](https://docs.aws.amazon.com/assets/images/favicon.ico)Installing aws-iam-authenticator - Amazon EKS](https://docs.aws.amazon.com/eks/latest/userguide/install-aws-iam-authenticator.html)

Step 3: After installing, you need access to a particular environment cluster.

* Go to $HOME/.Kube folder

`1cd 2cd .kube`

Open the config file and replace the content with the environment cluster config file. (Config file will be attached)`1gedit config`

* Copy-paste the content from the config file provided to this config file opened and save the file.

**2. Exec into the pod**`1kubectl exec --stdin --tty playground-584d866dcc-cr5zf -n playground -- /bin/bash`

(*Replace the pod name depending on what data you want.*

*Refer to Table 1.2 for more information*)

**3. Install Python and check to see if it installed correctly**`1apt install python3.8 2python --version`

**4. Install pip and check to see if it installed correctly**`1apt install python3-pip 2pip3 --version`

**5. Install psycopg2 and Pandas**`1pip3 install psycopg2-binary pandas`

Note: If this doesn’t work then try this command`1pip3 install --upgrade pip`

and running the #5 command again

### **Steps for setting up the environment for running the script** <a href="#steps-for-setting-up-the-environment-for-running-the-script.1" id="steps-for-setting-up-the-environment-for-running-the-script.1"></a>

**(Every time you want a datamart with the latest data available in the pods)**

**1. Sending the python script to the pod**`1tar cf - /home/priyanka/Desktop/mcollect.py | kubectl exec -i -n playground playground-584d866dcc-cr5zf -- tar xf - -C /tmp`

Note: Replace the file path (/home/priyanka/Desktop/mcollect.py) with your own file path (/home/user\_name/Desktop/script\_name.py)

Note: Replace the pod name depending on what data you want.

(Refer to Table 1.2 for more information on pod names)

**2. Exec into the pod**`1kubectl exec --stdin --tty playground-584d866dcc-cr5zf -n playground -- /bin/bash`

(Note: Replace the pod name depending on what data you want.`1kubectl exec --stdin --tty <your_pod_name> -n playground -- /bin/bash`

Refer to Table 1.2 for more information)

**3. Move into tmp directory and then move into the directory your script was in**`1cd tmp 2cd home/priyanka/Desktop`

for example :`1cd home/<your_username>/Desktop`

**4. List the files there**`1ls`

(Python script file should be present here)

(Refer Table 1.1 for the list of script file names for each module)

**5. Run the python script file**`1python3 ws.py`

(name of the python script file will change depending on the module)

(Refer Table 1.1 for the list of script file names for each module)

**6. Outside the pod shell, In your home directory run this command to copy the CSV file/files to your desired location**`1kubectl cp playground/playground-584d866dcc-cr5zf:/tmp/mcollectDatamart.csv /home/priyanka/Desktop/mcollectDatamart.csv`

(The list of CSV file names for each module will be mentioned below)

**7. The reported CSV file is ready to use.**

### Jupyter vs Excel for Data Analysis <a href="#jupyter-vs-excel-for-data-analysis" id="jupyter-vs-excel-for-data-analysis"></a>

|                                                                                                                                                                                                                                                                                                                    |                                                                                                                                                                                                                                      |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| **Jupyter**                                                                                                                                                                                                                                                                                                        | **Excel**                                                                                                                                                                                                                            |
| <p>Using jupyter will be <strong>command-based</strong>.</p><p>Will take some time getting used to it.</p>                                                                                                                                                                                                         | **Ease of Use** with the Graphical User Interface (**GUI**). Learning formulas is fairly easier.                                                                                                                                     |
| Jupyter requires python language for data analysis hence a **steeper learning curve.**                                                                                                                                                                                                                             | **Negligible** previous knowledge is required.                                                                                                                                                                                       |
| Equipped to handle lots of data really **quickly**. With the bonus of ease of **accessibility to databases** like Postgres and Mysql where actual data is stored.                                                                                                                                                  | <p>Excel can only handle so much data. Scalability becomes difficult and messy.</p><p><strong>More Data = Slower Results</strong></p>                                                                                                |
| <p><strong>Summary:</strong></p><p>Python is harder to learn because you have to download many packages and set the correct development environment on your computer. However, it provides a big leg up when working with big data and creating repeatable, automatable analyses, and in-depth visualizations.</p> | <p><strong>Summary:</strong></p><p>Excel is best when doing small and one-time analyses or creating basic visualizations quickly. It is easy to become an intermediate user relatively without too much experience dueo its GUI.</p> |

#### How to install and configure jupyter to analyze the datamart <a href="#how-to-install-and-configure-jupyter-to-analyze-the-datamart" id="how-to-install-and-configure-jupyter-to-analyze-the-datamart"></a>

Watch this video

[![](https://www.youtube.com/s/desktop/65defdc1/img/favicon_144x144.png)How To Install Jupyter Notebook In Ubuntu Linux](https://www.youtube.com/watch?v=Yg9AkozItTU)

OR

Follow these steps ->

**(One Time Setup)**

1. **Install Python and check to see if it installed correctly**

`1apt install python3.8 2python --version`

1. **Install pip and check to see if it installed correctly**`1apt install python3-pip 2pip3 --version`

**3. Install jupyter**`1pip3 install notebook`

**(Whenever you want to run Jupyter lab)**

1. **To run jupyter lab**

`1jupyter notebook`

**2.** **To open a new notebook**

New -> Python3 notebook

**3. To open an existing notebook**

Select File -> Open

Go to the directory where your sample notebook is.

Select that notebook (Ex: sample.pynb)

**Opening an existing notebook**

![](/files/-MgiGlCJDEPjWFpzYybq)

After opening

![](/files/-MgiGlCLA0iccU5p11_8)

### Table 1.1 - File Names for each Module <a href="#table-1.1-file-names-for-each-module" id="table-1.1-file-names-for-each-module"></a>

|                 |                                                                                                       |                            |                            |
| --------------- | ----------------------------------------------------------------------------------------------------- | -------------------------- | -------------------------- |
| **Module Name** | **Script File Name (With Links)**                                                                     | **Datamart CSV File Name** | **Datamart CSV File Name** |
| **PT**          | [pt.py](https://github.com/egovernments/utilities/blob/Rain-3035/datamart/property-tax/pt.py)         | ptDatamart.csv             |                            |
| **W\&S**        | [ws.py](https://github.com/egovernments/utilities/blob/Rain-3035/datamart/water-and-sewerage/ws.py)   | waterDatamart.csv          | sewerageDatamart.csv       |
| **PGR**         | [pgr.py](https://github.com/egovernments/utilities/blob/Rain-3035/datamart/pgr/pgr.py)                | pgrDatamart.csv            |                            |
| **mCollect**    | [mcollect.py](https://github.com/egovernments/utilities/blob/Rain-3035/datamart/mcollect/mcollect.py) | mcollectDatamart.csv       |                            |
| **TL**          | [tl.py](https://github.com/egovernments/utilities/blob/Rain-3035/datamart/trade-license/tl.py)        | tlDatamart.csv             | tlrenewDatamart.csv        |
| **Fire Noc**    | [fn.py](https://github.com/egovernments/utilities/blob/Rain-3035/datamart/firenoc/fn.py)              | fnDatamart.csv             |                            |
| **OBPS (Bpa)**  | [bpa.py](https://github.com/egovernments/utilities/tree/Rain-3035/datamart/obps/bpa)                  | bpaDatamart.csv            |                            |

### Table 1.2 - Pod Names for each Module <a href="#table-1.2-pod-names-for-each-module" id="table-1.2-pod-names-for-each-module"></a>

|                 |                             |                                     |
| --------------- | --------------------------- | ----------------------------------- |
| **Module Name** | **Pod Name**                | **Description**                     |
| **PT**          | playground-865db67c64-tfdrk | Punjab Prod Data in UAT Environment |
| **W\&S**        | playground-584d866dcc-cr5zf | QA Data                             |
| **PGR**         | Local Data                  | Data Dump                           |
| **mCollect**    | playground-584d866dcc-cr5zf | QA Data                             |
| **TL**          | playground-584d866dcc-cr5zf | QA Data                             |
| **Fire Noc**    | playground-584d866dcc-cr5zf | QA Data                             |
| **OBPS (Bpa)**  | playground-584d866dcc-cr5zf | QA Data                             |

> [![Creative Commons License](https://i.creativecommons.org/l/by/4.0/80x15.png)*​*](http://creativecommons.org/licenses/by/4.0/)*All content on this page by* [*eGov Foundation*](https://egov.org.in/) *is licensed under a* [*Creative Commons Attribution 4.0 International License*](http://creativecommons.org/licenses/by/4.0/)*.*


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.digit.org/local-governance/deploy/configure-digit/configuring-digit-services/configuring-common-services/digit-internal-datamart-deployment-steps.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.