Skip to content

DraceniY/Toubib

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Toubib API Connector

Program that feed patient data from the Toubib API to a DuckDB database and synchronize it using EL (Extract Load) framework.

Project Structure

.
├── Dockerfile
├── Makefile
├── README.md
├── data
│   └── database.db
├── docker-compose.yml
├── requirements.txt
├── src
│   ├── config.ini
│   ├── connector.py
│   ├── init_db.py
│   ├── settings.py
│   └── sql_queries.py
├── sync.log
├── test
│   └── test_regression.py
└── test_config.ini```

## To make run the project

### 1. Build Docker Image
```on the terminal run this
make build

2. Initialize Database

make init-db

3. Run Connector

make run

4. Run to synchronize manually

make sync-patients 

5. Run periodic each 5 minutes

make periodic

6. Verify Results

make show-db

Additional informations

  • Change Log: All modifications are tracked in patient_change_log table
  • Regression Tests: Run docker run --rm -v ./data:/data toubib_connector python test/test_regression.py

Database Schema

patients table

  • id (INTEGER PRIMARY KEY)
  • email (VARCHAR NOT NULL)
  • first_name (VARCHAR NOT NULL)
  • last_name (VARCHAR NOT NULL)
  • date_of_birth (DATE NOT NULL)
  • created_at (TIMESTAMP NOT NULL)
  • updated_at (TIMESTAMP NOT NULL)
  • total_visits (INTEGER NOT NULL)

patient_change_log table

  • id (BIGINT NOT NULL)
  • patient_id (INTEGER NOT NULL)
  • event_type (VARCHAR NOT NULL)
  • event_timestamp (TIMESTAMP NOT NULL)
  • old_data (JSON)
  • new_data (JSON)
  • created_at (TIMESTAMP DEFAULT CURRENT_TIMESTAMP)

Algorithm

  1. Data Fetching: The connector uses pagination to fetch all patients from the API
  2. Upsert Logic: Uses DuckDB's INSERT IF NEW ID OR ON CONFLICT DO UPDATE to handle duplicates
  3. Change Tracking: All changes are logged for audit purposes

Copyright

Yasmine Draceni 2025

About

Automated Healthcare Data Pipeline: Toubib API → DuckDB ETL System

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors