Skip to content

DLR-SC/RAG-for-Earth-Observation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RAG for Earth Observation

Repository Strucutre

This repository is structured in 4 main parts:

  • backend/ contains the mainn application logic and provides the FastAPI server used by the frontend.
  • frontend/ contains a Streamlit frontend providing easy access to the the QA system.
  • data/ contains all code needed to get and process the used data. For a more detailled description read data/README.md.
  • evaluation/ contains code and samples used for the DARES25 paper.

Installation

This guide walks you through setting up the Earth Observation QA system from scratch. It includes the database (ArangoDB), backend (FastAPI), and frontend (Streamlit).


🧩 Prerequisites

  • Docker and Docker Compose v2+
  • Git (to clone this repository)
  • Optional: curl for quick connectivity check and API access

⚙️ Environment Setup

Create a .env file abd fill in the required values:

ARANGO_ROOT_PASSWORD=
OPENAI_API_KEY=
MISTRAL_API_KEY=

ARANGO_DB=ScienceSearch

✅ These environment variables are loaded automatically by Docker.


🧠 Database Setup (ArangoDB)

As of now the database is not supplied within this repository and needs to be created seperately.

# Pull docker image
docker pull arangodb:latest

# Start container on port 8529, protect with password
docker run -d -p 8529:8529 \
    -e ARANGO_ROOT_PASSWORD="<YOUR_PASSWORD>" \
    --name eo_rag_arangodb arangodb

# Download database dump from zenodo
docker exec -t eo_rag_arangodb wget -O \
    /tmp/dump.zip https://zenodo.org/records/17287798/files/dump.zip

# Unzip contents
docker exec eo_rag_arangodb unzip -d / /tmp/dump.zip

# Restore backup
docker exec eo_rag_arangodb arangorestore \
    --server.password <YOUR_PASSWORD> \
    --server.database ScienceSearch \
    --create-database true 

⚠️ Restoring the database can take a few hours depending on its size.

Check that ArangoDB is running:

curl http://localhost:8529

or open in your browser:

👉 http://localhost:8529

Log in with:

  • Username: root
  • Password: (from .env

🚀 Run Backend & Frontend

Once ArangoDB is running, you can start the FastAPI backend and Streamlit frontend.

From the main repo folder:

1. Build and start the app stack

docker compose -f db-compose.yml up -d --build

This starts:

  • eo_rag_backend (FastAPI)
  • eo_rag_frontend (Streamlit)

2. Test the services

🧩 Backend (FastAPI)

Either go to http://localhost:8000/docs and check the swagger UI or

curl http://localhost:8000/healthz

Expected:

{"backend":"ok", "arangodb":"ok (...version...)", "mistral":"ok"}
🖥️ Frontend (Streamlit)

Open: 👉 http://localhost:8501

You should see the Earth Observation QA web interface.


🧪 Validation Checklist

Service URL Expected
ArangoDB http://localhost:8529 Login page
Backend (FastAPI) http://localhost:8000/healthz ok status
Frontend (Streamlit) http://localhost:8501 Web UI loads successfully

🧩 Next Steps (for improvrements)

  • Add automatic database restore at container startup (optional)
  • Integrate additional datasources (e.g., WebData API)

Author:
🧠 R. El Baff and B. Schluckebier, DLR / Earth Observation AI Research this readme was partially generated by ChatGPT with detailed inspection from Roxanne El Baff

About

Final build at submission of the thesis: Knowledge Graph-Enhanced Retrieval-Augmented Generation for Earth Observation Data

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors