Skip to content

Commit d4a8f3d

Browse files
Merge pull request #25 from SnappyLab/2-add-the-capability-to-search-for-documents-metadata
2 add the capability to search for documents metadata
2 parents a03d0d3 + d4d2388 commit d4a8f3d

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

48 files changed

+1746
-419
lines changed

.github/workflows/docbinder-oss.yml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,16 @@ on:
55
branches:
66
- main
77
- dev
8+
paths-ignore:
9+
- "docs/**"
10+
- "mkdocs.yml"
811
pull_request:
912
branches:
1013
- main
1114
- dev
15+
paths-ignore:
16+
- "docs/**"
17+
- "mkdocs.yml"
1218
jobs:
1319
test:
1420
runs-on: ubuntu-latest

.gitignore

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -77,3 +77,7 @@ ENV/
7777
# Credentials
7878
gcp_credentials.json
7979
*_token.json
80+
81+
# Test files
82+
search_results.csv
83+
search_results.json

.pre-commit-config.yaml

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
repos:
2+
- repo: https://github.com/astral-sh/uv-pre-commit
3+
rev: 0.7.16
4+
hooks:
5+
- id: uv-export
6+
- id: uv-lock
7+
- repo: https://github.com/pre-commit/pre-commit-hooks
8+
rev: v4.6.0
9+
hooks:
10+
- id: trailing-whitespace
11+
- id: end-of-file-fixer
12+
- id: check-yaml
13+
- id: check-added-large-files
14+
- repo: https://github.com/astral-sh/ruff-pre-commit
15+
# Ruff version.
16+
rev: v0.12.1
17+
hooks:
18+
# Run the linter.
19+
- id: ruff-check
20+
types_or: [ python, pyi ]
21+
args: [ --select, I, --fix, --select=E501 ]
22+
# Run the formatter.
23+
- id: ruff-format
24+
types_or: [ python, pyi ]

CONTRIBUTING.md

Lines changed: 23 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,4 +56,26 @@ All dependencies are tracked in `pyproject.toml`. Use `uv` commands to keep it u
5656
---
5757

5858
**Note:**
59-
Always use `uv` commands to manage dependencies and environments to keep `pyproject.toml` in sync.
59+
Always use `uv` commands to manage dependencies and environments to keep `pyproject.toml` in sync.
60+
61+
## Code Style and Linting
62+
63+
This project uses [Black](https://black.readthedocs.io/en/stable/) for code formatting and [Ruff](https://docs.astral.sh/ruff/) for linting. All code should be formatted and linted before committing.
64+
65+
- Run the following before committing code:
66+
67+
```zsh
68+
uv run black .
69+
uv run ruff check .
70+
```
71+
72+
- To automatically format and lint code on every commit, install pre-commit hooks:
73+
74+
```zsh
75+
uv pip install pre-commit
76+
pre-commit install
77+
```
78+
79+
This will ensure Black and Ruff are run on staged files before each commit.
80+
81+
Configuration for Black and Ruff is in `pyproject.toml`. This enforces consistent quotes, spacing, and other style rules for all contributors.

docs/CONTRIBUTING.md

Lines changed: 23 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,4 +56,26 @@ All dependencies are tracked in `pyproject.toml`. Use `uv` commands to keep it u
5656
---
5757

5858
**Note:**
59-
Always use `uv` commands to manage dependencies and environments to keep `pyproject.toml` in sync.
59+
Always use `uv` commands to manage dependencies and environments to keep `pyproject.toml` in sync.
60+
61+
## Code Style and Linting
62+
63+
This project uses [Black](https://black.readthedocs.io/en/stable/) for code formatting and [Ruff](https://docs.astral.sh/ruff/) for linting. All code should be formatted and linted before committing.
64+
65+
- Run the following before committing code:
66+
67+
```zsh
68+
uv run black .
69+
uv run ruff check .
70+
```
71+
72+
- To automatically format and lint code on every commit, install pre-commit hooks:
73+
74+
```zsh
75+
uv pip install pre-commit
76+
pre-commit install
77+
```
78+
79+
This will ensure Black and Ruff are run on staged files before each commit.
80+
81+
Configuration for Black and Ruff is in `pyproject.toml`. This enforces consistent quotes, spacing, and other style rules for all contributors.
Lines changed: 119 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,119 @@
1+
# How to Add a New Provider
2+
3+
This guide explains how to integrate a new storage provider (e.g., DropBox, OneDrive) into DocBinder-OSS. The process involves creating configuration and client classes, registering the provider, and ensuring compatibility with the system’s models and interfaces.
4+
5+
---
6+
7+
## 1. Create a Service Configuration Class
8+
9+
Each provider must define a configuration class that inherits from [`ServiceConfig`](https://github.com/SnappyLab/DocBinder-OSS/blob/main/src/docbinder_oss/services/base_class.py):
10+
11+
```python
12+
# filepath: src/docbinder_oss/services/my_provider/my_provider_service_config.py
13+
from docbinder_oss.services.base_class import ServiceConfig
14+
15+
class MyProviderServiceConfig(ServiceConfig):
16+
type: str = "my_provider"
17+
name: str
18+
# Add any other provider-specific fields here
19+
api_key: str
20+
```
21+
22+
- `type` must be unique and match the provider’s identifier.
23+
- `name` is a user-defined label for this provider instance.
24+
25+
---
26+
27+
## 2. Implement the Storage Client
28+
29+
Create a client class that inherits from [`BaseStorageClient`](https://github.com/SnappyLab/DocBinder-OSS/blob/main/src/docbinder_oss/services/base_class.py) and implements all abstract methods:
30+
31+
```python
32+
# filepath: src/docbinder_oss/services/my_provider/my_provider_client.py
33+
from typing import Optional, List
34+
from docbinder_oss.services.base_class import BaseStorageClient
35+
from docbinder_oss.core.schema import File, Permission
36+
from .my_provider_service_config import MyProviderServiceConfig
37+
38+
class MyProviderClient(BaseStorageClient):
39+
def __init__(self, config: MyProviderServiceConfig):
40+
self.config = config
41+
# Initialize SDK/client here
42+
43+
def test_connection(self) -> bool:
44+
# Implement connection test
45+
pass
46+
47+
def list_files(self, folder_id: Optional[str] = None) -> List[File]:
48+
# Implement file listing
49+
pass
50+
51+
def get_file_metadata(self, item_id: str) -> File:
52+
# Implement metadata retrieval
53+
pass
54+
55+
def get_permissions(self, item_id: str) -> List[Permission]:
56+
# Implement permissions retrieval
57+
pass
58+
```
59+
60+
- Use the shared models [`File`](https://github.com/SnappyLab/DocBinder-OSS/blob/main/src/docbinder_oss/core/schemas.py), [`Permission`](https://github.com/SnappyLab/DocBinder-OSS/blob/main/src/docbinder_oss/core/schemas.py), etc., for return types.
61+
62+
---
63+
64+
## 3. Register the Provider
65+
66+
Add an `__init__.py` in your provider’s folder with a `register()` function:
67+
68+
```python
69+
# filepath: src/docbinder_oss/services/my_provider/__init__.py
70+
from .my_provider_client import MyProviderClient
71+
from .my_provider_service_config import MyProviderServiceConfig
72+
73+
def register():
74+
return {
75+
"display_name": "my_provider",
76+
"config_class": MyProviderServiceConfig,
77+
"client_class": MyProviderClient,
78+
}
79+
```
80+
81+
---
82+
83+
## 4. Ensure Discovery
84+
85+
The system will automatically discover your provider if it’s in the `src/docbinder_oss/services/` directory and contains a `register()` function in `__init__.py`.
86+
87+
---
88+
89+
## 5. Update the Config File
90+
91+
Add your provider’s configuration to `~/.config/docbinder/config.yaml`:
92+
93+
```yaml
94+
providers:
95+
- type: my_provider
96+
name: my_instance
97+
# Add other required fields
98+
api_key: <your-api-key>
99+
```
100+
101+
---
102+
103+
## 6. Test Your Provider
104+
105+
- Run the application and ensure your provider appears and works as expected.
106+
- The config loader will validate your config using your `ServiceConfig` subclass.
107+
108+
---
109+
110+
## Reference
111+
112+
- [src/docbinder_oss/services/base_class.py](https://github.com/SnappyLab/DocBinder-OSS/blob/main/src/docbinder_oss/services/base_class.py)
113+
- [src/docbinder_oss/core/schemas.py](https://github.com/SnappyLab/DocBinder-OSS/blob/main/src/docbinder_oss/core/schemas.py)
114+
- [src/docbinder_oss/services/google_drive/](https://github.com/SnappyLab/DocBinder-OSS/tree/main/src/docbinder_oss/services/google_drive/) (example implementation)
115+
- [src/docbinder_oss/services/__init__.py](https://github.com/SnappyLab/DocBinder-OSS/blob/main/src/docbinder_oss/services/__init__.py)
116+
117+
---
118+
119+
**Tip:** Use the Google Drive as a template for your implementation. Make sure to follow the abstract method signatures and use the shared models for compatibility.
Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
# Google Drive Configuration Setup
2+
3+
This guide will help you configure Google Drive as a provider for DocBinder.
4+
5+
## Prerequisites
6+
7+
- A Google account
8+
- Access to [Google Cloud Console](https://console.cloud.google.com/)
9+
- DocBinder installed
10+
11+
## Step 1: Create a Google Cloud Project
12+
13+
1. Go to the [Google Cloud Console](https://console.cloud.google.com/).
14+
2. Click on **Select a project** and then **New Project**.
15+
3. Enter a project name and click **Create**.
16+
17+
## Step 2: Enable Google Drive API
18+
19+
1. In your project dashboard, navigate to **APIs & Services > Library**.
20+
2. Search for **Google Drive API**.
21+
3. Click **Enable**.
22+
23+
## Step 3: Create OAuth 2.0 Credentials
24+
25+
1. Go to **APIs & Services > Credentials**.
26+
2. Click **+ CREATE CREDENTIALS** and select **OAuth client ID**.
27+
3. Configure the consent screen if prompted.
28+
4. Choose **Desktop app** or **Web application** as the application type.
29+
5. Enter a name and click **Create**.
30+
6. Download the `credentials.json` file.
31+
32+
## Step 4: Configure DocBinder
33+
34+
1. Place your downloaded credentials file somewhere accessible (e.g., ~/gcp_credentials.json).
35+
2. The application will generate a token file (e.g., ~/gcp_token.json) after the first authentication.
36+
37+
## Step 5: Edit the Config File
38+
39+
Create the config file, and add a provider entry for Google Drive:
40+
```yaml
41+
providers:
42+
- type: google_drive
43+
name: my_gdrive
44+
gcp_credentials_json: ./gcp_credentials.json
45+
gcp_token_json: ./gcp_token.json
46+
```
47+
48+
* type: Must be google_drive.
49+
* name: A unique name for this provider.
50+
* gcp_credentials_json: Absolute/relative path to your Google Cloud credentials file.
51+
* gcp_token_json: Absolute/relative path where the token will be stored/generated.
52+
53+
## Step 6: Authenticate and Test
54+
55+
1. Run DocBinder with the Google Drive provider enabled.
56+
2. On first run, follow the authentication prompt to grant access.
57+
3. Verify that DocBinder can access your Google Drive files.
58+
59+
## Troubleshooting
60+
61+
- Ensure your credentials file is in the correct location.
62+
- Check that the Google Drive API is enabled for your project.
63+
- Review the [Google API Console](https://console.developers.google.com/) for error messages.
64+
65+
## References
66+
67+
- [Google Drive API Documentation](https://developers.google.com/drive)
68+
- [DocBinder OSS - GitHub](https://github.com/SnappyLab/DocBinder-OSS)

mkdocs.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,9 @@ nav:
1010
- Commands:
1111
- Main CLI: commands/main.md
1212
- Provider: commands/provider.md
13+
- Providers:
14+
- Google Drive: tool/providers/google_drive.md
15+
- Custom Provider: tool/providers/custom_provider.md
1316
- Contributing: CONTRIBUTING.md
1417
- Code of Conduct: CODE_OF_CONDUCT.md
1518
- Security: SECURITY.md

provider_setup_example.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
providers:
2+
- type: google_drive
3+
name: my_google_drive
4+
gcp_credentials_json: gcp_credentials.json

pyproject.toml

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -32,8 +32,10 @@ include = ["src/docbinder_oss/**"]
3232

3333
[dependency-groups]
3434
dev = [
35+
"black>=25.1.0",
3536
"mkdocs>=1.6.1",
3637
"mkdocs-material>=9.6.14",
38+
"pre-commit>=4.2.0",
3739
"pytest>=8.4.0",
3840
"tox>=4.26.0",
3941
"tox-uv>=1.26.0",
@@ -47,8 +49,7 @@ testpaths = [
4749
]
4850

4951
[tool.ruff]
50-
# Set the maximum line length to 100.
51-
line-length = 100
52+
line-length = 125
5253

5354
[tool.ruff.lint]
5455
# Add the `line-too-long` rule to the enforced rule set. By default, Ruff omits rules that

0 commit comments

Comments
 (0)