Skip to content

Kevin-Tucuxi/ContractPlaybookBuilder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Contract Playbook Builder v2

An AI-powered application for generating comprehensive contract playbooks from your agreements. Uses local LLMs (via Ollama) for deep legal analysis and IBM Docling for intelligent document parsing.

Key Improvements over v1

Feature v1 v2
Document Parsing Basic regex IBM Docling AI (DocLayNet, TableFormer)
Clause Analysis Hard-coded templates LLM reasoning (Qwen3, GPT-OSS, etc.)
Comparison Word overlap Semantic embeddings (vector similarity)
Fallback Generation Manual patterns LLM-generated from actual negotiations
Output Quality Shallow Matches manual expert analysis

Prerequisites

1. Ollama (Required)

Install and run Ollama with a capable model:

# Install Ollama (macOS)
brew install ollama

# Start the server
ollama serve

# Pull a model (choose one)
ollama pull qwen3:30b-a3b          # Recommended - great reasoning
ollama pull gpt-oss:20b            # Alternative
ollama pull ministral:14b          # Smaller, faster
ollama pull nomic-embed-text       # For embeddings

2. Python 3.10+

python3 --version  # Should be 3.10 or higher

3. Docling (Optional but Recommended)

For best document parsing:

pip install docling

Quick Start

cd contract-playbook-v2
./start.sh

Then open: http://localhost:3004

How It Works

Pipeline Architecture

┌─────────────────────────────────────────────────────────────────┐
│                         Upload Documents                         │
│              (Template + Negotiated Agreements)                  │
└─────────────────────────────────────────────────────────────────┘
                                │
                                ▼
┌─────────────────────────────────────────────────────────────────┐
│                    IBM Docling Parser                            │
│  • DocLayNet layout analysis                                     │
│  • TableFormer table extraction                                  │
│  • Reading order detection                                       │
│  • Hierarchical section extraction                               │
└─────────────────────────────────────────────────────────────────┘
                                │
                                ▼
┌─────────────────────────────────────────────────────────────────┐
│                    Semantic Engine                               │
│  • Generate embeddings (nomic-embed-text)                        │
│  • Store in ChromaDB vector database                             │
│  • Cluster similar clause variations                             │
│  • Detect outliers                                               │
└─────────────────────────────────────────────────────────────────┘
                                │
                                ▼
┌─────────────────────────────────────────────────────────────────┐
│                    LLM Analysis (Ollama)                         │
│  • Deep clause-by-clause analysis                                │
│  • Purpose, concerns, positions reasoning                        │
│  • Compare template vs negotiated semantically                   │
│  • Generate fallback language recommendations                    │
│  • Classify clause types and severity                            │
└─────────────────────────────────────────────────────────────────┘
                                │
                                ▼
┌─────────────────────────────────────────────────────────────────┐
│                    Playbook Generation                           │
│  • Comprehensive Excel with all analysis                         │
│  • Interactive HTML with fallback selection                      │
│  • JSON for API/integration                                      │
└─────────────────────────────────────────────────────────────────┘

What the LLM Analyzes

For each clause, the LLM provides:

  • Purpose & Rationale: What the clause accomplishes
  • Key Provisions: Main requirements and obligations
  • Customer Concerns: What customers typically object to
  • Edits to Watch: Specific modifications customers request
  • Provider Position: Why to maintain these terms
  • Negotiation Leverage: When to be flexible vs. firm
  • Risk if Modified: Consequences of weakening terms
  • Do Not Accept: Red lines that create unacceptable risk
  • Industry Context: Comparison to market standard
  • Related Clauses: Dependencies and interactions

Fallback Generation

Fallbacks are derived from:

  1. Actual negotiated agreements you upload
  2. LLM synthesis of common patterns
  3. Semantic clustering to group similar variations
  4. Financial tier context (enterprise vs. small deals)

Each fallback includes:

  • What changes from standard
  • When to use this fallback
  • Protections still maintained
  • Source deals and frequency

Configuration

Edit config/settings.yaml:

llm:
  primary_model: "qwen3:30b-a3b"  # Your preferred model
  ollama_host: "http://localhost:11434"
  temperature: 0.3  # Lower = more consistent

embeddings:
  model: "nomic-embed-text"
  similarity_threshold: 0.75

playbook:
  provider_name: "Your Company"
  provider_aliases:
    - "Your Company, Inc."

API Endpoints

Endpoint Method Description
POST /api/session Create session
POST /api/session/{id}/upload/template Upload template
POST /api/session/{id}/upload/negotiated Upload agreements
POST /api/session/{id}/generate Generate playbook
GET /api/session/{id}/export/{format} Download (json/excel/html)
GET /api/session/{id}/viewer Interactive viewer

Output Files

After generation:

data/output/{session_id}/
├── playbook.json      # Full playbook data
├── playbook.xlsx      # Excel with analysis tabs
└── playbook.html      # Interactive viewer

Model Recommendations

Model Size Speed Quality Use Case
qwen3:30b-a3b 30B Medium Excellent Best overall
gpt-oss:20b 20B Medium Very Good Alternative
ministral:14b 14B Fast Good Quick analysis
qwen3-vl:30b 30B Medium Excellent If using images

For embeddings, use nomic-embed-text (768 dimensions, fast, good quality).

Troubleshooting

Ollama not responding

# Check if running
curl http://localhost:11434/api/tags

# Restart
ollama serve

Model not found

# List available models
ollama list

# Pull the model
ollama pull qwen3:30b-a3b

Out of memory

Use a smaller model:

llm:
  primary_model: "ministral:14b"

Docling not parsing

Falls back to pdfplumber automatically. For better results:

pip install docling

Architecture

contract-playbook-v2/
├── backend/
│   ├── api/
│   │   ├── app.py           # Flask server
│   │   └── exporters.py     # Excel/HTML generation
│   ├── parsers/
│   │   └── docling_parser.py # Document parsing
│   └── analysis/
│       ├── llm_client.py     # Ollama integration
│       ├── semantic_engine.py # Embeddings & similarity
│       └── playbook_builder.py # Main generation logic
├── frontend/
│   └── build/index.html      # Web UI
├── config/
│   └── settings.yaml         # Configuration
├── data/
│   ├── uploads/              # Uploaded documents
│   ├── embeddings/           # Vector database
│   └── output/               # Generated playbooks
└── start.sh                  # Startup script

License

MIT

About

Tool to parse and redact contracts and build out playbooks with your corpus of agreements

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors