Background
Current records capture text/binary payloads but lack explicit relationships. Emerging GraphRAG workflows need to encode node/edge connections alongside content.
Proposal
- Extend the Lance schema with a
relationships column (e.g., list<struct { target_id: string, relation: string, weight: float? }>).
- Update serialization logic so Python callers can attach relationships when adding context.
- Provide helper APIs to query by related node ids and expose relationship metadata in search results.
Design Notes
- Store relationships as Arrow structs to keep everything columnar and enable joins during search.
- Backfill existing datasets by defaulting relationships to an empty list; add a lightweight migration utility for older manifests.
- Consider pairing with Lance's graph index once available to accelerate neighbor lookups.
Acceptance Criteria
- New records can round-trip relationships through Rust and Python APIs.
- Search responses optionally include related nodes/edges when requested.
- Documentation covers schema changes and provides examples for encoding graph edges.
Background
Current records capture text/binary payloads but lack explicit relationships. Emerging GraphRAG workflows need to encode node/edge connections alongside content.
Proposal
relationshipscolumn (e.g., list<struct { target_id: string, relation: string, weight: float? }>).Design Notes
Acceptance Criteria