Skip to content

Unable to Generate Complete Tree Structure for Large PDFs Due to LLM Context Length Limit #323

@aag21

Description

@aag21

When a large PDF (e.g., 500 pages) is provided, the document content exceeds the LLM's context window limit. As a result, the LLM cannot process the entire document at once and fails to generate a complete tree structure representing the document hierarchy.

Current Behavior
Tree structure generation works for smaller documents.
For large PDFs, context length is exceeded before the entire document is processed.
The generated tree is incomplete and does not cover the full document.

Expected Behavior
The system should be able to generate a tree structure for the entire document, regardless of document size.

Limitation
Due to context length restrictions, the current approach requires chunking the document and using vector search/retrieval, which may lose global document structure and hierarchy information.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions