Skip to content

Improved Footnote Serialization in MarkdownDocSerializer #3128

@simonschoe

Description

@simonschoe

Requested feature

Currently, footnotes are serialized as part of MarkdownDocSerializer more or less as-is:

Image

Serialized as:

5 https://github.com/tesseract-ocr/tesseract

6 https://github.com/VikParuchuri/surya

7 https://github.com/lukas-blecher/LaTeX-OCR

Alternatives

For downstream LLM-based applications it would be helpful if footnotes were serialized as actual footnotes in Markdown Syntax for the LLM to indentify them as footnotes (and not as a numbered list, for example).

^[5 https://github.com/tesseract-ocr/tesseract]

^[6 https://github.com/VikParuchuri/surya]

^[7 https://github.com/lukas-blecher/LaTeX-OCR]

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions