[Side-by-side test] mineru 3.x vs magic-pdf 1.3.x on Windows + RTX 4060 — 5 fewer pitfalls, 2.5× faster, but 1 new MAX_PATH gotcha #4890
gaoyuindeu
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi MinerU team and community,
Background: I started this morning trying to install
magic-pdf1.3.x following old tutorials and hit 7 distinct pitfalls before getting it to work. Then I realized the maintained package is nowmineru3.x (totally different name on PyPI), so I retested both side-by-side on the same 261-page scanned PDF, same hardware, same day.🔗 Repo with full data, configs, and both guides: https://github.com/gaoyuindeu/mineru-windows-setup-guide
Headline numbers (Windows 11 + RTX 4060 8GB)
magic-pdf1.3.12mineru3.1.6The
mineru3.x experience is dramatically smoother — kudos to the team for fixing 6 of the 7 issues that bit me on the old version.The 1 remaining Windows-specific gotcha (worth flagging)
FileNotFoundErroron long PDF filenames due to Windows MAX_PATH (260 char) limit.mineru's API-server architecture in 3.x writes through a deeply nested temp path:
With a reasonably long PDF filename (mine was ~80 chars), the full path exceeds Windows' default 260-char limit, even though the path is fine on disk-space terms.
Workaround that works: rename the PDF to something short (
my.pdf) before running.Suggested fix for maintainers: either (a) shorten the temp path scheme (use a hash of the basename rather than the full basename in the temp tree), or (b) detect Windows + long path and warn the user, or (c) document
LongPathsEnabledregistry tweak in the FAQ.Quality observation: tables are actually better in 3.x (I had this wrong at first)
My first impression was that
mineru3.x flattens tables. After comparing more carefully, that's not what's happening:mineru3.x's HTML is measurably cleaner — correct row numbers, no merged-cell OCR errors, proper whitespace.magic-pdf1.3.x forced these into HTML tables but the cell boundaries came out broken (lots of empty<td>s).mineru3.x renders them as paragraphs, which is structurally simpler and arguably more accurate to the source intent.So this is a net upgrade for table extraction, not a regression. Apologies for the initial mischaracterization in earlier drafts of this writeup.
One pitfall that's actually generic, not mineru-specific
pip install -U "mineru[all]"pullstorch==2.8.0+cpuon Windows. This bites every Windows + GPU user installing any PyTorch-based tool, but worth a one-liner mention in the Windows install docs:Hardware data point for FAQ
For anyone considering an 8GB consumer GPU (RTX 4060): the
pipelinebackend runs comfortably and finished a 261-page scanned report in 9.9 min. VRAM peak ~7GB, GPU util 99–100% during inference batches. So the "min 4GB VRAM" spec for pipeline is accurate and 8GB is plenty of headroom.Hope this helps someone deciding which version to use, or anyone hitting MAX_PATH. Happy to discuss/clarify anything.
Test methodology, full output samples, configs, and a legacy-
magic-pdfsurvival guide are all in the repo.Disclaimer: I'm just a user. Documentation drafted with assistance from Claude (Anthropic). Repo content is CC BY 4.0 / MIT.
Beta Was this translation helpful? Give feedback.
All reactions