Improve DataFrame docs flow#1526
Conversation
|
This is a good start. You can get around the problem you have building the docs if you have built the repo in your venv. This looks mostly like moving text around. I asked my agent to take a look and it picked up on a couple of things which I agree with. Mostly I'm thinking about the last point it makes. I was hoping to get a fresh look on what would be an ideal flow to the site. This is a good start though. -- Flow problems
Does it close #1397?Partially, but reasonably. Issue: page "jumps around from common operations to Arrow C interface to rendering" + needs a home for execution metrics. PR pulls Arrow deep-dive into own page, adds lifecycle roadmap, links metrics page. Main complaint (Arrow deep-dive breaking the flow) — fixed. Gap: page still ends with reference-dump — "Core Classes", "Expression Classes", "Built-in Functions" — untouched. That's part of the "all over the place" the issue named. Out of scope is defensible, but #1397 asked for a "fresh look at organization," so a reviewer could push for more. Not a blocker. |
Closes #1397.
Rationale for this change
The DataFrame guide currently mixes the main user flow with lower-level Arrow streaming details, display behavior, and metrics guidance. This makes the page harder to scan for new users who are trying to understand the basic DataFrame lifecycle.
What changes are included in this PR?
__arrow_c_stream__content into a dedicatedarrow-interfacepage under the DataFrame section.Are there any user-facing changes?
Yes, documentation-only changes. The DataFrame docs should be easier to scan and the Arrow streaming content now has a dedicated page.
Verification performed by Vandit:
git diff --checkdocs/source/user-guide/dataframe/index.rstlinksarrow-interfaceand thatdocs/source/user-guide/dataframe/arrow-interface.rstexists with the expected headingdocs/source/index.rstIPython example because the compileddatafusionpackage was not installed in that temp venv (ModuleNotFoundError: No module named 'datafusion').