-
Notifications
You must be signed in to change notification settings - Fork 4k
GH-48961: [Docs][Python] Doctest fails on pandas 3.0 #48969
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Thank you @tadeja for looking into this! One question regarding the bump of the Python version in Sphinx&Numpydoc job. I think it would be good if the examples worked for users with new or old pandas version. What if we use arrow/python/pyarrow/table.pxi Lines 1812 to 1814 in 95a3ed4
|
|
Agreed that it doesn't make sense for us to "test Pandas logic" especially in our docs. Agreed with @AlenkaF to instantiate the table in pyarrow. Using ellipsis in this case would hide the type and potentially increase user confusion :). |
|
Note that some examples are demonstrating conversion from pandas to pyarrow so in that case we might remove the string column and only keep the integer ones? |
rok
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me now. I think (hope) removing pandas from examples that don't require streamlines things for readers.
| day: int64 | ||
| n_legs: int64 | ||
| animals: string | ||
| animals: ...string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, I wasn't aware this works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
| animals: string | ||
| -- schema metadata -- | ||
| pandas: '{"index_columns": [{"kind": "range", "name": null, "start": 0, ... | ||
| >>> reader.read_all() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a good change, just pointing out there is some interesting behavior here.
736837d to
f224b15
Compare
|
@github-actions crossbow submit preview-docs |
|
Revision: 186c0a9 Submitted crossbow builds: ursacomputing/crossbow @ actions-ca47b1b8be
|
|
@AlenkaF this is ready for final review.
|
Rationale for this change
See issue #48961
Pandas 3.0.0 string storage type changes https://github.com/pandas-dev/pandas/pull/62118/changes
and https://pandas.pydata.org/docs/whatsnew/v3.0.0.html#dedicated-string-data-type-by-default
What changes are included in this PR?
Updating several doctest examples from
stringtolarge_string.Are these changes tested?
Yes, locally.
Are there any user-facing changes?
No.
Closes #48961