-
-
Notifications
You must be signed in to change notification settings - Fork 143
Broaden read csv param types #630
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Broaden the type hint for the 'names' param of read_csv (and read_table, which behaves similarly) from previous list[str], so that other valid types are accepted by mypy.
Noticed as I found clipboard after the changes to read_csv and read_table, and it calls it, so should match - but it was missing None as an option.
Match prior change to read_csv, since read_clipboard calls read_csv.
Match prior change to read_csv, read_table, read_clipboard.
This fixes the pycharm tooltip problem in gh-605, as well as allowing more list-like types of strings (tuples of strings, as well as mutable sequences of strings other than list), and callables that accept hashables, not just strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this nice PR and for combining the changes for both issues in the same PR, and the extensive tests.
I would like to add tests that make sure that a plain string is not accepted where it is not supposed to be. Something like this:
if TYPE_CHECKING_INVALID_USAGE:
pd.read_excel(path, names="foo") # type: ignore[call-overload] # pyright: ignore[reportGeneralTypeIssues]
pd.read_csv(path, usecols="foo") # type: ignore[arg-type] # pyright: ignore[reportGeneralTypeIssues]
I tried the above, with your PR, and it worked, but you should add tests for the names
and usecols
argument where str
is not accepted, and I think it is just for names
for read_excel()
.
I think that for read_excel()
you use call-overload
, but for the rest, arg-type
should work.
Strings aren't valid arguments here (except for read_excel, where we have a test now to check that this is accepted). Adding tests to make sure the type hints aren't overly wide and accept string arguments by mistake.
Added tests to make sure the plain string isn't accepted (and one for read_excel to check it is accepted where it should be). Thanks for the guidance, it made adding the tests nice and quick for me! Running the tests a few times, turns out it needs (I'm speculating maybe because |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @EFord36 . Nice work.
TBH, I've never figured out exactly why mypy picks one error over another. So we put in the error that was reported to make sure that we get through CI. For these kinds of tests where we are using |
assert_type()
to assert the type of any return valueSubmitted as 1 PR for the 2 issues because the relevant code is so close that I thought it might be quicker to review together, plus trying to have both as separate PRs would likely result in merge conflicts if/when one was merged before the other. The commits addressing the two issues are separate though, so it would be easy for me to split into 2 separate PR if this is preferred.
As well as read_csv, modify read_table, read_clipboard, read_excel, which all have the relevant params (I searched for other functions but let me know if I missed any).
Closes #605 in a different way than suggested in that ticket, in order to allow broader types than just lists of string or ints, since list-likes are allowed here. Naming of the type alias UsecolsArgType was inspired by ColspaceArgType, happy to change if desired.
Please let me know if coverage of more different types for
names
and/orusecols
is needed in the tests, or if there are stylistic details you would like me to change. There is a lot of repetition between the tests for the different functions that have these params - I'm happy to work on reducing it, but I thought it might mean restructuring some of the existing tests, so I didn't want to fiddle with it too much until I knew if it (and the actual functional change) was desired.Passes tests (
poetry run poe test_all
) locally when the last commit before my changes is7730abc
- the two commits after that (6819e32
and9c301b3
) result in test failures for me locally that seem unrelated to my change, and I get if I just run onmain
with no changes.