Skip to content

gh-67041: Allow to distinguish between empty and not defined URI components #123305

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

serhiy-storchaka
Copy link
Member

@serhiy-storchaka serhiy-storchaka commented Aug 25, 2024

Changes in the urllib.parse module:

  • Add option allow_none in urlparse(), urlsplit() and urldefrag(). If it is true, represent not defined components as None instead of an empty string.
  • Add option keep_empty in urlunparse() and urlunsplit(). If it is true, keep empty non-None components in the resulting string. By default it is the same as the allow_none value for the result of the urlparse() and urlsplit() calls.
  • Add option keep_empty in the geturl() method of DefragResult, SplitResult, ParseResult and the corresponding bytes counterparts.

…I components

Changes in the urllib.parse module:

* Add option allow_none in urlparse(), urlsplit() and urldefrag(). If
  it is true, represent not defined components as None instead of an
  empty string.
* Add option keep_empty in urlunparse() and urlunsplit(). If it is
  true, keep empty non-None components in the resulting string.
* Add option keep_empty in the geturl() method of DefragResult,
  SplitResult, ParseResult and the corresponding bytes counterparts.
@serhiy-storchaka serhiy-storchaka force-pushed the urllib-parse-allow-none branch from 7032015 to a60c9be Compare August 31, 2024 09:55
@serhiy-storchaka serhiy-storchaka marked this pull request as ready for review November 27, 2024 11:16
@serhiy-storchaka
Copy link
Member Author

It is now ready to review. The status of allow_none is now saved in the DefragResult, SplitResult and ParseResult objects, so in most cases there is no need to pass the keep_empty argument. geturl() no longer needs the keep_empty parameter.

Unfortunately, these objects now have __dict__ and no longer immutable. This is because non-empty __slots__ is not compatible with tuple subclasses. This is a separate complex issue. I'll try to find a solution of it, but it may be difficult.

The long term plan is to make allow_none True by default, and later deprecate allow_none=False. keep_empty=False can still be useful.

@serhiy-storchaka serhiy-storchaka marked this pull request as draft November 28, 2024 09:14
@serhiy-storchaka serhiy-storchaka marked this pull request as ready for review December 5, 2024 11:10
@serhiy-storchaka
Copy link
Member Author

I am sorry, I forget to copy the _keep_empty attribute in copying/encoding/decoding methods. Now the PR is ready for review.

@orsenthil, @barneygale, could you please make a review?

@barneygale barneygale self-requested a review December 5, 2024 14:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant