Skip to content

Sync WHATWG URL parser with upstream standards #17540

Closed
@TimothyGu

Description

@TimothyGu

There have been some recent changes in the standards governing our new URL parser API. We need to keep up with those changes in our implementation of the API.

  • Add space to class string of iterator objects (whatwg/webidl@4fcfaea) (lib: add space to class string of iterator objects and updated tests accordingly #17558)
    Change the 'URLSearchParamsIterator' in

    defineIDLClass(URLSearchParamsIteratorPrototype, 'URLSearchParamsIterator', {
    to 'URLSearchParams Iterator', and update tests if necessary.

  • Percent-encode additional characters in "fragment state" (whatwg/url@7a3c69f) (url: added url fragment lookup table #17627)

    • Add a new FRAGMENT_ENCODE_SET lookup table like

      node/src/node_url.cc

      Lines 215 to 280 in e55b7d6

      static const uint8_t C0_CONTROL_ENCODE_SET[32] = {
      // 00 01 02 03 04 05 06 07
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // 08 09 0A 0B 0C 0D 0E 0F
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // 10 11 12 13 14 15 16 17
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // 18 19 1A 1B 1C 1D 1E 1F
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // 20 21 22 23 24 25 26 27
      0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00,
      // 28 29 2A 2B 2C 2D 2E 2F
      0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00,
      // 30 31 32 33 34 35 36 37
      0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00,
      // 38 39 3A 3B 3C 3D 3E 3F
      0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00,
      // 40 41 42 43 44 45 46 47
      0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00,
      // 48 49 4A 4B 4C 4D 4E 4F
      0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00,
      // 50 51 52 53 54 55 56 57
      0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00,
      // 58 59 5A 5B 5C 5D 5E 5F
      0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00,
      // 60 61 62 63 64 65 66 67
      0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00,
      // 68 69 6A 6B 6C 6D 6E 6F
      0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00,
      // 70 71 72 73 74 75 76 77
      0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00,
      // 78 79 7A 7B 7C 7D 7E 7F
      0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x00 | 0x80,
      // 80 81 82 83 84 85 86 87
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // 88 89 8A 8B 8C 8D 8E 8F
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // 90 91 92 93 94 95 96 97
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // 98 99 9A 9B 9C 9D 9E 9F
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // A0 A1 A2 A3 A4 A5 A6 A7
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // A8 A9 AA AB AC AD AE AF
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // B0 B1 B2 B3 B4 B5 B6 B7
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // B8 B9 BA BB BC BD BE BF
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // C0 C1 C2 C3 C4 C5 C6 C7
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // C8 C9 CA CB CC CD CE CF
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // D0 D1 D2 D3 D4 D5 D6 D7
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // D8 D9 DA DB DC DD DE DF
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // E0 E1 E2 E3 E4 E5 E6 E7
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // E8 E9 EA EB EC ED EE EF
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // F0 F1 F2 F3 F4 F5 F6 F7
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80,
      // F8 F9 FA FB FC FD FE FF
      0x01 | 0x02 | 0x04 | 0x08 | 0x10 | 0x20 | 0x40 | 0x80
      };
      but with bits corresponding to 0x20, 0x22, 0x3C, 0x3E, and 0x60 set in additional to what's already set in C0_CONTROL_ENCODE_SET, per spec.
    • Replace C0_CONTROL_ENCODE_SET with the new lookup table under kFragment state in URL::Parse().
    • Port web-platform-tests/wpt@cb0662b to
      test/fixtures/url-setter-tests.js and test/fixtures/url-tests.js.
    • Make corresponding changes in the documentation in doc/api/url.md

Metadata

Metadata

Assignees

No one assigned

    Labels

    help wantedIssues that need assistance from volunteers or PRs that need help to proceed.whatwg-urlIssues and PRs related to the WHATWG URL implementation.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions