-
Notifications
You must be signed in to change notification settings - Fork 41
HTML Rewriter #1193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTML Rewriter #1193
Conversation
| headers: { 'Content-Type': 'text/html' }, | ||
| }).text(); | ||
| strictEqual(textEscape, expectedEscape); | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have a test that makes sure Content-Type directives also work ok? As in, that they're still correctly detected as HTML?
Additionally, is there any way to force this to run for, e.g. xhtml, xml, or other things that might technically be parseable? I guess you're expected to just wrap it in new Response(incomingBody, { headers: { 'Content-Type': 'text/html' } })?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rewriter isn't affected by the Content-Type headers, so I'm not sure we need that here, but happy to add it if you still think it's worthwhile!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, if it doesn't care about the content-type header, then that's fine. Since you were using it in every test, I assumed that meant the rewriter wouldn't fire unless it detected HTML content type.
zkat
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
|
Thanks for adding this! Looked for something like this in Fastly's JS API a month ago and didn't find it so I ended up writing my own (non-streaming) transformer using htmlparser2 and cheerio. I switched over to this one today and performance is way better (4s to transform 1000 small documents locally down to <1) and my WASM file size dropped by 7 mb. |
|
@IGx89 great to hear, thank you! |
Adds an HTML rewriter feature with an interface mostly the same as Akamai's html-rewriter. Uses the
lol-htmllibrary under the hood.A few potentially questionable design decisions which we may want to go a different way on:
lol-html's C API into this repo. This was done for a few reasons:rustcversion that we build with, so I patched that change.lol-htmlrepo and is not published on crates.io.lib.namein itsCargo.toml, which is incompatible with StarlingMonkey'sadd_rust_libfunction, so I patched that too.rustcwe compile with (which is a change made in upstream StarlingMonkey already), patchadd_rust_libto support customlib.names, then grab the C API crate throughcargowith a simple wrapper library, similar to the existing crates in StarlingMonkey. Alternatively, we could make a fork oflol-htmland point at that instead.insert_implicit_closefeature of Akamai's API, because it seems less important and I don't see a simple way to do this fromlol-html. I also extended their API with aescapeHTMLoption for all insertion functions, which allows inserting HTML content as text.fastly:html-rewriter. Ideally someone who knows their way around the module system better can tell me what to do here 😄BEGIN_COMMIT_OVERRIDE
feat: HTML Rewriter
END_COMMIT_OVERRIDE