Runtime
Bundler
Package Manager
Test Runner
Guides
Reference
Blog
Install Bun
Runtime Utilities

HTMLRewriter

Use Bun's HTMLRewriter to transform HTML documents with CSS selectors

HTMLRewriter lets you use CSS selectors to transform HTML documents. It works with Request, Response, as well as string. Bun's implementation is based on Cloudflare's lol-html.


Usage

A common usecase is rewriting URLs in HTML content. Here's an example that rewrites image sources and link URLs to use a CDN domain:

// Replace all images with a rickroll
const rewriter = new HTMLRewriter().on("img", {
  element(img) {
    // Famous rickroll video thumbnail
    img.setAttribute("src", "https://img.youtube.com/vi/dQw4w9WgXcQ/maxresdefault.jpg");

    // Wrap the image in a link to the video
    img.before('<a href="https://www.youtube.com/watch?v=dQw4w9WgXcQ" target="_blank">', {
      html: true,
    });
    img.after("</a>", { html: true });

    // Add some fun alt text
    img.setAttribute("alt", "Definitely not a rickroll");
  },
});

// An example HTML document
const html = `
<html>
<body>
  <img src="/cat.jpg">
  <img src="dog.png">
  <img src="https://example.com/bird.webp">
</body>
</html>
`;

const result = rewriter.transform(html);
console.log(result);

This replaces all images with a thumbnail of Rick Astley and wraps each <img> in a link, producing a diff like this:

<html>
  <body>
    <img src="/cat.jpg" /> 
    <img src="dog.png" /> 
    <img src="https://example.com/bird.webp" /> 
    <a href="https://www.youtube.com/watch?v=dQw4w9WgXcQ" target="_blank"> 
      <img src="https://img.youtube.com/vi/dQw4w9WgXcQ/maxresdefault.jpg" alt="Definitely not a rickroll" /> 
    </a> 
    <a href="https://www.youtube.com/watch?v=dQw4w9WgXcQ" target="_blank"> 
      <img src="https://img.youtube.com/vi/dQw4w9WgXcQ/maxresdefault.jpg" alt="Definitely not a rickroll" /> 
    </a> 
    <a href="https://www.youtube.com/watch?v=dQw4w9WgXcQ" target="_blank"> 
      <img src="https://img.youtube.com/vi/dQw4w9WgXcQ/maxresdefault.jpg" alt="Definitely not a rickroll" /> 
    </a> 
  </body>
</html>

Now every image on the page will be replaced with a thumbnail of Rick Astley, and clicking any image will lead to a very famous video.

Input types

HTMLRewriter can transform HTML from various sources. The input is automatically handled based on its type:

// From Response
rewriter.transform(new Response("<div>content</div>"));

// From string
rewriter.transform("<div>content</div>");

// From ArrayBuffer
rewriter.transform(new TextEncoder().encode("<div>content</div>").buffer);

// From Blob
rewriter.transform(new Blob(["<div>content</div>"]));

// From File
rewriter.transform(Bun.file("index.html"));

Note that Cloudflare Workers implementation of HTMLRewriter only supports Response objects.

Element Handlers

The on(selector, handlers) method allows you to register handlers for HTML elements that match a CSS selector. The handlers are called for each matching element during parsing:

rewriter.on("div.content", {
  // Handle elements
  element(element) {
    element.setAttribute("class", "new-content");
    element.append("<p>New content</p>", { html: true });
  },
  // Handle text nodes
  text(text) {
    text.replace("new text");
  },
  // Handle comments
  comments(comment) {
    comment.remove();
  },
});

The handlers can be asynchronous and return a Promise. Note that async operations will block the transformation until they complete:

rewriter.on("div", {
  async element(element) {
    await Bun.sleep(1000);
    element.setInnerContent("<span>replace</span>", { html: true });
  },
});

CSS Selector Support

The on() method supports a wide range of CSS selectors:

// Tag selectors
rewriter.on("p", handler);

// Class selectors
rewriter.on("p.red", handler);

// ID selectors
rewriter.on("h1#header", handler);

// Attribute selectors
rewriter.on("p[data-test]", handler); // Has attribute
rewriter.on('p[data-test="one"]', handler); // Exact match
rewriter.on('p[data-test="one" i]', handler); // Case-insensitive
rewriter.on('p[data-test="one" s]', handler); // Case-sensitive
rewriter.on('p[data-test~="two"]', handler); // Word match
rewriter.on('p[data-test^="a"]', handler); // Starts with
rewriter.on('p[data-test$="1"]', handler); // Ends with
rewriter.on('p[data-test*="b"]', handler); // Contains
rewriter.on('p[data-test|="a"]', handler); // Dash-separated

// Combinators
rewriter.on("div span", handler); // Descendant
rewriter.on("div > span", handler); // Direct child

// Pseudo-classes
rewriter.on("p:nth-child(2)", handler);
rewriter.on("p:first-child", handler);
rewriter.on("p:nth-of-type(2)", handler);
rewriter.on("p:first-of-type", handler);
rewriter.on("p:not(:first-child)", handler);

// Universal selector
rewriter.on("*", handler);

Element Operations

Elements provide various methods for manipulation. All modification methods return the element instance for chaining:

rewriter.on("div", {
  element(el) {
    // Attributes
    el.setAttribute("class", "new-class").setAttribute("data-id", "123");

    const classAttr = el.getAttribute("class"); // "new-class"
    const hasId = el.hasAttribute("id"); // boolean
    el.removeAttribute("class");

    // Content manipulation
    el.setInnerContent("New content"); // Escapes HTML by default
    el.setInnerContent("<p>HTML content</p>", { html: true }); // Parses HTML
    el.setInnerContent(""); // Clear content

    // Position manipulation
    el.before("Content before").after("Content after").prepend("First child").append("Last child");

    // HTML content insertion
    el.before("<span>before</span>", { html: true })
      .after("<span>after</span>", { html: true })
      .prepend("<span>first</span>", { html: true })
      .append("<span>last</span>", { html: true });

    // Removal
    el.remove(); // Remove element and contents
    el.removeAndKeepContent(); // Remove only the element tags

    // Properties
    console.log(el.tagName); // Lowercase tag name
    console.log(el.namespaceURI); // Element's namespace URI
    console.log(el.selfClosing); // Whether element is self-closing (e.g. <div />)
    console.log(el.canHaveContent); // Whether element can contain content (false for void elements like <br>)
    console.log(el.removed); // Whether element was removed

    // Attributes iteration
    for (const [name, value] of el.attributes) {
      console.log(name, value);
    }

    // End tag handling
    el.onEndTag(endTag => {
      endTag.before("Before end tag");
      endTag.after("After end tag");
      endTag.remove(); // Remove the end tag
      console.log(endTag.name); // Tag name in lowercase
    });
  },
});

Text Operations

Text handlers provide methods for text manipulation. Text chunks represent portions of text content and provide information about their position in the text node:

rewriter.on("p", {
  text(text) {
    // Content
    console.log(text.text); // Text content
    console.log(text.lastInTextNode); // Whether this is the last chunk
    console.log(text.removed); // Whether text was removed

    // Manipulation
    text.before("Before text").after("After text").replace("New text").remove();

    // HTML content insertion
    text
      .before("<span>before</span>", { html: true })
      .after("<span>after</span>", { html: true })
      .replace("<span>replace</span>", { html: true });
  },
});

Comment Operations

Comment handlers allow comment manipulation with similar methods to text nodes:

rewriter.on("*", {
  comments(comment) {
    // Content
    console.log(comment.text); // Comment text
    comment.text = "New comment text"; // Set comment text
    console.log(comment.removed); // Whether comment was removed

    // Manipulation
    comment.before("Before comment").after("After comment").replace("New comment").remove();

    // HTML content insertion
    comment
      .before("<span>before</span>", { html: true })
      .after("<span>after</span>", { html: true })
      .replace("<span>replace</span>", { html: true });
  },
});

Document Handlers

The onDocument(handlers) method allows you to handle document-level events. These handlers are called for events that occur at the document level rather than within specific elements:

rewriter.onDocument({
  // Handle doctype
  doctype(doctype) {
    console.log(doctype.name); // "html"
    console.log(doctype.publicId); // public identifier if present
    console.log(doctype.systemId); // system identifier if present
  },
  // Handle text nodes
  text(text) {
    console.log(text.text);
  },
  // Handle comments
  comments(comment) {
    console.log(comment.text);
  },
  // Handle document end
  end(end) {
    end.append("<!-- Footer -->", { html: true });
  },
});

Response Handling

When transforming a Response:

  • The status code, headers, and other response properties are preserved
  • The body is transformed while maintaining streaming capabilities
  • Content-encoding (like gzip) is handled automatically
  • The original response body is marked as used after transformation
  • Headers are cloned to the new response

Error Handling

HTMLRewriter operations can throw errors in several cases:

  • Invalid selector syntax in on() method
  • Invalid HTML content in transformation methods
  • Stream errors when processing Response bodies
  • Memory allocation failures
  • Invalid input types (e.g., passing Symbol)
  • Body already used errors

Errors should be caught and handled appropriately:

try {
  const result = rewriter.transform(input);
  // Process result
} catch (error) {
  console.error("HTMLRewriter error:", error);
}

See also

You can also read the Cloudflare documentation, which this API is intended to be compatible with.