# rehype-parse [![Build][build-badge]][build] [![Coverage][coverage-badge]][coverage] [![Downloads][downloads-badge]][downloads] [![Size][size-badge]][size] [![Sponsors][sponsors-badge]][collective] [![Backers][backers-badge]][collective] [![Chat][chat-badge]][chat] **[rehype][]** plugin to add support for parsing HTML input. ## Contents * [What is this?](#what-is-this) * [When should I use this?](#when-should-i-use-this) * [Install](#install) * [Use](#use) * [API](#api) * [`unified().use(rehypeParse[, options])`](#unifieduserehypeparse-options) * [Examples](#examples) * [Example: fragment versus document](#example-fragment-versus-document) * [Example: whitespace around and inside ``](#example-whitespace-around-and-inside-html) * [Example: parse errors](#example-parse-errors) * [Syntax](#syntax) * [Syntax tree](#syntax-tree) * [Types](#types) * [Compatibility](#compatibility) * [Security](#security) * [Contribute](#contribute) * [Sponsor](#sponsor) * [License](#license) ## What is this? This package is a [unified][] ([rehype][]) plugin that defines how to take HTML as input and turn it into a syntax tree. When it’s used, HTML can be parsed and other rehype plugins can be used after it. See [the monorepo readme][rehype] for info on what the rehype ecosystem is. ## When should I use this? This plugin adds support to unified for parsing HTML. You can alternatively use [`rehype`][rehype-core] instead, which combines unified, this plugin, and [`rehype-stringify`][rehype-stringify]. When you’re in a browser, trust your content, don’t need positional info, and value a smaller bundle size, you can use [`rehype-dom-parse`][rehype-dom-parse] instead. This plugin is built on [`parse5`][parse5] and [`hast-util-from-parse5`][hast-util-from-parse5], which deal with HTML-compliant tokenizing, parsing, and creating nodes. rehype focusses on making it easier to transform content by abstracting such internals away. ## Install This package is [ESM only][esm]. In Node.js (version 12.20+, 14.14+, or 16.0+), install with [npm][]: ```sh npm install rehype-parse ``` In Deno with [Skypack][]: ```js import rehypeParse from 'https://cdn.skypack.dev/rehype-parse@8?dts' ``` In browsers with [Skypack][]: ```html ``` ## Use Say we have the following module `example.js`: ```js import {unified} from 'unified' import rehypeParse from 'rehype-parse' import rehypeRemark from 'rehype-remark' import remarkStringify from 'remark-stringify' main() async function main() { const file = await unified() .use(rehypeParse) .use(rehypeRemark) .use(remarkStringify) .process('

Hello, world!

') console.log(String(file)) } ``` …running that with `node example.js` yields: ```markdown # Hello, world! ``` ## API This package exports no identifiers. The default export is `rehypeParse`. ### `unified().use(rehypeParse[, options])` Add support for parsing HTML input. ##### `options` Configuration (optional). ###### `options.fragment` Specify whether to parse as a fragment (`boolean`, default: `false`). The default is to expect a whole document. In document mode, unopened `html`, `head`, and `body` elements are opened. ###### `options.space` Which space the document is in (`'svg'` or `'html'`, default: `'html'`). When an `` element is found in the HTML space, `rehype-parse` already automatically switches to and from the SVG space when entering and exiting it. > πŸ‘‰ **Note**: rehype is not an XML parser. > It supports SVG as embedded in HTML. > It does not support the features available in XML. > Passing SVG files might break but fragments of modern SVG should be fine. > πŸ‘‰ **Note**: make sure to set `fragment: true` if `space: 'svg'`. ###### `options.emitParseErrors` Emit [HTML parse errors][parse-errors] as warning messages (`boolean`, default: `false`). Specific rules can be turned off by setting their IDs in `options` to `false` (or `0`). The default, when `emitParseErrors: true`, is `true` (or `1`), and means that rules emit as warnings. Rules can also be configured with `2`, to turn them into fatal errors. The list of parse errors: * `abandonedHeadElementChild` β€” unexpected metadata element after head ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/abandoned-head-element-child/index.html)) * [`abruptClosingOfEmptyComment`](https://html.spec.whatwg.org/multipage/parsing.html#parse-error-abrupt-closing-of-empty-comment) β€” unexpected abruptly closed empty comment ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/abrupt-closing-of-empty-comment/index.html)) * [`abruptDoctypePublicIdentifier`](https://html.spec.whatwg.org/multipage/parsing.html#parse-error-abrupt-doctype-public-identifier) β€” unexpected abruptly closed public identifier ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/abrupt-doctype-public-identifier/index.html)) * [`abruptDoctypeSystemIdentifier`](https://html.spec.whatwg.org/multipage/parsing.html#parse-error-abrupt-doctype-system-identifier) β€” unexpected abruptly closed system identifier ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/abrupt-doctype-system-identifier/index.html)) * [`absenceOfDigitsInNumericCharacterReference`](https://html.spec.whatwg.org/multipage/parsing.html#parse-error-absence-of-digits-in-numeric-character-reference) β€” unexpected non-digit at start of numeric character reference ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/absence-of-digits-in-numeric-character-reference/index.html)) * [`cdataInHtmlContent`](https://html.spec.whatwg.org/multipage/parsing.html#parse-error-cdata-in-html-content) β€” unexpected CDATA section in HTML ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/cdata-in-html-content/index.html)) * [`characterReferenceOutsideUnicodeRange`](https://html.spec.whatwg.org/multipage/parsing.html#parse-error-character-reference-outside-unicode-range) β€” unexpected too big numeric character reference ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/character-reference-outside-unicode-range/index.html)) * `closingOfElementWithOpenChildElements` β€” unexpected closing tag with open child elements ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/closing-of-element-with-open-child-elements/index.html)) * [`controlCharacterInInputStream`](https://html.spec.whatwg.org/multipage/parsing.html#parse-error-control-character-in-input-stream) β€” unexpected control character ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/control-character-in-input-stream/index.html)) * [`controlCharacterReference`](https://html.spec.whatwg.org/multipage/parsing.html#parse-error-control-character-reference) β€” unexpected control character reference ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/control-character-reference/index.html)) * `disallowedContentInNoscriptInHead` β€” disallowed content inside `` ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/disallowed-content-in-noscript-in-head/index.html)) * [`duplicateAttribute`](https://html.spec.whatwg.org/multipage/parsing.html#parse-error-duplicate-attribute) β€” unexpected duplicate attribute ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/duplicate-attribute/index.html)) * [`endTagWithAttributes`](https://html.spec.whatwg.org/multipage/parsing.html#parse-error-end-tag-with-attributes) β€” unexpected attribute on closing tag ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/end-tag-with-attributes/index.html)) * [`endTagWithTrailingSolidus`](https://html.spec.whatwg.org/multipage/parsing.html#parse-error-end-tag-with-trailing-solidus) β€” unexpected slash at end of closing tag ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/end-tag-with-trailing-solidus/index.html)) * `endTagWithoutMatchingOpenElement` β€” unexpected unopened end tag ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/end-tag-without-matching-open-element/index.html)) * [`eofBeforeTagName`](https://html.spec.whatwg.org/multipage/parsing.html#parse-error-eof-before-tag-name) β€” unexpected end of file ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/eof-before-tag-name/index.html)) * [`eofInCdata`](https://html.spec.whatwg.org/multipage/parsing.html#parse-error-eof-in-cdata) β€” unexpected end of file in CDATA ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/eof-in-cdata/index.html)) * [`eofInComment`](https://html.spec.whatwg.org/multipage/parsing.html#parse-error-eof-in-comment) β€” unexpected end of file in comment ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/eof-in-comment/index.html)) * [`eofInDoctype`](https://html.spec.whatwg.org/multipage/parsing.html#parse-error-eof-in-doctype) β€” unexpected end of file in doctype ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/eof-in-doctype/index.html)) * `eofInElementThatCanContainOnlyText` β€” unexpected end of file in element that can only contain text ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/eof-in-element-that-can-contain-only-text/index.html)) * [`eofInScriptHtmlCommentLikeText`](https://html.spec.whatwg.org/multipage/parsing.html#parse-error-eof-in-script-html-comment-like-text) β€” unexpected end of file in comment inside script ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/eof-in-script-html-comment-like-text/index.html)) * [`eofInTag`](https://html.spec.whatwg.org/multipage/parsing.html#parse-error-eof-in-tag) β€” unexpected end of file in tag ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/eof-in-tag/index.html)) * [`incorrectlyClosedComment`](https://html.spec.whatwg.org/multipage/parsing.html#parse-error-incorrectly-closed-comment) β€” incorrectly closed comment ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/incorrectly-closed-comment/index.html)) * [`incorrectlyOpenedComment`](https://html.spec.whatwg.org/multipage/parsing.html#parse-error-incorrectly-opened-comment) β€” incorrectly opened comment ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/incorrectly-opened-comment/index.html)) * [`invalidCharacterSequenceAfterDoctypeName`](https://html.spec.whatwg.org/multipage/parsing.html#parse-error-invalid-character-sequence-after-doctype-name) β€” invalid sequence after doctype name ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/invalid-character-sequence-after-doctype-name/index.html)) * [`invalidFirstCharacterOfTagName`](https://html.spec.whatwg.org/multipage/parsing.html#parse-error-invalid-first-character-of-tag-name) β€” invalid first character in tag name ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/invalid-first-character-of-tag-name/index.html)) * `misplacedDoctype` β€” misplaced doctype ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/misplaced-doctype/index.html)) * `misplacedStartTagForHeadElement` β€” misplaced `` start tag ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/misplaced-start-tag-for-head-element/index.html)) * [`missingAttributeValue`](https://html.spec.whatwg.org/multipage/parsing.html#parse-error-missing-attribute-value) β€” missing attribute value ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/missing-attribute-value/index.html)) * `missingDoctype` β€” missing doctype before other content ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/missing-doctype/index.html)) * [`missingDoctypeName`](https://html.spec.whatwg.org/multipage/parsing.html#parse-error-missing-doctype-name) β€” missing doctype name ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/missing-doctype-name/index.html)) * [`missingDoctypePublicIdentifier`](https://html.spec.whatwg.org/multipage/parsing.html#parse-error-missing-doctype-public-identifier) β€” missing public identifier in doctype ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/missing-doctype-public-identifier/index.html)) * [`missingDoctypeSystemIdentifier`](https://html.spec.whatwg.org/multipage/parsing.html#parse-error-missing-doctype-system-identifier) β€” missing system identifier in doctype ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/missing-doctype-system-identifier/index.html)) * [`missingEndTagName`](https://html.spec.whatwg.org/multipage/parsing.html#parse-error-missing-end-tag-name) β€” missing name in end tag ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/missing-end-tag-name/index.html)) * [`missingQuoteBeforeDoctypePublicIdentifier`](https://html.spec.whatwg.org/multipage/parsing.html#parse-error-missing-quote-before-doctype-public-identifier) β€” missing quote before public identifier in doctype ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/missing-quote-before-doctype-public-identifier/index.html)) * [`missingQuoteBeforeDoctypeSystemIdentifier`](https://html.spec.whatwg.org/multipage/parsing.html#parse-error-missing-quote-before-doctype-system-identifier) β€” missing quote before system identifier in doctype ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/missing-quote-before-doctype-system-identifier/index.html)) * [`missingSemicolonAfterCharacterReference`](https://html.spec.whatwg.org/multipage/parsing.html#parse-error-missing-semicolon-after-character-reference) β€” missing semicolon after character reference ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/missing-semicolon-after-character-reference/index.html)) * [`missingWhitespaceAfterDoctypePublicKeyword`](https://html.spec.whatwg.org/multipage/parsing.html#parse-error-missing-whitespace-after-doctype-public-keyword) β€” missing whitespace after public identifier in doctype ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/missing-whitespace-after-doctype-public-keyword/index.html)) * [`missingWhitespaceAfterDoctypeSystemKeyword`](https://html.spec.whatwg.org/multipage/parsing.html#parse-error-missing-whitespace-after-doctype-system-keyword) β€” missing whitespace after system identifier in doctype ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/missing-whitespace-after-doctype-system-keyword/index.html)) * [`missingWhitespaceBeforeDoctypeName`](https://html.spec.whatwg.org/multipage/parsing.html#parse-error-missing-whitespace-before-doctype-name) β€” missing whitespace before doctype name ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/missing-whitespace-before-doctype-name/index.html)) * [`missingWhitespaceBetweenAttributes`](https://html.spec.whatwg.org/multipage/parsing.html#parse-error-missing-whitespace-between-attributes) β€” missing whitespace between attributes ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/missing-whitespace-between-attributes/index.html)) * [`missingWhitespaceBetweenDoctypePublicAndSystemIdentifiers`](https://html.spec.whatwg.org/multipage/parsing.html#parse-error-missing-whitespace-between-doctype-public-and-system-identifiers) β€” missing whitespace between public and system identifiers in doctype ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/missing-whitespace-between-doctype-public-and-system-identifiers/index.html)) * [`nestedComment`](https://html.spec.whatwg.org/multipage/parsing.html#parse-error-nested-comment) β€” unexpected nested comment ([example](https://github.com/rehypejs/rehype/blob/main/test/parse-error/nested-comment/index.html)) * `nestedNoscriptInHead` β€” unexpected nested `