Skip to content

Conversation

@sirreal
Copy link
Member

@sirreal sirreal commented Sep 13, 2024

In the rules for parsing tokens in foreign content there are 2 places that indicated tokens should be processed "according to the rules given in the section corresponding to the current insertion mode in HTML content."

First (bold mine):

A start tag whose tag name is one of: "b", "big", "blockquote", "body", "br", "center", "code", "dd", "div", "dl", "dt", "em", "embed", "h1", "h2", "h3", "h4", "h5", "h6", "head", "hr", "i", "img", "li", "listing", "menu", "meta", "nobr", "ol", "p", "pre", "ruby", "s", "small", "span", "strong", "strike", "sub", "sup", "table", "tt", "u", "ul", "var"
A start tag whose tag name is "font", if the token has any attributes named "color", "face", or "size"
An end tag whose tag name is "br", "p"
Parse error.

While the current node is not a MathML text integration point, an HTML integration point, or an element in the HTML namespace, pop elements from the stack of open elements.

Reprocess the token according to the rules given in the section corresponding to the current insertion mode in HTML content.

And later (bold mine):

Any other end tag
Run these steps:

  1. Initialize node to be the current node (the bottommost node of the stack).
  2. If node's tag name, converted to ASCII lowercase, is not the same as the tag name of the token, then this is a parse error.
  3. Loop: If node is the topmost element in the stack of open elements, then return. (fragment case)
  4. If node's tag name, converted to ASCII lowercase, is the same as the tag name of the token, pop elements from the stack of open elements until node has been popped from the stack, and then return.
  5. Set node to the previous entry in the stack of open elements.
  6. If node is not an element in the HTML namespace, return to the step labeled loop.
  7. Otherwise, process the token according to the rules given in the section corresponding to the current insertion mode in HTML content.

While working on fragment parsing and html5lib-tests in dmsnell#22, I discovered an infinite loop that seems to occur in the following situation. At an svg:svg context node, create a fragment parser for with HTML </p>. This is the first condition mentioned.

In a full parser, the instruction "While the current node is not a MathML text integration point, an HTML integration point, or an element in the HTML namespace, pop elements from the stack of open elements." would ensure that when reprocessing the token it does not re-enter the foreign content rules because by popping elements the token would no longer be foreign content. However, in a fragment parser there may not be any nodes to pop and the context element may continue to cause foreign content handling to be applied, triggering an infinite loop. The instruction in the specification seems to indicate that the token should be handled in the current HTML insertion mode. This is fixed by moving to the same place in both cases which is a reproduction of the HTML handling switch on the current insertion mode.

This fixes the infinite loop that that appeared in dmsnell#22.

Observed in dmsnell#22.

Follow-up to [58868]
See: #61576


This Pull Request is for code review only. Please keep all other discussion in the Trac ticket. Do not merge this Pull Request. See GitHub Pull Requests for Code Review in the Core Handbook for more details.

This condition:

> Reprocess the token according to the rules given in the section
> corresponding to the current insertion mode in HTML content.

Was resulting in an infinite loop in fragment cases. In full documents,
after popping nodes the context is always moved so that foreign content
parsing is not used. This is not guaranteed in a fragment and could
cause an infinite loop.
@github-actions
Copy link

github-actions bot commented Sep 13, 2024

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

Core Committers: Use this line as a base for the props when committing in SVN:

Props jonsurrell, dmsnell.

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

@github-actions
Copy link

Test using WordPress Playground

The changes in this pull request can previewed and tested using a WordPress Playground instance.

WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser.

Some things to be aware of

  • The Plugin and Theme Directories cannot be accessed within Playground.
  • All changes will be lost when closing a tab with a Playground instance.
  • All changes will be lost when refreshing the page.
  • A fresh instance is created each time the link below is clicked.
  • Every time this pull request is updated, a new ZIP file containing all changes is created. If changes are not reflected in the Playground instance,
    it's possible that the most recent build failed, or has not completed. Check the list of workflow runs to be sure.

For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation.

Test this pull request with WordPress Playground.

@dmsnell
Copy link
Member

dmsnell commented Sep 16, 2024

Merged in [59024]
1637791

@dmsnell dmsnell closed this Sep 16, 2024
@sirreal sirreal deleted the html-api/process-in-html-insertion branch September 16, 2024 18:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants