Skip to content

Comments

Implement <select> parser “relaxation” — for “customizable” <select>#113

Open
sideshowbarker wants to merge 6 commits intomasterfrom
sideshowbarker/relaxed-select-parsing
Open

Implement <select> parser “relaxation” — for “customizable” <select>#113
sideshowbarker wants to merge 6 commits intomasterfrom
sideshowbarker/relaxed-select-parsing

Conversation

@sideshowbarker
Copy link
Member

@sideshowbarker sideshowbarker commented Dec 30, 2025

This change, in order to support the "customizable" <select> element:

  • "relaxes" parsing rules in <select> (to allow more elements as children)
  • adds the <selectedcontent> element for cloning selected <option> content
  • aligns all select-related tree construction with the WHATWG spec (whatwg/html PR #10557)

ElementName.java

  • Add <selectedcontent> element with TreeBuilder.SELECTEDCONTENT group

TreeBuilder.java

Structural changes:

  • Remove dead IN_SELECT / IN_SELECT_IN_TABLE constants and all associated mode handlers and end-tag special cases — with select "relaxation", those modes are never entered
  • Fix resetTheInsertionMode to skip past <select> elements on the stack instead of incorrectly returning IN_BODY — an enclosing <td>/<th> now correctly yields IN_CELL, an enclosing <table> yields IN_TABLE, etc.

Start-tag alignment with spec:

  • <hr>: use generateImpliedEndTags() instead of manual isCurrent/pop
  • <option>: remove redundant pop loop (spec says parse error only after generateImpliedEndTagsExceptFor("optgroup"))
  • <optgroup>: remove redundant pop loop, add missing reconstructTheActiveFormattingElements(), fix parse error to check both <option> and <optgroup> in scope
  • <select>: check fragment case first, remove generateImpliedEndTags from nested case, add framesetOk = false
  • <input>: pop select and reprocess when select is in scope
  • Table elements (<caption>, <tbody>, <tr>, <td>, <th>): treat as stray start tags, removing the old IN_SELECT_IN_TABLE dual-scope check

End-tag changes:

  • Fold </select> into the standard block-element end-tag group (with </div>, </fieldset>, </button>, etc.)
  • Remove dedicated </option> and </optgroup> end-tag cases — "any other end tag" handles them

Cloning hook:

  • Add optionElementPopped() hook called from all pop() variants
  • Default is no-op; DOM-side subclasses override to implement "maybe clone an option into selectedcontent"

Spec comments:

  • Add spec URLs and verbatim spec-text comments throughout all select-related code sections

DOMTreeBuilder.java

  • Implement optionElementPopped(): walk ancestors to find <select>, find <selectedcontent> descendant, check selectedness, deep-clone option children via cloneNode(true)

SAXTreeBuilder.java

  • Implement optionElementPopped() as no-op (streaming/SAX mode has no live DOM to clone into)

XOMTreeBuilder.java

  • Implement optionElementPopped() using XOM's Node.copy() for deep cloning

BrowserTreeBuilder.java

  • Implement optionElementPopped() using cloneNodeDeep() for deep cloning

CharBufferNode.java

  • Add getBuffer() accessor for direct char[] access during cloning

ParentNode.java

  • Add clearChildren() method, to support clearing <selectedcontent>

@hsivonen

This comment was marked as outdated.

@sideshowbarker

This comment was marked as outdated.

Copy link
Member

@hsivonen hsivonen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about the delay.

As discussed on Matrix, I think the way this patch makes the cloning work on what the parser saw is incorrect and the spec instead requires the cloning to work on what's in the live DOM, which may be different if the option element opens, the network stalls, setTimeout modifies the DOM, and network resumes.

Further remarks in an inline comment.

@sideshowbarker sideshowbarker force-pushed the sideshowbarker/relaxed-select-parsing branch 4 times, most recently from 020921d to f7b4c0f Compare February 2, 2026 05:31
This change, in order to support the “customizable” <select> element:

  - “relaxes” parsing rules in <select> (to allow more elements as children)
  - adds the <selectedcontent> element for cloning selected <option> content

ElementName.java
----------------
  - Add <selectedcontent> element with TreeBuilder.SELECTEDCONTENT group

TreeBuilder.java
----------------
  - Add SELECTEDCONTENT group constant and IN_SELECT_IN_CONTENT mode
  - Track selectedContentPointer and activeOptionStackPos for cloning
  - Allow <div>, <span>, and other elements in <select> in “relaxed” mode
  - Deep clone <option> content to <selectedcontent> when <option> closes
  - Handle <selectedcontent> special-casing (store pointer, no stack push)

SAXTreeBuilder.java
-------------------
  - Implement clearSelectedContentChildren() for clearing before clone
  - Implement deepCloneChildren() and deepCloneNode() for <option> cloning

ParentNode.java
---------------
  - Add clearChildren() method, to support clearing <selectedcontent>

html5lib-tests submodule
------------------------
  - Updated to pull in corresponding tests for “relaxed” <select> parsing
The initial implementation did inline cloning during parsing (mirroring
elements/characters into <selectedcontent> as tokens were processed) AND
deep-cloned from the DOM when option was popped.

The inline cloning was wrong per spec — cloning should operate on the
DOM, not on what the parser saw (the adoption agency algorithm can
restructure the tree) — and redundant, since pop() cleared
<selectedcontent> and deep-cloned from the DOM anyway.

This replaces all inline cloning machinery with a single overridable
hook method, cloneOptionContentToSelectedContent(), called when <option>
is popped. The tree builder tracks which <option> is active and where
<selectedcontent> is, but delegates the actual DOM cloning to subclasses.
Add necessary corresponding implementations to the other tree builders
that construct persistent trees:

  - DOMTreeBuilder: uses the DOM cloneNode method for deep cloning
  - XOMTreeBuilder: uses XOM’s Node.copy() for deep cloning
  - BrowserTreeBuilder: uses the existing cloneNodeDeep method
Add a getBuffer() accessor to CharBufferNode so that deepCloneNode()
can copy directly from char[] to char[] via the Characters constructor,
instead of going through toString() and toCharArray().
@sideshowbarker sideshowbarker force-pushed the sideshowbarker/relaxed-select-parsing branch from f7b4c0f to 87fe37a Compare February 2, 2026 05:35
…comments

Move selectedcontent cloning from TreeBuilder to DOM-side subclasses:
the optionElementPopped() hook now walks the live DOM to find the
ancestor <select> and its <selectedcontent>, check selectedness, and
clone option children — rather than relying on parser-tracked state.
Implemented in all four subclasses (DOMTreeBuilder, SAXTreeBuilder,
XOMTreeBuilder, BrowserTreeBuilder).

Remove all dead IN_SELECT / IN_SELECT_IN_TABLE mode code (handlers,
constants, end-tag special cases) — with <select> “relaxation”, those
modes are never entered.

Fix resetTheInsertionMode to skip past <select> elements on the stack
instead of incorrectly returning IN_BODY — an enclosing td/th now
correctly yields IN_CELL, an enclosing table yields IN_TABLE, etc.

Align start-tag handling with the spec:

- HR: use generateImpliedEndTags() instead of manual isCurrent/pop
- OPTION/OPTGROUP: remove redundant pop loops (spec says parse error
  only), add missing reconstructTheActiveFormattingElements for
  optgroup, fix optgroup to check both option and optgroup in scope
- SELECT: check fragment case first, remove generateImpliedEndTags
  from nested case, add framesetOk = false
- Table elements (caption, tbody, tr, td, th): treat as stray start
  tags, removing the old IN_SELECT_IN_TABLE dual-scope check

Fold </select> end tag into the standard block-element group (with
div, fieldset, button, etc.) — the dedicated resetTheInsertionMode
call is no longer needed.

Add spec URLs and verbatim spec-text comments throughout all
<select>-related code sections — for traceability and review-ability.
@sideshowbarker sideshowbarker force-pushed the sideshowbarker/relaxed-select-parsing branch from 0466829 to b3bbf98 Compare February 20, 2026 12:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants