Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change Script Enforcement Mechanism to use flags #533

Merged
merged 5 commits into from
Oct 28, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 38 additions & 42 deletions spec/index.bs
Original file line number Diff line number Diff line change
Expand Up @@ -127,8 +127,6 @@ spec:ECMA-262; urlPrefix: https://tc39.github.io/ecma262/
type:dfn; text:current realm record; url: current-realm
spec: HTML; urlPrefix: https://html.spec.whatwg.org/
type: dfn; text: prepare the script element; url: prepare-the-script-element
type: dfn; text: The text insertion mode; url: parsing-main-incdata
type: dfn; text: reentrant invocation of the parser; url: nestedParsing
type: dfn; text: get the text steps; url: get-the-text-steps
type: dfn; text: set the inner text steps; url: set-the-inner-text-steps
type: dfn; text: src; url: attr-script-src
Expand Down Expand Up @@ -1073,20 +1071,6 @@ Given a {{TrustedType}} type (|expectedType|), a [=realm/global object=] (|globa
1. Return a new instance of an interface with a type
name |trustedTypeName|, with its associated data value set to |dataString|.

## <dfn abstract-op>Prepare the script text</dfn> ## {#prepare-script-text}

Given an {{HTMLScriptElement}} (|script|), this algorithm performs the following steps:

1. If |script|'s [=script text=] value is not equal to its [=child text content=],
set |script|'s [=script text=] to the result of executing [$Get Trusted Type compliant string$], with the following arguments:
* {{TrustedScriptURL}} as |expectedType|,
* |script|'s {{Document}}'s [=relevant global object=] as |global|,
* |script|'s [=child text content=] attribute value,
* `HTMLScriptElement text` as |sink|,
* `'script'` as |sinkGroup|.

If the algorithm threw an error, rethrow the error.

## Get Trusted Types-compliant attribute value ## {#validate-attribute-mutation}
To <dfn abstract-op export>get Trusted Types-compliant attribute value</dfn> on {{Attr}} |attribute| with {{Element}} |element| and {{TrustedType}} or a string |newValue|, perform the following steps:

Expand Down Expand Up @@ -1177,12 +1161,17 @@ partial interface HTMLScriptElement {

#### Slots with trusted values #### {#slots-with-trusted-values}

This document modifies {{HTMLScriptElement}}s. Each script has:
An {{HTMLScriptElement}} and {{SVGScriptElement}} have:

: an associated boolean <dfn export for="HTMLScriptElement,SVGScriptElement">is trusted</dfn>.
:: A boolean indicating whether a script element is considered trustworthy for execution.
Initially true.
lukewarlow marked this conversation as resolved.
Show resolved Hide resolved

Note: This boolean is initially true so that parsed scripts are trusted.

: an associated string <dfn export for="HTMLScriptElement">script text</dfn>.
:: A string, containing the body of the script to execute that was set
through a compliant sink. Equivalent to script's
[=child text content=]. Initially an empty string.
: an associated boolean <dfn export for="HTMLScriptElement,SVGScriptElement">changed by trusted sink</dfn>.
:: A boolean indicating whether a script element has been modified by a trusted sink.
Initially false.

#### The {{HTMLScriptElement/innerText}} IDL attribute #### {#the-innerText-idl-attribute}

Expand All @@ -1191,7 +1180,7 @@ The {{HTMLScriptElement/innerText}} setter steps are:
1. Let |value| be the result of calling [$Get Trusted Type compliant string$] with
{{TrustedScript}}, [=this=]'s [=relevant global object=], the given value, `HTMLScriptElement innerText`, and
`script`.
1. Set [=this=]'s [=script text=] value to |value|.
1. Set [=this=]'s [=HTMLScriptElement/changed by trusted sink=] to true.
1. Run [=set the inner text steps=] with [=this=] and |value|.

The {{HTMLScriptElement/innerText}} getter steps are:
Expand All @@ -1206,7 +1195,7 @@ empty string instead, and then do as described below:
1. Let |value| be the result of calling [$Get Trusted Type compliant string$] with
{{TrustedScript}}, [=this=]'s [=relevant global object=], the given value, `HTMLScriptElement textContent`, and
`script`.
1. Set [=this=]'s [=script text=] value to |value|.
1. Set [=this=]'s [=HTMLScriptElement/changed by trusted sink=] to true.
1. Run [=set text content=] with [=this=] and |value|.

The {{HTMLScriptElement/textContent}} getter steps are:
Expand All @@ -1220,7 +1209,7 @@ Update the {{HTMLScriptElement/text}} setter steps algorithm as follows.
1. <ins>Let |value| be the result of calling [$Get Trusted Type compliant string$] with
{{TrustedScript}}, [=this=]'s [=relevant global object=], the given value, `HTMLScriptElement text`, and
`script`.</ins>
1. <ins>Set [=this=]'s [=script text=] value to the given value.</ins>
1. Set [=this=]'s [=HTMLScriptElement/changed by trusted sink=] to true.
1. [=String replace all=] with the given value within [=this=].


Expand All @@ -1233,29 +1222,25 @@ The {{HTMLScriptElement/src}} setter steps are:
`script`.</ins>
1. <ins>Set [=this=]'s [=src=] content attribute to |value|.</ins>

#### Setting slot values from parser #### {#setting-slot-values-from-parser}
#### Script children changed steps #### {#script-children-changed-steps}

This document modifies the HTML parser to set the [=script text=] value when the script is created.
This document modifies the [=children changed steps=] for {{HTMLScriptElement}} as follows:

Modify the [=The text insertion mode=] algorithm as follows:
1. Set [=this=]'s [=HTMLScriptElement/is trusted=] to false.

<dl class="switch">
<dt id="scriptEndTag">An end tag whose tag name is "script"</dt>
<dd>
<p>...</p>
1. If [=this=]'s [=HTMLScriptElement/changed by trusted sink=] is true, set [=this=]'s [=HTMLScriptElement/is trusted=] to true.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This flag is used to say hey this is an API change but it's a trusted one. We unset the flag once used.


<ins><p>Set <var>script</var>'s [=script text=] value to its [=child text content=].</p></ins>
1. Set [=this=]'s [=HTMLScriptElement/changed by trusted sink=] to false.

<p>If the <span>active speculative HTML parser</span> is null, then <span>prepare the script
element</span> <var>script</var>. This might cause some script to execute, which might cause
<span data-x="dom-document-write">new characters to be inserted into the tokenizer</span>, and
might cause the tokenizer to output more tokens, resulting in a [=reentrant invocation of the parser=].</p>
Note: This relies on the children changed steps never being called by the parser.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can we assume this? I think these steps are called indirectly by the parser, in the usual insertion path, no? And it seems that browsers expect this too: https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/renderer/core/html/html_script_element.cc;l=82;drc=d81683e439e0b80cac98e017d94449d59f48fa0d.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the slow response I've been OoO I didn't neccesarily expect this PR to be merged until someone with more knowledge of the status quo had reviewed, apologies for not making that clearer.

So from all I could work out, the specification children changed steps are only called from APIs and NOT the parser.
If that's not true then this will need reverting.
The ChildrenChanged method implemented in Chromium and WebKit are called by the parser but that's because they aren't a 100% match for the spec.

Fwiw I did a quick prototype in Chromium and it seemed to work as expected (it was a few months back at this point so can't fully remember). I've also got a draft PR to update WebKit to use this new model too WebKit/WebKit#30649.

Copy link
Member

@domfarolino domfarolino Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the slow response I've been OoO

Hah, not a problem, my delay since Anne pinged be about this was far worse.

I didn't neccesarily expect this PR to be merged until someone with more knowledge of the status quo had reviewed, apologies for not making that clearer.

Do you think we should revert it?

So from all I could work out, the specification children changed steps are only called from APIs and NOT the parser.
If that's not true then this will need reverting.

Hmm, spec-wise, I think the children changed steps are indeed called from the parser. At the very least, the HTML's adoption agency algorithm creates an element and then immediately appends it to the DOM, which goes through the https://dom.spec.whatwg.org/#concept-node-insert path, which itself calls the children changed steps.

I am really not familiar with the HTML parser, so I can't find any other paths from parser -> DOM insertion hook (which calls children changed steps), but I'm pretty sure they exist. For example, when an iframe initializes its document, it does so only as a result of its HTML insertion steps, and the only thing that calls these steps is the DOM insertion algorithm—the same thing that calls the children changed steps. So the DOM insertion algorithm must be invoked by the parser, or else I think parser-inserted iframes would never get initialized.


<p>...</p>
</dd>
</dl>
Issue: Need to double check how [part of script element's spec](https://html.spec.whatwg.org/#prepare-the-script-element:~:text=When%20a%20script%20element%20el%20that%20is%20not%20parser%2Dinserted%20experiences) fits into this. These steps need to happen before prepare the script is called.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to change the html spec when upstreaming to run the prepare the script (under relevant conditions) at the end of the children changed steps.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly it would rely on having a bit more information in the children changed steps algorithm if we want to inline it. Because it needs to know what type of change it is (insertion specifically in this case).

I suspect this is why some Chrome and WebKit's childrenChanged functions include more than the dom spec's algorithm. (And is why Firefox implements it in a way that also gives them this more granular informaion).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@domfarolino should probably look at this.

Also would that create issues with re-entrant invocations?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would seem that whatwg/html#10188 already changes that part of the HTML spec to be defined in terms of the children changed steps so I think we'd just need to put our new steps first and then run the post-insertion steps and it'll fix the concerns I had here.


Issue: The above algorithm doesn't account for the case when the script element's content is changed mid-parse. Implementors should ensure they protect against this case. See [https://github.com/w3c/trusted-types/issues/507](https://github.com/w3c/trusted-types/issues/507).
This document modifies the [=children changed steps=] for {{SVGScriptElement}} as follows:

1. Set [=this=]'s [=SVGScriptElement/is trusted=] to false.

Note: This relies on the children changed steps never being called by the parser.

#### Slot value verification #### {#slot-value-verification}

Expand All @@ -1276,11 +1261,22 @@ The first few steps of the [=prepare the script element=] algorithm are modified
<p class=note>This is done so that if a parser-inserted <code id=script-processing-model:the-script-element-28><a href=https://html.spec.whatwg.org/#the-script-element>script</a></code> element fails to
run when the parser tries to run it, but it is later executed after a script dynamically
updates it, it will execute in an async fashion even if the <code id=script-processing-model:attr-script-async-5><a href=https://html.spec.whatwg.org/#attr-script-async>async</a></code> attribute isn't set.</p>
<li><ins><p>Execute the [$Prepare the script text$] algorithm on <var>el</var>. If that algorithm threw an error, then return.</p></ins></li>
<li><p>Let <var ignore="">source text</var> be <var>el</var>'s <del><a id=script-processing-model:child-text-content href=https://dom.spec.whatwg.org/#concept-child-text-content data-x-internal=child-text-content>child text content</a>.</del> <ins>[=script text=] value.</ins>

<li><p>Let <var>source text</var> be <var>el</var>'s <a id=script-processing-model:child-text-content href=https://dom.spec.whatwg.org/#concept-child-text-content data-x-internal=child-text-content>child text content</a>.

<li><ins>
<p>If <var>el</var>'s [=HTMLScriptElement/is trusted=] is false:
<ol>
<li><p>Set <var>source text</var> to the result of executing [$Get Trusted Type compliant string$], with
{{TrustedScript}}, <var>el</var>'s [=relevant global object=], <var>source text</var>, `'HTMLScriptElement text'`,
and `'script'`.
<p>If that algorithm threw an error, then return.
</ol></ins>
<li>...
</ol>

Issue: There's no proper definition for the processing of SVG script elements. However, you should apply a similar change to the processing of {{SVGScriptElement}}s.

## Integration with DOM ## {#integration-with-dom}

Note: See [https://github.com/whatwg/dom/pull/1258](https://github.com/whatwg/dom/pull/1258) and [https://github.com/whatwg/dom/pull/1268](https://github.com/whatwg/dom/pull/1268) which upstream this integration.
Expand Down
Loading