Allow specifying additional "special" schemes. #749

tmccombs · 2023-02-05T05:56:05Z

The parsing algorithm behaves differently for certain domains that are considered "special". In addition the scheme of a non-special URL cannot be changed to a special scheme. In some applications, especially non-web-browser applications, it is desirable for additional schemes to be treated the same way as the listed special schemes, and be able change the protocol/scheme to and from other special schemes.

I think there are a few ways this could be addressed:

Change the API to allow passing a list of additional special schemes into the constructor for URL
Change the API to allow specifying that a URL should be treated as a special url during construction
Add a new URLFactory (or URLBuilder) class that allows configuring the set of special schemes for any URLs created with it.
Do not specify any additional required API, but say that an implementation is allowed to treat additional schemes as special, and potentially include an API for registering additional special schemes.

Some examples of schemes that applications may wish to treat as special:

git
sftp
gopher
http+unix and https+unix or similar (in fact, maybe it would be worth specifying that an existing special scheme followed by a "+" and a suffix is also a special scheme?)
custom scheme intended for opening an http resource in a specific application

I follow the rust-url repository, which aims at implementing this specification, and issues related to this come up pretty frequently. For example:

Related issues for this repository:

The text was updated successfully, but these errors were encountered:

annevk · 2023-02-06T10:17:11Z

I think the answer here is 5. It's worth clarifying in the standard that this is a non-goal, as it indeed occasionally comes up.

Instead what you'd do is define a processor that takes a URL and turns it into a data structure suitable for further usage. E.g., what we do in https://fetch.spec.whatwg.org/#data-urls for data: URLs. Such a scheme-specific processor can take care of adding a path, further processing an opaque host, etc.

The reason for that is that URL parsing ought to be stable over time and across implementations. Implementations should not have differing views as to what a URL string represents, how it serializes once parsed, etc. And if URLs are further processed ideally that aligns across implementations as well, but that will only happen in implementations purporting to support the scheme, which will be a subset.

tmccombs · 2023-02-06T18:15:52Z

My point is that a subset of custom schemes are basically identical to http/https, but use a different scheme to convey some additional information. Such a separate processor would have to duplicate a lot of what the Url parser already implements.

annevk · 2023-02-06T18:17:45Z

Yeah, understood.

annevk added the clarification Standard could be clearer label Feb 6, 2023

agowa mentioned this issue Jul 10, 2023

Proposal for new version of parsing spec #778

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow specifying additional "special" schemes. #749

Allow specifying additional "special" schemes. #749

tmccombs commented Feb 5, 2023

annevk commented Feb 6, 2023 •

edited

Loading

tmccombs commented Feb 6, 2023

annevk commented Feb 6, 2023

Allow specifying additional "special" schemes. #749

Allow specifying additional "special" schemes. #749

Comments

tmccombs commented Feb 5, 2023

annevk commented Feb 6, 2023 • edited Loading

tmccombs commented Feb 6, 2023

annevk commented Feb 6, 2023

annevk commented Feb 6, 2023 •

edited

Loading