Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTML entities are not being preserved #37

Open
cimmanon opened this issue Oct 31, 2013 · 5 comments
Open

HTML entities are not being preserved #37

cimmanon opened this issue Oct 31, 2013 · 5 comments

Comments

@cimmanon
Copy link

All HTML entities found in my templates are being converted to the actual character (eg. & becomes &, © becomes ©). For most characters, this doesn't matter a whole lot, but a raw ampersand causes HTML validation to fail.

Template source:

<p class="copyright">&copy;2013 Company Name here</p>

Rendered document's source:

<p class='copyright'>©2013 Company Name here</p>
@cdsmith
Copy link
Member

cdsmith commented Nov 1, 2013

For the copyright symbol, this is expected behavior. The entity is
unnecessary as long as you're working in UTF-8 or another good enough
character set.

This is probably incorrect if it's happening for &. Are you sure it is?
Or just speculating?
On Oct 31, 2013 7:55 AM, "cimmanon" [email protected] wrote:

All HTML entities found in my templates are being converted to the actual
character (eg. & becomes &, © becomes ©). For most characters,
this doesn't matter a whole lot, but a raw ampersand causes HTML validation
to fail.

Template source:

©2013 Company Name here

Rendered document's source:

©2013 Company Name here


Reply to this email directly or view it on GitHubhttps://github.com//issues/37
.

@mightybyte
Copy link
Member

I just checked and it looks like it is indeed happening for &.

@mightybyte
Copy link
Member

@cdsmith Any idea where this might be in xmlhtml?

@cdsmith
Copy link
Member

cdsmith commented Mar 30, 2016

https://github.com/snapframework/xmlhtml/blob/master/src/Text/XmlHtml/HTML/Render.hs#L60

Currently, it seems that the '&' character is only escaped when it is an "ambiguous ampersand", as defined at https://www.w3.org/TR/html5/syntax.html#text Non-ambiguous ampersands should not cause HTML validation to fail. I believe there is a reason this is done in this way, and I think it has something to do with certain browsers not liking the escaping of & inside a URL. I don't recall the details.

@mightybyte
Copy link
Member

Hmmm, I can't decide what to do on this issue. The existing behavior appears to be correct, but on the other hand maybe a few unnecessary entities are tolerable if they are causing validation problems. I think I'm leaning slightly more towards closing this issue and leaving things as-is, but I could certainly be persuaded otherwise. Thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants