aboutsummaryrefslogtreecommitdiff
path: root/doc/bugs/HTML_inlined_into_Atom_not_necessarily_well-formed.mdwn
blob: 8bf97910d82335295742cce9e6895509f526c9e4 (plain)
1
2
3
4
5
6
7
8
9
10
If a blog entry contains a HTML named entity, such as the `—` produced by [[plugins/rst]] for blockquote citations, it's pasted into the Atom feed as-is. However, Atom feeds don't have a DTD, so named entities beyond `<`, `>`, `"`, `&` and `'` aren't well-formed XML.

Possible solutions:

* Put HTML in Atom feeds as type="html" (and use ESCAPE=HTML) instead

* Keep HTML in Atom feeds as type="xhtml", but replace named entities with numeric ones,
  like in the re-escape-entities branch in my repository: http://git.debian.org/?p=users/smcv/ikiwiki.git;a=commitdiff;h=c0eb041c65d0653bacf0d4acb7a602e9bda8888e

(Also, the HTML in RSS feeds would probably get better interoperability if it was escaped with ESCAPE=HTML rather than being in a CDATA section?)