diff options
author | JoeRayhawk <JoeRayhawk@web> | 2010-10-05 01:09:42 +0000 |
---|---|---|
committer | Joey Hess <joey@kitenet.net> | 2010-10-05 01:09:42 +0000 |
commit | 380a80adfa8df712876095d684e8a564b8a86aea (patch) | |
tree | 2a8ede9900526ceebe7b56504f8f36823325e3e0 | |
parent | 082649f8698c5ca71aad504384f6aa4724420a8e (diff) | |
download | ikiwiki-380a80adfa8df712876095d684e8a564b8a86aea.tar ikiwiki-380a80adfa8df712876095d684e8a564b8a86aea.tar.gz |
Bug: UTF-16 and UTF-32 are unhandled: New
-rw-r--r-- | doc/bugs/UTF-16_and_UTF-32_are_unhandled.mdwn | 20 |
1 files changed, 20 insertions, 0 deletions
diff --git a/doc/bugs/UTF-16_and_UTF-32_are_unhandled.mdwn b/doc/bugs/UTF-16_and_UTF-32_are_unhandled.mdwn new file mode 100644 index 000000000..21df334a8 --- /dev/null +++ b/doc/bugs/UTF-16_and_UTF-32_are_unhandled.mdwn @@ -0,0 +1,20 @@ +Wide characters should probably be supported, or, at the very least, warned about. + +Test case: + + mkdir -p ikiwiki-utf-test/raw ikiwiki-utf-test/rendered + for page in txt mdwn; do + echo hello > ikiwiki-utf-test/raw/$page.$page + for text in 8 16 16BE 16LE 32 32BE 32LE; do + iconv -t UTF$text ikiwiki-utf-test/raw/$page.$page > ikiwiki-utf-test/raw/$page-utf$text.$page; + done + done + ikiwiki --verbose --plugin txt --plugin mdwn ikiwiki-utf-test/raw/ ikiwiki-utf-test/rendered/ + www-browser ikiwiki-utf-test/rendered/ || x-www-browser ikiwiki-utf-test/rendered/ + # rm -r ikiwiki-utf-test/ # some browsers rather stupidly daemonize themselves, so this operation can't easily be safely automated + +BOMless LE and BE input is probably a lost cause. + +Optimally, UTF-16 (which is ubiquitous in the Windows world) and UTF-32 should be fully supported, probably by converting to mostly-UTF-8 and using `&#xXXXX;` or `&#DDDDD;` XML escapes where necessary. + +Suboptimally, UTF-16 and UTF-32 should be converted to UTF-8 where cleanly possible and a warning printed where impossible. |