aboutsummaryrefslogtreecommitdiff
path: root/doc/todo/utf8.mdwn
diff options
context:
space:
mode:
authorjoey <joey@0fa5a96a-9a0e-0410-b3b2-a0fd24251071>2006-04-04 19:34:50 +0000
committerjoey <joey@0fa5a96a-9a0e-0410-b3b2-a0fd24251071>2006-04-04 19:34:50 +0000
commitf50bd57bcebe08d26653299b189fe82beaea4a0f (patch)
treeabf11303982b8a1780667696347e554c55c6d907 /doc/todo/utf8.mdwn
parenta0321594fb72ab1d215204f1838d2593e0b24f95 (diff)
downloadikiwiki-f50bd57bcebe08d26653299b189fe82beaea4a0f.tar
ikiwiki-f50bd57bcebe08d26653299b189fe82beaea4a0f.tar.gz
proper binmode settings so that with -CSD, ikiwiki will support unicode
however, due to robustness, that's not enabled by default yet
Diffstat (limited to 'doc/todo/utf8.mdwn')
-rw-r--r--doc/todo/utf8.mdwn27
1 files changed, 27 insertions, 0 deletions
diff --git a/doc/todo/utf8.mdwn b/doc/todo/utf8.mdwn
new file mode 100644
index 000000000..536ec75b2
--- /dev/null
+++ b/doc/todo/utf8.mdwn
@@ -0,0 +1,27 @@
+ikiwiki should support utf-8 pages, both input and output
+
+Currently ikiwiki is belived to be utf-8 clean itself; it tells perl to use
+binmode when reading possibly binary files (such as images) and it uses
+utf-8 compatable regexps etc.
+
+utf-8 IO is not enabled by default though. While you can probably embed
+utf-8 in pages anyway, ikiwiki will not treat it right in the cases where
+it deals with things on a per-character basis (mostly when escaping and
+de-escaping special characters in filenames).
+
+To enable utf-8, edit ikiwiki and add -CSD to the perl hashbang line.
+(This should probably be configurable via a --utf8 or better --encoding=
+switch.)
+
+The following problems have been observed when running ikiwiki this way:
+
+* If invalid utf-8 creeps into a file, ikiwiki will crash rendering it as
+ follows:
+
+ Malformed UTF-8 character (unexpected continuation byte 0x97, with no preceding start byte) in substitution iterator at /usr/bin/markdown line 1317.
+ Malformed UTF-8 character (fatal) at /usr/bin/markdown line 1317.
+
+ In this example, a literal 0x97 character had gotten into a markdown
+ file.
+
+ Here, let's put one in this file: "—"