aboutsummaryrefslogtreecommitdiff
path: root/doc/plugins/headinganchors/discussion.mdwn
blob: f55408ff6ad7af6dcc7a807b8cf5832f6de4d261 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
Isn't this functionality a part of what [[plugins/toc]] needs and does? Then probably the [[plugins/toc]] plugin's code could be split into the part that implements the [[plugins/headinganchors]]'s functionality and the TOC generation itself. That will bring more order into the code and the set of available plugins. --Ivan Z.

> Indeed it is. Except [[plugins/toc]] generates headings differently - and independently of this. *Even* if [[toc]]'s functionality would be split, you'd probably want to retain backwards compatibility there, so it's unlikely that this will happen... Also see [[todo/toc-with-human-readable-anchors]]. --[[anarcat]]

---

A patch to make it more like MediaWiki:

<pre>--- headinganchors.pm
+++ headinganchors.pm
@@ -5,6 +5,7 @@
 use warnings;
 use strict;
 use IkiWiki 2.00;
+use URI::Escape;
 
 sub import {
         hook(type => "sanitize", id => "headinganchors", call => \&headinganchors);
@@ -14,9 +15,11 @@
         my $str = shift;
         $str =~ s/^\s+//;
         $str =~ s/\s+$//;
-        $str = lc($str);
-        $str =~ s/[&\?"\'\.,\(\)!]//mig;
-        $str =~ s/[^a-z]/_/mig;
+        $str =~ s/\s/_/g;
+        $str =~ s/"//g;
+        $str =~ s/^[^a-zA-Z]/z-/; # must start with an alphabetical character
+        $str = uri_escape_utf8($str);
+        $str =~ s/%/./g;
         return $str;
 }
 </pre>

--Changaco

> This was applied in 3.20110608 --[[smcv]]

----

I think using this below would let the source html clear for the browser
without changing the render:

        #use URI::Escape
        .
        .

        #$str = uri_escape_utf8($str);
        $str = Encode::decode_utf8($str);
        #$str =~ s/%/./g;

Don't you think ?
[[mathdesc]]

> Older HTML and URI specifications didn't allow Unicode in IDs or fragments,
> but HTML5 and IRIs do. See also [[plugins/contrib/i18nheadinganchors]]
> and its [[plugins/contrib/i18nheadinganchors/discussion]] page.
>
> I think we should probably try to make these autogenerated IDs
> punctuation-independent by stripping most non-word characters, like
> Pandoc does: I would not expect changing
> `## Headings, maybe with punctuation` to
> `## Headings (maybe with punctuation)` to have any effect on the
> generated "slug" `headings-maybe-with-punctuation`. --[[smcv]]