aboutsummaryrefslogtreecommitdiff
path: root/doc/bugs/table_can_not_deal_with_Chinese_.mdwn
blob: 92be2b636f04c119ba6173140fcd7c39aa767df0 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
Table directive can not deal with Chinese, when format csv

    \[[!table format=csv data="""
    a,b,c
    1,2,你好
    """
    ]]

But the below example works well.

    \[[!table format=csv data="""
    a,b,c
    1,2,3
    """
    ]]


The below example works well too

    \[[!table format=dsv delimiter=, data="""
    a,b,c
    1,2,你好
    """
    ]]

----

> You don't say what actually happens when you try this, but I hit something similar trying unicode symbols in a CSV-based table. (I wasn't aware of the DSV work-around. Thanks!) The specific error  I get trying is

    [\[!table Error: Cannot decode string with wide characters at /usr/lib/x86_64-linux-gnu/perl/5.24/Encode.pm line 243.]]

> That file is owned by the `libperl5` package, but I think I've seen an error mentioning `Text::CSV` i.e. `libtext-csv-perl` when I've encountered this before. -- [[Jon]]

>> A related problem, also fixed by using DSV, is messing up the encoding of non-ASCII, non-wide characters, e.g. £ (workaround was to use £ instead) -- [[Jon]]

>>> Sorry, I have faced the same error: \[[!table Error: Cannot decode string with wide characters at /usr/lib/x86_64-linux-gnu/perl/5.24/Encode.pm line 243.]] -- [[tumashu1]]

---

The below patch seem to deal with this problem:

    From d6ed90331b31e4669222c6831f7a0f40f0677fe1 Mon Sep 17 00:00:00 2001
    From: Feng Shu <tumashu@163.com>
    Date: Sun, 2 Dec 2018 08:41:39 +0800
    Subject: [PATCH 2/2] Fix table plugin can handle UTF-8 csv format
    
    ---
     IkiWiki/Plugin/table.pm | 3 ++-
     1 file changed, 2 insertions(+), 1 deletion(-)
    
    From ad1a92c796d907ad293e572a168b6e9a8219623f Mon Sep 17 00:00:00 2001
    From: Feng Shu <tumashu@163.com>
    Date: Sun, 2 Dec 2018 08:41:39 +0800
    Subject: [PATCH 2/2] Fix table plugin can handle UTF-8 csv format
    
    ---
     IkiWiki/Plugin/table.pm | 3 ++-
     1 file changed, 2 insertions(+), 1 deletion(-)
    
    diff --git a/IkiWiki/Plugin/table.pm b/IkiWiki/Plugin/table.pm
    index f3c425a37..7fea8ab1c 100644
    --- a/IkiWiki/Plugin/table.pm
    +++ b/IkiWiki/Plugin/table.pm
    @@ -135,6 +135,7 @@ sub split_csv ($$) {
     	my $csv = Text::CSV->new({ 
     		sep_char	=> $delimiter,
     		binary		=> 1,
    +		decode_utf8 => 1,
     		allow_loose_quotes => 1,
     	}) || error("could not create a Text::CSV object");
     	
    @@ -143,7 +144,7 @@ sub split_csv ($$) {
     	foreach my $line (@text_lines) {
     		$l++;
     		if ($csv->parse($line)) {
    -			push(@data, [ map { decode_utf8 $_ } $csv->fields() ]);
    +			push(@data, [ $csv->fields() ]);
     		}
     		else {
     			debug(sprintf(gettext('parse fail at line %d: %s'), 
    -- 
    2.19.0

> Thanks, I've applied that patch and added test coverage. [[done]] --[[smcv]]

----

I can confirm that the above patch fixes the issue for me. Thanks! I'm not an ikiwiki committer, but I would encourage them to consider the above. Whilst I'm at it, I would be *really* grateful for some input on [[todo/support_multi-row_table_headers]] which relates to the same plugin. [[Jon]]

----

I've hit this bug with an inline-table and 3.20190228-1 (so: patch applied), with the following definition

    [[\!table class=fullwidth_table delimiter="      " data="""    
     
    Number  Title   Own?    Read?    
    I (HB1), 70 (PB1), 5 (PB50)     Dune    O       ✓"""]]

I'm going to attempt to work around it by moving to an external CSV. ­— [[Jon]]

> What version of Text::CSV (Debian: `libtext-csv-perl`) are you using?
> What version of Text::CSV::XS (Debian: `libtext-csv-xs-perl`) are you
> using, if any?
>
> I could't reproduce this with `libtext-csv-perl_2.00-1` and
> `libtext-csv-xs-perl_1.39-1`, assuming that the whitespace in
> `delimiter="..."` was meant to be a literal tab character, and that
> the data row has literal tabs before Dune, before O and before ✓.
>
> It would be great if you could modify `t/table.t` to include a failing
> test-case, and push it to your github fork or something, so I can apply
> it without having to guess precisely what the whitespace should be.
> --[[smcv]]