aboutsummaryrefslogtreecommitdiff
path: root/doc/bugs/table_can_not_deal_with_Chinese_.mdwn
blob: f115fbebbf18b7a6e6edbb07a3b99f3f90ed9257 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
Table directive can not deal with Chinese, when format csv

    \[[!table format=csv data="""
    a,b,c
    1,2,你好
    """
    ]]

But the below example works well.

    \[[!table format=csv data="""
    a,b,c
    1,2,3
    """
    ]]


The below example works well too

    \[[!table format=dsv delimiter=, data="""
    a,b,c
    1,2,你好
    """
    ]]

----

> You don't say what actually happens when you try this, but I hit something similar trying unicode symbols in a CSV-based table. (I wasn't aware of the DSV work-around. Thanks!) The specific error  I get trying is

    [\[!table Error: Cannot decode string with wide characters at /usr/lib/x86_64-linux-gnu/perl/5.24/Encode.pm line 243.]]

> That file is owned by the `libperl5` package, but I think I've seen an error mentioning `Text::CSV` i.e. `libtext-csv-perl` when I've encountered this before. -- [[Jon]]

>> A related problem, also fixed by using DSV, is messing up the encoding of non-ASCII, non-wide characters, e.g. £ (workaround was to use £ instead) -- [[Jon]]

>>> Sorry, I have faced the same error: \[[!table Error: Cannot decode string with wide characters at /usr/lib/x86_64-linux-gnu/perl/5.24/Encode.pm line 243.]] -- [[tumashu1]]

---

The below patch seem to deal with this problem:

    From d6ed90331b31e4669222c6831f7a0f40f0677fe1 Mon Sep 17 00:00:00 2001
    From: Feng Shu <tumashu@163.com>
    Date: Sun, 2 Dec 2018 08:41:39 +0800
    Subject: [PATCH 2/2] Fix table plugin can handle UTF-8 csv format
    
    ---
     IkiWiki/Plugin/table.pm | 3 ++-
     1 file changed, 2 insertions(+), 1 deletion(-)
    
    From ad1a92c796d907ad293e572a168b6e9a8219623f Mon Sep 17 00:00:00 2001
    From: Feng Shu <tumashu@163.com>
    Date: Sun, 2 Dec 2018 08:41:39 +0800
    Subject: [PATCH 2/2] Fix table plugin can handle UTF-8 csv format
    
    ---
     IkiWiki/Plugin/table.pm | 3 ++-
     1 file changed, 2 insertions(+), 1 deletion(-)
    
    diff --git a/IkiWiki/Plugin/table.pm b/IkiWiki/Plugin/table.pm
    index f3c425a37..7fea8ab1c 100644
    --- a/IkiWiki/Plugin/table.pm
    +++ b/IkiWiki/Plugin/table.pm
    @@ -135,6 +135,7 @@ sub split_csv ($$) {
     	my $csv = Text::CSV->new({ 
     		sep_char	=> $delimiter,
     		binary		=> 1,
    +		decode_utf8 => 1,
     		allow_loose_quotes => 1,
     	}) || error("could not create a Text::CSV object");
     	
    @@ -143,7 +144,7 @@ sub split_csv ($$) {
     	foreach my $line (@text_lines) {
     		$l++;
     		if ($csv->parse($line)) {
    -			push(@data, [ map { decode_utf8 $_ } $csv->fields() ]);
    +			push(@data, [ $csv->fields() ]);
     		}
     		else {
     			debug(sprintf(gettext('parse fail at line %d: %s'), 
    -- 
    2.19.0

> Thanks, I've applied that patch and added test coverage. [[done]] --[[smcv]]

----

I can confirm that the above patch fixes the issue for me. Thanks! I'm not an ikiwiki committer, but I would encourage them to consider the above. Whilst I'm at it, I would be *really* grateful for some input on [[todo/support_multi-row_table_headers]] which relates to the same plugin. [[Jon]]

----

I've hit this bug with an inline-table and 3.20190228-1 (so: patch applied), with the following definition

    [[\!table class=fullwidth_table delimiter="      " data="""    
     
    Number  Title   Own?    Read?    
    I (HB1), 70 (PB1), 5 (PB50)     Dune    O       ✓"""]]

I'm going to attempt to work around it by moving to an external CSV. ­— [[Jon]]

> What version of Text::CSV (Debian: `libtext-csv-perl`) are you using?
> What version of Text::CSV::XS (Debian: `libtext-csv-xs-perl`) are you
> using, if any?
>
> I could't reproduce this with `libtext-csv-perl_2.00-1` and
> `libtext-csv-xs-perl_1.39-1`, assuming that the whitespace in
> `delimiter="..."` was meant to be a literal tab character, and that
> the data row has literal tabs before Dune, before O and before ✓.
>
> It would be great if you could modify `t/table.t` to include a failing
> test-case, and push it to your github fork or something, so I can apply
> it without having to guess precisely what the whitespace should be.
> --[[smcv]]

>> Sorry, I appreciate as bug reports go my last post was not that useful.
>> It's serving as a sort-of personal placeholder to investigate further.
>> The issue can be seen live [here](https://jmtd.net/fiction/sf_masterworks/),
>> the source is [here](https://jmtd.net/tmp/sf_masterworks.mdwn). The web
>> servers versions are ikiwiki                  3.20190228-1,
>> libtext-csv-perl         1.33-2 and libtext-csv-xs-perl is not installed.
>> I'll do some futher diagnosis and poking around.
>> ­— [[Jon]]
>> 
>> OK: issue exists with oldstable/Stretch, and is seemingly fixed in stable/Buster.
>> `libcsv-text-xs-perl` doesn't seem to matter (presence or absence doesn't change
>> the bug). Upgrading just `libtext-csv-perl` on a Stretch host with Buster's
>> 1.99-1 is not sufficent to fix it. As per the error, I think libperl5 might be
>> relevant, i.e. bug present in 5.24.1-3+deb9u5 and fixed by 5.28.1-6.
>>
>> EDIT: yes indeed merely upgrading to libperl5.28=5.28.1-6 in stretch fixes the
>> issue.
>> — [[Jon]]
>>
>> Post Buster-upgrade, and it's *still* broken on my webhost, which shows
>> `[\[!table Error: Wide character at /usr/lib/x86_64-linux-gnu/perl/5.28/Encode.pm line 296.]]`
>> with libperl5.28:amd64	5.28.1-6, libtext-csv-perl	1.99-1 and 
>> libtext-csv-xs-perl	1.38-1. Further fiddling will commence.
>> (removing libtext-csv-xs-perl does not help.)
— [[Jon]]