aboutsummaryrefslogtreecommitdiff
path: root/doc/bugs/Slow_Filecheck_attachments___34__snails_it_all__34__/discussion.mdwn
blob: 629aba71e3e98797352bf9450ae09974ad69291c (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
##Foreword :
Disabling of filecheck is not actually possible because btw it cause the attachment.pm to malfunction and
any of pagespec that could contain a *mimetype* condition.

attachment.pm imports "statically"  filecheck so actually disabling it should be *interdicted* .

<pre>
sub import {
        add_underlay("attachment");
        add_underlay("javascript");
        add_underlay("jquery");
        hook(type => "getsetup", id => "attachment", call => \&getsetup);
        hook(type => "checkconfig", id => "attachment", call => \&checkconfig);
        hook(type => "formbuilder_setup", id => "attachment", call => \&formbuilder_setup);
        hook(type => "formbuilder", id => "attachment", call => \&formbuilder, last => 1);
        IkiWiki::loadplugin("filecheck");
}
</pre>

----

## How bad is it ?

So I tried on three pages to inline <tt>!mimetype(image/*)</tt> while I allowed attachment of <tt>mimetype(image/*)</tt>

My profiling tests in the bug report shows that most of the time is spend in the "Fallback using file" block code,
I tried to comment that block and see how it'll perform. Obviously this is much much faster ... but is the mimetype
discovered using only *File::MimeInfo* ?


Dumping some strings before return to STDERR, rebuilding . This is just a  [[!toggle  id="code-test" text="dumpdebug adding"]] 

[[!toggleable  id="code-test" text="""
<pre>
sub match_mimetype ($$;@) {
        my $page=shift;
        my $wanted=shift;

        my %params=@_;
        my $file=exists $params{file} ? $params{file} : IkiWiki::srcfile($IkiWiki::pagesources{$page});
        if (! defined $file) {
                return IkiWiki::ErrorReason->new("file does not exist");
        }

        # Get the mime type.
        #
        # First, try File::Mimeinfo. This is fast, but doesn't recognise
        # all files.
        eval q{use File::MimeInfo::Magic};
        my $mimeinfo_ok=! $@;
        my $mimetype;
        print STDERR " --- match_mimetype (".$file.")\n";
        if ($mimeinfo_ok) {
                my $mimetype=File::MimeInfo::Magic::magic($file);
        }

        # Fall back to using file, which has a more complete
        # magic database.
        #if (! defined $mimetype) {
        #       open(my $file_h, "-|", "file", "-bi", $file);
        #       $mimetype=<$file_h>;
        #       chomp $mimetype;
        #       close $file_h;
        #}

        if (! defined $mimetype || $mimetype !~s /;.*//) {
                # Fall back to default value.
                $mimetype=File::MimeInfo::Magic::default($file)
                        if $mimeinfo_ok;
                if (! defined $mimetype) {
                        $mimetype="unknown";
                }
        }

        my $regexp=IkiWiki::glob2re($wanted);
        if ($mimetype!~$regexp) {
                 print STDERR " xxx MIME unknown ($mimetype - $wanted - $regexp ) \n";
                return IkiWiki::FailReason->new("file MIME type is $mimetype, not $wanted");
        }
        else {
                print STDERR " vvv MIME found\n";
                return IkiWiki::SuccessReason->new("file MIME type is $mimetype");
        }
}
</pre>
"""]]

The results dump to stderr (or a file called... 'say *mime*) looks like this :
<pre>
--- match_mimetype (/usr/share/ikiwiki/attachment/ikiwiki/jquery.fileupload-ui.js)
 xxx MIME unknown (text/plain - image/* - (?i-xsm:^image\/.*$) )
 --- match_mimetype (/usr/share/ikiwiki/locale/fr/directives/ikiwiki/directive/fortune.mdwn)
 xxx MIME unknown (text/plain - image/* - (?i-xsm:^image\/.*$) )
 --- match_mimetype (/usr/share/ikiwiki/locale/fr/basewiki/shortcuts.mdwn)
 xxx MIME unknown (text/plain - image/* - (?i-xsm:^image\/.*$) 
 --- match_mimetype (/usr/share/ikiwiki/smiley/smileys/alert.png)
 xxx MIME unknown (application/octet-stream - image/* - (?i-xsm:^image\/.*$) )
 --- match_mimetype (/usr/share/ikiwiki/attachment/ikiwiki/images/ui-bg_flat_75_ffffff_40x100.png)
 xxx MIME unknown (application/octet-stream - image/* - (?i-xsm:^image\/.*$) 
</pre>

<tt>---</tt> prepend signals the file on analysis<br/>
<tt>xxx</tt> prepend signals a returns failure : mime is unknown, the match is a failure<br/>
<tt>vvv</tt> prepend signals a return success.<br/>


This is nasty-scary results ! Something missed me or this mime-filecheck is plain nuts ?

*Question 1* : How many files have been analysed : **3055** (yet on a tiny tiny wiki)
<pre>grep "^ --- " mime | wc -l
3055
</pre>

*Question 2* : How many time it fails : *all the time*
<pre>
 grep "^ xxx " mime | wc -l
3055
</pre>

*Question 1bis*  : Doh btw , how many files have been re-analysed ?  ** 2835 ** OMG !!
<pre>grep "^ --- " mime | sort -u | wc -l
220
</pre>

## Conclusion

- Only the system command *file -bi* works. While it is **should** be easy on the cpu , it's also hard on the I/O -> VM :( 
- Something nasty with the mime  implementation and/or my system configuration -> Hints ? :D
- Need to cache during the rebuild : a same page needs not being rechecked for its mime while it's locked !


--mathdesc

> >        if ($mimeinfo_ok) {
> >                my $mimetype=File::MimeInfo::Magic::magic($file);
> >        }
> 
> That seems strange to me, `my` restricts scope of $mimetype to enclosing if block, thus, assigned value will be dropped - I think, it is the problem.
> Try removing that stray `my`.
>
> --isbear