aboutsummaryrefslogtreecommitdiff
path: root/doc/todo/dependency_types.mdwn
blob: 19294bba054e5fb509c97a23de6c6a1f8cc267eb (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
Ikiwiki currently only has one type of dependency between pages
(plus wikilinks special cased in on the side). This has resulted in various
problems, and it's seemed for a long time to me that ikiwiki needs to get
smarter about what types of dependencies are supported.

### unnecessary work

The current single dependency type causes the depending page to be rebuilt
whenever a matching dependency is added, removed, or *modified*. But a
great many things don't care about the modification case, and often cause
unnecessary page rebuilds:

* map only cares if the pages are added or removed. Content change does
  not matter (unless show=title is used).
* brokenlinks, orphans, pagecount, ditto (generally)
* inline in archive mode cares about page title, author changing, but
  not content. (Ditto for meta with show=title.)
* Causes extra work when solving the [[bugs/transitive_dependencies]]
  problem.

### two types of dependencies needed for [[tracking_bugs_with_dependencies]]

>>  it seems that there are two types of dependency, and ikiwiki
>>  currently only handles one of them.  The first type is "Rebuild this
>>  page when any of these other pages changes" - ikiwiki handles this.
>>  The second type is "rebuild this page when set of pages referred to by
>>  this pagespec changes" - ikiwiki doesn't seem to handle this.  I
>>  suspect that named pagespecs would make that second type of dependency
>>  more important.  I'll try to come up with a good example. -- [[Will]]

>>> Hrm, I was going to build an example of this with backlinks, but it
>>> looks like that is handled as a special case at the moment (line 458 of
>>> render.pm).  I'll see if I can breapk
>>> things another way.  Fixing this properly would allow removal of that special case. -- [[Will]]

>>>> I can't quite understand the distinction you're trying to draw
>>>> between the two types of dependencies. Backlinks are a very special
>>>> case though and I'll be suprised if they fit well into pagespecs.
>>>> --[[Joey]] 

>>>>> The issue is that the existential pagespec matching allows you to build things that have similar
>>>>> problems to backlinks.
>>>>> e.g. the following inline:

    \[[!inline pages="define(~done, link(done)) and link(~done)" archive=yes]]

>>>>> includes any page that links to a page that links to done.  Now imagine I add a new link to 'done' on
>>>>> some random page somewhere - a page which some other page links to which didn't previously get included - the set of pages accepted by the pagespec, and hence the set of
>>>>> pages inlined, will change.  But, there is no dependency anywhere on the page that I altered, so
>>>>> ikiwiki will not rebuild the page with the inline in it.  What is happening is that the page that I altered affects
>>>>> the set of pages matched by the pagespec without itself being matched by the pagespec, and hence included in the dependency list.

>>>>> To make this work well, I think you need to recognise two types of dependencies for each page (and no
>>>>> special cases for particular types of links, eg backlinks).  The first type of dependency says, "The content of
>>>>> this page depends upon the content of these other pages".  The `add_depends()` in the shortcuts
>>>>> plugin is of this form: any time the shortcuts page is edited, any page with a shortcut on it
>>>>> is rebuilt.  The inline plugin also needs to add dependencies of this form to detect when the inlined
>>>>> content changes.  By contrast, the map plugin does not need a dependency of this form, because it
>>>>> doesn't actually care about the content of any pages, just which pages it needs to include (which we'll handle next).

>>>>> The second type of dependency says, "The content of this page depends upon the exact set of pages matched
>>>>> by this pagespec".  The first type of dependency was about the content of some pages, the second type is about
>>>>> which pages get matched by a pagespec.  This is the type of dependency tracking that the map plugin needs.
>>>>> If the set of pages matched by map pagespec changes, then the page with the map on it needs to be rebuilt to show a different list of pages.
>>>>> Inline needs this type of dependency as well as the previous type - This type handles a change in which pages
>>>>> are inlined, the previous type handles a change in the content of any of those pages.  Shortcut does not need this type of
>>>>> dependency.  Most of the places that use `add_depends()` seem to need this type of dependency rather than the first type.

>>>>>> Note that inline and map currently achieve the second type of dependency by
>>>>>> explicitly calling `add_depends` for each page the displayed.
>>>>>> If any of those pages are removed, the regular pagespec would not
>>>>>> match them -- since they're gone. However, the explicit dependency
>>>>>> on them does cause them to match. It's an ugly corner I'd like to
>>>>>> get rid of. --[[Joey]]

>>>>> Implementation Details:  The first type of dependency can be handled very similarly to the current
>>>>> dependency system.  You just need to keep a list of pages that the content depends upon.  You could
>>>>> keep that list as a pagespec, but if you do this you might want to check that the pagespec doesn't change,
>>>>> possibly by adding a dependency of the second type along with the dependency of the first type.

>>>>>> An example of the current system not tracking enough data is 
>>>>>> described in [[bugs/transitive_dependencies]].
>>>>>> --[[Joey]] 

>>>>> The second type of dependency is a little more tricky.  For each page, we'd need a list of pagespecs that
>>>>> the page depended on, and for each pagespec you'd want to store the list of pages that currently match it.
>>>>> On refresh, you'd need to check each pagespec to see if the set of pages that match it has changed, and if
>>>>> that set has changed, then rebuild the dependent page(s).  Oh, and for this second type of dependency, I
>>>>> don't think you can merge pagespecs.  If I wanted to know if either "\*" or "link(done)" changes, then just checking
>>>>> to see if the set of pages matched by "\* or link(done)" changes doesn't work.

>>>>> The current system works because even though you usually want dependencies of the second type, the set of pages
>>>>> referred to by a pagespec can only change if one of those pages itself changes.  i.e. A dependency check of the
>>>>> first type will catch a dependency change of the second type with current pagespecs.
>>>>> This doesn't work with backlinks, and it doesn't work with existential matching.  Backlinks are currently special-cased.  I don't know
>>>>> how to special-case existential matching - I suspect you're better off just getting the dependency tracking right.

>>>>> I also tried to come up with other possible solutions: e.g. can we find the dependencies for a pagespec?  That
>>>>> would be the set of pages where a change on one of those pages could lead to a change in the set of pages matched by the pagespec.
>>>>> For old-style pagespecs without backlinks, the dependency set for a pagespec is the same as the set of pages the pagespec matches.
>>>>> Unfortunately, with existential matching, the set of pages that each
>>>>> pagespec depends upon can quickly become "*", which is not very useful.  -- [[Will]]

### proposal

I propose the following. --[[Joey]] 

* Add a second type of dependency, call it an "presence dependency".
* `add_depends` defaults to adding a regular ("full") dependency, as
  before. (So nothing breaks.)
* `add_depends($page, $spec, presence => 0)` adds an presence dependency.
* `refresh` only looks at added/removed pages when resolving presence
  dependencies.

This seems straightforwardly doable. I'd like [[Will]]'s feedback on it, if
possible. The type types of dependencies I am proposing are not identical
to the two types he talks about above, but I hope are close enough that
they can be used.

This doesn't deal with the stuff that only depend on the metadata of a
page, as collected in the scan pass, changing.  But it does leave a window
open for adding such a dependency type later.

----

I implemented the above in a branch.
[[!template id=gitbranch branch=origin/dependency-types author="[[joey]]"]]

Then I found some problems:

* Something simple like pagecount, that seems like it could use a
  presence dependency, can have a pagespec that uses metadata, like
  `author()` or `copyright()`.
* pagestats, orphans and brokenlinks cannot use presence dependencies
  because they need to update when links change.

Now I'm thinking about having a special dependency look at page
metadata, and fire if the metadata changes. And it seems links should
either be included in that, or there should be a way to make a dependency
that fires when a page's links change. (And what about backlinks?)

It's easy to see when a page's links change, since there is `%oldlinks`.
To see when metadata is changed is harder, since it's stored in the
pagestate by the meta plugin. Also, there are many different types of
metadata, that would need to be matched with the pagespecs somehow.

Quick alternative: Make add_depends look at the pagespec. Ie, if it
is a simple page name, or a glob, we know a presence dependency
can be valid. If's more complex, convert the dependency from
presence to full.

There is a lot to dislike about this method. Its parsing of the pagespec,
as currently implemented, does not let plugins add new types of pagespecs
that only care about presence. Its pagespec parsing is also subject to
false negatives (though these should be somewhat rare, and no false
positives). Still, it does work, and it makes things like simple maps and
pagecounts much more efficient.

---- 

Link dependencies:

* `add_depends($page, $spec, links => 1, presence => 1)`
  adds a links + presence dependency.
* Use backlinks change code to detect changes to link dependencies too.
* So, brokenlinks can fire whenever any links in any of the
  pages it's tracking change, or when pages are added or
  removed.
* To determine if a pagespec is valid to be used with a links dependency,
  use the same set that are valid for presence dependencies. But also
  allow `backlinks()` to be used in it, since that matches pages
  that the page links to, which is just what link dependencies are
  triggered on.

[[done]]