aboutsummaryrefslogtreecommitdiff
path: root/doc/todo/headless_git_branches.mdwn
blob: d9bb38099d052c57417f9c4b9c5e69114405a26b (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
Ikiwiki should really survive being asked to work with a git branch that has no existing commits.

    mkdir iki-gittest
    cd iki-gittest
    GIT_DIR=barerepo.git git init
    git clone barerepo.git srcdir
    ikiwiki --rcs=git srcdir destdir

I've fixed this initial construction case, and, based on my testing, I've
also fixed the post-update executing on a new master, and ikiwiki.cgi
executing on a non-existent master cases.

Please commit so my users stop whining at me about having clean branches to push to, the big babies.

Summary: Change three scary loud failure cases related to empty branches into three mostly quiet success cases.

[[!tag patch]]

> FWIW, [[The_TOVA_Company]] apparently wants this feature (and I hope
> I don't mind that I mention they were willing to pay someone for it,
> but I told them I'd not done any of the work. :) )
> 
> Code review follows, per hunk.. --[[Joey]] 

<pre>
diff --git a/IkiWiki/Plugin/git.pm b/IkiWiki/Plugin/git.pm
index cf7fbe9..e5bafcf 100644
--- a/IkiWiki/Plugin/git.pm
+++ b/IkiWiki/Plugin/git.pm
@@ -439,17 +439,21 @@ sub git_commit_info ($;$) {
 
 	my @opts;
 	push @opts, "--max-count=$num" if defined $num;
-
-	my @raw_lines = run_or_die('git', 'log', @opts,
-		'--pretty=raw', '--raw', '--abbrev=40', '--always', '-c',
-		'-r', $sha1, '--', '.');
-
+	my @raw_lines;
 	my @ci;
-	while (my $parsed = parse_diff_tree(\@raw_lines)) {
-		push @ci, $parsed;
-	}
+        
+	# Test to see if branch actually exists yet.
+	if (run_or_non('git', 'show-ref', '--quiet', '--verify', '--', 'refs/heads/' . $config{gitmaster_branch}) ) {
+		@raw_lines = run_or_die('git', 'log', @opts,
+			'--pretty=raw', '--raw', '--abbrev=40', '--always', '-c',
+			'-r', $sha1, '--', '.');
+
+		while (my $parsed = parse_diff_tree(\@raw_lines)) {
+			push @ci, $parsed;
+		}
 
-	warn "Cannot parse commit info for '$sha1' commit" if !@ci;
+		warn "Cannot parse commit info for '$sha1' commit" if !@ci;
+	};
 
 	return wantarray ? @ci : $ci[0];
 }
</pre>

My concern is that this adds a bit of slowdown (git show-ref is fast, but
It's still extra work) to a very hot code path that is run to eg,
update recentchanges after every change. 

Seems not ideal to do extra work every time to handle a case
that will literally happen a maximum of once in the entire lifecycle of a
wiki (and zero times more typically, since the setup automator puts in a
.gitignore file that works around this problem).

So as to not just say "no" ... what if it always tried to run git log,
and if it failed (or returned no parsed lines), then it could look 
at git show-ref to deduce whether to throw an error or not.
--[[Joey]] 

> Ah, but then git-log would still complain "bad revision 'HEAD'"
> --[[Joey]] 

    jrayhawk@piny:/srv/git/jrayhawk.git$ time perl -e 'for( $i = 1; $i < 10000; $i++) { system("git", "show-ref", "--quiet", "--verify", "--", "refs/heads/master"); }'
    
    real  0m10.988s
    user  0m0.120s
    sys   0m1.210s

> FWIW, "an extra millisecond per edit" vs "full git coverage" is no
> contest for me; I use that patch on seven different systems, including
> freedesktop.org, because I've spent more time explaining to users either
> why Ikiwiki won't work on their empty repositories or why their
> repositories need useless initial commits (a la Branchable) that make
> pushing not work and why denyNonFastForwards=0 and git push -f are
> necessary than all the milliseconds that could've been saved in the
> world.
> 
> But, since we're having fun rearranging deck chairs on the RMS Perl
> (toot toot)...
> 
> There's some discrepency here I wasn't expecting:

    jrayhawk@piny:/srv/git/jrayhawk.git$ time dash -c 'i=0; while [ $i -lt 10000 ]; do i=$((i+1)); git show-ref --quiet --verify -- refs/heads/master; done'
    
    real  0m9.986s
    user  0m0.170s
    sys   0m0.940s

> While looking around in the straces, I notice Perl, unlike {b,d}ash
> appears to do PATH lookup on every invocation of git, adding up to
> around 110 microseconds apiece on a post-2.6.38 16-thread QPI system:

    29699      0.000112 execve("/home/jrayhawk/bin/git", ["git", "show-ref", "--quiet", "--verify", "--", "refs/heads/master"], [/* 17 vars */]) = -1 ENOENT (No such file or directory)
    29699      0.000116 execve("/usr/local/bin/git", ["git", "show-ref", "--quiet", "--verify", "--", "refs/heads/master"], [/* 17 vars */]) = -1 ENOENT (No such file or directory)
    29699      0.000084 execve("/usr/bin/git", ["git", "show-ref", "--quiet", "--verify", "--", "refs/heads/master"], [/* 17 vars */]) = 0

> You can probably save a reasonable number of context switches and
> RCU-heavy (or, previously, lock-heavy) dentry lookups by doing a Perl
> equivalent of `which git` and using the result. It might even add up to
> a whole millisecond in some circumstances!
> 
> No idea where the rest of that time is going. Probably cache misses
> on the giant Perl runtime or something.
> 
> ...
> 
> Now I feel dirty for having spent more time talking about optimization
> than that optimization is likely to save. This must be what being an
> engineer feels like.
> --jrayhawk

<pre>
@@ -474,7 +478,10 @@ sub rcs_update () {
 	# Update working directory.
 
 	if (length $config{gitorigin_branch}) {
-		run_or_cry('git', 'pull', '--prune', $config{gitorigin_branch});
+		run_or_cry('git', 'fetch', '--prune', $config{gitorigin_branch});
+		if (run_or_non('git', 'show-ref', '--quiet', '--verify', '--', 'refs/remotes/' . $config{gitorigin_branch} . '/' . $config{gitmaster_branch}) ) {
+			run_or_cry('git', 'merge', $config{gitorigin_branch} . '/' . $config{gitmaster_branch});
+		}
 	}
 }

</pre>

Same concern here about extra work. Code path is nearly as hot, being
called on every refresh. Probably could be dealt with similarly as above.

Also, is there any point in breaking the pull up into a
fetch followed by a merge? --[[Joey]] 

> The same benchmarking applies, at least.
>
> Re: fetch/merge: We can't test for the nonexistence of the origin branch
> without fetching it, and we can't merge it if it is, indeed,
> nonexistant.
>
> Unless you're implying that it would be better to just spam stderr with
> unnecessary scary messages and/or ignore/suppress them and lose the
> ability to respond appropriately to every other error condition. As
> maintainer, you deal with a disproportionate amount of the resulting
> support fallout, so I'm perfectly satisfied letting you make that call.
> --jrayhawk

<pre>
@@ -559,7 +566,7 @@ sub rcs_commit_helper (@) {
 	# So we should ignore its exit status (hence run_or_non).
 	if (run_or_non('git', 'commit', '-m', $params{message}, '-q', @opts)) {
 		if (length $config{gitorigin_branch}) {
-			run_or_cry('git', 'push', $config{gitorigin_branch});
+			run_or_cry('git', 'push', $config{gitorigin_branch}, $config{gitmaster_branch});
 		}
 	}
 	
</pre>

This seems fine to apply. --[[Joey]]

> Hooray!
> --jrayhawk