doc/design-paper/roadmap-2007.tex


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690

\documentclass{article}

\usepackage{url}

\newenvironment{tightlist}{\begin{list}{$\bullet$}{
  \setlength{\itemsep}{0mm}
    \setlength{\parsep}{0mm}
    %  \setlength{\labelsep}{0mm}
    %  \setlength{\labelwidth}{0mm}
    %  \setlength{\topsep}{0mm}
    }}{\end{list}}
\newcommand{\tmp}[1]{{\bf #1} [......] \\}
\newcommand{\plan}[1]{ {\bf (#1)}}

\begin{document}

\title{Tor Development Roadmap: Wishlist for Nov 2006--Dec 2007}
\author{Roger Dingledine \and Nick Mathewson \and Shava Nerad}

\maketitle
\pagestyle{plain}

% TO DO:
%   add cites
%   add time estimates


\section{Introduction}
%Hi, Roger!  Hi, Shava.  This paragraph should get deleted soon.  Right now,
%this document goes into about as much detail as I'd like to go into for a
%technical audience, since that's the audience I know best.  It doesn't have
%time estimates everywhere.  It isn't well prioritized, and it doesn't
%distinguish well between things that need lots of research and things that
%don't.  The breakdowns don't all make sense.  There are lots of things where
%I don't make it clear how they fit into larger goals, and lots of larger
%goals that don't break down into little things. It isn't all stuff we can do
%for sure, and it isn't even all stuff we can do for sure in 2007.  The
%tmp\{\} macro indicates stuff I haven't said enough about.  That said, here
%plangoes...

Tor (the software) and Tor (the overall software/network/support/document
suite) are now experiencing all the crises of success.  Over the next year,
we're probably going to grow more in terms of users, developers, and funding
than before.  This gives us the opportunity to perform long-neglected
maintenance tasks.

\section{Code and design infrastructure}

\subsection{Protocol revision}
To maintain backward compatibility, we've postponed major protocol
changes and redesigns for a long time.  Because of this, there are a number
of sensible revisions we've been putting off until we could deploy several of
them at once.  To do each of these, we first need to discuss design
alternatives with other cryptographers and outside collaborators to
make sure that our choices are secure.

First of all, our protocol needs better {\bf versioning support} so that we
can make backward-incompatible changes to our core protocol.  There are
difficult anonymity issues here, since many naive designs would make it easy
to tell clients apart (and then track them) based on their supported versions.

With protocol versioning support would come the ability to {\bf future-proof
  our ciphersuites}.  For example, not only our OR protocol, but also our
directory protocol, is pretty firmly tied to the SHA-1 hash function, which
though not yet known to be insecure for our purposes, has begun to show
its age.  We should
remove assumptions throughout our design based on the assumption that public
keys, secret keys, or digests will remain any particular size indefinitely.

Our OR {\bf authentication protocol}, though provably
secure\cite{tap:pet2006}, relies more on particular aspects of RSA and our
implementation thereof than we had initially believed.  To future-proof
against changes, we should replace it with a less delicate approach.

\plan{For all the above: 2 person-months to specify, spread over several
  months with time for interaction with external participants.  One
  person-month to implement.  Start specifying in early 2007.}

We might design a {\bf stream migration} feature so that streams tunneled
over Tor could be more resilient to dropped connections and changed IPs.
\plan{Not in 2007.}

A new protocol could support {\bf multiple cell sizes}.  Right now, all data
passes through the Tor network divided into 512-byte cells.  This is
efficient for high-bandwidth protocols, but inefficient for protocols
like SSH or AIM that send information in small chunks.  Of course, we need to
investigate the extent to which multiple sizes could make it easier for an
adversary to fingerprint a traffic pattern. \plan{Not in 2007.}

As a part of our design, we should investigate possible {\bf cipher modes}
other than counter mode.  For example, a mode with built-in integrity
checking, error propagation, and random access could simplify our protocol
significantly.  Sadly, many of these are patented and unavailable for us.
\plan{Not in 2007.}

\subsection{Scalability}

\subsubsection{Improved directory efficiency}
Right now, clients download a statement of the {\bf network status} made by
each directory authority.  We could reduce network bandwidth significantly by
having the authorities jointly sign a statement reflecting their vote on the
current network status.  This would save clients up to 160K per hour, and
make their view of the network more uniform.  Of course, we'd need to make
sure the voting process was secure and resilient to failures in the
network.\plan{Must do; specify in 2006. 2 weeks to specify, 3-4 weeks to
  implement.}

We should {\bf shorten router descriptors}, since the current format includes
a great deal of information that's only of interest to the directory
authorities, and not of interest to clients.  We can do this by having each
router upload a short-form and a long-form signed descriptor, and having
clients download only the short form.  Even a naive version of this would
save about 40\% of the bandwidth currently spent by clients downloading
descriptors.\plan{Must do; specify in 2006. 3-4 weeks.}

We should {\bf have routers upload their descriptors even less often}, so
that clients do not need to download replacements every 18 hours whether any
information has changed or not.  (As of Tor 0.1.2.3-alpha, clients tolerate
routers that don't upload often, but routers still upload at least every 18
hours to support older clients.) \plan{Must do, but not until 0.1.1.x is
deprecated in mid 2007. 1 week.}

\subsubsection{Non-clique topology}
Our current network design achieves a certain amount of its anonymity by
making clients act like each other through the simple expedient of making
sure that all clients know all servers, and that any server can talk to any
other server.  But as the number of servers increases to serve an
ever-greater number of clients, these assumptions become impractical.

At worst, if these scalability issues become troubling before a solution is
found, we can design and build a solution to {\bf split the network into
multiple slices} until a better solution comes along.  This is not ideal,
since rather than looking like all other users from a point of view of path
selection, users would ``only'' look like 200,000--300,000 other
users.\plan{Not unless needed.}

We are in the process of designing {\bf improved schemes for network
  scalability}.  Some approaches focus on limiting what an adversary can know
about what a user knows; others focus on reducing the extent to which an
adversary can exploit this knowledge.  These are currently in their infancy,
and will probably not be needed in 2007, but they must be designed in 2007 if
they are to be deployed in 2008.\plan{Design in 2007; unknown difficulty.
  Write a paper.}

\subsubsection{Relay incentives}
To support more users on the network, we need to get more servers.  So far,
we've relied on volunteerism to attract server operators, and so far it's
served us well.  But in the long run, we need to {\bf design incentives for
  users to run servers} and relay traffic for others.  Most obviously, we
could try to build the network so that servers offered improved service for
other servers, but we would need to do so without weakening anonymity and
making it obvious which connections originate from users running servers.  We
have some preliminary designs~\cite{incentives-txt,tor-challenges},
but need to perform
some more research to make sure they would be safe and effective.\plan{Write
  a draft paper; 2 person-months.}

\subsection{Portability}
Our {\bf Windows implementation}, though much improved, continues to lag
behind Unix and Mac OS X, especially when running as a server.  We hope to
merge promising patches from Mike Chiussi to address this point, and bring
Windows performance on par with other platforms.\plan{Do in 2007; 1.5 months
  to integrate not counting Mike's work.}

We should have {\bf better support for portable devices}, including modes of
operation that require less RAM, and that write to disk less frequently (to
avoid wearing out flash RAM).\plan{Optional; 2 weeks.}

We should {\bf stop using socketpair on Windows}; instead, we can use
in-memory structures to communicate between cpuworkers and the main thread,
and between connections.\plan{Optional; 1 week.}

\subsection{Performance: resource usage}
We've been working on {\bf using less RAM}, especially on servers.  This has
paid off a lot for directory caches in the 0.1.2, which in some cases are
using 90\% less memory than they used to require.  But we can do better,
especially in the area around our buffer management algorithms, by using an
approach more like the BSD and Linux kernels use instead of our current ring
buffer approach.  (For OR connections, we can just use queues of cell-sized
chunks produced with a specialized allocator.)  This could potentially save
around 25 to 50\% of the memory currently allocated for network buffers, and
make Tor a more attractive proposition for restricted-memory environments
like old computers, mobile devices, and the like.\plan{Do in 2007; 2-3 weeks
  plus one week measurement.}

We should improve our {\bf bandwidth limiting}.  The current system has been
crucial in making users willing to run servers: nobody is willing to run a
server if it might use an unbounded amount of bandwidth, especially if they
are charged for their usage.  We can make our system better by letting users
configure bandwidth limits independently for their own traffic and traffic
relayed for others; and by adding write limits for users running directory
servers.\plan{Do in 2006; 2-3 weeks.}

On many hosts, sockets are still in short supply, and will be until we can
migrate our protocol to UDP.  We can {\bf use fewer sockets} by making our
self-to-self connections happen internally to the code rather than involving
the operating system's socket implementation.\plan{Optional; 1 week.}

\subsection{Performance: network usage}
We know too little about how well our current path
selection algorithms actually spread traffic around the network in practice.
We should {\bf research the efficacy of our traffic allocation} and either
assure ourselves that it is close enough to optimal as to need no improvement
(unlikely) or {\bf identify ways to improve network usage}, and get more
users' traffic delivered faster.  Performing this research will require
careful thought about anonymity implications.

We should also {\bf examine the efficacy of our congestion control
  algorithm}, and see whether we can improve client performance in the
presence of a congested network through dynamic `sendme' window sizes or
other means.  This will have anonymity implications too if we aren't careful.

\plan{For both of the above: research, design and write
  a measurement tool in 2007: 1 month.  See if we can interest a graduate
  student.}

We should work on making Tor's cell-based protocol  perform better on
networks with low bandwidth
and high packet loss.\plan{Do in 2007 if we're funded to do it; 4-6 weeks.}

\subsection{Performance scenario: one Tor client, many users}
We should {\bf improve Tor's performance when a single Tor handles many
  clients}.  Many organizations want to manage a single Tor client on their
firewall for many users, rather than having each user install a separate
Tor client.  We haven't optimized for this scenario, and it is likely that
there are some code paths in the current implementation that become
inefficient when a single Tor is servicing hundreds or thousands of client
connections.  (Additionally, it is likely that such clients have interesting
anonymity requirements the we should investigate.)  We should profile Tor
under appropriate loads, identify bottlenecks, and fix them.\plan{Do in 2007
  if we're funded to do it; 4-8 weeks.}

\subsection{Tor servers on asymmetric bandwidth}

Tor should work better on servers that have asymmetric connections like cable
or DSL.  Because Tor has separate TCP connections between each
hop, if the incoming bytes are arriving just fine and the outgoing bytes are
all getting dropped on the floor, the TCP push-back mechanisms don't really
transmit this information back to the incoming streams.\plan{Do in 2007 since
  related to bandwidth limiting.  3-4 weeks.}

\subsection{Running Tor as both client and server}

Many performance tradeoffs and balances that might need more attention.
We first need to track and fix whatever bottlenecks emerge; but we also
need to invent good algorithms for prioritizing the client's traffic
without starving the server's traffic too much.\plan{No idea; try
profiling and improving things in 2007.}

\subsection{Protocol redesign for UDP}
Tor has relayed only TCP traffic since its first versions, and has used
TLS-over-TCP to do so.  This approach has proved reliable and flexible, but
in the long term we will need to allow UDP traffic on the network, and switch
some or all of the network to using a UDP transport.  {\bf Supporting UDP
  traffic} will make Tor more suitable for protocols that require UDP, such
as many VOIP protocols.  {\bf Using a UDP transport} could greatly reduce
resource limitations on servers, and make the network far less interruptible
by lossy connections.  Either of these protocol changes would require a great
deal of design work, however.  We hope to be able to enlist the aid of a few
talented graduate students to assist with the initial design and
specification, but the actual implementation will require significant testing
of different reliable transport approaches.\plan{Maybe do a design in 2007 if
we find an interested academic.  Ian or Ben L might be good partners here.}

\section{Blocking resistance}

\subsection{Design for blocking resistance}
We have written a design document explaining our general approach to blocking
resistance.  We should workshop it with other experts in the field to get
their ideas about how we can improve Tor's efficacy as an anti-censorship
tool.

\subsection{Implementation: client-side and bridges-side}

Our anticensorship design calls for some nodes to act as ``bridges''
that are outside a national firewall, and others inside the firewall to
act as pure clients.  This part of the design is quite clear-cut; we're
probably ready to begin implementing it.  To {\bf implement bridges}, we
need to have servers publish themselves as limited-availability relays
to a special bridge authority if they judge they'd make good servers.
We will also need to help provide documentation for port forwarding,
and an easy configuration tool for running as a bridge.

To {\bf implement clients}, we need to provide a flexible interface to
learn about bridges and to act on knowledge of bridges. We also need
to teach them how to know to use bridges as their first hop, and how to
fetch directory information from both classes of directory authority.

Clients also need to {\bf use the encrypted directory variant} added in Tor
0.1.2.3-alpha.  This will let them retrieve directory information over Tor
once they've got their initial bridges. We may want to get the rest of the
Tor user base to begin using this encrypted directory variant too, to
provide cover.

Bridges will want to be able to {\bf listen on multiple addresses and ports}
if they can, to give the adversary more ports to block.

\subsection{Research: anonymity implications from becoming a bridge}

\subsection{Implementation: bridge authority}

The design here is also reasonably clear-cut: we need to run some
directory authorities with a slightly modified protocol that doesn't leak
the entire list of bridges. Thus users can learn up-to-date information
for bridges they already know about, but they can't learn about arbitrary
new bridges.

\subsection{Normalizing the Tor protocol on the wire}
Additionally, we should {\bf resist content-based filters}.  Though an
adversary can't see what users are saying, some aspects of our protocol are
easy to fingerprint {\em as} Tor.  We should correct this where possible.

Look like Firefox; or look like nothing?
Future research: investigate timing similarities with other protocols.

\subsection{Access control for bridges}
Design/impl: password-protecting bridges, in light of above.
And/or more general access control.

\subsection{Research: scanning-resistance}

\subsection{Research/Design/Impl: how users discover bridges}
Our design anticipates an arms race between discovery methods and censors.
We need to begin the infrastructure on our side quickly, preferably in a
flexible language like Python, so we can adapt quickly to censorship.

phase one: personal bridges
phase two: families of personal bridges
phase three: more structured social network
phase four: bag of tricks
Research: phase five...

Integration with Psiphon, etc?

\subsection{Document best practices for users}
Document best practices for various activities common among
blocked users (e.g. WordPress use).

\subsection{Research: how to know if a bridge has been blocked?}

\subsection{GeoIP maintenance, and "private" user statistics}
How to know if the whole idea is working?

\subsection{Research: hiding whether the user is reading or publishing?}

\subsection{Research: how many bridges do you need to know to maintain
reachability?}

\subsection{Resisting censorship of the Tor website, docs, and mirrors}

We should take some effort to consider {\bf initial distribution of Tor and
  related information} in countries where the Tor website and mirrors are
censored.  (Right now, most countries that block access to Tor block only the
main website and leave mirrors and the network itself untouched.)  Falling
back on word-of-mouth is always a good last resort, but we should also take
steps to make sure it's relatively easy for users to get ahold of a copy.

\section{Security}

\subsection{Security research projects}

We should investigate approaches with some promise to help Tor resist
end-to-end traffic correlation attacks.  It's an open research question
whether (and to what extent) {\bf mixed-latency} networks, {\bf low-volume
  long-distance padding}, or other approaches can resist these attacks, which
are currently some of the most effective against careful Tor users.  We
should research these questions and perform simulations to identify
opportunities for strengthening our design without dropping performance to
unacceptable levels. %Cite something
\plan{Start doing this in 2007; write a paper.  8-16 weeks.}

We've got some preliminary results suggesting that {\bf a topology-aware
  routing algorithm}~\cite{feamster:wpes2004} could reduce Tor users'
vulnerability against local or ISP-level adversaries, by ensuring that they
are never in a position to watch both ends of a connection.  We need to
examine the effects of this approach in more detail and consider side-effects
on anonymity against other kinds of adversaries.  If the approach still looks
promising, we should investigate ways for clients to implement it (or an
approximation of it) without having to download routing tables for the whole
Internet. \plan{Not in 2007 unless a graduate student wants to do it.}

%\tmp{defenses against end-to-end correlation}  We don't expect any to work
%right now, but it would be useful to learn that one did.  Alternatively,
%proving that one didn't would free up researchers in the field to go work on
%other things.
%
% See above; I think I got this.

We should research the efficacy of {\bf website fingerprinting} attacks,
wherein an adversary tries to match the distinctive traffic and timing
pattern of the resources constituting a given website to the traffic pattern
of a user's client.  These attacks work great in simulations, but in
practice we hear they don't work nearly as well.  We should get some actual
numbers to investigate the issue, and figure out what's going on.  If we
resist these attacks, or can improve our design to resist them, we should.
% add cites
\plan{Possibly part of end-to-end correlation paper.  Otherwise, not in 2007
  unless a graduate student is interested.}

\subsection{Implementation security}
Right now, each Tor node stores its keys unencrypted.  We should {\bf encrypt
  more Tor keys} so that Tor authorities can require a startup password.  We
should look into adding intermediary medium-term ``signing keys'' between
identity keys and onion keys, so that a password could be required to replace
a signing key, but not to start Tor.  This would improve Tor's long-term
security, especially in its directory authority infrastructure.\plan{Design this
  as a part of the revised ``v2.1'' directory protocol; implement it in
  2007. 3-4 weeks.}

We should also {\bf mark RAM that holds key material as non-swappable} so
that there is no risk of recovering key material from a hard disk
compromise.  This would require submitting patches upstream to OpenSSL, where
support for marking memory as sensitive is currently in a very preliminary
state.\plan{Nice to do, but not in immediate Tor scope.}

There are numerous tools for identifying trouble spots in code (such as
Coverity or even VS2005's code analysis tool) and we should convince somebody
to run some of them against the Tor codebase.  Ideally, we could figure out a
way to get our code checked periodically rather than just once.\plan{Almost
  no time once we talk somebody into it.}

We should try {\bf protocol fuzzing} to identify errors in our
implementation.\plan{Not in 2007 unless we find a grad student or
  undergraduate who wants to try.}

Our guard nodes help prevent an attacker from being able to become a chosen
client's entry point by having each client choose a few favorite entry points
as ``guards'' and stick to them.   We should implement a {\bf directory
  guards} feature to keep adversaries from enumerating Tor users by acting as
a directory cache.\plan{Do in 2007; 2 weeks.}

\subsection{Detect corrupt exits and other servers}
With the success of our network, we've attracted servers in many locations,
operated by many kinds of people.  Unfortunately, some of these locations
have compromised or defective networks, and some of these people are
untrustworthy or incompetent.  Our current design relies on authority
administrators to identify bad nodes and mark them as nonfunctioning.  We
should {\bf automate the process of identifying malfunctioning nodes} as
follows:

We should create a generic {\bf feedback mechanism for add-on tools} like
Mike Perry's ``Snakes on a Tor'' to report failing nodes to authorities.
\plan{Do in 2006; 1-2 weeks.}

We should write tools to {\bf detect more kinds of innocent node failure},
such as nodes whose network providers intercept SSL, nodes whose network
providers censor popular websites, and so on.  We should also try to detect
{\bf routers that snoop traffic}; we could do this by launching connections
to throwaway accounts, and seeing which accounts get used.\plan{Do in 2007;
  ask Mike Perry if he's interested.  4-6 weeks.}

We should add {\bf an efficient way for authorities to mark a set of servers
  as probably collaborating} though not necessarily otherwise dishonest.
This happens when an administrator starts multiple routers, but doesn't mark
them as belonging to the same family.\plan{Do during v2.1 directory protocol
  redesign; 1-2 weeks to implement.}

To avoid attacks where an adversary claims good performance in order to
attract traffic, we should {\bf have authorities measure node performance}
(including stability and bandwidth) themselves, and not simply believe what
they're told.  Measuring stability can be done by tracking MTBF.  Measuring
bandwidth can be tricky, since it's hard to distinguish between a server with
low capacity, and a high-capacity server with most of its capacity in
use.\plan{Do ``Stable'' in 2007; 2-3 weeks.  ``Fast'' will be harder; do it
  if we can interest a grad student.}

{\bf Operating a directory authority should be easier.}  We rely on authority
operators to keep the network running well, but right now their job involves
too much busywork and administrative overhead.  A better interface for them
to use could free their time to work on exception cases rather than on
adding named nodes to the network.\plan{Do in 2007; 4-5 weeks.}

\subsection{Protocol security}

In addition to other protocol changes discussed above,
% And should we move some of them down here? -NM
we should add {\bf hooks for denial-of-service resistance}; we have some
preliminary designs, but we shouldn't postpone them until we really need them.
If somebody tries a DDoS attack against the Tor network, we won't want to
wait for all the servers and clients to upgrade to a new
version.\plan{Research project; do this in 2007 if funded.}

\section{Development infrastructure}

\subsection{Build farm}
We've begun to deploy a cross-platform distributed build farm of hosts
that build and test the Tor source every time it changes in our development
repository.

We need to {\bf get more participants}, so that we can test a larger variety
of platforms.  (Previously, we've only found out when our code had broken on
obscure platforms when somebody got around to building it.)

We need also to {\bf add our dependencies} to the build farm, so that we can
ensure that libraries we need (especially libevent) do not stop working on
any important platform between one release and the next.

\plan{This is ongoing as more buildbots arrive.}

\subsection{Improved testing harness}
Currently, our {\bf unit tests} cover only about 20\% of the code base.  This
is uncomfortably low; we should write more and switch to a more flexible
testing framework.\plan{Ongoing basis, time permitting.}

We should also write flexible {\bf automated single-host deployment tests} so
we can more easily verify that the current codebase works with the
network.\plan{Worthwhile in 2007; would save lots of time.  2-4 weeks.}

We should build automated {\bf stress testing} frameworks so we can see which
realistic loads cause Tor to perform badly, and regularly profile Tor against
these loads.  This would give us {\it in vitro} performance values to
supplement our deployment experience.\plan{Worthwhile in 2007; 2-6 weeks.}

We should improve our memory profiling code.\plan{...}


\subsection{Centralized build system}
We currently rely on a separate packager to maintain the packaging system and
to build Tor on each platform for which we distribute binaries.  Separate
package maintainers is sensible, but separate package builders has meant
long turnaround times between source releases and package releases.  We
should create the necessary infrastructure for us to produce binaries for all
major packages within an hour or so of source release.\plan{We should
  brainstorm this at least in 2007.}

\subsection{Improved metrics}
We need a way to {\bf measure the network's health, capacity, and degree of
  utilization}.  Our current means for doing this are ad hoc and not
completely accurate

We need better ways to {\bf tell which countries are users are coming from,
  and how many there are}.  A good perspective of the network helps us
allocate resources and identify trouble spots, but our current approaches
will work less and less well as we make it harder for adversaries to
enumerate users.  We'll probably want to shift to a smarter, statistical
approach rather than our current ``count and extrapolate'' method.

\plan{All of this in 2007 if funded; 4-8 weeks}

% \tmp{We'd like to know how much of the network is getting used.}
% I think this is covered above -NM

\subsection{Controller library}
We've done lots of design and development on our controller interface, which
allows UI applications and other tools to interact with Tor.  We could
encourage the development of more such tools by releasing a {\bf
  general-purpose controller library}, ideally with API support for several
popular programming languages.\plan{2006 or 2007; 1-2 weeks.}

\section{User experience}

\subsection{Get blocked less, get blocked less broadly}
Right now, some services block connections from the Tor network because
they don't have a better
way to keep vandals from abusing them than blocking IP addresses associated
with vandalism.  Our approach so far has been to educate them about better
solutions that currently exist, but we should also {\bf create better
solutions for limiting vandalism by anonymous users} like credential and
blind-signature based implementations, and encourage their use. Other
promising starting points including writing a patch and explanation for
Wikipedia, and helping Freenode to document, maintain, and expand its
current Tor-friendly position.\plan{Do a writeup here in 2007; 1-2 weeks.}

Those who do block Tor users also block overbroadly, sometimes blacklisting
operators of Tor servers that do not permit exit to their services.  We could
obviate innocent reasons for doing so by designing a {\bf narrowly-targeted Tor
  RBL service} so that those who wanted to overblock Tor could no longer
plead incompetence.\plan{Possibly in 2007 if we decide it's a good idea; 3
  weeks.}

\subsection{All-in-one bundle}
We need a well-tested, well-documented bundle of Tor and supporting
applications configured to use it correctly.  We have an initial
implementation well under way, but it will need additional work in
identifying requisite Firefox extensions, identifying security threats,
improving user experience, and so on.  This will need significantly more work
before it's ready for a general public release.

\subsection{LiveCD Tor}
We need a nice bootable livecd containing a minimal OS and a few applications
configured to use it correctly.  The Anonym.OS project demonstrated that this
is quite feasible, but their project is not currently maintained.

\subsection{A Tor client in a VM}
\tmp{a.k.a JanusVM} which is quite related to the firewall-level deployment
section below. JanusVM is a Linux kernel running in VMWare. It gets an IP
address from the network, and serves as a DHCP server for its host Windows
machine. It intercepts all outgoing traffic and redirects it into Privoxy,
Tor, etc. This Linux-in-Windows approach may help us with scalability in
the short term, and it may also be a good long-term solution rather than
accepting all security risks in Windows.

%\subsection{Interface improvements}
%\tmp{Allow controllers to manipulate server status.}
% (Why is this in the User Experience section?) -RD
% I think it's better left to a generic ``make controller iface better'' item.

\subsection{Firewall-level deployment}
Another useful deployment mode for some users is using {\bf Tor in a firewall
  configuration}, and directing all their traffic through Tor.  This can be a
little tricky to set up currently, but it's an effective way to make sure no
traffic leaves the host un-anonymized.  To achieve this, we need to {\bf
  improve and port our new TransPort} feature which allows Tor to be used
without SOCKS support; to {\bf add an anonymizing DNS proxy} feature to Tor;
and to {\bf construct a recommended set of firewall configurations} to redirect
traffic to Tor.

This is an area where {\bf deployment via a livecd}, or an installation
targeted at specialized home routing hardware, could be useful.

\subsection{Assess software and configurations for anonymity risks}
Right now, users and packagers are more or less on their own when selecting
Firefox extensions.  We should {\bf assemble a recommended list of browser
  extensions} through experiment, and include this in the application bundles
we distribute.

We should also describe {\bf best practices for using Tor with each class of
  application}. For example, Ethan Zuckerman has written a detailed
tutorial on how to use Tor, Firefox, GMail, and Wordpress to blog with
improved safety. There are many other cases on the Internet where anonymity
would be helpful, and there are a lot of ways to screw up using Tor.

The Foxtor and Torbutton extensions serve similar purposes; we should pick a
favorite, and merge in the useful features of the other.

%\tmp{clean up our own bundled software:
%E.g. Merge the good features of Foxtor into Torbutton}
%
% What else did you have in mind? -NM

\subsection{Localization}
Right now, most of our user-facing code is internationalized.  We need to
internationalize the last few hold-outs (like the Tor expert installer), and get
more translations for the parts that are already internationalized.

Also, we should look into a {\bf unified translator's solution}.  Currently,
since different tools have been internationalized using the
framework-appropriate method, different tools require translators to localize
them via different interfaces.  Inasmuch as possible, we should make
translators only need to use a single tool to translate the whole Tor suite.

\section{Support}

It would be nice to set up some {\bf user support infrastructure} and
{\bf contributor support infrastructure}, especially focusing on server
operators and on coordinating volunteers.

This includes intuitive and easy ticket systems for bug reports and
feature suggestions (not just mailing lists with a half dozen people
and no clear roles for who answers what), but it also includes a more
personalized and efficient framework for interaction so we keep the
attention and interest of the contributors, and so we make them feel
helpful and wanted.

\section{Documentation}

\subsection{Unified documentation scheme}

We need to {\bf inventory our documentation.}  Our documentation so far has
been mostly produced on an {\it ad hoc} basis, in response to particular
needs and requests.  We should figure out what documentation we have, which of
it (if any) should get priority, and whether we can't put it all into a
single format.

We could {\bf unify the docs} into a single book-like thing.  This will also
help us identify what sections of the ``book'' are missing.

\subsection{Missing technical documentation}

We should {\bf revise our design paper} to reflect the new decisions and
research we've made since it was published in 2004.  This will help other
researchers evaluate and suggest improvements to Tor's current design.

Other projects sometimes implement the client side of our protocol.  We
encourage this, but we should write {\bf a document about how to avoid
excessive resource use}, so we don't need to worry that they will do so
without regard to the effect of their choices on server resources.

\subsection{Missing user documentation}

Our documentation falls into two broad categories: some is `discoursive' and
explains in detail why users should take certain actions, and other
documentation is `comprehensive' and describes all of Tor's features.  Right
now, we have no document that is both deep, readable, and thorough.  We
should correct this by identifying missing spots in our design.

\bibliographystyle{plain} \bibliography{tor-design}

\end{document}