start a roadmap for 2008 and beyond. based on 2007 roadmap as

a starting point. svn:r13083
author: Roger Dingledine <arma@torproject.org> 2008-01-09 14:21:00 +0000
committer: Roger Dingledine <arma@torproject.org> 2008-01-09 14:21:00 +0000
commit: 958c524a2b9b7f8d3bb79ab46e452800899d1c50 (patch)
tree: 04ca4e2c270478e391b8d247345679b7eaf0b061 /doc/design-paper
parent: 3618b7eac30bfe85b17c2795ae249fd6c2347905 (diff)
download: tor-958c524a2b9b7f8d3bb79ab46e452800899d1c50.tar
tor-958c524a2b9b7f8d3bb79ab46e452800899d1c50.tar.gz
1 files changed, 690 insertions, 0 deletions
diff --git a/doc/design-paper/roadmap-future.tex b/doc/design-paper/roadmap-future.tex
new file mode 100644
index 000000000..cebe4a590
--- /dev/null
+++ b/doc/design-paper/roadmap-future.tex
@@ -0,0 +1,690 @@
+\documentclass{article}
+
+\usepackage{url}
+
+\newenvironment{tightlist}{\begin{list}{$\bullet$}{
+  \setlength{\itemsep}{0mm}
+    \setlength{\parsep}{0mm}
+    %  \setlength{\labelsep}{0mm}
+    %  \setlength{\labelwidth}{0mm}
+    %  \setlength{\topsep}{0mm}
+    }}{\end{list}}
+\newcommand{\tmp}[1]{{\bf #1} [......] \\}
+\newcommand{\plan}[1]{ {\bf (#1)}}
+
+\begin{document}
+
+\title{Tor Development Roadmap: Wishlist for Nov 2006--Dec 2007}
+\author{Roger Dingledine \and Nick Mathewson \and Shava Nerad}
+
+\maketitle
+\pagestyle{plain}
+
+% TO DO:
+%   add cites
+%   add time estimates
+
+
+\section{Introduction}
+%Hi, Roger!  Hi, Shava.  This paragraph should get deleted soon.  Right now,
+%this document goes into about as much detail as I'd like to go into for a
+%technical audience, since that's the audience I know best.  It doesn't have
+%time estimates everywhere.  It isn't well prioritized, and it doesn't
+%distinguish well between things that need lots of research and things that
+%don't.  The breakdowns don't all make sense.  There are lots of things where
+%I don't make it clear how they fit into larger goals, and lots of larger
+%goals that don't break down into little things. It isn't all stuff we can do
+%for sure, and it isn't even all stuff we can do for sure in 2007.  The
+%tmp\{\} macro indicates stuff I haven't said enough about.  That said, here
+%plangoes...
+
+Tor (the software) and Tor (the overall software/network/support/document
+suite) are now experiencing all the crises of success.  Over the next year,
+we're probably going to grow more in terms of users, developers, and funding
+than before.  This gives us the opportunity to perform long-neglected
+maintenance tasks.
+
+\section{Code and design infrastructure}
+
+\subsection{Protocol revision}
+To maintain backward compatibility, we've postponed major protocol
+changes and redesigns for a long time.  Because of this, there are a number
+of sensible revisions we've been putting off until we could deploy several of
+them at once.  To do each of these, we first need to discuss design
+alternatives with other cryptographers and outside collaborators to
+make sure that our choices are secure.
+
+First of all, our protocol needs better {\bf versioning support} so that we
+can make backward-incompatible changes to our core protocol.  There are
+difficult anonymity issues here, since many naive designs would make it easy
+to tell clients apart (and then track them) based on their supported versions.
+
+With protocol versioning support would come the ability to {\bf future-proof
+  our ciphersuites}.  For example, not only our OR protocol, but also our
+directory protocol, is pretty firmly tied to the SHA-1 hash function, which
+though not yet known to be insecure for our purposes, has begun to show
+its age.  We should
+remove assumptions throughout our design based on the assumption that public
+keys, secret keys, or digests will remain any particular size indefinitely.
+
+Our OR {\bf authentication protocol}, though provably
+secure\cite{tap:pet2006}, relies more on particular aspects of RSA and our
+implementation thereof than we had initially believed.  To future-proof
+against changes, we should replace it with a less delicate approach.
+
+\plan{For all the above: 2 person-months to specify, spread over several
+  months with time for interaction with external participants.  One
+  person-month to implement.  Start specifying in early 2007.}
+
+We might design a {\bf stream migration} feature so that streams tunneled
+over Tor could be more resilient to dropped connections and changed IPs.
+\plan{Not in 2007.}
+
+A new protocol could support {\bf multiple cell sizes}.  Right now, all data
+passes through the Tor network divided into 512-byte cells.  This is
+efficient for high-bandwidth protocols, but inefficient for protocols
+like SSH or AIM that send information in small chunks.  Of course, we need to
+investigate the extent to which multiple sizes could make it easier for an
+adversary to fingerprint a traffic pattern. \plan{Not in 2007.}
+
+As a part of our design, we should investigate possible {\bf cipher modes}
+other than counter mode.  For example, a mode with built-in integrity
+checking, error propagation, and random access could simplify our protocol
+significantly.  Sadly, many of these are patented and unavailable for us.
+\plan{Not in 2007.}
+
+\subsection{Scalability}
+
+\subsubsection{Improved directory efficiency}
+Right now, clients download a statement of the {\bf network status} made by
+each directory authority.  We could reduce network bandwidth significantly by
+having the authorities jointly sign a statement reflecting their vote on the
+current network status.  This would save clients up to 160K per hour, and
+make their view of the network more uniform.  Of course, we'd need to make
+sure the voting process was secure and resilient to failures in the
+network.\plan{Must do; specify in 2006. 2 weeks to specify, 3-4 weeks to
+  implement.}
+
+We should {\bf shorten router descriptors}, since the current format includes
+a great deal of information that's only of interest to the directory
+authorities, and not of interest to clients.  We can do this by having each
+router upload a short-form and a long-form signed descriptor, and having
+clients download only the short form.  Even a naive version of this would
+save about 40\% of the bandwidth currently spent by clients downloading
+descriptors.\plan{Must do; specify in 2006. 3-4 weeks.}
+
+We should {\bf have routers upload their descriptors even less often}, so
+that clients do not need to download replacements every 18 hours whether any
+information has changed or not.  (As of Tor 0.1.2.3-alpha, clients tolerate
+routers that don't upload often, but routers still upload at least every 18
+hours to support older clients.) \plan{Must do, but not until 0.1.1.x is
+deprecated in mid 2007. 1 week.}
+
+\subsubsection{Non-clique topology}
+Our current network design achieves a certain amount of its anonymity by
+making clients act like each other through the simple expedient of making
+sure that all clients know all servers, and that any server can talk to any
+other server.  But as the number of servers increases to serve an
+ever-greater number of clients, these assumptions become impractical.
+
+At worst, if these scalability issues become troubling before a solution is
+found, we can design and build a solution to {\bf split the network into
+multiple slices} until a better solution comes along.  This is not ideal,
+since rather than looking like all other users from a point of view of path
+selection, users would ``only'' look like 200,000--300,000 other
+users.\plan{Not unless needed.}
+
+We are in the process of designing {\bf improved schemes for network
+  scalability}.  Some approaches focus on limiting what an adversary can know
+about what a user knows; others focus on reducing the extent to which an
+adversary can exploit this knowledge.  These are currently in their infancy,
+and will probably not be needed in 2007, but they must be designed in 2007 if
+they are to be deployed in 2008.\plan{Design in 2007; unknown difficulty.
+  Write a paper.}
+
+\subsubsection{Relay incentives}
+To support more users on the network, we need to get more servers.  So far,
+we've relied on volunteerism to attract server operators, and so far it's
+served us well.  But in the long run, we need to {\bf design incentives for
+  users to run servers} and relay traffic for others.  Most obviously, we
+could try to build the network so that servers offered improved service for
+other servers, but we would need to do so without weakening anonymity and
+making it obvious which connections originate from users running servers.  We
+have some preliminary designs~\cite{incentives-txt,tor-challenges},
+but need to perform
+some more research to make sure they would be safe and effective.\plan{Write
+  a draft paper; 2 person-months.}
+
+\subsection{Portability}
+Our {\bf Windows implementation}, though much improved, continues to lag
+behind Unix and Mac OS X, especially when running as a server.  We hope to
+merge promising patches from Mike Chiussi to address this point, and bring
+Windows performance on par with other platforms.\plan{Do in 2007; 1.5 months
+  to integrate not counting Mike's work.}
+
+We should have {\bf better support for portable devices}, including modes of
+operation that require less RAM, and that write to disk less frequently (to
+avoid wearing out flash RAM).\plan{Optional; 2 weeks.}
+
+We should {\bf stop using socketpair on Windows}; instead, we can use
+in-memory structures to communicate between cpuworkers and the main thread,
+and between connections.\plan{Optional; 1 week.}
+
+\subsection{Performance: resource usage}
+We've been working on {\bf using less RAM}, especially on servers.  This has
+paid off a lot for directory caches in the 0.1.2, which in some cases are
+using 90\% less memory than they used to require.  But we can do better,
+especially in the area around our buffer management algorithms, by using an
+approach more like the BSD and Linux kernels use instead of our current ring
+buffer approach.  (For OR connections, we can just use queues of cell-sized
+chunks produced with a specialized allocator.)  This could potentially save
+around 25 to 50\% of the memory currently allocated for network buffers, and
+make Tor a more attractive proposition for restricted-memory environments
+like old computers, mobile devices, and the like.\plan{Do in 2007; 2-3 weeks
+  plus one week measurement.}
+
+We should improve our {\bf bandwidth limiting}.  The current system has been
+crucial in making users willing to run servers: nobody is willing to run a
+server if it might use an unbounded amount of bandwidth, especially if they
+are charged for their usage.  We can make our system better by letting users
+configure bandwidth limits independently for their own traffic and traffic
+relayed for others; and by adding write limits for users running directory
+servers.\plan{Do in 2006; 2-3 weeks.}
+
+On many hosts, sockets are still in short supply, and will be until we can
+migrate our protocol to UDP.  We can {\bf use fewer sockets} by making our
+self-to-self connections happen internally to the code rather than involving
+the operating system's socket implementation.\plan{Optional; 1 week.}
+
+\subsection{Performance: network usage}
+We know too little about how well our current path
+selection algorithms actually spread traffic around the network in practice.
+We should {\bf research the efficacy of our traffic allocation} and either
+assure ourselves that it is close enough to optimal as to need no improvement
+(unlikely) or {\bf identify ways to improve network usage}, and get more
+users' traffic delivered faster.  Performing this research will require
+careful thought about anonymity implications.
+
+We should also {\bf examine the efficacy of our congestion control
+  algorithm}, and see whether we can improve client performance in the
+presence of a congested network through dynamic `sendme' window sizes or
+other means.  This will have anonymity implications too if we aren't careful.
+
+\plan{For both of the above: research, design and write
+  a measurement tool in 2007: 1 month.  See if we can interest a graduate
+  student.}
+
+We should work on making Tor's cell-based protocol  perform better on
+networks with low bandwidth
+and high packet loss.\plan{Do in 2007 if we're funded to do it; 4-6 weeks.}
+
+\subsection{Performance scenario: one Tor client, many users}
+We should {\bf improve Tor's performance when a single Tor handles many
+  clients}.  Many organizations want to manage a single Tor client on their
+firewall for many users, rather than having each user install a separate
+Tor client.  We haven't optimized for this scenario, and it is likely that
+there are some code paths in the current implementation that become
+inefficient when a single Tor is servicing hundreds or thousands of client
+connections.  (Additionally, it is likely that such clients have interesting
+anonymity requirements the we should investigate.)  We should profile Tor
+under appropriate loads, identify bottlenecks, and fix them.\plan{Do in 2007
+  if we're funded to do it; 4-8 weeks.}
+
+\subsection{Tor servers on asymmetric bandwidth}
+
+Tor should work better on servers that have asymmetric connections like cable
+or DSL.  Because Tor has separate TCP connections between each
+hop, if the incoming bytes are arriving just fine and the outgoing bytes are
+all getting dropped on the floor, the TCP push-back mechanisms don't really
+transmit this information back to the incoming streams.\plan{Do in 2007 since
+  related to bandwidth limiting.  3-4 weeks.}
+
+\subsection{Running Tor as both client and server}
+
+Many performance tradeoffs and balances that might need more attention.
+We first need to track and fix whatever bottlenecks emerge; but we also
+need to invent good algorithms for prioritizing the client's traffic
+without starving the server's traffic too much.\plan{No idea; try
+profiling and improving things in 2007.}
+
+\subsection{Protocol redesign for UDP}
+Tor has relayed only TCP traffic since its first versions, and has used
+TLS-over-TCP to do so.  This approach has proved reliable and flexible, but
+in the long term we will need to allow UDP traffic on the network, and switch
+some or all of the network to using a UDP transport.  {\bf Supporting UDP
+  traffic} will make Tor more suitable for protocols that require UDP, such
+as many VOIP protocols.  {\bf Using a UDP transport} could greatly reduce
+resource limitations on servers, and make the network far less interruptible
+by lossy connections.  Either of these protocol changes would require a great
+deal of design work, however.  We hope to be able to enlist the aid of a few
+talented graduate students to assist with the initial design and
+specification, but the actual implementation will require significant testing
+of different reliable transport approaches.\plan{Maybe do a design in 2007 if
+we find an interested academic.  Ian or Ben L might be good partners here.}
+
+\section{Blocking resistance}
+
+\subsection{Design for blocking resistance}
+We have written a design document explaining our general approach to blocking
+resistance.  We should workshop it with other experts in the field to get
+their ideas about how we can improve Tor's efficacy as an anti-censorship
+tool.
+
+\subsection{Implementation: client-side and bridges-side}
+
+Our anticensorship design calls for some nodes to act as ``bridges''
+that are outside a national firewall, and others inside the firewall to
+act as pure clients.  This part of the design is quite clear-cut; we're
+probably ready to begin implementing it.  To {\bf implement bridges}, we
+need to have servers publish themselves as limited-availability relays
+to a special bridge authority if they judge they'd make good servers.
+We will also need to help provide documentation for port forwarding,
+and an easy configuration tool for running as a bridge.
+
+To {\bf implement clients}, we need to provide a flexible interface to
+learn about bridges and to act on knowledge of bridges. We also need
+to teach them how to know to use bridges as their first hop, and how to
+fetch directory information from both classes of directory authority.
+
+Clients also need to {\bf use the encrypted directory variant} added in Tor
+0.1.2.3-alpha.  This will let them retrieve directory information over Tor
+once they've got their initial bridges. We may want to get the rest of the
+Tor user base to begin using this encrypted directory variant too, to
+provide cover.
+
+Bridges will want to be able to {\bf listen on multiple addresses and ports}
+if they can, to give the adversary more ports to block.
+
+\subsection{Research: anonymity implications from becoming a bridge}
+
+\subsection{Implementation: bridge authority}
+
+The design here is also reasonably clear-cut: we need to run some
+directory authorities with a slightly modified protocol that doesn't leak
+the entire list of bridges. Thus users can learn up-to-date information
+for bridges they already know about, but they can't learn about arbitrary
+new bridges.
+
+\subsection{Normalizing the Tor protocol on the wire}
+Additionally, we should {\bf resist content-based filters}.  Though an
+adversary can't see what users are saying, some aspects of our protocol are
+easy to fingerprint {\em as} Tor.  We should correct this where possible.
+
+Look like Firefox; or look like nothing?
+Future research: investigate timing similarities with other protocols.
+
+\subsection{Access control for bridges}
+Design/impl: password-protecting bridges, in light of above.
+And/or more general access control.
+
+\subsection{Research: scanning-resistance}
+
+\subsection{Research/Design/Impl: how users discover bridges}
+Our design anticipates an arms race between discovery methods and censors.
+We need to begin the infrastructure on our side quickly, preferably in a
+flexible language like Python, so we can adapt quickly to censorship.
+
+phase one: personal bridges
+phase two: families of personal bridges
+phase three: more structured social network
+phase four: bag of tricks
+Research: phase five...
+
+Integration with Psiphon, etc?
+
+\subsection{Document best practices for users}
+Document best practices for various activities common among
+blocked users (e.g. WordPress use).
+
+\subsection{Research: how to know if a bridge has been blocked?}
+
+\subsection{GeoIP maintenance, and "private" user statistics}
+How to know if the whole idea is working?
+
+\subsection{Research: hiding whether the user is reading or publishing?}
+
+\subsection{Research: how many bridges do you need to know to maintain
+reachability?}
+
+\subsection{Resisting censorship of the Tor website, docs, and mirrors}
+
+We should take some effort to consider {\bf initial distribution of Tor and
+  related information} in countries where the Tor website and mirrors are
+censored.  (Right now, most countries that block access to Tor block only the
+main website and leave mirrors and the network itself untouched.)  Falling
+back on word-of-mouth is always a good last resort, but we should also take
+steps to make sure it's relatively easy for users to get ahold of a copy.
+
+\section{Security}
+
+\subsection{Security research projects}
+
+We should investigate approaches with some promise to help Tor resist
+end-to-end traffic correlation attacks.  It's an open research question
+whether (and to what extent) {\bf mixed-latency} networks, {\bf low-volume
+  long-distance padding}, or other approaches can resist these attacks, which
+are currently some of the most effective against careful Tor users.  We
+should research these questions and perform simulations to identify
+opportunities for strengthening our design without dropping performance to
+unacceptable levels. %Cite something
+\plan{Start doing this in 2007; write a paper.  8-16 weeks.}
+
+We've got some preliminary results suggesting that {\bf a topology-aware
+  routing algorithm}~\cite{feamster:wpes2004} could reduce Tor users'
+vulnerability against local or ISP-level adversaries, by ensuring that they
+are never in a position to watch both ends of a connection.  We need to
+examine the effects of this approach in more detail and consider side-effects
+on anonymity against other kinds of adversaries.  If the approach still looks
+promising, we should investigate ways for clients to implement it (or an
+approximation of it) without having to download routing tables for the whole
+Internet. \plan{Not in 2007 unless a graduate student wants to do it.}
+
+%\tmp{defenses against end-to-end correlation}  We don't expect any to work
+%right now, but it would be useful to learn that one did.  Alternatively,
+%proving that one didn't would free up researchers in the field to go work on
+%other things.
+%
+% See above; I think I got this.
+
+We should research the efficacy of {\bf website fingerprinting} attacks,
+wherein an adversary tries to match the distinctive traffic and timing
+pattern of the resources constituting a given website to the traffic pattern
+of a user's client.  These attacks work great in simulations, but in
+practice we hear they don't work nearly as well.  We should get some actual
+numbers to investigate the issue, and figure out what's going on.  If we
+resist these attacks, or can improve our design to resist them, we should.
+% add cites
+\plan{Possibly part of end-to-end correlation paper.  Otherwise, not in 2007
+  unless a graduate student is interested.}
+
+\subsection{Implementation security}
+Right now, each Tor node stores its keys unencrypted.  We should {\bf encrypt
+  more Tor keys} so that Tor authorities can require a startup password.  We
+should look into adding intermediary medium-term ``signing keys'' between
+identity keys and onion keys, so that a password could be required to replace
+a signing key, but not to start Tor.  This would improve Tor's long-term
+security, especially in its directory authority infrastructure.\plan{Design this
+  as a part of the revised ``v2.1'' directory protocol; implement it in
+  2007. 3-4 weeks.}
+
+We should also {\bf mark RAM that holds key material as non-swappable} so
+that there is no risk of recovering key material from a hard disk
+compromise.  This would require submitting patches upstream to OpenSSL, where
+support for marking memory as sensitive is currently in a very preliminary
+state.\plan{Nice to do, but not in immediate Tor scope.}
+
+There are numerous tools for identifying trouble spots in code (such as
+Coverity or even VS2005's code analysis tool) and we should convince somebody
+to run some of them against the Tor codebase.  Ideally, we could figure out a
+way to get our code checked periodically rather than just once.\plan{Almost
+  no time once we talk somebody into it.}
+
+We should try {\bf protocol fuzzing} to identify errors in our
+implementation.\plan{Not in 2007 unless we find a grad student or
+  undergraduate who wants to try.}
+
+Our guard nodes help prevent an attacker from being able to become a chosen
+client's entry point by having each client choose a few favorite entry points
+as ``guards'' and stick to them.   We should implement a {\bf directory
+  guards} feature to keep adversaries from enumerating Tor users by acting as
+a directory cache.\plan{Do in 2007; 2 weeks.}
+
+\subsection{Detect corrupt exits and other servers}
+With the success of our network, we've attracted servers in many locations,
+operated by many kinds of people.  Unfortunately, some of these locations
+have compromised or defective networks, and some of these people are
+untrustworthy or incompetent.  Our current design relies on authority
+administrators to identify bad nodes and mark them as nonfunctioning.  We
+should {\bf automate the process of identifying malfunctioning nodes} as
+follows:
+
+We should create a generic {\bf feedback mechanism for add-on tools} like
+Mike Perry's ``Snakes on a Tor'' to report failing nodes to authorities.
+\plan{Do in 2006; 1-2 weeks.}
+
+We should write tools to {\bf detect more kinds of innocent node failure},
+such as nodes whose network providers intercept SSL, nodes whose network
+providers censor popular websites, and so on.  We should also try to detect
+{\bf routers that snoop traffic}; we could do this by launching connections
+to throwaway accounts, and seeing which accounts get used.\plan{Do in 2007;
+  ask Mike Perry if he's interested.  4-6 weeks.}
+
+We should add {\bf an efficient way for authorities to mark a set of servers
+  as probably collaborating} though not necessarily otherwise dishonest.
+This happens when an administrator starts multiple routers, but doesn't mark
+them as belonging to the same family.\plan{Do during v2.1 directory protocol
+  redesign; 1-2 weeks to implement.}
+
+To avoid attacks where an adversary claims good performance in order to
+attract traffic, we should {\bf have authorities measure node performance}
+(including stability and bandwidth) themselves, and not simply believe what
+they're told.  Measuring stability can be done by tracking MTBF.  Measuring
+bandwidth can be tricky, since it's hard to distinguish between a server with
+low capacity, and a high-capacity server with most of its capacity in
+use.\plan{Do ``Stable'' in 2007; 2-3 weeks.  ``Fast'' will be harder; do it
+  if we can interest a grad student.}
+
+{\bf Operating a directory authority should be easier.}  We rely on authority
+operators to keep the network running well, but right now their job involves
+too much busywork and administrative overhead.  A better interface for them
+to use could free their time to work on exception cases rather than on
+adding named nodes to the network.\plan{Do in 2007; 4-5 weeks.}
+
+\subsection{Protocol security}
+
+In addition to other protocol changes discussed above,
+% And should we move some of them down here? -NM
+we should add {\bf hooks for denial-of-service resistance}; we have some
+preliminary designs, but we shouldn't postpone them until we really need them.
+If somebody tries a DDoS attack against the Tor network, we won't want to
+wait for all the servers and clients to upgrade to a new
+version.\plan{Research project; do this in 2007 if funded.}
+
+\section{Development infrastructure}
+
+\subsection{Build farm}
+We've begun to deploy a cross-platform distributed build farm of hosts
+that build and test the Tor source every time it changes in our development
+repository.
+
+We need to {\bf get more participants}, so that we can test a larger variety
+of platforms.  (Previously, we've only found out when our code had broken on
+obscure platforms when somebody got around to building it.)
+
+We need also to {\bf add our dependencies} to the build farm, so that we can
+ensure that libraries we need (especially libevent) do not stop working on
+any important platform between one release and the next.
+
+\plan{This is ongoing as more buildbots arrive.}
+
+\subsection{Improved testing harness}
+Currently, our {\bf unit tests} cover only about 20\% of the code base.  This
+is uncomfortably low; we should write more and switch to a more flexible
+testing framework.\plan{Ongoing basis, time permitting.}
+
+We should also write flexible {\bf automated single-host deployment tests} so
+we can more easily verify that the current codebase works with the
+network.\plan{Worthwhile in 2007; would save lots of time.  2-4 weeks.}
+
+We should build automated {\bf stress testing} frameworks so we can see which
+realistic loads cause Tor to perform badly, and regularly profile Tor against
+these loads.  This would give us {\it in vitro} performance values to
+supplement our deployment experience.\plan{Worthwhile in 2007; 2-6 weeks.}
+
+We should improve our memory profiling code.\plan{...}
+
+
+\subsection{Centralized build system}
+We currently rely on a separate packager to maintain the packaging system and
+to build Tor on each platform for which we distribute binaries.  Separate
+package maintainers is sensible, but separate package builders has meant
+long turnaround times between source releases and package releases.  We
+should create the necessary infrastructure for us to produce binaries for all
+major packages within an hour or so of source release.\plan{We should
+  brainstorm this at least in 2007.}
+
+\subsection{Improved metrics}
+We need a way to {\bf measure the network's health, capacity, and degree of
+  utilization}.  Our current means for doing this are ad hoc and not
+completely accurate
+
+We need better ways to {\bf tell which countries are users are coming from,
+  and how many there are}.  A good perspective of the network helps us
+allocate resources and identify trouble spots, but our current approaches
+will work less and less well as we make it harder for adversaries to
+enumerate users.  We'll probably want to shift to a smarter, statistical
+approach rather than our current ``count and extrapolate'' method.
+
+\plan{All of this in 2007 if funded; 4-8 weeks}
+
+% \tmp{We'd like to know how much of the network is getting used.}
+% I think this is covered above -NM
+
+\subsection{Controller library}
+We've done lots of design and development on our controller interface, which
+allows UI applications and other tools to interact with Tor.  We could
+encourage the development of more such tools by releasing a {\bf
+  general-purpose controller library}, ideally with API support for several
+popular programming languages.\plan{2006 or 2007; 1-2 weeks.}
+
+\section{User experience}
+
+\subsection{Get blocked less, get blocked less broadly}
+Right now, some services block connections from the Tor network because
+they don't have a better
+way to keep vandals from abusing them than blocking IP addresses associated
+with vandalism.  Our approach so far has been to educate them about better
+solutions that currently exist, but we should also {\bf create better
+solutions for limiting vandalism by anonymous users} like credential and
+blind-signature based implementations, and encourage their use. Other
+promising starting points including writing a patch and explanation for
+Wikipedia, and helping Freenode to document, maintain, and expand its
+current Tor-friendly position.\plan{Do a writeup here in 2007; 1-2 weeks.}
+
+Those who do block Tor users also block overbroadly, sometimes blacklisting
+operators of Tor servers that do not permit exit to their services.  We could
+obviate innocent reasons for doing so by designing a {\bf narrowly-targeted Tor
+  RBL service} so that those who wanted to overblock Tor could no longer
+plead incompetence.\plan{Possibly in 2007 if we decide it's a good idea; 3
+  weeks.}
+
+\subsection{All-in-one bundle}
+We need a well-tested, well-documented bundle of Tor and supporting
+applications configured to use it correctly.  We have an initial
+implementation well under way, but it will need additional work in
+identifying requisite Firefox extensions, identifying security threats,
+improving user experience, and so on.  This will need significantly more work
+before it's ready for a general public release.
+
+\subsection{LiveCD Tor}
+We need a nice bootable livecd containing a minimal OS and a few applications
+configured to use it correctly.  The Anonym.OS project demonstrated that this
+is quite feasible, but their project is not currently maintained.
+
+\subsection{A Tor client in a VM}
+\tmp{a.k.a JanusVM} which is quite related to the firewall-level deployment
+section below. JanusVM is a Linux kernel running in VMWare. It gets an IP
+address from the network, and serves as a DHCP server for its host Windows
+machine. It intercepts all outgoing traffic and redirects it into Privoxy,
+Tor, etc. This Linux-in-Windows approach may help us with scalability in
+the short term, and it may also be a good long-term solution rather than
+accepting all security risks in Windows.
+
+%\subsection{Interface improvements}
+%\tmp{Allow controllers to manipulate server status.}
+% (Why is this in the User Experience section?) -RD
+% I think it's better left to a generic ``make controller iface better'' item.
+
+\subsection{Firewall-level deployment}
+Another useful deployment mode for some users is using {\bf Tor in a firewall
+  configuration}, and directing all their traffic through Tor.  This can be a
+little tricky to set up currently, but it's an effective way to make sure no
+traffic leaves the host un-anonymized.  To achieve this, we need to {\bf
+  improve and port our new TransPort} feature which allows Tor to be used
+without SOCKS support; to {\bf add an anonymizing DNS proxy} feature to Tor;
+and to {\bf construct a recommended set of firewall configurations} to redirect
+traffic to Tor.
+
+This is an area where {\bf deployment via a livecd}, or an installation
+targeted at specialized home routing hardware, could be useful.
+
+\subsection{Assess software and configurations for anonymity risks}
+Right now, users and packagers are more or less on their own when selecting
+Firefox extensions.  We should {\bf assemble a recommended list of browser
+  extensions} through experiment, and include this in the application bundles
+we distribute.
+
+We should also describe {\bf best practices for using Tor with each class of
+  application}. For example, Ethan Zuckerman has written a detailed
+tutorial on how to use Tor, Firefox, GMail, and Wordpress to blog with
+improved safety. There are many other cases on the Internet where anonymity
+would be helpful, and there are a lot of ways to screw up using Tor.
+
+The Foxtor and Torbutton extensions serve similar purposes; we should pick a
+favorite, and merge in the useful features of the other.
+
+%\tmp{clean up our own bundled software:
+%E.g. Merge the good features of Foxtor into Torbutton}
+%
+% What else did you have in mind? -NM
+
+\subsection{Localization}
+Right now, most of our user-facing code is internationalized.  We need to
+internationalize the last few hold-outs (like the Tor expert installer), and get
+more translations for the parts that are already internationalized.
+
+Also, we should look into a {\bf unified translator's solution}.  Currently,
+since different tools have been internationalized using the
+framework-appropriate method, different tools require translators to localize
+them via different interfaces.  Inasmuch as possible, we should make
+translators only need to use a single tool to translate the whole Tor suite.
+
+\section{Support}
+
+It would be nice to set up some {\bf user support infrastructure} and
+{\bf contributor support infrastructure}, especially focusing on server
+operators and on coordinating volunteers.
+
+This includes intuitive and easy ticket systems for bug reports and
+feature suggestions (not just mailing lists with a half dozen people
+and no clear roles for who answers what), but it also includes a more
+personalized and efficient framework for interaction so we keep the
+attention and interest of the contributors, and so we make them feel
+helpful and wanted.
+
+\section{Documentation}
+
+\subsection{Unified documentation scheme}
+
+We need to {\bf inventory our documentation.}  Our documentation so far has
+been mostly produced on an {\it ad hoc} basis, in response to particular
+needs and requests.  We should figure out what documentation we have, which of
+it (if any) should get priority, and whether we can't put it all into a
+single format.
+
+We could {\bf unify the docs} into a single book-like thing.  This will also
+help us identify what sections of the ``book'' are missing.
+
+\subsection{Missing technical documentation}
+
+We should {\bf revise our design paper} to reflect the new decisions and
+research we've made since it was published in 2004.  This will help other
+researchers evaluate and suggest improvements to Tor's current design.
+
+Other projects sometimes implement the client side of our protocol.  We
+encourage this, but we should write {\bf a document about how to avoid
+excessive resource use}, so we don't need to worry that they will do so
+without regard to the effect of their choices on server resources.
+
+\subsection{Missing user documentation}
+
+Our documentation falls into two broad categories: some is `discoursive' and
+explains in detail why users should take certain actions, and other
+documentation is `comprehensive' and describes all of Tor's features.  Right
+now, we have no document that is both deep, readable, and thorough.  We
+should correct this by identifying missing spots in our design.
+
+\bibliographystyle{plain} \bibliography{tor-design}
+
+\end{document}
+
author	Roger Dingledine <arma@torproject.org>	2008-01-09 14:21:00 +0000
committer	Roger Dingledine <arma@torproject.org>	2008-01-09 14:21:00 +0000
commit	958c524a2b9b7f8d3bb79ab46e452800899d1c50 (patch)
tree	04ca4e2c270478e391b8d247345679b7eaf0b061 /doc/design-paper
parent	3618b7eac30bfe85b17c2795ae249fd6c2347905 (diff)
download	tor-958c524a2b9b7f8d3bb79ab46e452800899d1c50.tar tor-958c524a2b9b7f8d3bb79ab46e452800899d1c50.tar.gz