2 files changed, 46 insertions, 47 deletions
diff --git a/doc/TODO b/doc/TODO
index 9aaabf7bc..5be3f3ee2 100644
--- a/doc/TODO
+++ b/doc/TODO
@@ -5,6 +5,7 @@ rename ACI to CircID
 rotate tls-level connections -- make new ones, expire old ones.
 dirserver shouldn't put you in running-routers list if you haven't
   uploading a descriptor recently
+look at having smallcells and largecells
 
 Legend:
 SPEC!!  - Not specified
diff --git a/doc/tor-design.tex b/doc/tor-design.tex
index 0e75b1bac..466a28a50 100644
--- a/doc/tor-design.tex
+++ b/doc/tor-design.tex
@@ -52,7 +52,7 @@
 \begin{abstract}
 We present Tor, a circuit-based low-latency anonymous communication
 system. Tor is the successor to Onion Routing
-and addresses many limitations in the original Onion Routing design.
+and addresses various limitations in the original Onion Routing design.
 Tor works in a real-world Internet environment, requires no special
 privileges such as root- or kernel-level access,
 requires little synchronization or coordination between nodes, and
@@ -388,7 +388,8 @@ they avoid the well-known inefficiencies of tunneling TCP over TCP
 
 Distributed-trust anonymizing systems need to prevent attackers from
 adding too many servers and thus compromising too many user paths.
-Tor relies on a centrally maintained set of well-known servers. Tarzan
+Tor relies on a small set of well-known servers to make
+decisions about which nodes can join. Tarzan
 and MorphMix allow unknown users to run servers, and limit an attacker
 from becoming too much of the network based on a limited resource such
 as number of IPs controlled. Crowds suggests requiring written, notarized
@@ -440,13 +441,13 @@ so that it can serve as a test-bed for future research in low-latency
 anonymity systems.  Many of the open problems in low-latency anonymity
 networks, such as generating dummy traffic or preventing Sybil attacks
 \cite{sybil}, may be solvable independently from the issues solved by
-Tor. Hopefully future systems will not need to reinvent Tor's design
-decisions.  (But note that while a flexible design benefits researchers,
+Tor. Hopefully future systems will not need to reinvent Tor's design.
+(But note that while a flexible design benefits researchers,
 there is a danger that differing choices of extensions will make users
 distinguishable. Experiments should be run on a separate network.)
 
-\textbf{Conservative design:} The protocol's design and security
-parameters must be conservative. Additional features impose implementation
+\textbf{Simple design:} The protocol's design and security
+parameters must be well-understood. Additional features impose implementation
 and complexity costs; adding unproven techniques to the design threatens
 deployability, readability, and ease of security analysis. Tor aims to
 deploy a simple and stable system that integrates the best well-understood
@@ -454,14 +455,15 @@ approaches to protecting anonymity.
 
 \SubSection{Non-goals}
 \label{subsec:non-goals}
-In favoring conservative, deployable designs, we have explicitly deferred
+In favoring simple, deployable designs, we have explicitly deferred
 a number of goals, either because they are solved elsewhere, or because
 they are an open research question.
 
 \textbf{Not Peer-to-peer:} Tarzan and MorphMix aim to scale to completely
 decentralized peer-to-peer environments with thousands of short-lived
 servers, many of which may be controlled by an adversary.  This approach
-is appealing, but still has many open problems.
+is appealing, but still has many open problems
+\cite{tarzan:ccs02,morphmix:fc04}.
 
 \textbf{Not secure against end-to-end attacks:} Tor does not claim
 to provide a definitive solution to end-to-end timing or intersection
@@ -522,9 +524,10 @@ network and correlating traffic entering and leaving the network---either
 because of relationships in packet timing; relationships in the volume
 of data sent; or relationships in any externally visible user-selected
 options. The adversary can also mount active attacks by compromising
-routers or keys; by replaying traffic; by selectively DoSing trustworthy
-routers to encourage users to send their traffic through compromised
-routers, or DoSing users to see if the traffic elsewhere in the
+routers or keys; by replaying traffic; by selectively denying service
+to trustworthy routers to encourage users to send their traffic through
+compromised routers, or denying service to users to see if the traffic
+elsewhere in the
 network stops; or by introducing patterns into traffic that can later be
 detected. The adversary might attack the directory servers to give users
 differing views of network state. Additionally, he can try to decrease
@@ -587,8 +590,10 @@ fairness issues.
 % I think we should describe connections before cells. -NM
 
 Traffic passes from one OR to another, or between a user's OP and an OR,
-in fixed-size cells. Each cell is 256
-bytes, and consists of a header and a payload. The header includes an
+in fixed-size cells. Each cell is 256 bytes (but see
+Section~\ref{sec:conclusion}
+for a discussion of allowing large cells and small cells on the same
+network), and consists of a header and a payload. The header includes an
 anonymous circuit identifier (ACI) that specifies which circuit the
 % Should we replace ACI with circID ? What is this 'anonymous circuit'
 % thing anyway? -RD
@@ -611,7 +616,8 @@ be multiplexed over a circuit); an end-to-end checksum for integrity
 checking; the length of the relay payload; and a relay command. Relay
 commands can be one of: \emph{relay
 data} (for data flowing down the stream), \emph{relay begin} (to open a
-stream), \emph{relay end} (to close a stream), \emph{relay connected}
+stream), \emph{relay end} (to close a stream cleanly), \emph{relay
+teardown} (to close a broken stream), \emph{relay connected}
 (to notify the OP that a relay begin has succeeded), \emph{relay
 extend} and \emph{relay extended} (to extend the circuit by a hop,
 and to acknowledge), \emph{relay truncate} and \emph{relay truncated}
@@ -621,9 +627,6 @@ implement long-range dummies).
 
 We describe each of these cell types in more detail below.
 
-% Nick: should there have been a table here? -RD
-% Maybe. -NM
-
 \SubSection{Circuits and streams}
 \label{subsec:circuits}
 
@@ -638,8 +641,9 @@ open many TCP streams.
 In Tor, each circuit can be shared by many TCP streams.  To avoid
 delays, users construct circuits preemptively.  To limit linkability
 among the streams, users rotate connections by building a new circuit
-periodically (currently every minute) if the previous one has been
-used, and expire old used circuits that are no longer in use. Thus
+periodically if the previous one has been used,
+and expire old used circuits that are no longer in use. Tor considers
+making a new circuit once a minute: thus
 even heavy users spend a negligible amount of time and CPU in
 building circuits, but only a limited number of requests can be linked
 to each other by a given exit node. Also, because circuits are built
@@ -745,25 +749,25 @@ applications like Mozilla and ssh have this flaw.
 
 In the case of Mozilla, we're fine: the filtering web proxy called Privoxy
 does the SOCKS call safely, and Mozilla talks to Privoxy safely. But a
-portable general solution, such as for ssh, is an open problem. We could
+portable general solution, such as for ssh, is an open problem. We can
 modify the local nameserver, but this approach is invasive, brittle, and
-not portable. We could encourage the resolver library to do resolution
+not portable. We can encourage the resolver library to do resolution
 via TCP rather than UDP, but this approach is hard to do right, and also
-has portability problems. Our current answer is to encourage the use of
-privacy-aware proxies like Privoxy wherever possible, and also provide
-a tool similar to \emph{dig} that can do a private lookup through the
-Tor network.
+has portability problems. We can provide a tool similar to \emph{dig} that
+can do a private lookup through the Tor network. Our current answer is to
+encourage the use of privacy-aware proxies like Privoxy wherever possible,
 
 Ending a Tor stream is analogous to ending a TCP stream: it uses a
 two-step handshake for normal operation, or a one-step handshake for
 errors. If one side of the stream closes abnormally, that node simply
 sends a relay teardown cell, and tears down the stream. If one side
-% Nick: mention relay teardown in 'cell' subsec? good enough name? -RD
 of the stream closes the connection normally, that node sends a relay
 end cell down the circuit. When the other side has sent back its own
 relay end, the stream can be torn down. This two-step handshake allows
 for TCP-based applications that, for example, close a socket for writing
-but are still willing to read.
+but are still willing to read. Remember that all relay cells use layered
+encryption, so only the destination OR knows what type of relay cell
+it is.
 
 \SubSection{Integrity checking on streams}
 
@@ -815,6 +819,7 @@ that Alice or Bob tear down the circuit if they receive a bad hash.
 Volunteers are generally more willing to run services that can limit
 their bandwidth usage.  To accomodate them, Tor servers use a token
 bucket approach to limit the number of bytes they
+% XXX cite token bucket?
 receive. Tokens are added to the bucket each second (when the bucket is
 full, new tokens are discarded.) Each token represents permission to
 receive one byte from the network---to receive a byte, the connection
@@ -947,17 +952,6 @@ to slow down other users when they build new circuits.
 
 % What about link-to-link rate limiting?
 
-More worrisome are distributed denial of service attacks wherein an
-attacker uses a large number of compromised hosts throughout the network
-to consume the Tor network's resources.  Although these attacks are not
-new to the networking literature, some proposed approaches are a poor
-fit to anonymous networks.  For example, solutions based on backtracking
-harmful traffic \cite{XXX} could allow an anonymity-breaking
-adversary to exploit the backtracking mechanism.
-% XXX I don't see how you would do DDoS through Tor. And even if you
-%     did, it seems ok to track you down. Should we remove this
-%     paragraph? -RD
-
 Attackers also have an opportunity to attack the Tor network by mounting
 attacks on its hosts and network links. Disrupting a single circuit or
 link breaks all currently open streams passing along that part of the
@@ -1001,7 +995,7 @@ network.  (Using a private exit (if one exists) is a more secure way
 for a client to connect to a given host or network---an external
 adversary cannot eavesdrop traffic between the private exit and the
 final destination, and so is less sure of Alice's destination and
-activities.)  is less sure of Alice's destination. More generally,
+activities.)  is less sure of Alice's destination. In general,
 nodes can require a variety of forms of traffic authentication
 \cite{or-discex00}.
 
@@ -1187,7 +1181,7 @@ but refuses to relay traffic from other routers, the directory servers
 must build circuits and use them to anonymously test router reliability
 \cite{mix-acc}.
 
-When a client Alice retrieves a consensus directory, she uses it if it
+When Alice retrieves a consensus directory, she uses it if it
 is signed by a majority of the directory servers she knows.
 
 Using directory servers rather than flooding provides simplicity and
@@ -1221,8 +1215,9 @@ Our design for location-hidden servers has the following properties:
   simply by sending many requests to talk to Bob.  Thus, Bob needs a
   way to filter incoming requests.
 \item[Robust:] Bob should be able to maintain a long-term pseudonymous
-  identity even in the presence of router failure.  Thus, Bob's identity
-  must not be tied to a single OR.
+  identity even in the presence of router failure.  Thus, Bob's service
+  must not be tied to a single OR, and Bob must be able to tie his service
+  to new ORs.
 \item[Smear-resistant:] An attacker should not be able to use rendezvous
   points to smear an OR.  That is, if a social attacker tries to host a 
   location-hidden service that is illegal or disreputable, it should not
@@ -1327,8 +1322,8 @@ remains a SOCKS proxy.  Thus we must encode all of the necessary
 information into the fully qualified domain name Alice uses when
 establishing her connections.  Location-hidden services use a virtual
 top level domain called `.onion': thus hostnames take the form
-x.y.onion where x encodes the hash of PK, and y is the authentication
-cookie. Alice's onion proxy examines hostnames and recognizes when
+x.y.onion where x is the authentication cookie, and y encodes the hash
+of PK. Alice's onion proxy examines hostnames and recognizes when
 they're destined for a hidden server. If so, it decodes the PK and
 starts the rendezvous as described in the table above.
 
@@ -1342,7 +1337,7 @@ self-authenticating, and so the client can recognize the same service
 with confidence later on. His design also differs from ours in the
 following ways: First, Goldberg suggests that the client should
 manually hunt down a current location of the service via Gnutella;
-whereas our use of the DHT makes lookup faster, more robust, and
+whereas our use of CFS makes lookup faster, more robust, and
 transparent to the user. Second, in Tor the client and server
 negotiate ephemeral keys via Diffie-Hellman, so at no point in the
 path is the plaintext exposed. Third, our design tries to minimize the
@@ -1546,7 +1541,9 @@ them.
   traffic once the circuits have been closed.)  Additionally, building
   circuits that cross jurisdictions can make legal coercion
   harder---this phenomenon is commonly called ``jurisdictional
-  arbitrage.''
+  arbitrage.'' The JAP project recently experienced this issue, when
+  the German government successfully ordered them to add a backdoor to
+  all of their nodes.
 
   
 \item \emph{Run a recipient.} By running a Web server, an adversary
@@ -1890,7 +1887,8 @@ issues remaining to be ironed out. In particular:
 
 %% commented out for anonymous submission
 %\Section{Acknowledgments}
-% Peter Palfrader for editing
+% Peter Palfrader, Geoff Goodell, Adam Shostack, Joseph Sokol-Margolis
+%   for editing and comments
 % Bram Cohen for congestion control discussions
 % Adam Back for suggesting telescoping circuits