aboutsummaryrefslogtreecommitdiff
path: root/doc/design-paper
diff options
context:
space:
mode:
authorNick Mathewson <nickm@torproject.org>2006-10-31 19:17:18 +0000
committerNick Mathewson <nickm@torproject.org>2006-10-31 19:17:18 +0000
commitbba78b9c1fd5af82ccd6ec2ff64da775069c5f67 (patch)
treeaa68217abd66d83bfa65bc562ad89599ba654d4e /doc/design-paper
parent1bf1f9d2fca917099c7e26e8f46df36329cd5c65 (diff)
downloadtor-bba78b9c1fd5af82ccd6ec2ff64da775069c5f67.tar
tor-bba78b9c1fd5af82ccd6ec2ff64da775069c5f67.tar.gz
r9450@Kushana: nickm | 2006-10-31 14:16:35 -0500
checkpoint some blocking tweaks and edits svn:r8882
Diffstat (limited to 'doc/design-paper')
-rw-r--r--doc/design-paper/blocking.tex253
1 files changed, 141 insertions, 112 deletions
diff --git a/doc/design-paper/blocking.tex b/doc/design-paper/blocking.tex
index 214f0d139..bd74c6302 100644
--- a/doc/design-paper/blocking.tex
+++ b/doc/design-paper/blocking.tex
@@ -56,7 +56,7 @@ corporations who don't want to reveal information to their competitors,
and law enforcement and government intelligence agencies who need to do
operations on the Internet without being noticed.
-Historically, research on anonymizing systems has assumed a passive
+Historically, research on anonymizing systems has focused on a passive
attacker who monitors the user (call her Alice) and tries to discover her
activities, yet lets her reach any piece of the network. In more modern
threat models such as Tor's, the adversary is allowed to perform active
@@ -65,22 +65,23 @@ into revealing her destination, or intercepting some of her connections
to run a man-in-the-middle attack. But these systems still assume that
Alice can eventually reach the anonymizing network.
-An increasing number of users are making use of the Tor software
-not so much for its anonymity properties but for its censorship
-resistance properties -- if they access Internet sites like Wikipedia
-and Blogspot via Tor, they are no longer affected by local censorship
+An increasing number of users are using the Tor software
+less for its anonymity properties than for its censorship
+resistance properties---if they use Tor to access Internet sites like
+Wikipedia
+and Blogspot, they are no longer affected by local censorship
and firewall rules. In fact, an informal user study (described in
Appendix~\ref{app:geoip}) showed China as the third largest user base
for Tor clients, with perhaps ten thousand people accessing the Tor
network from China each day.
The current Tor design is easy to block if the attacker controls Alice's
-connection to the Tor network --- by blocking the directory authorities,
+connection to the Tor network---by blocking the directory authorities,
by blocking all the server IP addresses in the directory, or by filtering
based on the signature of the Tor TLS handshake. Here we describe a
design that builds upon the current Tor network to provide an anonymizing
network that also resists this blocking. Specifically,
-Section~\ref{sec:adversary} discusses our threat model --- that is,
+Section~\ref{sec:adversary} discusses our threat model---that is,
the assumptions we make about our adversary; Section~\ref{sec:current-tor}
describes the components of the current Tor design and how they can be
leveraged for a new blocking-resistant design; Section~\ref{sec:related}
@@ -98,70 +99,76 @@ assumptions about what adversaries to expect and what problems are
in the critical path to a solution. Here we try to enumerate our best
understanding of the current situation around the world.
-In the traditional security style, we aim to describe a strong attacker
---- if we can defend against this attacker, we inherit protection
+In the traditional security style, we aim to describe a strong
+attacker---if we can defend against this attacker, we inherit protection
against weaker attackers as well. After all, we want a general design
-that will work for people in China, people in Iran, people in Thailand,
-whistleblowers in firewalled corporate networks, and people in whatever
-turns out to be the next oppressive situation. In fact, by designing with
+that will work for citizens of China, Iran, Thailand, and other censored
+countries; for
+whistleblowers in firewalled corporate network; and for people in
+unanticipated oppressive situations. In fact, by designing with
a variety of adversaries in mind, we can take advantage of the fact that
-adversaries will be in different stages of the arms race at each location.
+adversaries will be in different stages of the arms race at each location,
+and thereby retain partial utility in servers even when they are blocked
+by some of the adversaries.
We assume there are three main network attacks in use by censors
currently~\cite{clayton:pet2006}:
\begin{tightlist}
-\item Block destination by automatically searching for certain strings
-in TCP packets.
-\item Block destination by manually listing its IP address at the
+\item Block a destination or type of traffic by automatically searching for
+ certain strings or patterns in TCP packets.
+\item Block a destination by manually listing its IP address at the
firewall.
\item Intercept DNS requests and give bogus responses for certain
destination hostnames.
\end{tightlist}
-We assume the network firewall has very limited CPU per
+We assume the network firewall has limited CPU and memory per
connection~\cite{clayton:pet2006}. Against an adversary who spends
hours looking through the contents of each packet, we would need
some stronger mechanism such as steganography, which introduces its
own problems~\cite{active-wardens,tcpstego,bar}.
-More broadly, we assume that the chance that the authorities try to
-block a given system grows as its popularity grows. That is, a system
+More broadly, we assume that the authorities are more likely to
+block a given system as its popularity grows. That is, a system
used by only a few users will probably never be blocked, whereas a
well-publicized system with many users will receive much more scrutiny.
We assume that readers of blocked content are not in as much danger
as publishers. So far in places like China, the authorities mainly go
-after people who publish materials and coordinate organized movements
-against the state~\cite{mackinnon}. If they find that a user happens
+after people who publish materials and coordinate organized
+movements~\cite{mackinnon}.
+If they find that a user happens
to be reading a site that should be blocked, the typical response is
simply to block the site. Of course, even with an encrypted connection,
the adversary may be able to distinguish readers from publishers by
observing whether Alice is mostly downloading bytes or mostly uploading
-them --- we discuss this issue more in Section~\ref{subsec:upload-padding}.
+them---we discuss this issue more in Section~\ref{subsec:upload-padding}.
We assume that while various different regimes can coordinate and share
-notes, there will be a significant time lag between one attacker learning
+notes, there will be a time lag between one attacker learning
how to overcome a facet of our design and other attackers picking it up.
Similarly, we assume that in the early stages of deployment the insider
threat isn't as high of a risk, because no attackers have put serious
effort into breaking the system yet.
-We assume that government-level attackers are not always uniform across
+We do not assume that government-level attackers are always uniform across
the country. For example, there is no single centralized place in China
that coordinates its censorship decisions and steps.
We assume that our users have control over their hardware and
-software --- they don't have any spyware installed, there are no
+software---they don't have any spyware installed, there are no
cameras watching their screen, etc. Unfortunately, in many situations
-these threats are very real~\cite{zuckerman-threatmodels}; yet
+these threats are real~\cite{zuckerman-threatmodels}; yet
software-based security systems like ours are poorly equipped to handle
a user who is entirely observed and controlled by the adversary. See
Section~\ref{subsec:cafes-and-livecds} for more discussion of what little
we can do about this issue.
-We assume that widespread access to the Internet is economically and/or
-socially valuable in each deployment country. After all, if censorship
+We assume that widespread access to the Internet is economically,
+politically, and/or
+socially valuable to the policymakers of each deployment country. After
+all, if censorship
is more important than Internet access, the firewall administrators have
an easy job: they should simply block everything. The corollary to this
assumption is that we should design so that increased blocking of our
@@ -178,9 +185,13 @@ real Tor network.
Tor is popular and sees a lot of use. It's the largest anonymity
network of its kind.
-Tor has attracted more than 800 routers from around the world.
-A few sentences about how Tor works.
-In this section, we examine some of the reasons why Tor has taken off,
+Tor has attracted more than 800 volunteer-operated routers from around the
+world. Tor protects users by routing their traffic through a multiply
+encrypted ``circuit'' built of a few randomly selected servers, each of which
+can remove only a single layer of encryption. Each server sees only the step
+before it and the step after it in the circuit, and so no single server can
+learn the connection between a user and her chosen communication partners.
+In this section, we examine some of the reasons why Tor has become popular,
with particular emphasis to how we can take advantage of these properties
for a blocking-resistance design.
@@ -196,39 +207,40 @@ can't learn your location.
For blocking-resistance, we care most clearly about the first
property. But as the arms race progresses, the second property
-will become important --- for example, to discourage an adversary
+will become important---for example, to discourage an adversary
from volunteering a relay in order to learn that Alice is reading
-or posting to certain websites. The third property is not so clearly
-important in this context, but we believe it will turn out to be helpful:
-consider websites and other Internet services that have been pressured
-recently into treating clients differently depending on their network
+or posting to certain websites. The third property helps keep users safe from
+collaborating websites: consider websites and other Internet services
+that have been pressured
+recently into revealing the identity of bloggers~\cite{arrested-bloggers}
+or treating clients differently depending on their network
location~\cite{google-geolocation}.
% and cite{goodell-syverson06} once it's finalized.
The Tor design provides other features as well over manual or ad
hoc circumvention techniques.
-Firstly, the Tor directory authorities automatically aggregate, test,
+First, the Tor directory authorities automatically aggregate, test,
and publish signed summaries of the available Tor routers. Tor clients
can fetch these summaries to learn which routers are available and
-which routers have desired properties. Directory information is cached
+which routers are suitable for their needs. Directory information is cached
throughout the Tor network, so once clients have bootstrapped they never
need to interact with the authorities directly. (To tolerate a minority
-of compromised directory authorities, we use a threshold trust scheme ---
+of compromised directory authorities, we use a threshold trust scheme---
see Section~\ref{subsec:trust-chain} for details.)
-Secondly, Tor clients can be configured to use any directory authorities
+Second, Tor clients can be configured to use any directory authorities
they want. They use the default authorities if no others are specified,
but it's easy to start a separate (or even overlapping) Tor network just
by running a different set of authorities and convincing users to prefer
a modified client. For example, we could launch a distinct Tor network
inside China; some users could even use an aggregate network made up of
-both the main network and the China network. But we should not be too
-quick to create other Tor networks --- part of Tor's anonymity comes from
+both the main network and the China network. (But we should not be too
+quick to create other Tor networks---part of Tor's anonymity comes from
users behaving like other users, and there are many unsolved anonymity
-questions if different users know about different pieces of the network.
+questions if different users know about different pieces of the network.)
-Thirdly, in addition to automatically learning from the chosen directories
+Third, in addition to automatically learning from the chosen directories
which Tor routers are available and working, Tor takes care of building
paths through the network and rebuilding them as needed. So the user
never has to know how paths are chosen, never has to manually pick
@@ -242,7 +254,7 @@ of directory authorities, its own set of Tor routers (called the Blossom
network), and uses Tor's flexible path-building to let users view Internet
resources from any point in the Blossom network.
-Fourthly, Tor separates the role of \emph{internal relay} from the
+Fourth, Tor separates the role of \emph{internal relay} from the
role of \emph{exit relay}. That is, some volunteers choose just to relay
traffic between Tor users and Tor routers, and others choose to also allow
connections to external Internet resources. Because we don't force all
@@ -252,13 +264,14 @@ user has for her first hop, and the more options she has for her last hop,
the less likely it is that a given attacker will be watching both ends
of her circuit~\cite{tor-design}. As a bonus, because our design attracts
more internal relays that want to help out but don't want to deal with
-being an exit relay, we end up with more options for the first hop ---
-the one most critical to being able to reach the Tor network.
+being an exit relay, we end up with more options for the first hop---the
+one most critical to being able to reach the Tor network.
-Fifthly, Tor is sustainable. Zero-Knowledge Systems offered the commercial
-but now-defunct Freedom Network~\cite{freedom21-security}, a design with
+Fifth, Tor is sustainable. Zero-Knowledge Systems offered the commercial
+but now defunct Freedom Network~\cite{freedom21-security}, a design with
security comparable to Tor's, but its funding model relied on collecting
-money from users to pay relays. Modern commercial proxy systems similarly
+money from users to pay relay operators. Modern commercial proxy systems
+similarly
need to keep collecting money to support their infrastructure. On the
other hand, Tor has built a self-sustaining community of volunteers who
donate their time and resources. This community trust is rooted in Tor's
@@ -268,11 +281,11 @@ expert to decide, whether it is safe to use. Further, Tor's modularity
as described above, along with its open license, mean that its impact
will continue to grow.
-Sixthly, Tor has an established user base of hundreds of
+Sixth, Tor has an established user base of hundreds of
thousands of people from around the world. This diversity of
users contributes to sustainability as above: Tor is used by
ordinary citizens, activists, corporations, law enforcement, and
-even governments and militaries~\cite{tor-use-cases}, and they can
+even government and military users~\cite{tor-use-cases}, and they can
only achieve their security goals by blending together in the same
network~\cite{econymics,usability:weis2006}. This user base also provides
something else: hundreds of thousands of different and often-changing
@@ -289,14 +302,14 @@ our repertoire of building blocks and ideas.
Relay-based blocking-resistance schemes generally have two main
components: a relay component and a discovery component. The relay part
encompasses the process of establishing a connection, sending traffic
-back and forth, and so on --- everything that's done once the user knows
+back and forth, and so on---everything that's done once the user knows
where he's going to connect. Discovery is the step before that: the
process of finding one or more usable relays.
-For example, we described several pieces of Tor in the previous section,
-but we can divide them into the process of building paths and sending
+For example, we can divide the pieces of Tor in the previous section
+into the process of building paths and sending
traffic over them (relay) and the process of learning from the directory
-servers about what routers are available (discovery). With this distinction
+servers about what routers are available (discovery). With this distinction
in mind, we now examine several categories of relay-based schemes.
\subsection{Centrally-controlled shared proxies}
@@ -312,14 +325,15 @@ In terms of the relay component, single proxies provide weak security
compared to systems that distribute trust over multiple relays, since a
compromised proxy can trivially observe all of its users' actions, and
an eavesdropper only needs to watch a single proxy to perform timing
-correlation attacks against all its users' traffic. Worse, all users
+correlation attacks against all its users' traffic and thus learn where
+everyone is connecting. Worse, all users
need to trust the proxy company to have good security itself as well as
to not reveal user activities.
On the other hand, single-hop proxies are easier to deploy, and they
can provide better performance than distributed-trust designs like Tor,
since traffic only goes through one relay. They're also more convenient
-from the user's perspective --- since users entirely trust the proxy,
+from the user's perspective---since users entirely trust the proxy,
they can just use their web browser directly.
Whether public proxy schemes are more or less scalable than Tor is
@@ -333,9 +347,9 @@ log in to those websites and relay their traffic through them. When
these websites get blocked (generally soon after the company becomes
popular), if the company cares about users in the blocked areas, they
start renting lots of disparate IP addresses and rotating through them
-as they get blocked. They notify their users of new addresses by email,
-for example. It's an arms race, since attackers can sign up to receive the
-email too, but they have one nice trick available to them: because they
+as they get blocked. They notify their users of new addresses (by email,
+for example). It's an arms race, since attackers can sign up to receive the
+email too, but operators have one nice trick available to them: because they
have a list of paying subscribers, they can notify certain subscribers
about updates earlier than others.
@@ -347,7 +361,7 @@ Discovery in the face of a government-level firewall is a complex and
unsolved
topic, and we're stuck in this same arms race ourselves; we explore it
in more detail in Section~\ref{sec:discovery}. But first we examine the
-other end of the spectrum --- getting volunteers to run the proxies,
+other end of the spectrum---getting volunteers to run the proxies,
and telling only a few people about each proxy.
\subsection{Independent personal proxies}
@@ -365,11 +379,12 @@ actually install the Circumventor \emph{on} the computer that is blocked
from accessing Web sites. You, or a friend of yours, has to install the
Circumventor on some \emph{other} machine which is not censored.''
-This tactic has great advantages in terms of blocking-resistance ---
-recall our assumption in Section~\ref{sec:adversary} that the attention
+This tactic has great advantages in terms of blocking-resistance---recall
+our assumption in Section~\ref{sec:adversary} that the attention
a system attracts from the attacker is proportional to its number of
users and level of publicity. If each proxy only has a few users, and
-there is no central list of proxies, most of them will never get noticed.
+there is no central list of proxies, most of them will never get noticed by
+the censors.
On the other hand, there's a huge scalability question that so far has
prevented these schemes from being widely useful: how does the fellow
@@ -381,8 +396,8 @@ Ohio find a person in China who needs it?
%discovery is also hard because the hosts keep vanishing if they're
%on dynamic ip. But not so bad, since they can use dyndns addresses.
-This challenge leads to a hybrid design --- centrally-distributed
-personal proxies --- which we will investigate in more detail in
+This challenge leads to a hybrid design---centrally-distributed
+personal proxies---which we will investigate in more detail in
Section~\ref{sec:discovery}.
\subsection{Open proxies}
@@ -449,13 +464,13 @@ more subtle variant on this theory is that we've positioned Tor in the
public eye as a tool for retaining civil liberties in more free countries,
so perhaps blocking authorities don't view it as a threat. (We revisit
this idea when we consider whether and how to publicize a Tor variant
-that improves blocking-resistance --- see Section~\ref{subsec:publicity}
+that improves blocking-resistance---see Section~\ref{subsec:publicity}
for more discussion.)
-The broader explanation is that most government-level filters are not
-created by people setting out to block all possible ways to bypass
-them. They're created by people who want to do a good enough job that
-they can still appear in control. They realize that there will always
+The broader explanation is that the maintainance of most government-level
+filters is aimed at stopping widespread information flow and appearing to be
+in control, not by the impossible goal of blocking all possible ways to bypass
+censorship. Censors realize that there will always
be ways for a few people to get around the firewall, and as long as Tor
has not publically threatened their control, they see no urgent need to
block it yet.
@@ -481,6 +496,12 @@ to get more relay addresses, and to distribute them to users differently.
\subsection{Bridge relays}
+Today, Tor servers operate on less than a thousand distinct IP; an adversary
+could enumerate and block them all with little trouble. To provide a
+means of ingress to the network, we need a larger set of entry points, most
+of which an adversary won't be able to enumerate easily. Fortunately, we
+have such a set: the Tor userbase.
+
Hundreds of thousands of people around the world use Tor. We can leverage
our already self-selected user base to produce a list of thousands of
often-changing IP addresses. Specifically, we can give them a little
@@ -530,7 +551,8 @@ infrastructure and trust chain.
Bridges use Tor to publish their descriptors privately and securely,
so even an attacker monitoring the bridge directory authority's network
can't make a list of all the addresses contacting the authority and
-track them that way.
+track them that way. Bridges may publish to only a subset of the
+authorities, to limit the potential impact of an authority compromise.
%\subsection{A simple matter of engineering}
%
@@ -554,7 +576,7 @@ track them that way.
%
%Lastly, since bridge authorities don't answer full network statuses,
%we need to add a new way for users to learn the current status for a
-%single relay or a small set of relays --- to answer such questions as
+%single relay or a small set of relays---to answer such questions as
%``is it running?'' or ``is it behaving correctly?'' We describe in
%Section~\ref{subsec:enclave-dirs} a way for the bridge authority to
%publish this information without resorting to signing each answer
@@ -610,7 +632,7 @@ However, connecting directly to the directory cache involves a plaintext
HTTP request. A censor could create a network signature for the request
and/or its response, thus preventing these connections. To resolve this
vulnerability, we've modified the Tor protocol so that users can connect
-to the directory cache via the main Tor port --- they establish a TLS
+to the directory cache via the main Tor port---they establish a TLS
connection with the bridge as normal, and then send a special ``begindir''
relay command to establish an internal connection to its directory cache.
@@ -625,7 +647,8 @@ be most useful, because clients behind standard firewalls will have
the best chance to reach them. Is this the best choice in all cases,
or should we encourage some fraction of them pick random ports, or other
ports commonly permitted through firewalls like 53 (DNS) or 110
-(POP)? We need
+(POP)? Or perhaps we should use a port where TLS traffic is expected, like
+443 (HTTPS), 993 (IMAPS), or 995 (POP3S). We need
more research on our potential users, and their current and anticipated
firewall restrictions.
@@ -633,23 +656,25 @@ Furthermore, we need to look at the specifics of Tor's TLS handshake.
Right now Tor uses some predictable strings in its TLS handshakes. For
example, it sets the X.509 organizationName field to ``Tor'', and it puts
the Tor server's nickname in the certificate's commonName field. We
-should tweak the handshake protocol so it doesn't rely on any details
-in the certificate headers, yet it remains secure. Should we replace
-it with blank entries for each field, or should we research the common
-values that Firefox and Internet Explorer use and try to imitate those?
-
-Worse, Tor's TLS handshake involves sending two certificates in each
-direction: one certificate contains the self-signed identity key for
-the router, and the second contains the current link key, signed by the
+should tweak the handshake protocol so it doesn't rely on any unusual details
+in the certificate, yet it remains secure; the certificate itself
+should be made to resemble an ordinary HTTPS certificate. We should also try
+to make our advertised cipher-suites closer to what an ordinary web server
+would support.
+
+Tor's TLS handshake uses two-certificate chains: one certificate
+contains the self-signed identity key for
+the router, and the second contains a current TLS key, signed by the
identity key. We use these to authenticate that we're talking to the right
-router, and also to establish perfect forward secrecy for that link.
-How much will these extra certificates make Tor's TLS handshake stand
-out? We have to work on normalizing our appearance not just in terms
-of the fields used in each certificate, but also in the number of
-certificates we present for each side.
-% Nick, I need help with the above paragraph. What are the two certs
-% for really, and how much work would it be to start acting like a normal
-% browser? -RD
+router, and to limit the impact of TLS-key exposure. Most (though far from
+all) consumer-oriented HTTPS services provide only a single certificate.
+These extra certificates may help identify Tor's TLS handshake; instead,
+bridges should consider using only a single TLS key certificate signed by
+their identity key, and providing the full value of the identity key in an
+early handshake cell. More significantly, Tor currently has all clients
+present certificates, so that clients are harder to distinguish from servers.
+But in a blocking-resistance environment, clients should not present
+certificates at all.
Lastly, what if the adversary starts observing the network traffic even
more closely? Even if our TLS handshake looks innocent, our traffic timing
@@ -672,7 +697,7 @@ network once he knows the IP address and ORPort of a bridge. What about
local spoofing attacks? That is, since we never learned an identity
key fingerprint for the bridge, a local attacker could intercept our
connection and pretend to be the bridge we had in mind. It turns out
-that giving false information isn't that bad --- since the Tor client
+that giving false information isn't that bad---since the Tor client
ships with trusted keys for the bridge directory authority and the Tor
network directory authorities, the user can learn whether he's being
given a real connection to the bridge authorities or not. (After all,
@@ -681,8 +706,8 @@ him a bad connection each time, there's nothing we can do.)
What about anonymity-breaking attacks from observing traffic, if the
blocked user doesn't start out knowing the identity key of his intended
-bridge? The vulnerabilities aren't so bad in this case either ---
-the adversary could do similar attacks just by monitoring the network
+bridge? The vulnerabilities aren't so bad in this case either---the
+adversary could do similar attacks just by monitoring the network
traffic.
% cue paper by steven and george
@@ -710,7 +735,7 @@ Section~\ref{sec:related}.
In this section we describe four approaches to adding discovery
components for our design, in order of increasing complexity. Note that
-we can deploy all four schemes at once --- bridges and blocked users can
+we can deploy all four schemes at once---bridges and blocked users can
use the discovery approach that is most appropriate for their situation.
\subsection{Independent bridges, no central discovery}
@@ -763,7 +788,7 @@ available bridges),
\subsection{Social networks with directory-side support}
-Pick some seeds --- trusted people in the blocked area --- and give
+Pick some seeds---trusted people in the blocked area---and give
them each a few hundred bridge addresses. Run a website next to the
bridge authority, where they can log in (they only need persistent
pseudonyms). Give them tokens slowly over time. They can use these
@@ -803,9 +828,9 @@ Most government firewalls are not perfect. They allow connections to
Google cache or some open proxy servers, or they let file-sharing or
Skype or World-of-Warcraft connections through.
For users who can't use any of these techniques, hopefully they know
-a friend who can --- for example, perhaps the friend already knows some
+a friend who can---for example, perhaps the friend already knows some
bridge relay addresses.
-(If they can't get around it at all, then we can't help them --- they
+(If they can't get around it at all, then we can't help them---they
should go meet more people.)
Some techniques are sufficient to get us an IP address and a port,
@@ -879,9 +904,9 @@ reward good behavior, hard to punish bad behavior.
\subsection{How to allocate bridge addresses to users}
Hold a fraction in reserve, in case our currently deployed tricks
-all fail at once --- so we can move to new approaches quickly.
+all fail at once---so we can move to new approaches quickly.
(Bridges that sign up and don't get used yet will be sad; but this
-is a transient problem --- if bridges are on by default, nobody will
+is a transient problem---if bridges are on by default, nobody will
mind not being used.)
Perhaps each bridge should be known by a single bridge directory
@@ -984,7 +1009,7 @@ solution though.
\subsection{Possession of Tor in oppressed areas}
Many people speculate that installing and using a Tor client in areas with
-particularly extreme firewalls is a high risk --- and the risk increases
+particularly extreme firewalls is a high risk---and the risk increases
as the firewall gets more restrictive. This is probably true, but there's
a counter pressure as well: as the firewall gets more restrictive, more
ordinary people use Tor for more mainstream activities, such as learning
@@ -1021,7 +1046,7 @@ we try to make it hard to enumerate all bridges, it's still possible to
learn about some of them, and for some people just the fact that they're
running one might signal to an attacker that they place a high value
on their anonymity. Second, there are some more esoteric attacks on Tor
-relays that are not as well-understood or well-tested --- for example, an
+relays that are not as well-understood or well-tested---for example, an
attacker may be able to ``observe'' whether the bridge is sending traffic
even if he can't actually watch its network, by relaying traffic through
it and noticing changes in traffic timing~\cite{attack-tor-oak05}. On
@@ -1044,7 +1069,7 @@ For Internet cafe Windows computers that let you attach your own USB key,
a USB-based Tor image would be smart. There's Torpark, and hopefully
there will be more thoroughly analyzed options down the road. Worries
about hardware or
-software keyloggers and other spyware --- and physical surveillance.
+software keyloggers and other spyware---and physical surveillance.
If the system lets you boot from a CD or from a USB key, you can gain
a bit more security by bringing a privacy LiveCD with you. Hardware
@@ -1069,10 +1094,10 @@ they demand that the next Tor server in the path prove knowledge of
its private key~\cite{tor-design}. This step prevents the first node
in the path from just spoofing the rest of the path. Secondly, the
Tor directory authorities provide a signed list of servers along with
-their public keys --- so unless the adversary can control a threshold
+their public keys---so unless the adversary can control a threshold
of directory authorities, he can't trick the Tor client into using other
Tor servers. Thirdly, the location and keys of the directory authorities,
-in turn, is hard-coded in the Tor source code --- so as long as the user
+in turn, is hard-coded in the Tor source code---so as long as the user
got a genuine version of Tor, he can know that he is using the genuine
Tor network. And lastly, the source code and other packages are signed
with the GPG keys of the Tor developers, so users can confirm that they
@@ -1091,8 +1116,8 @@ community, though, this question remains a critical weakness.
\subsection{Security through obscurity: publishing our design}
Many other schemes like dynaweb use the typical arms race strategy of
-not publishing their plans. Our goal here is to produce a design ---
-a framework --- that can be public and still secure. Where's the tradeoff?
+not publishing their plans. Our goal here is to produce a design---a
+framework---that can be public and still secure. Where's the tradeoff?
\section{Performance improvements}
\label{sec:performance}
@@ -1131,7 +1156,8 @@ The first answer is to aim to get volunteers both from traditionally
``consumer'' networks and also from traditionally ``producer'' networks.
The second answer (not so good) would be to encourage more use of consumer
-networks for popular and useful websites.
+networks for popular and useful websites. (But P2P exists; minor websites
+exist; gaming exists; IM exists; ...)
Other attack: China pressures Verizon to discourage its users from
running bridges.
@@ -1141,7 +1167,7 @@ running bridges.
If it's trivial to verify that we're a bridge, and we run on a predictable
port, then it's conceivable our attacker would scan the whole Internet
looking for bridges. (In fact, he can just scan likely networks like
-cablemodem and DSL services --- see Section~\ref{block-cable} for a related
+cablemodem and DSL services---see Section~\ref{block-cable} for a related
attack.) It would be nice to slow down this attack. It would
be even nicer to make it hard to learn whether we're a bridge without
first knowing some secret.
@@ -1152,6 +1178,9 @@ it or something when he connects. We'd need to give him an ID key for the
bridge too, and wait to present the password until we've TLSed, else the
adversary can pretend to be the bridge and MITM him to learn the password.
+We could some kind of ID-based knocking protocol, or we could act like an
+unconfigured HTTPS server if treated like one.
+
\subsection{How to motivate people to run bridge relays}
One of the traditional ways to get people to run software that benefits
@@ -1161,7 +1190,7 @@ will be pleased to run it. We take a similar approach here, by leveraging
the fact that these users are already interested in protecting their
own Internet traffic, so they will install and run the software.
-Make all Tor users become bridges if they're reachable -- needs more work
+Make all Tor users become bridges if they're reachable---needs more work
on usability first, but we're making progress.
Also, we can make a snazzy network graph with Vidalia that emphasizes
@@ -1218,7 +1247,7 @@ Assuming actually crossing the firewall is the risky part of the
operation, can we have some bridge relays inside the blocked area too,
and more established users can use them as relays so they don't need to
communicate over the firewall directly at all? A simple example here is
-to make new blocked users into internal bridges also -- so they sign up
+to make new blocked users into internal bridges also---so they sign up
on the BDA as part of doing their query, and we give out their addresses
rather than (or along with) the external bridge addresses. This design
is a lot trickier because it brings in the complexity of whether the