aboutsummaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
Diffstat (limited to 'doc')
-rw-r--r--doc/Makefile.am7
-rw-r--r--doc/spec/Makefile.am12
-rw-r--r--doc/spec/README10
-rw-r--r--doc/spec/address-spec.txt58
-rw-r--r--doc/spec/bridges-spec.txt249
-rw-r--r--doc/spec/control-spec-v0.txt498
-rw-r--r--doc/spec/control-spec.txt2001
-rw-r--r--doc/spec/dir-spec-v1.txt314
-rw-r--r--doc/spec/dir-spec-v2.txt896
-rw-r--r--doc/spec/dir-spec.txt2440
-rw-r--r--doc/spec/path-spec.txt657
-rw-r--r--doc/spec/proposals/000-index.txt196
-rw-r--r--doc/spec/proposals/001-process.txt184
-rw-r--r--doc/spec/proposals/098-todo.txt107
-rw-r--r--doc/spec/proposals/099-misc.txt28
-rw-r--r--doc/spec/proposals/100-tor-spec-udp.txt422
-rw-r--r--doc/spec/proposals/101-dir-voting.txt283
-rw-r--r--doc/spec/proposals/102-drop-opt.txt38
-rw-r--r--doc/spec/proposals/103-multilevel-keys.txt204
-rw-r--r--doc/spec/proposals/104-short-descriptors.txt181
-rw-r--r--doc/spec/proposals/105-handshake-revision.txt323
-rw-r--r--doc/spec/proposals/106-less-tls-constraint.txt111
-rw-r--r--doc/spec/proposals/107-uptime-sanity-checking.txt54
-rw-r--r--doc/spec/proposals/108-mtbf-based-stability.txt88
-rw-r--r--doc/spec/proposals/109-no-sharing-ips.txt90
-rw-r--r--doc/spec/proposals/110-avoid-infinite-circuits.txt120
-rw-r--r--doc/spec/proposals/111-local-traffic-priority.txt151
-rw-r--r--doc/spec/proposals/112-bring-back-pathlencoinweight.txt163
-rw-r--r--doc/spec/proposals/113-fast-authority-interface.txt85
-rw-r--r--doc/spec/proposals/114-distributed-storage.txt439
-rw-r--r--doc/spec/proposals/115-two-hop-paths.txt385
-rw-r--r--doc/spec/proposals/116-two-hop-paths-from-guard.txt118
-rw-r--r--doc/spec/proposals/117-ipv6-exits.txt410
-rw-r--r--doc/spec/proposals/118-multiple-orports.txt84
-rw-r--r--doc/spec/proposals/119-controlport-auth.txt140
-rw-r--r--doc/spec/proposals/120-shutdown-descriptors.txt83
-rw-r--r--doc/spec/proposals/121-hidden-service-authentication.txt776
-rw-r--r--doc/spec/proposals/122-unnamed-flag.txt136
-rw-r--r--doc/spec/proposals/123-autonaming.txt54
-rw-r--r--doc/spec/proposals/124-tls-certificates.txt313
-rw-r--r--doc/spec/proposals/125-bridges.txt291
-rw-r--r--doc/spec/proposals/126-geoip-reporting.txt410
-rw-r--r--doc/spec/proposals/127-dirport-mirrors-downloads.txt155
-rw-r--r--doc/spec/proposals/128-bridge-families.txt64
-rw-r--r--doc/spec/proposals/129-reject-plaintext-ports.txt114
-rw-r--r--doc/spec/proposals/130-v2-conn-protocol.txt184
-rw-r--r--doc/spec/proposals/131-verify-tor-usage.txt148
-rw-r--r--doc/spec/proposals/132-browser-check-tor-service.txt145
-rw-r--r--doc/spec/proposals/133-unreachable-ors.txt128
-rw-r--r--doc/spec/proposals/134-robust-voting.txt123
-rw-r--r--doc/spec/proposals/135-private-tor-networks.txt281
-rw-r--r--doc/spec/proposals/136-legacy-keys.txt100
-rw-r--r--doc/spec/proposals/137-bootstrap-phases.txt235
-rw-r--r--doc/spec/proposals/138-remove-down-routers-from-consensus.txt49
-rw-r--r--doc/spec/proposals/139-conditional-consensus-download.txt94
-rw-r--r--doc/spec/proposals/140-consensus-diffs.txt156
-rw-r--r--doc/spec/proposals/141-jit-sd-downloads.txt323
-rw-r--r--doc/spec/proposals/142-combine-intro-and-rend-points.txt277
-rw-r--r--doc/spec/proposals/143-distributed-storage-improvements.txt194
-rw-r--r--doc/spec/proposals/144-enforce-distinct-providers.txt165
-rw-r--r--doc/spec/proposals/145-newguard-flag.txt39
-rw-r--r--doc/spec/proposals/146-long-term-stability.txt84
-rw-r--r--doc/spec/proposals/147-prevoting-opinions.txt58
-rw-r--r--doc/spec/proposals/148-uniform-client-end-reason.txt57
-rw-r--r--doc/spec/proposals/149-using-netinfo-data.txt42
-rw-r--r--doc/spec/proposals/150-exclude-exit-nodes.txt47
-rw-r--r--doc/spec/proposals/151-path-selection-improvements.txt148
-rw-r--r--doc/spec/proposals/152-single-hop-circuits.txt62
-rw-r--r--doc/spec/proposals/153-automatic-software-update-protocol.txt175
-rw-r--r--doc/spec/proposals/154-automatic-updates.txt377
-rw-r--r--doc/spec/proposals/155-four-hidden-service-improvements.txt120
-rw-r--r--doc/spec/proposals/156-tracking-blocked-ports.txt527
-rw-r--r--doc/spec/proposals/157-specific-cert-download.txt102
-rw-r--r--doc/spec/proposals/158-microdescriptors.txt198
-rw-r--r--doc/spec/proposals/159-exit-scanning.txt142
-rw-r--r--doc/spec/proposals/160-bandwidth-offset.txt105
-rw-r--r--doc/spec/proposals/161-computing-bandwidth-adjustments.txt174
-rw-r--r--doc/spec/proposals/162-consensus-flavors.txt188
-rw-r--r--doc/spec/proposals/163-detecting-clients.txt115
-rw-r--r--doc/spec/proposals/164-reporting-server-status.txt91
-rw-r--r--doc/spec/proposals/165-simple-robust-voting.txt133
-rw-r--r--doc/spec/proposals/166-statistics-extra-info-docs.txt391
-rw-r--r--doc/spec/proposals/167-params-in-consensus.txt47
-rw-r--r--doc/spec/proposals/168-reduce-circwindow.txt134
-rw-r--r--doc/spec/proposals/169-eliminating-renegotiation.txt404
-rw-r--r--doc/spec/proposals/170-user-path-config.txt95
-rw-r--r--doc/spec/proposals/171-separate-streams.txt357
-rw-r--r--doc/spec/proposals/172-circ-getinfo-option.txt138
-rw-r--r--doc/spec/proposals/173-getinfo-option-expansion.txt101
-rw-r--r--doc/spec/proposals/174-optimistic-data-server.txt242
-rw-r--r--doc/spec/proposals/175-automatic-node-promotion.txt238
-rw-r--r--doc/spec/proposals/176-revising-handshake.txt623
-rw-r--r--doc/spec/proposals/177-flag-abstention.txt104
-rw-r--r--doc/spec/proposals/ideas/xxx-auto-update.txt39
-rw-r--r--doc/spec/proposals/ideas/xxx-bridge-disbursement.txt174
-rw-r--r--doc/spec/proposals/ideas/xxx-bwrate-algs.txt106
-rw-r--r--doc/spec/proposals/ideas/xxx-choosing-crypto-in-tor-protocol.txt138
-rw-r--r--doc/spec/proposals/ideas/xxx-controllers-intercept-extends.txt44
-rw-r--r--doc/spec/proposals/ideas/xxx-crypto-migration.txt384
-rw-r--r--doc/spec/proposals/ideas/xxx-crypto-requirements.txt72
-rw-r--r--doc/spec/proposals/ideas/xxx-draft-spec-for-TLS-normalization.txt360
-rw-r--r--doc/spec/proposals/ideas/xxx-encrypted-services.txt66
-rw-r--r--doc/spec/proposals/ideas/xxx-exit-scanning-outline.txt44
-rw-r--r--doc/spec/proposals/ideas/xxx-geoip-survey-plan.txt137
-rw-r--r--doc/spec/proposals/ideas/xxx-grand-scaling-plan.txt97
-rw-r--r--doc/spec/proposals/ideas/xxx-hide-platform.txt37
-rw-r--r--doc/spec/proposals/ideas/xxx-pluggable-transport.txt312
-rw-r--r--doc/spec/proposals/ideas/xxx-port-knocking.txt91
-rw-r--r--doc/spec/proposals/ideas/xxx-rate-limit-exits.txt63
-rw-r--r--doc/spec/proposals/ideas/xxx-using-spdy.txt143
-rw-r--r--doc/spec/proposals/ideas/xxx-what-uses-sha1.txt247
-rwxr-xr-xdoc/spec/proposals/reindex.py117
-rw-r--r--doc/spec/rend-spec.txt966
-rw-r--r--doc/spec/socks-extensions.txt78
-rw-r--r--doc/spec/tor-fw-helper-spec.txt57
-rw-r--r--doc/spec/tor-spec.txt1004
-rw-r--r--doc/spec/version-spec.txt44
117 files changed, 11 insertions, 27211 deletions
diff --git a/doc/Makefile.am b/doc/Makefile.am
index c8bffc931..6cc0ea99f 100644
--- a/doc/Makefile.am
+++ b/doc/Makefile.am
@@ -1,4 +1,3 @@
-
# We use a two-step process to generate documentation from asciidoc files.
#
# First, we use asciidoc/a2x to process the asciidoc files into .1.in and
@@ -36,16 +35,12 @@ endif
EXTRA_DIST = HACKING asciidoc-helper.sh \
$(html_in) $(man_in) $(txt_in) \
tor-rpm-creation.txt \
- tor-win32-mingw-creation.txt
+ tor-win32-mingw-creation.txt spec/README
docdir = @docdir@
asciidoc_product = $(nodist_man_MANS) $(doc_DATA)
-SUBDIRS = spec
-
-DIST_SUBDIRS = spec
-
# Generate the html documentation from asciidoc, but don't do
# machine-specific replacements yet
$(html_in) :
diff --git a/doc/spec/Makefile.am b/doc/spec/Makefile.am
deleted file mode 100644
index a4fba780e..000000000
--- a/doc/spec/Makefile.am
+++ /dev/null
@@ -1,12 +0,0 @@
-
-EXTRA_DIST = \
- address-spec.txt \
- bridges-spec.txt \
- control-spec.txt \
- dir-spec.txt \
- path-spec.txt \
- rend-spec.txt \
- socks-extensions.txt \
- tor-spec.txt \
- version-spec.txt
-
diff --git a/doc/spec/README b/doc/spec/README
new file mode 100644
index 000000000..a7fa17002
--- /dev/null
+++ b/doc/spec/README
@@ -0,0 +1,10 @@
+The Tor specifications and proposals have moved to a new repository.
+
+To browse the specifications, go to
+ https://gitweb.torproject.org/torspec.git/tree
+
+To check out the specification repository, run
+ git clone git://git.torproject.org/torspec.git
+
+For other information on the repository, see
+ http://gitweb.torproject.org/torspec.git
diff --git a/doc/spec/address-spec.txt b/doc/spec/address-spec.txt
deleted file mode 100644
index ce6d2b65e..000000000
--- a/doc/spec/address-spec.txt
+++ /dev/null
@@ -1,58 +0,0 @@
-
- Special Hostnames in Tor
- Nick Mathewson
-
-1. Overview
-
- Most of the time, Tor treats user-specified hostnames as opaque: When
- the user connects to www.torproject.org, Tor picks an exit node and uses
- that node to connect to "www.torproject.org". Some hostnames, however,
- can be used to override Tor's default behavior and circuit-building
- rules.
-
- These hostnames can be passed to Tor as the address part of a SOCKS4a or
- SOCKS5 request. If the application is connected to Tor using an IP-only
- method (such as SOCKS4, TransPort, or NATDPort), these hostnames can be
- substituted for certain IP addresses using the MapAddress configuration
- option or the MAPADDRESS control command.
-
-2. .exit
-
- SYNTAX: [hostname].[name-or-digest].exit
- [name-or-digest].exit
-
- Hostname is a valid hostname; [name-or-digest] is either the nickname of a
- Tor node or the hex-encoded digest of that node's public key.
-
- When Tor sees an address in this format, it uses the specified hostname as
- the exit node. If no "hostname" component is given, Tor defaults to the
- published IPv4 address of the exit node.
-
- It is valid to try to resolve hostnames, and in fact upon success Tor
- will cache an internal mapaddress of the form
- "www.google.com.foo.exit=64.233.161.99.foo.exit" to speed subsequent
- lookups.
-
- The .exit notation is disabled by default as of Tor 0.2.2.1-alpha, due
- to potential application-level attacks.
-
- EXAMPLES:
- www.example.com.exampletornode.exit
-
- Connect to www.example.com from the node called "exampletornode".
-
- exampletornode.exit
-
- Connect to the published IP address of "exampletornode" using
- "exampletornode" as the exit.
-
-3. .onion
-
- SYNTAX: [digest].onion
-
- The digest is the first eighty bits of a SHA1 hash of the identity key for
- a hidden service, encoded in base32.
-
- When Tor sees an address in this format, it tries to look up and connect to
- the specified hidden service. See rend-spec.txt for full details.
-
diff --git a/doc/spec/bridges-spec.txt b/doc/spec/bridges-spec.txt
deleted file mode 100644
index 647118815..000000000
--- a/doc/spec/bridges-spec.txt
+++ /dev/null
@@ -1,249 +0,0 @@
-
- Tor bridges specification
-
-0. Preface
-
- This document describes the design decisions around support for bridge
- users, bridge relays, and bridge authorities. It acts as an overview
- of the bridge design and deployment for developers, and it also tries
- to point out limitations in the current design and implementation.
-
- For more details on what all of these mean, look at blocking.tex in
- /doc/design-paper/
-
-1. Bridge relays
-
- Bridge relays are just like normal Tor relays except they don't publish
- their server descriptors to the main directory authorities.
-
-1.1. PublishServerDescriptor
-
- To configure your relay to be a bridge relay, just add
- BridgeRelay 1
- PublishServerDescriptor bridge
- to your torrc. This will cause your relay to publish its descriptor
- to the bridge authorities rather than to the default authorities.
-
- Alternatively, you can say
- BridgeRelay 1
- PublishServerDescriptor 0
- which will cause your relay to not publish anywhere. This could be
- useful for private bridges.
-
-1.2. Recommendations.
-
- Bridge relays should use an exit policy of "reject *:*". This is
- because they only need to relay traffic between the bridge users
- and the rest of the Tor network, so there's no need to let people
- exit directly from them.
-
- We invented the RelayBandwidth* options for this situation: Tor clients
- who want to allow relaying too. See proposal 111 for details. Relay
- operators should feel free to rate-limit their relayed traffic.
-
-1.3. Implementation note.
-
- Vidalia 0.0.15 has turned its "Relay" settings page into a tri-state
- "Don't relay" / "Relay for the Tor network" / "Help censored users".
-
- If you click the third choice, it forces your exit policy to reject *:*.
-
- If all the bridges end up on port 9001, that's not so good. On the
- other hand, putting the bridges on a low-numbered port in the Unix
- world requires jumping through extra hoops. The current compromise is
- that Vidalia makes the ORPort default to 443 on Windows, and 9001 on
- other platforms.
-
- At the bottom of the relay config settings window, Vidalia displays
- the bridge identifier to the operator (see Section 3.1) so he can pass
- it on to bridge users.
-
-2. Bridge authorities.
-
- Bridge authorities are like normal v3 directory authorities, except
- they don't create their own network-status documents or votes. So if
- you ask a bridge authority for a network-status document or consensus,
- they behave like a directory mirror: they give you one from one of
- the main authorities. But if you ask the bridge authority for the
- descriptor corresponding to a particular identity fingerprint, it will
- happily give you the latest descriptor for that fingerprint.
-
- To become a bridge authority, add these lines to your torrc:
- AuthoritativeDirectory 1
- BridgeAuthoritativeDir 1
-
- Right now there's one bridge authority, running on the Tonga relay.
-
-2.1. Exporting bridge-purpose descriptors
-
- We've added a new purpose for server descriptors: the "bridge"
- purpose. With the new router-descriptors file format that includes
- annotations, it's easy to look through it and find the bridge-purpose
- descriptors.
-
- Currently we export the bridge descriptors from Tonga to the
- BridgeDB server, so it can give them out according to the policies
- in blocking.pdf.
-
-2.2. Reachability/uptime testing
-
- Right now the bridge authorities do active reachability testing of
- bridges, so we know which ones to recommend for users.
-
- But in the design document, we suggested that bridges should publish
- anonymously (i.e. via Tor) to the bridge authority, so somebody watching
- the bridge authority can't just enumerate all the bridges. But if we're
- doing active measurement, the game is up. Perhaps we should back off on
- this goal, or perhaps we should do our active measurement anonymously?
-
- Answering this issue is scheduled for 0.2.1.x.
-
-2.3. Future work: migrating to multiple bridge authorities
-
- Having only one bridge authority is both a trust bottleneck (if you
- break into one place you learn about every single bridge we've got)
- and a robustness bottleneck (when it's down, bridge users become sad).
-
- Right now if we put up a second bridge authority, all the bridges would
- publish to it, and (assuming the code works) bridge users would query
- a random bridge authority. This resolves the robustness bottleneck,
- but makes the trust bottleneck even worse.
-
- In 0.2.2.x and later we should think about better ways to have multiple
- bridge authorities.
-
-3. Bridge users.
-
- Bridge users are like ordinary Tor users except they use encrypted
- directory connections by default, and they use bridge relays as both
- entry guards (their first hop) and directory guards (the source of
- all their directory information).
-
- To become a bridge user, add the following line to your torrc:
- UseBridges 1
-
- and then add at least one "Bridge" line to your torrc based on the
- format below.
-
-3.1. Format of the bridge identifier.
-
- The canonical format for a bridge identifier contains an IP address,
- an ORPort, and an identity fingerprint:
- bridge 128.31.0.34:9009 4C17 FB53 2E20 B2A8 AC19 9441 ECD2 B017 7B39 E4B1
-
- However, the identity fingerprint can be left out, in which case the
- bridge user will connect to that relay and use it as a bridge regardless
- of what identity key it presents:
- bridge 128.31.0.34:9009
- This might be useful for cases where only short bridge identifiers
- can be communicated to bridge users.
-
- In a future version we may also support bridge identifiers that are
- only a key fingerprint:
- bridge 4C17 FB53 2E20 B2A8 AC19 9441 ECD2 B017 7B39 E4B1
- and the bridge user can fetch the latest descriptor from the bridge
- authority (see Section 3.4).
-
-3.2. Bridges as entry guards
-
- For now, bridge users add their bridge relays to their list of "entry
- guards" (see path-spec.txt for background on entry guards). They are
- managed by the entry guard algorithms exactly as if they were a normal
- entry guard -- their keys and timing get cached in the "state" file,
- etc. This means that when the Tor user starts up with "UseBridges"
- disabled, he will skip past the bridge entries since they won't be
- listed as up and usable in his networkstatus consensus. But to be clear,
- the "entry_guards" list doesn't currently distinguish guards by purpose.
-
- Internally, each bridge user keeps a smartlist of "bridge_info_t"
- that reflects the "bridge" lines from his torrc along with a download
- schedule (see Section 3.5 below). When he starts Tor, he attempts
- to fetch a descriptor for each configured bridge (see Section 3.4
- below). When he succeeds at getting a descriptor for one of the bridges
- in his list, he adds it directly to the entry guard list using the
- normal add_an_entry_guard() interface. Once a bridge descriptor has
- been added, should_delay_dir_fetches() will stop delaying further
- directory fetches, and the user begins to bootstrap his directory
- information from that bridge (see Section 3.3).
-
- Currently bridge users cache their bridge descriptors to the
- "cached-descriptors" file (annotated with purpose "bridge"), but
- they don't make any attempt to reuse descriptors they find in this
- file. The theory is that either the bridge is available now, in which
- case you can get a fresh descriptor, or it's not, in which case an
- old descriptor won't do you much good.
-
- We could disable writing out the bridge lines to the state file, if
- we think this is a problem.
-
- As an exception, if we get an application request when we have one
- or more bridge descriptors but we believe none of them are running,
- we mark them all as running again. This is similar to the exception
- already in place to help long-idle Tor clients realize they should
- fetch fresh directory information rather than just refuse requests.
-
-3.3. Bridges as directory guards
-
- In addition to using bridges as the first hop in their circuits, bridge
- users also use them to fetch directory updates. Other than initial
- bootstrapping to find a working bridge descriptor (see Section 3.4
- below), all further non-anonymized directory fetches will be redirected
- to the bridge.
-
- This means that bridge relays need to have cached answers for all
- questions the bridge user might ask. This makes the upgrade path
- tricky --- for example, if we migrate to a v4 directory design, the
- bridge user would need to keep using v3 so long as his bridge relays
- only knew how to answer v3 queries.
-
- In a future design, for cases where the user has enough information
- to build circuits yet the chosen bridge doesn't know how to answer a
- given query, we might teach bridge users to make an anonymized request
- to a more suitable directory server.
-
-3.4. How bridge users get their bridge descriptor
-
- Bridge users can fetch bridge descriptors in two ways: by going directly
- to the bridge and asking for "/tor/server/authority", or by going to
- the bridge authority and asking for "/tor/server/fp/ID". By default,
- they will only try the direct queries. If the user sets
- UpdateBridgesFromAuthority 1
- in his config file, then he will try querying the bridge authority
- first for bridges where he knows a digest (if he only knows an IP
- address and ORPort, then his only option is a direct query).
-
- If the user has at least one working bridge, then he will do further
- queries to the bridge authority through a full three-hop Tor circuit.
- But when bootstrapping, he will make a direct begin_dir-style connection
- to the bridge authority.
-
- As of Tor 0.2.0.10-alpha, if the user attempts to fetch a descriptor
- from the bridge authority and it returns a 404 not found, the user
- will automatically fall back to trying a direct query. Therefore it is
- recommended that bridge users always set UpdateBridgesFromAuthority,
- since at worst it will delay their fetches a little bit and notify
- the bridge authority of the identity fingerprint (but not location)
- of their intended bridges.
-
-3.5. Bridge descriptor retry schedule
-
- Bridge users try to fetch a descriptor for each bridge (using the
- steps in Section 3.4 above) on startup. Whenever they receive a
- bridge descriptor, they reschedule a new descriptor download for 1
- hour from then.
-
- If on the other hand it fails, they try again after 15 minutes for the
- first attempt, after 15 minutes for the second attempt, and after 60
- minutes for subsequent attempts.
-
- In 0.2.2.x we should come up with some smarter retry schedules.
-
-3.6. Implementation note.
-
- Vidalia 0.1.0 has a new checkbox in its Network config window called
- "My ISP blocks connections to the Tor network." Users who click that
- box change their configuration to:
- UseBridges 1
- UpdateBridgesFromAuthority 1
- and should add at least one bridge identifier.
-
diff --git a/doc/spec/control-spec-v0.txt b/doc/spec/control-spec-v0.txt
deleted file mode 100644
index 3515d395a..000000000
--- a/doc/spec/control-spec-v0.txt
+++ /dev/null
@@ -1,498 +0,0 @@
-
- TC: A Tor control protocol (Version 0)
-
--1. Deprecation
-
-THIS PROTOCOL IS DEPRECATED. It is still documented here because Tor
-0.1.1.x happens to support much of it; but the support for v0 is not
-maintained, so you should expect it to rot in unpredictable ways. Support
-for v0 will be removed some time after Tor 0.1.2.
-
-0. Scope
-
-This document describes an implementation-specific protocol that is used
-for other programs (such as frontend user-interfaces) to communicate
-with a locally running Tor process. It is not part of the Tor onion
-routing protocol.
-
-We're trying to be pretty extensible here, but not infinitely
-forward-compatible.
-
-1. Protocol outline
-
-TC is a bidirectional message-based protocol. It assumes an underlying
-stream for communication between a controlling process (the "client") and
-a Tor process (the "server"). The stream may be implemented via TCP,
-TLS-over-TCP, a Unix-domain socket, or so on, but it must provide
-reliable in-order delivery. For security, the stream should not be
-accessible by untrusted parties.
-
-In TC, the client and server send typed variable-length messages to each
-other over the underlying stream. By default, all messages from the server
-are in response to messages from the client. Some client requests, however,
-will cause the server to send messages to the client indefinitely far into
-the future.
-
-Servers respond to messages in the order they're received.
-
-2. Message format
-
-The messages take the following format:
-
- Length [2 octets; big-endian]
- Type [2 octets; big-endian]
- Body [Length octets]
-
-Upon encountering a recognized Type, implementations behave as described in
-section 3 below. If the type is not recognized, servers respond with an
-"ERROR" message (code UNRECOGNIZED; see 3.1 below), and clients simply ignore
-the message.
-
-2.1. Types and encodings
-
- All numbers are given in big-endian (network) order.
-
- OR identities are given in hexadecimal, in the same format as identity key
- fingerprints, but without spaces; see tor-spec.txt for more information.
-
-3. Message types
-
- Message types are drawn from the following ranges:
-
- 0x0000-0xEFFF : Reserved for use by official versions of this spec.
- 0xF000-0xFFFF : Unallocated; usable by unofficial extensions.
-
-3.1. ERROR (Type 0x0000)
-
- Sent in response to a message that could not be processed as requested.
-
- The body of the message begins with a 2-byte error code. The following
- values are defined:
-
- 0x0000 Unspecified error
- []
-
- 0x0001 Internal error
- [Something went wrong inside Tor, so that the client's
- request couldn't be fulfilled.]
-
- 0x0002 Unrecognized message type
- [The client sent a message type we don't understand.]
-
- 0x0003 Syntax error
- [The client sent a message body in a format we can't parse.]
-
- 0x0004 Unrecognized configuration key
- [The client tried to get or set a configuration option we don't
- recognize.]
-
- 0x0005 Invalid configuration value
- [The client tried to set a configuration option to an
- incorrect, ill-formed, or impossible value.]
-
- 0x0006 Unrecognized byte code
- [The client tried to set a byte code (in the body) that
- we don't recognize.]
-
- 0x0007 Unauthorized.
- [The client tried to send a command that requires
- authorization, but it hasn't sent a valid AUTHENTICATE
- message.]
-
- 0x0008 Failed authentication attempt
- [The client sent a well-formed authorization message.]
-
- 0x0009 Resource exhausted
- [The server didn't have enough of a given resource to
- fulfill a given request.]
-
- 0x000A No such stream
-
- 0x000B No such circuit
-
- 0x000C No such OR
-
- The rest of the body should be a human-readable description of the error.
-
- In general, new error codes should only be added when they don't fall under
- one of the existing error codes.
-
-3.2. DONE (Type 0x0001)
-
- Sent from server to client in response to a request that was successfully
- completed, with no more information needed. The body is usually empty but
- may contain a message.
-
-3.3. SETCONF (Type 0x0002)
-
- Change the value of a configuration variable. The body contains a list of
- newline-terminated key-value configuration lines. An individual key-value
- configuration line consists of the key, followed by a space, followed by
- the value. The server behaves as though it had just read the key-value pair
- in its configuration file.
-
- The server responds with a DONE message on success, or an ERROR message on
- failure.
-
- When a configuration options takes multiple values, or when multiple
- configuration keys form a context-sensitive group (see below), then
- setting _any_ of the options in a SETCONF command is taken to reset all of
- the others. For example, if two ORBindAddress values are configured,
- and a SETCONF command arrives containing a single ORBindAddress value, the
- new command's value replaces the two old values.
-
- To _remove_ all settings for a given option entirely (and go back to its
- default value), send a single line containing the key and no value.
-
-3.4. GETCONF (Type 0x0003)
-
- Request the value of a configuration variable. The body contains one or
- more NL-terminated strings for configuration keys. The server replies
- with a CONFVALUE message.
-
- If an option appears multiple times in the configuration, all of its
- key-value pairs are returned in order.
-
- Some options are context-sensitive, and depend on other options with
- different keywords. These cannot be fetched directly. Currently there
- is only one such option: clients should use the "HiddenServiceOptions"
- virtual keyword to get all HiddenServiceDir, HiddenServicePort,
- HiddenServiceNodes, and HiddenServiceExcludeNodes option settings.
-
-3.5. CONFVALUE (Type 0x0004)
-
- Sent in response to a GETCONF message; contains a list of "Key Value\n"
- (A non-whitespace keyword, a single space, a non-NL value, a NL)
- strings.
-
-3.6. SETEVENTS (Type 0x0005)
-
- Request the server to inform the client about interesting events.
- The body contains a list of 2-byte event codes (see "event" below).
- Any events *not* listed in the SETEVENTS body are turned off; thus, sending
- SETEVENTS with an empty body turns off all event reporting.
-
- The server responds with a DONE message on success, and an ERROR message
- if one of the event codes isn't recognized. (On error, the list of active
- event codes isn't changed.)
-
-3.7. EVENT (Type 0x0006)
-
- Sent from the server to the client when an event has occurred and the
- client has requested that kind of event. The body contains a 2-byte
- event code followed by additional event-dependent information. Event
- codes are:
- 0x0001 -- Circuit status changed
-
- Status [1 octet]
- 0x00 Launched - circuit ID assigned to new circuit
- 0x01 Built - all hops finished, can now accept streams
- 0x02 Extended - one more hop has been completed
- 0x03 Failed - circuit closed (was not built)
- 0x04 Closed - circuit closed (was built)
- Circuit ID [4 octets]
- (Must be unique to Tor process/time)
- Path [NUL-terminated comma-separated string]
- (For extended/failed, is the portion of the path that is
- built)
-
- 0x0002 -- Stream status changed
-
- Status [1 octet]
- (Sent connect=0,sent resolve=1,succeeded=2,failed=3,
- closed=4, new connection=5, new resolve request=6,
- stream detached from circuit and still retriable=7)
- Stream ID [4 octets]
- (Must be unique to Tor process/time)
- Target (NUL-terminated address-port string]
-
- 0x0003 -- OR Connection status changed
-
- Status [1 octet]
- (Launched=0,connected=1,failed=2,closed=3)
- OR nickname/identity [NUL-terminated]
-
- 0x0004 -- Bandwidth used in the last second
-
- Bytes read [4 octets]
- Bytes written [4 octets]
-
- 0x0005 -- Notice/warning/error occurred
-
- Message [NUL-terminated]
-
- <obsolete: use 0x0007-0x000B instead.>
-
- 0x0006 -- New descriptors available
-
- OR List [NUL-terminated, comma-delimited list of
- OR identity]
-
- 0x0007 -- Debug message occurred
- 0x0008 -- Info message occurred
- 0x0009 -- Notice message occurred
- 0x000A -- Warning message occurred
- 0x000B -- Error message occurred
-
- Message [NUL-terminated]
-
-3.8. AUTHENTICATE (Type 0x0007)
-
- Sent from the client to the server. Contains a 'magic cookie' to prove
- that client is really allowed to control this Tor process. The server
- responds with DONE or ERROR.
-
- The format of the 'cookie' is implementation-dependent; see 4.1 below for
- information on how the standard Tor implementation handles it.
-
-3.9. SAVECONF (Type 0x0008)
-
- Sent from the client to the server. Instructs the server to write out
- its config options into its torrc. Server returns DONE if successful, or
- ERROR if it can't write the file or some other error occurs.
-
-3.10. SIGNAL (Type 0x0009)
-
- Sent from the client to the server. The body contains one byte that
- indicates the action the client wishes the server to take.
-
- 1 (0x01) -- Reload: reload config items, refetch directory.
- 2 (0x02) -- Controlled shutdown: if server is an OP, exit immediately.
- If it's an OR, close listeners and exit after 30 seconds.
- 10 (0x0A) -- Dump stats: log information about open connections and
- circuits.
- 12 (0x0C) -- Debug: switch all open logs to loglevel debug.
- 15 (0x0F) -- Immediate shutdown: clean up and exit now.
-
- The server responds with DONE if the signal is recognized (or simply
- closes the socket if it was asked to close immediately), else ERROR.
-
-3.11. MAPADDRESS (Type 0x000A)
-
- Sent from the client to the server. The body contains a sequence of
- address mappings, each consisting of the address to be mapped, a single
- space, the replacement address, and a NL character.
-
- Addresses may be IPv4 addresses, IPv6 addresses, or hostnames.
-
- The client sends this message to the server in order to tell it that future
- SOCKS requests for connections to the original address should be replaced
- with connections to the specified replacement address. If the addresses
- are well-formed, and the server is able to fulfill the request, the server
- replies with a single DONE message containing the source and destination
- addresses. If request is malformed, the server replies with a syntax error
- message. The server can't fulfill the request, it replies with an internal
- ERROR message.
-
- The client may decline to provide a body for the original address, and
- instead send a special null address ("0.0.0.0" for IPv4, "::0" for IPv6, or
- "." for hostname), signifying that the server should choose the original
- address itself, and return that address in the DONE message. The server
- should ensure that it returns an element of address space that is unlikely
- to be in actual use. If there is already an address mapped to the
- destination address, the server may reuse that mapping.
-
- If the original address is already mapped to a different address, the old
- mapping is removed. If the original address and the destination address
- are the same, the server removes any mapping in place for the original
- address.
-
- {Note: This feature is designed to be used to help Tor-ify applications
- that need to use SOCKS4 or hostname-less SOCKS5. There are three
- approaches to doing this:
- 1. Somehow make them use SOCKS4a or SOCKS5-with-hostnames instead.
- 2. Use tor-resolve (or another interface to Tor's resolve-over-SOCKS
- feature) to resolve the hostname remotely. This doesn't work
- with special addresses like x.onion or x.y.exit.
- 3. Use MAPADDRESS to map an IP address to the desired hostname, and then
- arrange to fool the application into thinking that the hostname
- has resolved to that IP.
- This functionality is designed to help implement the 3rd approach.}
-
- [XXXX When, if ever, can mappings expire? Should they expire?]
- [XXXX What addresses, if any, are safe to use?]
-
-3.12 GETINFO (Type 0x000B)
-
- Sent from the client to the server. The message body is as for GETCONF:
- one or more NL-terminated strings. The server replies with an INFOVALUE
- message.
-
- Unlike GETCONF, this message is used for data that are not stored in the
- Tor configuration file, but instead.
-
- Recognized key and their values include:
-
- "version" -- The version of the server's software, including the name
- of the software. (example: "Tor 0.0.9.4")
-
- "desc/id/<OR identity>" or "desc/name/<OR nickname>" -- the latest server
- descriptor for a given OR, NUL-terminated. If no such OR is known, the
- corresponding value is an empty string.
-
- "network-status" -- a space-separated list of all known OR identities.
- This is in the same format as the router-status line in directories;
- see tor-spec.txt for details.
-
- "addr-mappings/all"
- "addr-mappings/config"
- "addr-mappings/cache"
- "addr-mappings/control" -- a NL-terminated list of address mappings, each
- in the form of "from-address" SP "to-address". The 'config' key
- returns those address mappings set in the configuration; the 'cache'
- key returns the mappings in the client-side DNS cache; the 'control'
- key returns the mappings set via the control interface; the 'all'
- target returns the mappings set through any mechanism.
-
-3.13 INFOVALUE (Type 0x000C)
-
- Sent from the server to the client in response to a GETINFO message.
- Contains one or more items of the format:
-
- Key [(NUL-terminated string)]
- Value [(NUL-terminated string)]
-
- The keys match those given in the GETINFO message.
-
-3.14 EXTENDCIRCUIT (Type 0x000D)
-
- Sent from the client to the server. The message body contains two fields:
- Circuit ID [4 octets]
- Path [NUL-terminated, comma-delimited string of OR nickname/identity]
-
- This request takes one of two forms: either the Circuit ID is zero, in
- which case it is a request for the server to build a new circuit according
- to the specified path, or the Circuit ID is nonzero, in which case it is a
- request for the server to extend an existing circuit with that ID according
- to the specified path.
-
- If the request is successful, the server sends a DONE message containing
- a message body consisting of the four-octet Circuit ID of the newly created
- circuit.
-
-3.15 ATTACHSTREAM (Type 0x000E)
-
- Sent from the client to the server. The message body contains two fields:
- Stream ID [4 octets]
- Circuit ID [4 octets]
-
- This message informs the server that the specified stream should be
- associated with the specified circuit. Each stream may be associated with
- at most one circuit, and multiple streams may share the same circuit.
- Streams can only be attached to completed circuits (that is, circuits that
- have sent a circuit status 'built' event).
-
- If the circuit ID is 0, responsibility for attaching the given stream is
- returned to Tor.
-
- {Implementation note: By default, Tor automatically attaches streams to
- circuits itself, unless the configuration variable
- "__LeaveStreamsUnattached" is set to "1". Attempting to attach streams
- via TC when "__LeaveStreamsUnattached" is false may cause a race between
- Tor and the controller, as both attempt to attach streams to circuits.}
-
-3.16 POSTDESCRIPTOR (Type 0x000F)
-
- Sent from the client to the server. The message body contains one field:
- Descriptor [NUL-terminated string]
-
- This message informs the server about a new descriptor.
-
- The descriptor, when parsed, must contain a number of well-specified
- fields, including fields for its nickname and identity.
-
- If there is an error in parsing the descriptor, the server must send an
- appropriate error message. If the descriptor is well-formed but the server
- chooses not to add it, it must reply with a DONE message whose body
- explains why the server was not added.
-
-3.17 FRAGMENTHEADER (Type 0x0010)
-
- Sent in either direction. Used to encapsulate messages longer than 65535
- bytes in length.
-
- Underlying type [2 bytes]
- Total Length [4 bytes]
- Data [Rest of message]
-
- A FRAGMENTHEADER message MUST be followed immediately by a number of
- FRAGMENT messages, such that lengths of the "Data" fields of the
- FRAGMENTHEADER and FRAGMENT messages add to the "Total Length" field of the
- FRAGMENTHEADER message.
-
- Implementations MUST NOT fragment messages of length less than 65536 bytes.
- Implementations MUST be able to process fragmented messages that not
- optimally packed.
-
-3.18 FRAGMENT (Type 0x0011)
-
- Data [Entire message]
-
- See FRAGMENTHEADER for more information
-
-3.19 REDIRECTSTREAM (Type 0x0012)
-
- Sent from the client to the server. The message body contains two fields:
- Stream ID [4 octets]
- Address [variable-length, NUL-terminated.]
-
- Tells the server to change the exit address on the specified stream. No
- remapping is performed on the new provided address.
-
- To be sure that the modified address will be used, this event must be sent
- after a new stream event is received, and before attaching this stream to
- a circuit.
-
-3.20 CLOSESTREAM (Type 0x0013)
-
- Sent from the client to the server. The message body contains three
- fields:
- Stream ID [4 octets]
- Reason [1 octet]
- Flags [1 octet]
-
- Tells the server to close the specified stream. The reason should be
- one of the Tor RELAY_END reasons given in tor-spec.txt. Flags is not
- used currently. Tor may hold the stream open for a while to flush
- any data that is pending.
-
-3.21 CLOSECIRCUIT (Type 0x0014)
-
- Sent from the client to the server. The message body contains two
- fields:
- Circuit ID [4 octets]
- Flags [1 octet]
-
- Tells the server to close the specified circuit. If the LSB of the flags
- field is nonzero, do not close the circuit unless it is unused.
-
-4. Implementation notes
-
-4.1. Authentication
-
- By default, the current Tor implementation trusts all local users.
-
- If the 'CookieAuthentication' option is true, Tor writes a "magic cookie"
- file named "control_auth_cookie" into its data directory. To authenticate,
- the controller must send the contents of this file.
-
- If the 'HashedControlPassword' option is set, it must contain the salted
- hash of a secret password. The salted hash is computed according to the
- S2K algorithm in RFC 2440 (OpenPGP), and prefixed with the s2k specifier.
- This is then encoded in hexadecimal, prefixed by the indicator sequence
- "16:". Thus, for example, the password 'foo' could encode to:
- 16:660537E3E1CD49996044A3BF558097A981F539FEA2F9DA662B4626C1C2
- ++++++++++++++++**^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- salt hashed value
- indicator
- You can generate the salt of a password by calling
- 'tor --hash-password <password>'
- or by using the example code in the Python and Java controller libraries.
- To authenticate under this scheme, the controller sends Tor the original
- secret that was used to generate the password.
-
-4.2. Don't let the buffer get too big.
-
- If you ask for lots of events, and 16MB of them queue up on the buffer,
- the Tor process will close the socket.
-
diff --git a/doc/spec/control-spec.txt b/doc/spec/control-spec.txt
deleted file mode 100644
index f86f94ba6..000000000
--- a/doc/spec/control-spec.txt
+++ /dev/null
@@ -1,2001 +0,0 @@
-
- TC: A Tor control protocol (Version 1)
-
-0. Scope
-
- This document describes an implementation-specific protocol that is used
- for other programs (such as frontend user-interfaces) to communicate with a
- locally running Tor process. It is not part of the Tor onion routing
- protocol.
-
- This protocol replaces version 0 of TC, which is now deprecated. For
- reference, TC is described in "control-spec-v0.txt". Implementors are
- recommended to avoid using TC directly, but instead to use a library that
- can easily be updated to use the newer protocol. (Version 0 is used by Tor
- versions 0.1.0.x; the protocol in this document only works with Tor
- versions in the 0.1.1.x series and later.)
-
- The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
- NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
- "OPTIONAL" in this document are to be interpreted as described in
- RFC 2119.
-
-1. Protocol outline
-
- TC is a bidirectional message-based protocol. It assumes an underlying
- stream for communication between a controlling process (the "client"
- or "controller") and a Tor process (or "server"). The stream may be
- implemented via TCP, TLS-over-TCP, a Unix-domain socket, or so on,
- but it must provide reliable in-order delivery. For security, the
- stream should not be accessible by untrusted parties.
-
- In TC, the client and server send typed messages to each other over the
- underlying stream. The client sends "commands" and the server sends
- "replies".
-
- By default, all messages from the server are in response to messages from
- the client. Some client requests, however, will cause the server to send
- messages to the client indefinitely far into the future. Such
- "asynchronous" replies are marked as such.
-
- Servers respond to messages in the order messages are received.
-
-2. Message format
-
-2.1. Description format
-
- The message formats listed below use ABNF as described in RFC 2234.
- The protocol itself is loosely based on SMTP (see RFC 2821).
-
- We use the following nonterminals from RFC 2822: atom, qcontent
-
- We define the following general-use nonterminals:
-
- String = DQUOTE *qcontent DQUOTE
-
- There are explicitly no limits on line length. All 8-bit characters are
- permitted unless explicitly disallowed.
-
- Wherever CRLF is specified to be accepted from the controller, Tor MAY also
- accept LF. Tor, however, MUST NOT generate LF instead of CRLF.
- Controllers SHOULD always send CRLF.
-
-2.2. Commands from controller to Tor
-
- Command = Keyword Arguments CRLF / "+" Keyword Arguments CRLF Data
- Keyword = 1*ALPHA
- Arguments = *(SP / VCHAR)
-
- Specific commands and their arguments are described below in section 3.
-
-2.3. Replies from Tor to the controller
-
- Reply = SyncReply / AsyncReply
- SyncReply = *(MidReplyLine / DataReplyLine) EndReplyLine
- AsyncReply = *(MidReplyLine / DataReplyLine) EndReplyLine
-
- MidReplyLine = StatusCode "-" ReplyLine
- DataReplyLine = StatusCode "+" ReplyLine Data
- EndReplyLine = StatusCode SP ReplyLine
- ReplyLine = [ReplyText] CRLF
- ReplyText = XXXX
- StatusCode = 3DIGIT
-
- Specific replies are mentioned below in section 3, and described more fully
- in section 4.
-
- [Compatibility note: versions of Tor before 0.2.0.3-alpha sometimes
- generate AsyncReplies of the form "*(MidReplyLine / DataReplyLine)".
- This is incorrect, but controllers that need to work with these
- versions of Tor should be prepared to get multi-line AsyncReplies with
- the final line (usually "650 OK") omitted.]
-
-2.4. General-use tokens
-
- ; CRLF means, "the ASCII Carriage Return character (decimal value 13)
- ; followed by the ASCII Linefeed character (decimal value 10)."
- CRLF = CR LF
-
- ; How a controller tells Tor about a particular OR. There are four
- ; possible formats:
- ; $Fingerprint -- The router whose identity key hashes to the fingerprint.
- ; This is the preferred way to refer to an OR.
- ; $Fingerprint~Nickname -- The router whose identity key hashes to the
- ; given fingerprint, but only if the router has the given nickname.
- ; $Fingerprint=Nickname -- The router whose identity key hashes to the
- ; given fingerprint, but only if the router is Named and has the given
- ; nickname.
- ; Nickname -- The Named router with the given nickname, or, if no such
- ; router exists, any router whose nickname matches the one given.
- ; This is not a safe way to refer to routers, since Named status
- ; could under some circumstances change over time.
- ;
- ; The tokens that implement the above follow:
-
- ServerSpec = LongName / Nickname
- LongName = Fingerprint [ ( "=" / "~" ) Nickname ]
-
- Fingerprint = "$" 40*HEXDIG
- NicknameChar = "a"-"z" / "A"-"Z" / "0" - "9"
- Nickname = 1*19 NicknameChar
-
- ; What follows is an outdated way to refer to ORs.
- ; Feature VERBOSE_NAMES replaces ServerID with LongName in events and
- ; GETINFO results. VERBOSE_NAMES can be enabled starting in Tor version
- ; 0.1.2.2-alpha and it is always-on in 0.2.2.1-alpha and later.
- ServerID = Nickname / Fingerprint
-
-
- ; Unique identifiers for streams or circuits. Currently, Tor only
- ; uses digits, but this may change
- StreamID = 1*16 IDChar
- CircuitID = 1*16 IDChar
- IDChar = ALPHA / DIGIT
-
- Address = ip4-address / ip6-address / hostname (XXXX Define these)
-
- ; A "Data" section is a sequence of octets concluded by the terminating
- ; sequence CRLF "." CRLF. The terminating sequence may not appear in the
- ; body of the data. Leading periods on lines in the data are escaped with
- ; an additional leading period as in RFC 2821 section 4.5.2.
- Data = *DataLine "." CRLF
- DataLine = CRLF / "." 1*LineItem CRLF / NonDotItem *LineItem CRLF
- LineItem = NonCR / 1*CR NonCRLF
- NonDotItem = NonDotCR / 1*CR NonCRLF
-
-3. Commands
-
- All commands are case-insensitive, but most keywords are case-sensitive.
-
-3.1. SETCONF
-
- Change the value of one or more configuration variables. The syntax is:
-
- "SETCONF" 1*(SP keyword ["=" value]) CRLF
- value = String / QuotedString
-
- Tor behaves as though it had just read each of the key-value pairs
- from its configuration file. Keywords with no corresponding values have
- their configuration values reset to 0 or NULL (use RESETCONF if you want
- to set it back to its default). SETCONF is all-or-nothing: if there
- is an error in any of the configuration settings, Tor sets none of them.
-
- Tor responds with a "250 configuration values set" reply on success.
- If some of the listed keywords can't be found, Tor replies with a
- "552 Unrecognized option" message. Otherwise, Tor responds with a
- "513 syntax error in configuration values" reply on syntax error, or a
- "553 impossible configuration setting" reply on a semantic error.
-
- When a configuration option takes multiple values, or when multiple
- configuration keys form a context-sensitive group (see GETCONF below), then
- setting _any_ of the options in a SETCONF command is taken to reset all of
- the others. For example, if two ORBindAddress values are configured, and a
- SETCONF command arrives containing a single ORBindAddress value, the new
- command's value replaces the two old values.
-
- Sometimes it is not possible to change configuration options solely by
- issuing a series of SETCONF commands, because the value of one of the
- configuration options depends on the value of another which has not yet
- been set. Such situations can be overcome by setting multiple configuration
- options with a single SETCONF command (e.g. SETCONF ORPort=443
- ORListenAddress=9001).
-
-3.2. RESETCONF
-
- Remove all settings for a given configuration option entirely, assign
- its default value (if any), and then assign the String provided.
- Typically the String is left empty, to simply set an option back to
- its default. The syntax is:
-
- "RESETCONF" 1*(SP keyword ["=" String]) CRLF
-
- Otherwise it behaves like SETCONF above.
-
-3.3. GETCONF
-
- Request the value of a configuration variable. The syntax is:
-
- "GETCONF" 1*(SP keyword) CRLF
-
- If all of the listed keywords exist in the Tor configuration, Tor replies
- with a series of reply lines of the form:
- 250 keyword=value
- If any option is set to a 'default' value semantically different from an
- empty string, Tor may reply with a reply line of the form:
- 250 keyword
-
- Value may be a raw value or a quoted string. Tor will try to use
- unquoted values except when the value could be misinterpreted through
- not being quoted.
-
- If some of the listed keywords can't be found, Tor replies with a
- "552 unknown configuration keyword" message.
-
- If an option appears multiple times in the configuration, all of its
- key-value pairs are returned in order.
-
- Some options are context-sensitive, and depend on other options with
- different keywords. These cannot be fetched directly. Currently there
- is only one such option: clients should use the "HiddenServiceOptions"
- virtual keyword to get all HiddenServiceDir, HiddenServicePort,
- HiddenServiceNodes, and HiddenServiceExcludeNodes option settings.
-
-3.4. SETEVENTS
-
- Request the server to inform the client about interesting events. The
- syntax is:
-
- "SETEVENTS" [SP "EXTENDED"] *(SP EventCode) CRLF
-
- EventCode = "CIRC" / "STREAM" / "ORCONN" / "BW" / "DEBUG" /
- "INFO" / "NOTICE" / "WARN" / "ERR" / "NEWDESC" / "ADDRMAP" /
- "AUTHDIR_NEWDESCS" / "DESCCHANGED" / "STATUS_GENERAL" /
- "STATUS_CLIENT" / "STATUS_SERVER" / "GUARD" / "NS" / "STREAM_BW" /
- "CLIENTS_SEEN" / "NEWCONSENSUS" / "BUILDTIMEOUT_SET" / "SIGNAL"
-
- Any events *not* listed in the SETEVENTS line are turned off; thus, sending
- SETEVENTS with an empty body turns off all event reporting.
-
- The server responds with a "250 OK" reply on success, and a "552
- Unrecognized event" reply if one of the event codes isn't recognized. (On
- error, the list of active event codes isn't changed.)
-
- If the flag string "EXTENDED" is provided, Tor may provide extra
- information with events for this connection; see 4.1 for more information.
- NOTE: All events on a given connection will be provided in extended format,
- or none.
- NOTE: "EXTENDED" is only supported in Tor 0.1.1.9-alpha or later.
-
- Each event is described in more detail in Section 4.1.
-
-3.5. AUTHENTICATE
-
- Sent from the client to the server. The syntax is:
- "AUTHENTICATE" [ SP 1*HEXDIG / QuotedString ] CRLF
-
- The server responds with "250 OK" on success or "515 Bad authentication" if
- the authentication cookie is incorrect. Tor closes the connection on an
- authentication failure.
-
- The format of the 'cookie' is implementation-dependent; see 5.1 below for
- information on how the standard Tor implementation handles it.
-
- Before the client has authenticated, no command other than PROTOCOLINFO,
- AUTHENTICATE, or QUIT is valid. If the controller sends any other command,
- or sends a malformed command, or sends an unsuccessful AUTHENTICATE
- command, or sends PROTOCOLINFO more than once, Tor sends an error reply and
- closes the connection.
-
- To prevent some cross-protocol attacks, the AUTHENTICATE command is still
- required even if all authentication methods in Tor are disabled. In this
- case, the controller should just send "AUTHENTICATE" CRLF.
-
- (Versions of Tor before 0.1.2.16 and 0.2.0.4-alpha did not close the
- connection after an authentication failure.)
-
-3.6. SAVECONF
-
- Sent from the client to the server. The syntax is:
- "SAVECONF" CRLF
-
- Instructs the server to write out its config options into its torrc. Server
- returns "250 OK" if successful, or "551 Unable to write configuration
- to disk" if it can't write the file or some other error occurs.
-
- See also the "getinfo config-text" command, if the controller wants
- to write the torrc file itself.
-
-3.7. SIGNAL
-
- Sent from the client to the server. The syntax is:
-
- "SIGNAL" SP Signal CRLF
-
- Signal = "RELOAD" / "SHUTDOWN" / "DUMP" / "DEBUG" / "HALT" /
- "HUP" / "INT" / "USR1" / "USR2" / "TERM" / "NEWNYM" /
- "CLEARDNSCACHE"
-
- The meaning of the signals are:
-
- RELOAD -- Reload: reload config items, refetch directory. (like HUP)
- SHUTDOWN -- Controlled shutdown: if server is an OP, exit immediately.
- If it's an OR, close listeners and exit after 30 seconds.
- (like INT)
- DUMP -- Dump stats: log information about open connections and
- circuits. (like USR1)
- DEBUG -- Debug: switch all open logs to loglevel debug. (like USR2)
- HALT -- Immediate shutdown: clean up and exit now. (like TERM)
- CLEARDNSCACHE -- Forget the client-side cached IPs for all hostnames.
- NEWNYM -- Switch to clean circuits, so new application requests
- don't share any circuits with old ones. Also clears
- the client-side DNS cache. (Tor MAY rate-limit its
- response to this signal.)
-
- The server responds with "250 OK" if the signal is recognized (or simply
- closes the socket if it was asked to close immediately), or "552
- Unrecognized signal" if the signal is unrecognized.
-
-3.8. MAPADDRESS
-
- Sent from the client to the server. The syntax is:
-
- "MAPADDRESS" 1*(Address "=" Address SP) CRLF
-
- The first address in each pair is an "original" address; the second is a
- "replacement" address. The client sends this message to the server in
- order to tell it that future SOCKS requests for connections to the original
- address should be replaced with connections to the specified replacement
- address. If the addresses are well-formed, and the server is able to
- fulfill the request, the server replies with a 250 message:
- 250-OldAddress1=NewAddress1
- 250 OldAddress2=NewAddress2
-
- containing the source and destination addresses. If request is
- malformed, the server replies with "512 syntax error in command
- argument". If the server can't fulfill the request, it replies with
- "451 resource exhausted".
-
- The client may decline to provide a body for the original address, and
- instead send a special null address ("0.0.0.0" for IPv4, "::0" for IPv6, or
- "." for hostname), signifying that the server should choose the original
- address itself, and return that address in the reply. The server
- should ensure that it returns an element of address space that is unlikely
- to be in actual use. If there is already an address mapped to the
- destination address, the server may reuse that mapping.
-
- If the original address is already mapped to a different address, the old
- mapping is removed. If the original address and the destination address
- are the same, the server removes any mapping in place for the original
- address.
-
- Example:
- C: MAPADDRESS 0.0.0.0=torproject.org 1.2.3.4=tor.freehaven.net
- S: 250-127.192.10.10=torproject.org
- S: 250 1.2.3.4=tor.freehaven.net
-
- {Note: This feature is designed to be used to help Tor-ify applications
- that need to use SOCKS4 or hostname-less SOCKS5. There are three
- approaches to doing this:
- 1. Somehow make them use SOCKS4a or SOCKS5-with-hostnames instead.
- 2. Use tor-resolve (or another interface to Tor's resolve-over-SOCKS
- feature) to resolve the hostname remotely. This doesn't work
- with special addresses like x.onion or x.y.exit.
- 3. Use MAPADDRESS to map an IP address to the desired hostname, and then
- arrange to fool the application into thinking that the hostname
- has resolved to that IP.
- This functionality is designed to help implement the 3rd approach.}
-
- Mappings set by the controller last until the Tor process exits:
- they never expire. If the controller wants the mapping to last only
- a certain time, then it must explicitly un-map the address when that
- time has elapsed.
-
-3.9. GETINFO
-
- Sent from the client to the server. The syntax is as for GETCONF:
- "GETINFO" 1*(SP keyword) CRLF
- one or more NL-terminated strings. The server replies with an INFOVALUE
- message, or a 551 or 552 error.
-
- Unlike GETCONF, this message is used for data that are not stored in the Tor
- configuration file, and that may be longer than a single line. On success,
- one ReplyLine is sent for each requested value, followed by a final 250 OK
- ReplyLine. If a value fits on a single line, the format is:
- 250-keyword=value
- If a value must be split over multiple lines, the format is:
- 250+keyword=
- value
- .
- Recognized keys and their values include:
-
- "version" -- The version of the server's software, including the name
- of the software. (example: "Tor 0.0.9.4")
-
- "config-file" -- The location of Tor's configuration file ("torrc").
-
- "config-text" -- The contents that Tor would write if you send it
- a SAVECONF command, so the controller can write the file to
- disk itself. [First implemented in 0.2.2.7-alpha.]
-
- ["exit-policy/prepend" -- The default exit policy lines that Tor will
- *prepend* to the ExitPolicy config option.
- -- Never implemented. Useful?]
-
- "exit-policy/default" -- The default exit policy lines that Tor will
- *append* to the ExitPolicy config option.
-
- "desc/id/<OR identity>" or "desc/name/<OR nickname>" -- the latest
- server descriptor for a given OR, NUL-terminated.
-
- "desc-annotations/id/<OR identity>" -- outputs the annotations string
- (source, timestamp of arrival, purpose, etc) for the corresponding
- descriptor. [First implemented in 0.2.0.13-alpha.]
-
- "extra-info/digest/<digest>" -- the extrainfo document whose digest (in
- hex) is <digest>. Only available if we're downloading extra-info
- documents.
-
- "ns/id/<OR identity>" or "ns/name/<OR nickname>" -- the latest router
- status info (v2 directory style) for a given OR. Router status
- info is as given in
- dir-spec.txt, and reflects the current beliefs of this Tor about the
- router in question. Like directory clients, controllers MUST
- tolerate unrecognized flags and lines. The published date and
- descriptor digest are those believed to be best by this Tor,
- not necessarily those for a descriptor that Tor currently has.
- [First implemented in 0.1.2.3-alpha.]
-
- "ns/all" -- Router status info (v2 directory style) for all ORs we
- have an opinion about, joined by newlines. [First implemented
- in 0.1.2.3-alpha.]
-
- "ns/purpose/<purpose>" -- Router status info (v2 directory style)
- for all ORs of this purpose. Mostly designed for /ns/purpose/bridge
- queries. [First implemented in 0.2.0.13-alpha.]
-
- "desc/all-recent" -- the latest server descriptor for every router that
- Tor knows about.
-
- "network-status" -- a space-separated list (v1 directory style)
- of all known OR identities. This is in the same format as the
- router-status line in v1 directories; see dir-spec-v1.txt section
- 3 for details. (If VERBOSE_NAMES is enabled, the output will
- not conform to dir-spec-v1.txt; instead, the result will be a
- space-separated list of LongName, each preceded by a "!" if it is
- believed to be not running.) This option is deprecated; use
- "ns/all" instead.
-
- "address-mappings/all"
- "address-mappings/config"
- "address-mappings/cache"
- "address-mappings/control" -- a \r\n-separated list of address
- mappings, each in the form of "from-address to-address expiry".
- The 'config' key returns those address mappings set in the
- configuration; the 'cache' key returns the mappings in the
- client-side DNS cache; the 'control' key returns the mappings set
- via the control interface; the 'all' target returns the mappings
- set through any mechanism.
- Expiry is formatted as with ADDRMAP events, except that "expiry" is
- always a time in GMT or the string "NEVER"; see section 4.1.7.
- First introduced in 0.2.0.3-alpha.
-
- "addr-mappings/*" -- as for address-mappings/*, but without the
- expiry portion of the value. Use of this value is deprecated
- since 0.2.0.3-alpha; use address-mappings instead.
-
- "address" -- the best guess at our external IP address. If we
- have no guess, return a 551 error. (Added in 0.1.2.2-alpha)
-
- "fingerprint" -- the contents of the fingerprint file that Tor
- writes as a server, or a 551 if we're not a server currently.
- (Added in 0.1.2.3-alpha)
-
- "circuit-status"
- A series of lines as for a circuit status event. Each line is of
- the form:
- CircuitID SP CircStatus [SP Path] CRLF
-
- "stream-status"
- A series of lines as for a stream status event. Each is of the form:
- StreamID SP StreamStatus SP CircID SP Target CRLF
-
- "orconn-status"
- A series of lines as for an OR connection status event. In Tor
- 0.1.2.2-alpha with feature VERBOSE_NAMES enabled and in Tor
- 0.2.2.1-alpha and later by default, each line is of the form:
- LongName SP ORStatus CRLF
-
- In Tor versions 0.1.2.2-alpha through 0.2.2.1-alpha with feature
- VERBOSE_NAMES turned off and before version 0.1.2.2-alpha, each line
- is of the form:
- ServerID SP ORStatus CRLF
-
- "entry-guards"
- A series of lines listing the currently chosen entry guards, if any.
- In Tor 0.1.2.2-alpha with feature VERBOSE_NAMES enabled and in Tor
- 0.2.2.1-alpha and later by default, each line is of the form:
- LongName SP Status [SP ISOTime] CRLF
-
- In Tor versions 0.1.2.2-alpha through 0.2.2.1-alpha with feature
- VERBOSE_NAMES turned off and before version 0.1.2.2-alpha, each line
- is of the form:
- ServerID2 SP Status [SP ISOTime] CRLF
- ServerID2 = Nickname / 40*HEXDIG
-
- The definition of Status is the same for both:
- Status = "up" / "never-connected" / "down" /
- "unusable" / "unlisted"
-
- [From 0.1.1.4-alpha to 0.1.1.10-alpha, entry-guards was called
- "helper-nodes". Tor still supports calling "helper-nodes", but it
- is deprecated and should not be used.]
-
- [Older versions of Tor (before 0.1.2.x-final) generated 'down' instead
- of unlisted/unusable. Current Tors never generate 'down'.]
-
- [XXXX ServerID2 differs from ServerID in not prefixing fingerprints
- with a $. This is an implementation error. It would be nice to add
- the $ back in if we can do so without breaking compatibility.]
-
- "traffic/read" -- Total bytes read (downloaded).
-
- "traffic/written" -- Total bytes written (uploaded).
-
- "accounting/enabled"
- "accounting/hibernating"
- "accounting/bytes"
- "accounting/bytes-left"
- "accounting/interval-start"
- "accounting/interval-wake"
- "accounting/interval-end"
- Information about accounting status. If accounting is enabled,
- "enabled" is 1; otherwise it is 0. The "hibernating" field is "hard"
- if we are accepting no data; "soft" if we're accepting no new
- connections, and "awake" if we're not hibernating at all. The "bytes"
- and "bytes-left" fields contain (read-bytes SP write-bytes), for the
- start and the rest of the interval respectively. The 'interval-start'
- and 'interval-end' fields are the borders of the current interval; the
- 'interval-wake' field is the time within the current interval (if any)
- where we plan[ned] to start being active. The times are GMT.
-
- "config/names"
- A series of lines listing the available configuration options. Each is
- of the form:
- OptionName SP OptionType [ SP Documentation ] CRLF
- OptionName = Keyword
- OptionType = "Integer" / "TimeInterval" / "TimeMsecInterval" /
- "DataSize" / "Float" / "Boolean" / "Time" / "CommaList" /
- "Dependant" / "Virtual" / "String" / "LineList"
- Documentation = Text
-
- "info/names"
- A series of lines listing the available GETINFO options. Each is of
- one of these forms:
- OptionName SP Documentation CRLF
- OptionPrefix SP Documentation CRLF
- OptionPrefix = OptionName "/*"
-
- "events/names"
- A space-separated list of all the events supported by this version of
- Tor's SETEVENTS.
-
- "features/names"
- A space-separated list of all the events supported by this version of
- Tor's USEFEATURE.
-
- "ip-to-country/*"
- Maps IP addresses to 2-letter country codes. For example,
- "GETINFO ip-to-country/18.0.0.1" should give "US".
-
- "next-circuit/IP:port"
- XXX todo.
-
- "process/pid" -- Process id belonging to the main tor process.
- "process/uid" -- User id running the tor process, -1 if unknown (this is
- unimplemented on Windows, returning -1).
- "process/user" -- Username under which the tor process is running,
- providing an empty string if none exists (this is unimplemented on
- Windows, returning an empty string).
- "process/descriptor-limit" -- Upper bound on the file descriptor limit, -1
- if unknown.
-
- "dir/status-vote/current/consensus" [added in Tor 0.2.1.6-alpha]
- "dir/status/authority"
- "dir/status/fp/<F>"
- "dir/status/fp/<F1>+<F2>+<F3>"
- "dir/status/all"
- "dir/server/fp/<F>"
- "dir/server/fp/<F1>+<F2>+<F3>"
- "dir/server/d/<D>"
- "dir/server/d/<D1>+<D2>+<D3>"
- "dir/server/authority"
- "dir/server/all"
- A series of lines listing directory contents, provided according to the
- specification for the URLs listed in Section 4.4 of dir-spec.txt. Note
- that Tor MUST NOT provide private information, such as descriptors for
- routers not marked as general-purpose. When asked for 'authority'
- information for which this Tor is not authoritative, Tor replies with
- an empty string.
-
- "status/circuit-established"
- "status/enough-dir-info"
- "status/good-server-descriptor"
- "status/accepted-server-descriptor"
- "status/..."
- These provide the current internal Tor values for various Tor
- states. See Section 4.1.10 for explanations. (Only a few of the
- status events are available as getinfo's currently. Let us know if
- you want more exposed.)
- "status/reachability-succeeded/or"
- 0 or 1, depending on whether we've found our ORPort reachable.
- "status/reachability-succeeded/dir"
- 0 or 1, depending on whether we've found our DirPort reachable.
- "status/reachability-succeeded"
- "OR=" ("0"/"1") SP "DIR=" ("0"/"1")
- Combines status/reachability-succeeded/*; controllers MUST ignore
- unrecognized elements in this entry.
- "status/bootstrap-phase"
- Returns the most recent bootstrap phase status event
- sent. Specifically, it returns a string starting with either
- "NOTICE BOOTSTRAP ..." or "WARN BOOTSTRAP ...". Controllers should
- use this getinfo when they connect or attach to Tor to learn its
- current bootstrap state.
- "status/version/recommended"
- List of currently recommended versions.
- "status/version/current"
- Status of the current version. One of: new, old, unrecommended,
- recommended, new in series, obsolete, unknown.
- "status/clients-seen"
- A summary of which countries we've seen clients from recently,
- formatted the same as the CLIENTS_SEEN status event described in
- Section 4.1.14. This GETINFO option is currently available only
- for bridge relays.
-
- Examples:
- C: GETINFO version desc/name/moria1
- S: 250+desc/name/moria=
- S: [Descriptor for moria]
- S: .
- S: 250-version=Tor 0.1.1.0-alpha-cvs
- S: 250 OK
-
-3.10. EXTENDCIRCUIT
-
- Sent from the client to the server. The format is:
- "EXTENDCIRCUIT" SP CircuitID
- [SP ServerSpec *("," ServerSpec)
- SP "purpose=" Purpose] CRLF
-
- This request takes one of two forms: either the CircuitID is zero, in
- which case it is a request for the server to build a new circuit,
- or the CircuitID is nonzero, in which case it is a request for the
- server to extend an existing circuit with that ID according to the
- specified path.
-
- If the CircuitID is 0, the controller has the option of providing
- a path for Tor to use to build the circuit. If it does not provide
- a path, Tor will select one automatically from high capacity nodes
- according to path-spec.txt.
-
- If CircuitID is 0 and "purpose=" is specified, then the circuit's
- purpose is set. Two choices are recognized: "general" and
- "controller". If not specified, circuits are created as "general".
-
- If the request is successful, the server sends a reply containing a
- message body consisting of the CircuitID of the (maybe newly created)
- circuit. The syntax is "250" SP "EXTENDED" SP CircuitID CRLF.
-
-3.11. SETCIRCUITPURPOSE
-
- Sent from the client to the server. The format is:
- "SETCIRCUITPURPOSE" SP CircuitID SP Purpose CRLF
-
- This changes the circuit's purpose. See EXTENDCIRCUIT above for details.
-
-3.12. SETROUTERPURPOSE
-
- Sent from the client to the server. The format is:
- "SETROUTERPURPOSE" SP NicknameOrKey SP Purpose CRLF
-
- This changes the descriptor's purpose. See +POSTDESCRIPTOR below
- for details.
-
- NOTE: This command was disabled and made obsolete as of Tor
- 0.2.0.8-alpha. It doesn't exist anymore, and is listed here only for
- historical interest.
-
-3.13. ATTACHSTREAM
-
- Sent from the client to the server. The syntax is:
- "ATTACHSTREAM" SP StreamID SP CircuitID [SP "HOP=" HopNum] CRLF
-
- This message informs the server that the specified stream should be
- associated with the specified circuit. Each stream may be associated with
- at most one circuit, and multiple streams may share the same circuit.
- Streams can only be attached to completed circuits (that is, circuits that
- have sent a circuit status 'BUILT' event or are listed as built in a
- GETINFO circuit-status request).
-
- If the circuit ID is 0, responsibility for attaching the given stream is
- returned to Tor.
-
- If HOP=HopNum is specified, Tor will choose the HopNumth hop in the
- circuit as the exit node, rather than the last node in the circuit.
- Hops are 1-indexed; generally, it is not permitted to attach to hop 1.
-
- Tor responds with "250 OK" if it can attach the stream, 552 if the circuit
- or stream didn't exist, or 551 if the stream couldn't be attached for
- another reason.
-
- {Implementation note: Tor will close unattached streams by itself,
- roughly two minutes after they are born. Let the developers know if
- that turns out to be a problem.}
-
- {Implementation note: By default, Tor automatically attaches streams to
- circuits itself, unless the configuration variable
- "__LeaveStreamsUnattached" is set to "1". Attempting to attach streams
- via TC when "__LeaveStreamsUnattached" is false may cause a race between
- Tor and the controller, as both attempt to attach streams to circuits.}
-
- {Implementation note: You can try to attachstream to a stream that
- has already sent a connect or resolve request but hasn't succeeded
- yet, in which case Tor will detach the stream from its current circuit
- before proceeding with the new attach request.}
-
-3.14. POSTDESCRIPTOR
-
- Sent from the client to the server. The syntax is:
- "+POSTDESCRIPTOR" [SP "purpose=" Purpose] [SP "cache=" Cache]
- CRLF Descriptor CRLF "." CRLF
-
- This message informs the server about a new descriptor. If Purpose is
- specified, it must be either "general", "controller", or "bridge",
- else we return a 552 error. The default is "general".
-
- If Cache is specified, it must be either "no" or "yes", else we
- return a 552 error. If Cache is not specified, Tor will decide for
- itself whether it wants to cache the descriptor, and controllers
- must not rely on its choice.
-
- The descriptor, when parsed, must contain a number of well-specified
- fields, including fields for its nickname and identity.
-
- If there is an error in parsing the descriptor, the server must send a
- "554 Invalid descriptor" reply. If the descriptor is well-formed but
- the server chooses not to add it, it must reply with a 251 message
- whose body explains why the server was not added. If the descriptor
- is added, Tor replies with "250 OK".
-
-3.15. REDIRECTSTREAM
-
- Sent from the client to the server. The syntax is:
- "REDIRECTSTREAM" SP StreamID SP Address [SP Port] CRLF
-
- Tells the server to change the exit address on the specified stream. If
- Port is specified, changes the destination port as well. No remapping
- is performed on the new provided address.
-
- To be sure that the modified address will be used, this event must be sent
- after a new stream event is received, and before attaching this stream to
- a circuit.
-
- Tor replies with "250 OK" on success.
-
-3.16. CLOSESTREAM
-
- Sent from the client to the server. The syntax is:
-
- "CLOSESTREAM" SP StreamID SP Reason *(SP Flag) CRLF
-
- Tells the server to close the specified stream. The reason should be one
- of the Tor RELAY_END reasons given in tor-spec.txt, as a decimal. Flags is
- not used currently; Tor servers SHOULD ignore unrecognized flags. Tor may
- hold the stream open for a while to flush any data that is pending.
-
- Tor replies with "250 OK" on success, or a 512 if there aren't enough
- arguments, or a 552 if it doesn't recognize the StreamID or reason.
-
-3.17. CLOSECIRCUIT
-
- The syntax is:
- CLOSECIRCUIT SP CircuitID *(SP Flag) CRLF
- Flag = "IfUnused"
-
- Tells the server to close the specified circuit. If "IfUnused" is
- provided, do not close the circuit unless it is unused.
-
- Other flags may be defined in the future; Tor SHOULD ignore unrecognized
- flags.
-
- Tor replies with "250 OK" on success, or a 512 if there aren't enough
- arguments, or a 552 if it doesn't recognize the CircuitID.
-
-3.18. QUIT
-
- Tells the server to hang up on this controller connection. This command
- can be used before authenticating.
-
-3.19. USEFEATURE
-
- Adding additional features to the control protocol sometimes will break
- backwards compatibility. Initially such features are added into Tor and
- disabled by default. USEFEATURE can enable these additional features.
-
- The syntax is:
-
- "USEFEATURE" *(SP FeatureName) CRLF
- FeatureName = 1*(ALPHA / DIGIT / "_" / "-")
-
- Feature names are case-insensitive.
-
- Once enabled, a feature stays enabled for the duration of the connection
- to the controller. A new connection to the controller must be opened to
- disable an enabled feature.
-
- Features are a forward-compatibility mechanism; each feature will eventually
- become a standard part of the control protocol. Once a feature becomes part
- of the protocol, it is always-on. Each feature documents the version it was
- introduced as a feature and the version in which it became part of the
- protocol.
-
- Tor will ignore a request to use any feature that is always-on. Tor will give
- a 552 error in response to an unrecognized feature.
-
- EXTENDED_EVENTS
-
- Same as passing 'EXTENDED' to SETEVENTS; this is the preferred way to
- request the extended event syntax.
-
- This feature was first introduced in 0.1.2.3-alpha. It is always-on
- and part of the protocol in Tor 0.2.2.1-alpha and later.
-
- VERBOSE_NAMES
-
- Replaces ServerID with LongName in events and GETINFO results. LongName
- provides a Fingerprint for all routers, an indication of Named status,
- and a Nickname if one is known. LongName is strictly more informative
- than ServerID, which only provides either a Fingerprint or a Nickname.
-
- This feature was first introduced in 0.1.2.2-alpha. It is always-on and
- part of the protocol in Tor 0.2.2.1-alpha and later.
-
-3.20. RESOLVE
-
- The syntax is
- "RESOLVE" *Option *Address CRLF
- Option = "mode=reverse"
- Address = a hostname or IPv4 address
-
- This command launches a remote hostname lookup request for every specified
- request (or reverse lookup if "mode=reverse" is specified). Note that the
- request is done in the background: to see the answers, your controller will
- need to listen for ADDRMAP events; see 4.1.7 below.
-
- [Added in Tor 0.2.0.3-alpha]
-
-3.21. PROTOCOLINFO
-
- The syntax is:
- "PROTOCOLINFO" *(SP PIVERSION) CRLF
-
- The server reply format is:
- "250-PROTOCOLINFO" SP PIVERSION CRLF *InfoLine "250 OK" CRLF
-
- InfoLine = AuthLine / VersionLine / OtherLine
-
- AuthLine = "250-AUTH" SP "METHODS=" AuthMethod *(",")AuthMethod
- *(SP "COOKIEFILE=" AuthCookieFile) CRLF
- VersionLine = "250-VERSION" SP "Tor=" TorVersion [SP Arguments] CRLF
-
- AuthMethod =
- "NULL" / ; No authentication is required
- "HASHEDPASSWORD" / ; A controller must supply the original password
- "COOKIE" / ; A controller must supply the contents of a cookie
-
- AuthCookieFile = QuotedString
- TorVersion = QuotedString
-
- OtherLine = "250-" Keyword [SP Arguments] CRLF
-
- PIVERSION: 1*DIGIT
-
- Tor MAY give its InfoLines in any order; controllers MUST ignore InfoLines
- with keywords they do not recognize. Controllers MUST ignore extraneous
- data on any InfoLine.
-
- PIVERSION is there in case we drastically change the syntax one day. For
- now it should always be "1". Controllers MAY provide a list of the
- protocolinfo versions they support; Tor MAY select a version that the
- controller does not support.
-
- AuthMethod is used to specify one or more control authentication
- methods that Tor currently accepts.
-
- AuthCookieFile specifies the absolute path and filename of the
- authentication cookie that Tor is expecting and is provided iff
- the METHODS field contains the method "COOKIE". Controllers MUST handle
- escape sequences inside this string.
-
- The VERSION line contains the Tor version.
-
- [Unlike other commands besides AUTHENTICATE, PROTOCOLINFO may be used (but
- only once!) before AUTHENTICATE.]
-
- [PROTOCOLINFO was not supported before Tor 0.2.0.5-alpha.]
-
-4. Replies
-
- Reply codes follow the same 3-character format as used by SMTP, with the
- first character defining a status, the second character defining a
- subsystem, and the third designating fine-grained information.
-
- The TC protocol currently uses the following first characters:
-
- 2yz Positive Completion Reply
- The command was successful; a new request can be started.
-
- 4yz Temporary Negative Completion reply
- The command was unsuccessful but might be reattempted later.
-
- 5yz Permanent Negative Completion Reply
- The command was unsuccessful; the client should not try exactly
- that sequence of commands again.
-
- 6yz Asynchronous Reply
- Sent out-of-order in response to an earlier SETEVENTS command.
-
- The following second characters are used:
-
- x0z Syntax
- Sent in response to ill-formed or nonsensical commands.
-
- x1z Protocol
- Refers to operations of the Tor Control protocol.
-
- x5z Tor
- Refers to actual operations of Tor system.
-
- The following codes are defined:
-
- 250 OK
- 251 Operation was unnecessary
- [Tor has declined to perform the operation, but no harm was done.]
-
- 451 Resource exhausted
-
- 500 Syntax error: protocol
-
- 510 Unrecognized command
- 511 Unimplemented command
- 512 Syntax error in command argument
- 513 Unrecognized command argument
- 514 Authentication required
- 515 Bad authentication
-
- 550 Unspecified Tor error
-
- 551 Internal error
- [Something went wrong inside Tor, so that the client's
- request couldn't be fulfilled.]
-
- 552 Unrecognized entity
- [A configuration key, a stream ID, circuit ID, event,
- mentioned in the command did not actually exist.]
-
- 553 Invalid configuration value
- [The client tried to set a configuration option to an
- incorrect, ill-formed, or impossible value.]
-
- 554 Invalid descriptor
-
- 555 Unmanaged entity
-
- 650 Asynchronous event notification
-
- Unless specified to have specific contents, the human-readable messages
- in error replies should not be relied upon to match those in this document.
-
-4.1. Asynchronous events
-
- These replies can be sent after a corresponding SETEVENTS command has been
- received. They will not be interleaved with other Reply elements, but they
- can appear between a command and its corresponding reply. For example,
- this sequence is possible:
-
- C: SETEVENTS CIRC
- S: 250 OK
- C: GETCONF SOCKSPORT ORPORT
- S: 650 CIRC 1000 EXTENDED moria1,moria2
- S: 250-SOCKSPORT=9050
- S: 250 ORPORT=0
-
- But this sequence is disallowed:
- C: SETEVENTS CIRC
- S: 250 OK
- C: GETCONF SOCKSPORT ORPORT
- S: 250-SOCKSPORT=9050
- S: 650 CIRC 1000 EXTENDED moria1,moria2
- S: 250 ORPORT=0
-
- Clients MUST tolerate more arguments in an asynchonous reply than
- expected, and MUST tolerate more lines in an asynchronous reply than
- expected. For instance, a client that expects a CIRC message like:
- 650 CIRC 1000 EXTENDED moria1,moria2
- must tolerate:
- 650-CIRC 1000 EXTENDED moria1,moria2 0xBEEF
- 650-EXTRAMAGIC=99
- 650 ANONYMITY=high
-
- If clients ask for extended events, then each event line as specified below
- will be followed by additional extensions. Additional lines will be of the
- form
- "650" ("-"/" ") KEYWORD ["=" ARGUMENTS] CRLF
- Additional arguments will be of the form
- SP KEYWORD ["=" ( QuotedString / * NonSpDquote ) ]
- Such clients MUST tolerate lines with keywords they do not recognize.
-
-4.1.1. Circuit status changed
-
- The syntax is:
-
- "650" SP "CIRC" SP CircuitID SP CircStatus [SP Path]
- [SP "REASON=" Reason [SP "REMOTE_REASON=" Reason]] CRLF
-
- CircStatus =
- "LAUNCHED" / ; circuit ID assigned to new circuit
- "BUILT" / ; all hops finished, can now accept streams
- "EXTENDED" / ; one more hop has been completed
- "FAILED" / ; circuit closed (was not built)
- "CLOSED" ; circuit closed (was built)
-
- Path = LongName *("," LongName)
- ; In Tor versions 0.1.2.2-alpha through 0.2.2.1-alpha with feature
- ; VERBOSE_NAMES turned off and before version 0.1.2.2-alpha, Path
- ; is as follows:
- Path = ServerID *("," ServerID)
-
- Reason = "NONE" / "TORPROTOCOL" / "INTERNAL" / "REQUESTED" /
- "HIBERNATING" / "RESOURCELIMIT" / "CONNECTFAILED" /
- "OR_IDENTITY" / "OR_CONN_CLOSED" / "TIMEOUT" /
- "FINISHED" / "DESTROYED" / "NOPATH" / "NOSUCHSERVICE" /
- "MEASUREMENT_EXPIRED"
-
- The path is provided only when the circuit has been extended at least one
- hop.
-
- The "REASON" field is provided only for FAILED and CLOSED events, and only
- if extended events are enabled (see 3.19). Clients MUST accept reasons
- not listed above. Reasons are as given in tor-spec.txt, except for:
-
- NOPATH (Not enough nodes to make circuit)
-
- The "REMOTE_REASON" field is provided only when we receive a DESTROY or
- TRUNCATE cell, and only if extended events are enabled. It contains the
- actual reason given by the remote OR for closing the circuit. Clients MUST
- accept reasons not listed above. Reasons are as listed in tor-spec.txt.
-
-4.1.2. Stream status changed
-
- The syntax is:
-
- "650" SP "STREAM" SP StreamID SP StreamStatus SP CircID SP Target
- [SP "REASON=" Reason [ SP "REMOTE_REASON=" Reason ]]
- [SP "SOURCE=" Source] [ SP "SOURCE_ADDR=" Address ":" Port ]
- [SP "PURPOSE=" Purpose]
- CRLF
-
- StreamStatus =
- "NEW" / ; New request to connect
- "NEWRESOLVE" / ; New request to resolve an address
- "REMAP" / ; Address re-mapped to another
- "SENTCONNECT" / ; Sent a connect cell along a circuit
- "SENTRESOLVE" / ; Sent a resolve cell along a circuit
- "SUCCEEDED" / ; Received a reply; stream established
- "FAILED" / ; Stream failed and not retriable
- "CLOSED" / ; Stream closed
- "DETACHED" ; Detached from circuit; still retriable
-
- Target = Address ":" Port
-
- The circuit ID designates which circuit this stream is attached to. If
- the stream is unattached, the circuit ID "0" is given.
-
- Reason = "MISC" / "RESOLVEFAILED" / "CONNECTREFUSED" /
- "EXITPOLICY" / "DESTROY" / "DONE" / "TIMEOUT" /
- "NOROUTE" / "HIBERNATING" / "INTERNAL"/ "RESOURCELIMIT" /
- "CONNRESET" / "TORPROTOCOL" / "NOTDIRECTORY" / "END" /
- "PRIVATE_ADDR"
-
- The "REASON" field is provided only for FAILED, CLOSED, and DETACHED
- events, and only if extended events are enabled (see 3.19). Clients MUST
- accept reasons not listed above. Reasons are as given in tor-spec.txt,
- except for:
-
- END (We received a RELAY_END cell from the other side of this
- stream.)
- PRIVATE_ADDR (The client tried to connect to a private address like
- 127.0.0.1 or 10.0.0.1 over Tor.)
- [XXXX document more. -NM]
-
-
- The "REMOTE_REASON" field is provided only when we receive a RELAY_END
- cell, and only if extended events are enabled. It contains the actual
- reason given by the remote OR for closing the stream. Clients MUST accept
- reasons not listed above. Reasons are as listed in tor-spec.txt.
-
- "REMAP" events include a Source if extended events are enabled:
- Source = "CACHE" / "EXIT"
- Clients MUST accept sources not listed above. "CACHE" is given if
- the Tor client decided to remap the address because of a cached
- answer, and "EXIT" is given if the remote node we queried gave us
- the new address as a response.
-
- The "SOURCE_ADDR" field is included with NEW and NEWRESOLVE events if
- extended events are enabled. It indicates the address and port
- that requested the connection, and can be (e.g.) used to look up the
- requesting program.
-
- Purpose = "DIR_FETCH" / "UPLOAD_DESC" / "DNS_REQUEST" /
- "USER" / "DIRPORT_TEST"
-
- The "PURPOSE" field is provided only for NEW and NEWRESOLVE events, and
- only if extended events are enabled (see 3.19). Clients MUST accept
- purposes not listed above.
-
-4.1.3. OR Connection status changed
-
- The syntax is:
-
- "650" SP "ORCONN" SP (LongName / Target) SP ORStatus [ SP "REASON="
- Reason ] [ SP "NCIRCS=" NumCircuits ] CRLF
-
- ORStatus = "NEW" / "LAUNCHED" / "CONNECTED" / "FAILED" / "CLOSED"
-
- ; In Tor versions 0.1.2.2-alpha through 0.2.2.1-alpha with feature
- ; VERBOSE_NAMES turned off and before version 0.1.2.2-alpha, OR
- ; Connection is as follows:
- "650" SP "ORCONN" SP (ServerID / Target) SP ORStatus [ SP "REASON="
- Reason ] [ SP "NCIRCS=" NumCircuits ] CRLF
-
- NEW is for incoming connections, and LAUNCHED is for outgoing
- connections. CONNECTED means the TLS handshake has finished (in
- either direction). FAILED means a connection is being closed that
- hasn't finished its handshake, and CLOSED is for connections that
- have handshaked.
-
- A LongName or ServerID is specified unless it's a NEW connection, in
- which case we don't know what server it is yet, so we use Address:Port.
-
- If extended events are enabled (see 3.19), optional reason and
- circuit counting information is provided for CLOSED and FAILED
- events.
-
- Reason = "MISC" / "DONE" / "CONNECTREFUSED" /
- "IDENTITY" / "CONNECTRESET" / "TIMEOUT" / "NOROUTE" /
- "IOERROR" / "RESOURCELIMIT"
-
- NumCircuits counts both established and pending circuits.
-
-4.1.4. Bandwidth used in the last second
-
- The syntax is:
- "650" SP "BW" SP BytesRead SP BytesWritten *(SP Type "=" Num) CRLF
- BytesRead = 1*DIGIT
- BytesWritten = 1*DIGIT
- Type = "DIR" / "OR" / "EXIT" / "APP" / ...
- Num = 1*DIGIT
-
- BytesRead and BytesWritten are the totals. [In a future Tor version,
- we may also include a breakdown of the connection types that used
- bandwidth this second (not implemented yet).]
-
-4.1.5. Log messages
-
- The syntax is:
- "650" SP Severity SP ReplyText CRLF
- or
- "650+" Severity CRLF Data 650 SP "OK" CRLF
-
- Severity = "DEBUG" / "INFO" / "NOTICE" / "WARN"/ "ERR"
-
-4.1.6. New descriptors available
-
- Syntax:
- "650" SP "NEWDESC" 1*(SP LongName) CRLF
- ; In Tor versions 0.1.2.2-alpha through 0.2.2.1-alpha with feature
- ; VERBOSE_NAMES turned off and before version 0.1.2.2-alpha, it
- ; is as follows:
- "650" SP "NEWDESC" 1*(SP ServerID) CRLF
-
-4.1.7. New Address mapping
-
- Syntax:
- "650" SP "ADDRMAP" SP Address SP NewAddress SP Expiry
- [SP Error] SP GMTExpiry CRLF
-
- NewAddress = Address / "<error>"
- Expiry = DQUOTE ISOTime DQUOTE / "NEVER"
-
- Error = "error=" ErrorCode
- ErrorCode = XXXX
- GMTExpiry = "EXPIRES=" DQUOTE IsoTime DQUOTE
-
- Error and GMTExpiry are only provided if extended events are enabled.
-
- Expiry is expressed as the local time (rather than GMT). This is a bug,
- left in for backward compatibility; new code should look at GMTExpiry
- instead.
-
- These events are generated when a new address mapping is entered in the
- cache, or when the answer for a RESOLVE command is found.
-
-4.1.8. Descriptors uploaded to us in our role as authoritative dirserver
-
- Syntax:
- "650" "+" "AUTHDIR_NEWDESCS" CRLF Action CRLF Message CRLF
- Descriptor CRLF "." CRLF "650" SP "OK" CRLF
- Action = "ACCEPTED" / "DROPPED" / "REJECTED"
- Message = Text
-
-4.1.9. Our descriptor changed
-
- Syntax:
- "650" SP "DESCCHANGED" CRLF
-
- [First added in 0.1.2.2-alpha.]
-
-4.1.10. Status events
-
- Status events (STATUS_GENERAL, STATUS_CLIENT, and STATUS_SERVER) are sent
- based on occurrences in the Tor process pertaining to the general state of
- the program. Generally, they correspond to log messages of severity Notice
- or higher. They differ from log messages in that their format is a
- specified interface.
-
- Syntax:
- "650" SP StatusType SP StatusSeverity SP StatusAction
- [SP StatusArguments] CRLF
-
- StatusType = "STATUS_GENERAL" / "STATUS_CLIENT" / "STATUS_SERVER"
- StatusSeverity = "NOTICE" / "WARN" / "ERR"
- StatusAction = 1*ALPHA
- StatusArguments = StatusArgument *(SP StatusArgument)
- StatusArgument = StatusKeyword '=' StatusValue
- StatusKeyword = 1*(ALNUM / "_")
- StatusValue = 1*(ALNUM / '_') / QuotedString
-
- Action is a string, and Arguments is a series of keyword=value
- pairs on the same line. Values may be space-terminated strings,
- or quoted strings.
-
- These events are always produced with EXTENDED_EVENTS and
- VERBOSE_NAMES; see the explanations in the USEFEATURE section
- for details.
-
- Controllers MUST tolerate unrecognized actions, MUST tolerate
- unrecognized arguments, MUST tolerate missing arguments, and MUST
- tolerate arguments that arrive in any order.
-
- Each event description below is accompanied by a recommendation for
- controllers. These recommendations are suggestions only; no controller
- is required to implement them.
-
- Compatibility note: versions of Tor before 0.2.0.22-rc incorrectly
- generated "STATUS_SERVER" as "STATUS_SEVER". To be compatible with those
- versions, tools should accept both.
-
- Actions for STATUS_GENERAL events can be as follows:
-
- CLOCK_JUMPED
- "TIME=NUM"
- Tor spent enough time without CPU cycles that it has closed all
- its circuits and will establish them anew. This typically
- happens when a laptop goes to sleep and then wakes up again. It
- also happens when the system is swapping so heavily that Tor is
- starving. The "time" argument specifies the number of seconds Tor
- thinks it was unconscious for (or alternatively, the number of
- seconds it went back in time).
-
- This status event is sent as NOTICE severity normally, but WARN
- severity if Tor is acting as a server currently.
-
- {Recommendation for controller: ignore it, since we don't really
- know what the user should do anyway. Hm.}
-
- DANGEROUS_VERSION
- "CURRENT=version"
- "REASON=NEW/OBSOLETE/UNRECOMMENDED"
- "RECOMMENDED=\"version, version, ...\""
- Tor has found that directory servers don't recommend its version of
- the Tor software. RECOMMENDED is a comma-and-space-separated string
- of Tor versions that are recommended. REASON is NEW if this version
- of Tor is newer than any recommended version, OBSOLETE if
- this version of Tor is older than any recommended version, and
- UNRECOMMENDED if some recommended versions of Tor are newer and
- some are older than this version. (The "OBSOLETE" reason was called
- "OLD" from Tor 0.1.2.3-alpha up to and including 0.2.0.12-alpha.)
-
- {Controllers may want to suggest that the user upgrade OLD or
- UNRECOMMENDED versions. NEW versions may be known-insecure, or may
- simply be development versions.}
-
- TOO_MANY_CONNECTIONS
- "CURRENT=NUM"
- Tor has reached its ulimit -n or whatever the native limit is on file
- descriptors or sockets. CURRENT is the number of sockets Tor
- currently has open. The user should really do something about
- this. The "current" argument shows the number of connections currently
- open.
-
- {Controllers may recommend that the user increase the limit, or
- increase it for them. Recommendations should be phrased in an
- OS-appropriate way and automated when possible.}
-
- BUG
- "REASON=STRING"
- Tor has encountered a situation that its developers never expected,
- and the developers would like to learn that it happened. Perhaps
- the controller can explain this to the user and encourage her to
- file a bug report?
-
- {Controllers should log bugs, but shouldn't annoy the user in case a
- bug appears frequently.}
-
- CLOCK_SKEW
- SKEW="+" / "-" SECONDS
- MIN_SKEW="+" / "-" SECONDS.
- SOURCE="DIRSERV:" IP ":" Port /
- "NETWORKSTATUS:" IP ":" Port /
- "OR:" IP ":" Port /
- "CONSENSUS"
- If "SKEW" is present, it's an estimate of how far we are from the
- time declared in the source. (In other words, if we're an hour in
- the past, the value is -3600.) "MIN_SKEW" is present, it's a lower
- bound. If the source is a DIRSERV, we got the current time from a
- connection to a dirserver. If the source is a NETWORKSTATUS, we
- decided we're skewed because we got a v2 networkstatus from far in
- the future. If the source is OR, the skew comes from a NETINFO
- cell from a connection to another relay. If the source is
- CONSENSUS, we decided we're skewed because we got a networkstatus
- consensus from the future.
-
- {Tor should send this message to controllers when it thinks the
- skew is so high that it will interfere with proper Tor operation.
- Controllers shouldn't blindly adjust the clock, since the more
- accurate source of skew info (DIRSERV) is currently
- unauthenticated.}
-
- BAD_LIBEVENT
- "METHOD=" libevent method
- "VERSION=" libevent version
- "BADNESS=" "BROKEN" / "BUGGY" / "SLOW"
- "RECOVERED=" "NO" / "YES"
- Tor knows about bugs in using the configured event method in this
- version of libevent. "BROKEN" libevents won't work at all;
- "BUGGY" libevents might work okay; "SLOW" libevents will work
- fine, but not quickly. If "RECOVERED" is YES, Tor managed to
- switch to a more reliable (but probably slower!) libevent method.
-
- {Controllers may want to warn the user if this event occurs, though
- generally it's the fault of whoever built the Tor binary and there's
- not much the user can do besides upgrade libevent or upgrade the
- binary.}
-
- DIR_ALL_UNREACHABLE
- Tor believes that none of the known directory servers are
- reachable -- this is most likely because the local network is
- down or otherwise not working, and might help to explain for the
- user why Tor appears to be broken.
-
- {Controllers may want to warn the user if this event occurs; further
- action is generally not possible.}
-
- CONSENSUS_ARRIVED
- Tor has received and validated a new consensus networkstatus.
- (This event can be delayed a little while after the consensus
- is received, if Tor needs to fetch certificates.)
-
- Actions for STATUS_CLIENT events can be as follows:
-
- BOOTSTRAP
- "PROGRESS=" num
- "TAG=" Keyword
- "SUMMARY=" String
- ["WARNING=" String
- "REASON=" Keyword
- "COUNT=" num
- "RECOMMENDATION=" Keyword
- ]
-
- Tor has made some progress at establishing a connection to the
- Tor network, fetching directory information, or making its first
- circuit; or it has encountered a problem while bootstrapping. This
- status event is especially useful for users with slow connections
- or with connectivity problems.
-
- "Progress" gives a number between 0 and 100 for how far through
- the bootstrapping process we are. "Summary" is a string that can
- be displayed to the user to describe the *next* task that Tor
- will tackle, i.e., the task it is working on after sending the
- status event. "Tag" is a string that controllers can use to
- recognize bootstrap phases, if they want to do something smarter
- than just blindly displaying the summary string; see Section 5
- for the current tags that Tor issues.
-
- The StatusSeverity describes whether this is a normal bootstrap
- phase (severity notice) or an indication of a bootstrapping
- problem (severity warn).
-
- For bootstrap problems, we include the same progress, tag, and
- summary values as we would for a normal bootstrap event, but we
- also include "warning", "reason", "count", and "recommendation"
- key/value combos. The "count" number tells how many bootstrap
- problems there have been so far at this phase. The "reason"
- string lists one of the reasons allowed in the ORCONN event. The
- "warning" argument string with any hints Tor has to offer about
- why it's having troubles bootstrapping.
-
- The "reason" values are long-term-stable controller-facing tags to
- identify particular issues in a bootstrapping step. The warning
- strings, on the other hand, are human-readable. Controllers
- SHOULD NOT rely on the format of any warning string. Currently
- the possible values for "recommendation" are either "ignore" or
- "warn" -- if ignore, the controller can accumulate the string in
- a pile of problems to show the user if the user asks; if warn,
- the controller should alert the user that Tor is pretty sure
- there's a bootstrapping problem.
-
- Currently Tor uses recommendation=ignore for the first
- nine bootstrap problem reports for a given phase, and then
- uses recommendation=warn for subsequent problems at that
- phase. Hopefully this is a good balance between tolerating
- occasional errors and reporting serious problems quickly.
-
- ENOUGH_DIR_INFO
- Tor now knows enough network-status documents and enough server
- descriptors that it's going to start trying to build circuits now.
-
- {Controllers may want to use this event to decide when to indicate
- progress to their users, but should not interrupt the user's browsing
- to tell them so.}
-
- NOT_ENOUGH_DIR_INFO
- We discarded expired statuses and router descriptors to fall
- below the desired threshold of directory information. We won't
- try to build any circuits until ENOUGH_DIR_INFO occurs again.
-
- {Controllers may want to use this event to decide when to indicate
- progress to their users, but should not interrupt the user's browsing
- to tell them so.}
-
- CIRCUIT_ESTABLISHED
- Tor is able to establish circuits for client use. This event will
- only be sent if we just built a circuit that changed our mind --
- that is, prior to this event we didn't know whether we could
- establish circuits.
-
- {Suggested use: controllers can notify their users that Tor is
- ready for use as a client once they see this status event. [Perhaps
- controllers should also have a timeout if too much time passes and
- this event hasn't arrived, to give tips on how to troubleshoot.
- On the other hand, hopefully Tor will send further status events
- if it can identify the problem.]}
-
- CIRCUIT_NOT_ESTABLISHED
- "REASON=" "EXTERNAL_ADDRESS" / "DIR_ALL_UNREACHABLE" / "CLOCK_JUMPED"
- We are no longer confident that we can build circuits. The "reason"
- keyword provides an explanation: which other status event type caused
- our lack of confidence.
-
- {Controllers may want to use this event to decide when to indicate
- progress to their users, but should not interrupt the user's browsing
- to do so.}
- [Note: only REASON=CLOCK_JUMPED is implemented currently.]
-
- DANGEROUS_PORT
- "PORT=" port
- "RESULT=" "REJECT" / "WARN"
- A stream was initiated to a port that's commonly used for
- vulnerable-plaintext protocols. If the Result is "reject", we
- refused the connection; whereas if it's "warn", we allowed it.
-
- {Controllers should warn their users when this occurs, unless they
- happen to know that the application using Tor is in fact doing so
- correctly (e.g., because it is part of a distributed bundle). They
- might also want some sort of interface to let the user configure
- their RejectPlaintextPorts and WarnPlaintextPorts config options.}
-
- DANGEROUS_SOCKS
- "PROTOCOL=" "SOCKS4" / "SOCKS5"
- "ADDRESS=" IP:port
- A connection was made to Tor's SOCKS port using one of the SOCKS
- approaches that doesn't support hostnames -- only raw IP addresses.
- If the client application got this address from gethostbyname(),
- it may be leaking target addresses via DNS.
-
- {Controllers should warn their users when this occurs, unless they
- happen to know that the application using Tor is in fact doing so
- correctly (e.g., because it is part of a distributed bundle).}
-
- SOCKS_UNKNOWN_PROTOCOL
- "DATA=string"
- A connection was made to Tor's SOCKS port that tried to use it
- for something other than the SOCKS protocol. Perhaps the user is
- using Tor as an HTTP proxy? The DATA is the first few characters
- sent to Tor on the SOCKS port.
-
- {Controllers may want to warn their users when this occurs: it
- indicates a misconfigured application.}
-
- SOCKS_BAD_HOSTNAME
- "HOSTNAME=QuotedString"
- Some application gave us a funny-looking hostname. Perhaps
- it is broken? In any case it won't work with Tor and the user
- should know.
-
- {Controllers may want to warn their users when this occurs: it
- usually indicates a misconfigured application.}
-
- Actions for STATUS_SERVER can be as follows:
-
- EXTERNAL_ADDRESS
- "ADDRESS=IP"
- "HOSTNAME=NAME"
- "METHOD=CONFIGURED/DIRSERV/RESOLVED/INTERFACE/GETHOSTNAME"
- Our best idea for our externally visible IP has changed to 'IP'.
- If 'HOSTNAME' is present, we got the new IP by resolving 'NAME'. If the
- method is 'CONFIGURED', the IP was given verbatim as a configuration
- option. If the method is 'RESOLVED', we resolved the Address
- configuration option to get the IP. If the method is 'GETHOSTNAME',
- we resolved our hostname to get the IP. If the method is 'INTERFACE',
- we got the address of one of our network interfaces to get the IP. If
- the method is 'DIRSERV', a directory server told us a guess for what
- our IP might be.
-
- {Controllers may want to record this info and display it to the user.}
-
- CHECKING_REACHABILITY
- "ORADDRESS=IP:port"
- "DIRADDRESS=IP:port"
- We're going to start testing the reachability of our external OR port
- or directory port.
-
- {This event could affect the controller's idea of server status, but
- the controller should not interrupt the user to tell them so.}
-
- REACHABILITY_SUCCEEDED
- "ORADDRESS=IP:port"
- "DIRADDRESS=IP:port"
- We successfully verified the reachability of our external OR port or
- directory port (depending on which of ORADDRESS or DIRADDRESS is
- given.)
-
- {This event could affect the controller's idea of server status, but
- the controller should not interrupt the user to tell them so.}
-
- GOOD_SERVER_DESCRIPTOR
- We successfully uploaded our server descriptor to at least one
- of the directory authorities, with no complaints.
-
- {Originally, the goal of this event was to declare "every authority
- has accepted the descriptor, so there will be no complaints
- about it." But since some authorities might be offline, it's
- harder to get certainty than we had thought. As such, this event
- is equivalent to ACCEPTED_SERVER_DESCRIPTOR below. Controllers
- should just look at ACCEPTED_SERVER_DESCRIPTOR and should ignore
- this event for now.}
-
- SERVER_DESCRIPTOR_STATUS
- "STATUS=" "LISTED" / "UNLISTED"
- We just got a new networkstatus consensus, and whether we're in
- it or not in it has changed. Specifically, status is "listed"
- if we're listed in it but previous to this point we didn't know
- we were listed in a consensus; and status is "unlisted" if we
- thought we should have been listed in it (e.g. we were listed in
- the last one), but we're not.
-
- {Moving from listed to unlisted is not necessarily cause for
- alarm. The relay might have failed a few reachability tests,
- or the Internet might have had some routing problems. So this
- feature is mainly to let relay operators know when their relay
- has successfully been listed in the consensus.}
-
- [Not implemented yet. We should do this in 0.2.2.x. -RD]
-
- NAMESERVER_STATUS
- "NS=addr"
- "STATUS=" "UP" / "DOWN"
- "ERR=" message
- One of our nameservers has changed status.
-
- {This event could affect the controller's idea of server status, but
- the controller should not interrupt the user to tell them so.}
-
- NAMESERVER_ALL_DOWN
- All of our nameservers have gone down.
-
- {This is a problem; if it happens often without the nameservers
- coming up again, the user needs to configure more or better
- nameservers.}
-
- DNS_HIJACKED
- Our DNS provider is providing an address when it should be saying
- "NOTFOUND"; Tor will treat the address as a synonym for "NOTFOUND".
-
- {This is an annoyance; controllers may want to tell admins that their
- DNS provider is not to be trusted.}
-
- DNS_USELESS
- Our DNS provider is giving a hijacked address instead of well-known
- websites; Tor will not try to be an exit node.
-
- {Controllers could warn the admin if the server is running as an
- exit server: the admin needs to configure a good DNS server.
- Alternatively, this happens a lot in some restrictive environments
- (hotels, universities, coffeeshops) when the user hasn't registered.}
-
- BAD_SERVER_DESCRIPTOR
- "DIRAUTH=addr:port"
- "REASON=string"
- A directory authority rejected our descriptor. Possible reasons
- include malformed descriptors, incorrect keys, highly skewed clocks,
- and so on.
-
- {Controllers should warn the admin, and try to cope if they can.}
-
- ACCEPTED_SERVER_DESCRIPTOR
- "DIRAUTH=addr:port"
- A single directory authority accepted our descriptor.
- // actually notice
-
- {This event could affect the controller's idea of server status, but
- the controller should not interrupt the user to tell them so.}
-
- REACHABILITY_FAILED
- "ORADDRESS=IP:port"
- "DIRADDRESS=IP:port"
- We failed to connect to our external OR port or directory port
- successfully.
-
- {This event could affect the controller's idea of server status. The
- controller should warn the admin and suggest reasonable steps to take.}
-
-4.1.11. Our set of guard nodes has changed
-
- Syntax:
- "650" SP "GUARD" SP Type SP Name SP Status ... CRLF
- Type = "ENTRY"
- Name = The (possibly verbose) nickname of the guard affected.
- Status = "NEW" | "UP" | "DOWN" | "BAD" | "GOOD" | "DROPPED"
-
- [explain states. XXX]
-
-4.1.12. Network status has changed
-
- Syntax:
- "650" "+" "NS" CRLF 1*NetworkStatus "." CRLF "650" SP "OK" CRLF
-
- The event is used whenever our local view of a relay status changes.
- This happens when we get a new v3 consensus (in which case the entries
- we see are a duplicate of what we see in the NEWCONSENSUS event,
- below), but it also happens when we decide to mark a relay as up or
- down in our local status, for example based on connection attempts.
-
- [First added in 0.1.2.3-alpha]
-
-4.1.13. Bandwidth used on an application stream
-
- The syntax is:
- "650" SP "STREAM_BW" SP StreamID SP BytesWritten SP BytesRead CRLF
- BytesWritten = 1*DIGIT
- BytesRead = 1*DIGIT
-
- BytesWritten and BytesRead are the number of bytes written and read
- by the application since the last STREAM_BW event on this stream.
-
- Note that from Tor's perspective, *reading* a byte on a stream means
- that the application *wrote* the byte. That's why the order of "written"
- vs "read" is opposite for stream_bw events compared to bw events.
-
- These events are generated about once per second per stream; no events
- are generated for streams that have not written or read. These events
- apply only to streams entering Tor (such as on a SOCKSPort, TransPort,
- or so on). They are not generated for exiting streams.
-
-4.1.14. Per-country client stats
-
- The syntax is:
- "650" SP "CLIENTS_SEEN" SP TimeStarted SP CountrySummary CRLF
-
- We just generated a new summary of which countries we've seen clients
- from recently. The controller could display this for the user, e.g.
- in their "relay" configuration window, to give them a sense that they
- are actually being useful.
-
- Currently only bridge relays will receive this event, but once we figure
- out how to sufficiently aggregate and sanitize the client counts on
- main relays, we might start sending these events in other cases too.
-
- TimeStarted is a quoted string indicating when the reported summary
- counts from (in GMT).
-
- The CountrySummary keyword has as its argument a comma-separated,
- possibly empty set of "countrycode=count" pairs. For example (without
- linebreak),
- 650-CLIENTS_SEEN TimeStarted="2008-12-25 23:50:43"
- CountrySummary=us=16,de=8,uk=8
-
-4.1.15. New consensus networkstatus has arrived.
-
- The syntax is:
- "650" "+" "NEWCONSENSUS" CRLF 1*NetworkStatus "." CRLF "650" SP
- "OK" CRLF
-
- A new consensus networkstatus has arrived. We include NS-style lines for
- every relay in the consensus. NEWCONSENSUS is a separate event from the
- NS event, because the list here represents every usable relay: so any
- relay *not* mentioned in this list is implicitly no longer recommended.
-
- [First added in 0.2.1.13-alpha]
-
-4.1.16. New circuit buildtime has been set.
-
- The syntax is:
- "650" SP "BUILDTIMEOUT_SET" SP Type SP "TOTAL_TIMES=" Total SP
- "TIMEOUT_MS=" Timeout SP "XM=" Xm SP "ALPHA=" Alpha SP
- "CUTOFF_QUANTILE=" Quantile SP "TIMEOUT_RATE=" TimeoutRate SP
- "CLOSE_MS=" CloseTimeout SP "CLOSE_RATE=" CloseRate
- CRLF
- Type = "COMPUTED" / "RESET" / "SUSPENDED" / "DISCARD" / "RESUME"
- Total = Integer count of timeouts stored
- Timeout = Integer timeout in milliseconds
- Xm = Estimated integer Pareto parameter Xm in milliseconds
- Alpha = Estimated floating point Paredo paremter alpha
- Quantile = Floating point CDF quantile cutoff point for this timeout
- TimeoutRate = Floating point ratio of circuits that timeout
- CloseTimeout = How long to keep measurement circs in milliseconds
- CloseRate = Floating point ratio of measurement circuits that are closed
-
- A new circuit build timeout time has been set. If Type is "COMPUTED",
- Tor has computed the value based on historical data. If Type is "RESET",
- initialization or drastic network changes have caused Tor to reset
- the timeout back to the default, to relearn again. If Type is
- "SUSPENDED", Tor has detected a loss of network connectivity and has
- temporarily changed the timeout value to the default until the network
- recovers. If type is "DISCARD", Tor has decided to discard timeout
- values that likely happened while the network was down. If type is
- "RESUME", Tor has decided to resume timeout calculation.
-
- The Total value is the count of circuit build times Tor used in
- computing this value. It is capped internally at the maximum number
- of build times Tor stores (NCIRCUITS_TO_OBSERVE).
-
- The Timeout itself is provided in milliseconds. Internally, Tor rounds
- this value to the nearest second before using it.
-
- [First added in 0.2.2.7-alpha]
-
-4.1.17. Signal received
-
- The syntax is:
- "650" SP "SIGNAL" SP Signal CRLF
-
- Signal = "RELOAD" / "DUMP" / "DEBUG" / "NEWNYM" / "CLEARDNSCACHE"
-
- A signal has been received and actions taken by Tor. The meaning of each
- signal, and the mapping to Unix signals, is as defined in section 3.7.
- Future versions of Tor MAY generate signals other than those listed here;
- controllers MUST be able to accept them.
-
- If Tor chose to ignore a signal (such as NEWNYM), this event will not be
- sent. Note that some options (like ReloadTorrcOnSIGHUP) may affect the
- semantics of the signals here.
-
- Note that the HALT (SIGTERM) and SHUTDOWN (SIGINT) signals do not currently
- generate any event.
-
- [First added in 0.2.3.1-alpha]
-
-5. Implementation notes
-
-5.1. Authentication
-
- If the control port is open and no authentication operation is enabled, Tor
- trusts any local user that connects to the control port. This is generally
- a poor idea.
-
- If the 'CookieAuthentication' option is true, Tor writes a "magic cookie"
- file named "control_auth_cookie" into its data directory. To authenticate,
- the controller must send the contents of this file, encoded in hexadecimal.
-
- If the 'HashedControlPassword' option is set, it must contain the salted
- hash of a secret password. The salted hash is computed according to the
- S2K algorithm in RFC 2440 (OpenPGP), and prefixed with the s2k specifier.
- This is then encoded in hexadecimal, prefixed by the indicator sequence
- "16:". Thus, for example, the password 'foo' could encode to:
- 16:660537E3E1CD49996044A3BF558097A981F539FEA2F9DA662B4626C1C2
- ++++++++++++++++**^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- salt hashed value
- indicator
- You can generate the salt of a password by calling
- 'tor --hash-password <password>'
- or by using the example code in the Python and Java controller libraries.
- To authenticate under this scheme, the controller sends Tor the original
- secret that was used to generate the password, either as a quoted string
- or encoded in hexadecimal.
-
-5.2. Don't let the buffer get too big.
-
- If you ask for lots of events, and 16MB of them queue up on the buffer,
- the Tor process will close the socket.
-
-5.3. Backward compatibility with v0 control protocol.
-
- The 'version 0' control protocol was replaced in Tor 0.1.1.x. Support
- was removed in Tor 0.2.0.x. Every non-obsolete version of Tor now
- supports the version 1 control protocol.
-
- For backward compatibility with the "version 0" control protocol,
- Tor used to check whether the third octet of the first command is zero.
- (If it was, Tor assumed that version 0 is in use.)
-
- This compatibility was removed in Tor 0.1.2.16 and 0.2.0.4-alpha.
-
-5.4. Tor config options for use by controllers
-
- Tor provides a few special configuration options for use by controllers.
- These options can be set and examined by the SETCONF and GETCONF commands,
- but are not saved to disk by SAVECONF.
-
- Generally, these options make Tor unusable by disabling a portion of Tor's
- normal operations. Unless a controller provides replacement functionality
- to fill this gap, Tor will not correctly handle user requests.
-
- __AllDirOptionsPrivate
-
- If true, Tor will try to launch all directory operations through
- anonymous connections. (Ordinarily, Tor only tries to anonymize
- requests related to hidden services.) This option will slow down
- directory access, and may stop Tor from working entirely if it does not
- yet have enough directory information to build circuits.
-
- (Boolean. Default: "0".)
-
- __DisablePredictedCircuits
-
- If true, Tor will not launch preemptive "general-purpose" circuits for
- streams to attach to. (It will still launch circuits for testing and
- for hidden services.)
-
- (Boolean. Default: "0".)
-
- __LeaveStreamsUnattached
-
- If true, Tor will not automatically attach new streams to circuits;
- instead, the controller must attach them with ATTACHSTREAM. If the
- controller does not attach the streams, their data will never be routed.
-
- (Boolean. Default: "0".)
-
- __HashedControlSessionPassword
-
- As HashedControlPassword, but is not saved to the torrc file by
- SAVECONF. Added in Tor 0.2.0.20-rc.
-
- __ReloadTorrcOnSIGHUP
-
- If this option is true (the default), we reload the torrc from disk
- every time we get a SIGHUP (from the controller or via a signal).
- Otherwise, we don't. This option exists so that controllers can keep
- their options from getting overwritten when a user sends Tor a HUP for
- some other reason (for example, to rotate the logs).
-
- (Boolean. Default: "1")
-
-5.5. Phases from the Bootstrap status event.
-
- This section describes the various bootstrap phases currently reported
- by Tor. Controllers should not assume that the percentages and tags
- listed here will continue to match up, or even that the tags will stay
- in the same order. Some phases might also be skipped (not reported)
- if the associated bootstrap step is already complete, or if the phase
- no longer is necessary. Only "starting" and "done" are guaranteed to
- exist in all future versions.
-
- Current Tor versions enter these phases in order, monotonically.
- Future Tors MAY revisit earlier stages.
-
- Phase 0:
- tag=starting summary="Starting"
-
- Tor starts out in this phase.
-
- Phase 5:
- tag=conn_dir summary="Connecting to directory mirror"
-
- Tor sends this event as soon as Tor has chosen a directory mirror --
- e.g. one of the authorities if bootstrapping for the first time or
- after a long downtime, or one of the relays listed in its cached
- directory information otherwise.
-
- Tor will stay at this phase until it has successfully established
- a TCP connection with some directory mirror. Problems in this phase
- generally happen because Tor doesn't have a network connection, or
- because the local firewall is dropping SYN packets.
-
- Phase 10:
- tag=handshake_dir summary="Finishing handshake with directory mirror"
-
- This event occurs when Tor establishes a TCP connection with a relay used
- as a directory mirror (or its https proxy if it's using one). Tor remains
- in this phase until the TLS handshake with the relay is finished.
-
- Problems in this phase generally happen because Tor's firewall is
- doing more sophisticated MITM attacks on it, or doing packet-level
- keyword recognition of Tor's handshake.
-
- Phase 15:
- tag=onehop_create summary="Establishing one-hop circuit for dir info"
-
- Once TLS is finished with a relay, Tor will send a CREATE_FAST cell
- to establish a one-hop circuit for retrieving directory information.
- It will remain in this phase until it receives the CREATED_FAST cell
- back, indicating that the circuit is ready.
-
- Phase 20:
- tag=requesting_status summary="Asking for networkstatus consensus"
-
- Once we've finished our one-hop circuit, we will start a new stream
- for fetching the networkstatus consensus. We'll stay in this phase
- until we get the 'connected' relay cell back, indicating that we've
- established a directory connection.
-
- Phase 25:
- tag=loading_status summary="Loading networkstatus consensus"
-
- Once we've established a directory connection, we will start fetching
- the networkstatus consensus document. This could take a while; this
- phase is a good opportunity for using the "progress" keyword to indicate
- partial progress.
-
- This phase could stall if the directory mirror we picked doesn't
- have a copy of the networkstatus consensus so we have to ask another,
- or it does give us a copy but we don't find it valid.
-
- Phase 40:
- tag=loading_keys summary="Loading authority key certs"
-
- Sometimes when we've finished loading the networkstatus consensus,
- we find that we don't have all the authority key certificates for the
- keys that signed the consensus. At that point we put the consensus we
- fetched on hold and fetch the keys so we can verify the signatures.
-
- Phase 45
- tag=requesting_descriptors summary="Asking for relay descriptors"
-
- Once we have a valid networkstatus consensus and we've checked all
- its signatures, we start asking for relay descriptors. We stay in this
- phase until we have received a 'connected' relay cell in response to
- a request for descriptors.
-
- Phase 50:
- tag=loading_descriptors summary="Loading relay descriptors"
-
- We will ask for relay descriptors from several different locations,
- so this step will probably make up the bulk of the bootstrapping,
- especially for users with slow connections. We stay in this phase until
- we have descriptors for at least 1/4 of the usable relays listed in
- the networkstatus consensus. This phase is also a good opportunity to
- use the "progress" keyword to indicate partial steps.
-
- Phase 80:
- tag=conn_or summary="Connecting to entry guard"
-
- Once we have a valid consensus and enough relay descriptors, we choose
- some entry guards and start trying to build some circuits. This step
- is similar to the "conn_dir" phase above; the only difference is
- the context.
-
- If a Tor starts with enough recent cached directory information,
- its first bootstrap status event will be for the conn_or phase.
-
- Phase 85:
- tag=handshake_or summary="Finishing handshake with entry guard"
-
- This phase is similar to the "handshake_dir" phase, but it gets reached
- if we finish a TCP connection to a Tor relay and we have already reached
- the "conn_or" phase. We'll stay in this phase until we complete a TLS
- handshake with a Tor relay.
-
- Phase 90:
- tag=circuit_create summary="Establishing circuits"
-
- Once we've finished our TLS handshake with an entry guard, we will
- set about trying to make some 3-hop circuits in case we need them soon.
-
- Phase 100:
- tag=done summary="Done"
-
- A full 3-hop exit circuit has been established. Tor is ready to handle
- application connections now.
-
diff --git a/doc/spec/dir-spec-v1.txt b/doc/spec/dir-spec-v1.txt
deleted file mode 100644
index a92fc7999..000000000
--- a/doc/spec/dir-spec-v1.txt
+++ /dev/null
@@ -1,314 +0,0 @@
-
- Tor Protocol Specification
-
- Roger Dingledine
- Nick Mathewson
-
-0. Preliminaries
-
- THIS SPECIFICATION IS OBSOLETE.
-
- This document specifies the Tor directory protocol as used in version
- 0.1.0.x and earlier. See dir-spec.txt for a current version.
-
-1. Basic operation
-
- There is a small number of directory authorities, and a larger number of
- caches. Client and servers know public keys for the directory authorities.
- Tor servers periodically upload self-signed "router descriptors" to the
- directory authorities. Each authority publishes a self-signed "directory"
- (containing all the router descriptors it knows, and a statement on which
- are running) and a self-signed "running routers" document containing only
- the statement on which routers are running.
-
- All Tors periodically download these documents, downloading the directory
- less frequently than they do the "running routers" document. Clients
- preferentially download from caches rather than authorities.
-
-1.1. Document format
-
- Router descriptors, directories, and running-routers documents all obey the
- following lightweight extensible information format.
-
- The highest level object is a Document, which consists of one or more
- Items. Every Item begins with a KeywordLine, followed by one or more
- Objects. A KeywordLine begins with a Keyword, optionally followed by
- whitespace and more non-newline characters, and ends with a newline. A
- Keyword is a sequence of one or more characters in the set [A-Za-z0-9-].
- An Object is a block of encoded data in pseudo-Open-PGP-style
- armor. (cf. RFC 2440)
-
- More formally:
-
- Document ::= (Item | NL)+
- Item ::= KeywordLine Object*
- KeywordLine ::= Keyword NL | Keyword WS ArgumentsChar+ NL
- Keyword = KeywordChar+
- KeywordChar ::= 'A' ... 'Z' | 'a' ... 'z' | '0' ... '9' | '-'
- ArgumentChar ::= any printing ASCII character except NL.
- WS = (SP | TAB)+
- Object ::= BeginLine Base-64-encoded-data EndLine
- BeginLine ::= "-----BEGIN " Keyword "-----" NL
- EndLine ::= "-----END " Keyword "-----" NL
-
- The BeginLine and EndLine of an Object must use the same keyword.
-
- When interpreting a Document, software MUST reject any document containing a
- KeywordLine that starts with a keyword it doesn't recognize.
-
- The "opt" keyword is reserved for non-critical future extensions. All
- implementations MUST ignore any item of the form "opt keyword ....." when
- they would not recognize "keyword ....."; and MUST treat "opt keyword ....."
- as synonymous with "keyword ......" when keyword is recognized.
-
-2. Router descriptor format.
-
- Every router descriptor MUST start with a "router" Item; MUST end with a
- "router-signature" Item and an extra NL; and MUST contain exactly one
- instance of each of the following Items: "published" "onion-key" "link-key"
- "signing-key" "bandwidth". Additionally, a router descriptor MAY contain
- any number of "accept", "reject", "fingerprint", "uptime", and "opt" Items.
- Other than "router" and "router-signature", the items may appear in any
- order.
-
- The items' formats are as follows:
- "router" nickname address ORPort SocksPort DirPort
-
- Indicates the beginning of a router descriptor. "address"
- must be an IPv4 address in dotted-quad format. The last
- three numbers indicate the TCP ports at which this OR exposes
- functionality. ORPort is a port at which this OR accepts TLS
- connections for the main OR protocol; SocksPort is deprecated and
- should always be 0; and DirPort is the port at which this OR accepts
- directory-related HTTP connections. If any port is not supported,
- the value 0 is given instead of a port number.
-
- "bandwidth" bandwidth-avg bandwidth-burst bandwidth-observed
-
- Estimated bandwidth for this router, in bytes per second. The
- "average" bandwidth is the volume per second that the OR is willing
- to sustain over long periods; the "burst" bandwidth is the volume
- that the OR is willing to sustain in very short intervals. The
- "observed" value is an estimate of the capacity this server can
- handle. The server remembers the max bandwidth sustained output
- over any ten second period in the past day, and another sustained
- input. The "observed" value is the lesser of these two numbers.
-
- "platform" string
-
- A human-readable string describing the system on which this OR is
- running. This MAY include the operating system, and SHOULD include
- the name and version of the software implementing the Tor protocol.
-
- "published" YYYY-MM-DD HH:MM:SS
-
- The time, in GMT, when this descriptor was generated.
-
- "fingerprint"
-
- A fingerprint (a HASH_LEN-byte of asn1 encoded public key, encoded
- in hex, with a single space after every 4 characters) for this router's
- identity key. A descriptor is considered invalid (and MUST be
- rejected) if the fingerprint line does not match the public key.
-
- [We didn't start parsing this line until Tor 0.1.0.6-rc; it should
- be marked with "opt" until earlier versions of Tor are obsolete.]
-
- "hibernating" 0|1
-
- If the value is 1, then the Tor server was hibernating when the
- descriptor was published, and shouldn't be used to build circuits.
-
- [We didn't start parsing this line until Tor 0.1.0.6-rc; it should
- be marked with "opt" until earlier versions of Tor are obsolete.]
-
- "uptime"
-
- The number of seconds that this OR process has been running.
-
- "onion-key" NL a public key in PEM format
-
- This key is used to encrypt EXTEND cells for this OR. The key MUST
- be accepted for at least XXXX hours after any new key is published in
- a subsequent descriptor.
-
- "signing-key" NL a public key in PEM format
-
- The OR's long-term identity key.
-
- "accept" exitpattern
- "reject" exitpattern
-
- These lines, in order, describe the rules that an OR follows when
- deciding whether to allow a new stream to a given address. The
- 'exitpattern' syntax is described below.
-
- "router-signature" NL Signature NL
-
- The "SIGNATURE" object contains a signature of the PKCS1-padded
- hash of the entire router descriptor, taken from the beginning of the
- "router" line, through the newline after the "router-signature" line.
- The router descriptor is invalid unless the signature is performed
- with the router's identity key.
-
- "contact" info NL
-
- Describes a way to contact the server's administrator, preferably
- including an email address and a PGP key fingerprint.
-
- "family" names NL
-
- 'Names' is a whitespace-separated list of server nicknames. If two ORs
- list one another in their "family" entries, then OPs should treat them
- as a single OR for the purpose of path selection.
-
- For example, if node A's descriptor contains "family B", and node B's
- descriptor contains "family A", then node A and node B should never
- be used on the same circuit.
-
- "read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL
- "write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL
-
- Declare how much bandwidth the OR has used recently. Usage is divided
- into intervals of NSEC seconds. The YYYY-MM-DD HH:MM:SS field defines
- the end of the most recent interval. The numbers are the number of
- bytes used in the most recent intervals, ordered from oldest to newest.
-
- [We didn't start parsing these lines until Tor 0.1.0.6-rc; they should
- be marked with "opt" until earlier versions of Tor are obsolete.]
-
-2.1. Nonterminals in routerdescriptors
-
- nickname ::= between 1 and 19 alphanumeric characters, case-insensitive.
-
- exitpattern ::= addrspec ":" portspec
- portspec ::= "*" | port | port "-" port
- port ::= an integer between 1 and 65535, inclusive.
- addrspec ::= "*" | ip4spec | ip6spec
- ipv4spec ::= ip4 | ip4 "/" num_ip4_bits | ip4 "/" ip4mask
- ip4 ::= an IPv4 address in dotted-quad format
- ip4mask ::= an IPv4 mask in dotted-quad format
- num_ip4_bits ::= an integer between 0 and 32
- ip6spec ::= ip6 | ip6 "/" num_ip6_bits
- ip6 ::= an IPv6 address, surrounded by square brackets.
- num_ip6_bits ::= an integer between 0 and 128
-
- Ports are required; if they are not included in the router
- line, they must appear in the "ports" lines.
-
-3. Directory format
-
- A Directory begins with a "signed-directory" item, followed by one each of
- the following, in any order: "recommended-software", "published",
- "router-status", "dir-signing-key". It may include any number of "opt"
- items. After these items, a directory includes any number of router
- descriptors, and a single "directory-signature" item.
-
- "signed-directory"
-
- Indicates the start of a directory.
-
- "published" YYYY-MM-DD HH:MM:SS
-
- The time at which this directory was generated and signed, in GMT.
-
- "dir-signing-key"
-
- The key used to sign this directory; see "signing-key" for format.
-
- "recommended-software" comma-separated-version-list
-
- A list of which versions of which implementations are currently
- believed to be secure and compatible with the network.
-
- "running-routers" whitespace-separated-list
-
- A description of which routers are currently believed to be up or
- down. Every entry consists of an optional "!", followed by either an
- OR's nickname, or "$" followed by a hexadecimal encoding of the hash
- of an OR's identity key. If the "!" is included, the router is
- believed not to be running; otherwise, it is believed to be running.
- If a router's nickname is given, exactly one router of that nickname
- will appear in the directory, and that router is "approved" by the
- directory server. If a hashed identity key is given, that OR is not
- "approved". [XXXX The 'running-routers' line is only provided for
- backward compatibility. New code should parse 'router-status'
- instead.]
-
- "router-status" whitespace-separated-list
-
- A description of which routers are currently believed to be up or
- down, and which are verified or unverified. Contains one entry for
- every router that the directory server knows. Each entry is of the
- format:
-
- !name=$digest [Verified router, currently not live.]
- name=$digest [Verified router, currently live.]
- !$digest [Unverified router, currently not live.]
- or $digest [Unverified router, currently live.]
-
- (where 'name' is the router's nickname and 'digest' is a hexadecimal
- encoding of the hash of the routers' identity key).
-
- When parsing this line, clients should only mark a router as
- 'verified' if its nickname AND digest match the one provided.
-
- "directory-signature" nickname-of-dirserver NL Signature
-
- The signature is computed by computing the digest of the
- directory, from the characters "signed-directory", through the newline
- after "directory-signature". This digest is then padded with PKCS.1,
- and signed with the directory server's signing key.
-
- If software encounters an unrecognized keyword in a single router descriptor,
- it MUST reject only that router descriptor, and continue using the
- others. Because this mechanism is used to add 'critical' extensions to
- future versions of the router descriptor format, implementation should treat
- it as a normal occurrence and not, for example, report it to the user as an
- error. [Versions of Tor prior to 0.1.1 did this.]
-
- If software encounters an unrecognized keyword in the directory header,
- it SHOULD reject the entire directory.
-
-4. Network-status descriptor
-
- A "network-status" (a.k.a "running-routers") document is a truncated
- directory that contains only the current status of a list of nodes, not
- their actual descriptors. It contains exactly one of each of the following
- entries.
-
- "network-status"
-
- Must appear first.
-
- "published" YYYY-MM-DD HH:MM:SS
-
- (see section 3 above)
-
- "router-status" list
-
- (see section 3 above)
-
- "directory-signature" NL signature
-
- (see section 3 above)
-
-5. Behavior of a directory server
-
- lists nodes that are connected currently
- speaks HTTP on a socket, spits out directory on request
-
- Directory servers listen on a certain port (the DirPort), and speak a
- limited version of HTTP 1.0. Clients send either GET or POST commands.
- The basic interactions are:
- "%s %s HTTP/1.0\r\nContent-Length: %lu\r\nHost: %s\r\n\r\n",
- command, url, content-length, host.
- Get "/tor/" to fetch a full directory.
- Get "/tor/dir.z" to fetch a compressed full directory.
- Get "/tor/running-routers" to fetch a network-status descriptor.
- Post "/tor/" to post a server descriptor, with the body of the
- request containing the descriptor.
-
- "host" is used to specify the address:port of the dirserver, so
- the request can survive going through HTTP proxies.
-
diff --git a/doc/spec/dir-spec-v2.txt b/doc/spec/dir-spec-v2.txt
deleted file mode 100644
index d1be27f3d..000000000
--- a/doc/spec/dir-spec-v2.txt
+++ /dev/null
@@ -1,896 +0,0 @@
-
- Tor directory protocol, version 2
-
-0. Scope and preliminaries
-
- This directory protocol is used by Tor version 0.1.1.x and 0.1.2.x. See
- dir-spec-v1.txt for information on earlier versions, and dir-spec.txt
- for information on later versions.
-
-0.1. Goals and motivation
-
- There were several problems with the way Tor handles directory information
- in version 0.1.0.x and earlier. Here are the problems we try to fix with
- this new design, already implemented in 0.1.1.x:
- 1. Directories were very large and use up a lot of bandwidth: clients
- downloaded descriptors for all router several times an hour.
- 2. Every directory authority was a trust bottleneck: if a single
- directory authority lied, it could make clients believe for a time an
- arbitrarily distorted view of the Tor network.
- 3. Our current "verified server" system is kind of nonsensical.
-
- 4. Getting more directory authorities would add more points of failure
- and worsen possible partitioning attacks.
-
- There are two problems that remain unaddressed by this design.
- 5. Requiring every client to know about every router won't scale.
- 6. Requiring every directory cache to know every router won't scale.
-
- We attempt to fix 1-4 here, and to build a solution that will work when we
- figure out an answer for 5. We haven't thought at all about what to do
- about 6.
-
-1. Outline
-
- There is a small set (say, around 10) of semi-trusted directory
- authorities. A default list of authorities is shipped with the Tor
- software. Users can change this list, but are encouraged not to do so, in
- order to avoid partitioning attacks.
-
- Routers periodically upload signed "descriptors" to the directory
- authorities describing their keys, capabilities, and other information.
- Routers may act as directory mirrors (also called "caches"), to reduce
- load on the directory authorities. They announce this in their
- descriptors.
-
- Each directory authority periodically generates and signs a compact
- "network status" document that lists that authority's view of the current
- descriptors and status for known routers, but which does not include the
- descriptors themselves.
-
- Directory mirrors download, cache, and re-serve network-status documents
- to clients.
-
- Clients, directory mirrors, and directory authorities all use
- network-status documents to find out when their list of routers is
- out-of-date. If it is, they download any missing router descriptors.
- Clients download missing descriptors from mirrors; mirrors and authorities
- download from authorities. Descriptors are downloaded by the hash of the
- descriptor, not by the server's identity key: this prevents servers from
- attacking clients by giving them descriptors nobody else uses.
-
- All directory information is uploaded and downloaded with HTTP.
-
- Coordination among directory authorities is done client-side: clients
- compute a vote-like algorithm among the network-status documents they
- have, and base their decisions on the result.
-
-1.1. What's different from 0.1.0.x?
-
- Clients used to download a signed concatenated set of router descriptors
- (called a "directory") from directory mirrors, regardless of which
- descriptors had changed.
-
- Between downloading directories, clients would download "network-status"
- documents that would list which servers were supposed to running.
-
- Clients would always believe the most recently published network-status
- document they were served.
-
- Routers used to upload fresh descriptors all the time, whether their keys
- and other information had changed or not.
-
-1.2. Document meta-format
-
- Router descriptors, directories, and running-routers documents all obey the
- following lightweight extensible information format.
-
- The highest level object is a Document, which consists of one or more
- Items. Every Item begins with a KeywordLine, followed by one or more
- Objects. A KeywordLine begins with a Keyword, optionally followed by
- whitespace and more non-newline characters, and ends with a newline. A
- Keyword is a sequence of one or more characters in the set [A-Za-z0-9-].
- An Object is a block of encoded data in pseudo-Open-PGP-style
- armor. (cf. RFC 2440)
-
- More formally:
-
- Document ::= (Item | NL)+
- Item ::= KeywordLine Object*
- KeywordLine ::= Keyword NL | Keyword WS ArgumentsChar+ NL
- Keyword = KeywordChar+
- KeywordChar ::= 'A' ... 'Z' | 'a' ... 'z' | '0' ... '9' | '-'
- ArgumentChar ::= any printing ASCII character except NL.
- WS = (SP | TAB)+
- Object ::= BeginLine Base-64-encoded-data EndLine
- BeginLine ::= "-----BEGIN " Keyword "-----" NL
- EndLine ::= "-----END " Keyword "-----" NL
-
- The BeginLine and EndLine of an Object must use the same keyword.
-
- When interpreting a Document, software MUST ignore any KeywordLine that
- starts with a keyword it doesn't recognize; future implementations MUST NOT
- require current clients to understand any KeywordLine not currently
- described.
-
- The "opt" keyword was used until Tor 0.1.2.5-alpha for non-critical future
- extensions. All implementations MUST ignore any item of the form "opt
- keyword ....." when they would not recognize "keyword ....."; and MUST
- treat "opt keyword ....." as synonymous with "keyword ......" when keyword
- is recognized.
-
- Implementations before 0.1.2.5-alpha rejected any document with a
- KeywordLine that started with a keyword that they didn't recognize.
- Implementations MUST prefix items not recognized by older versions of Tor
- with an "opt" until those versions of Tor are obsolete.
-
- Other implementations that want to extend Tor's directory format MAY
- introduce their own items. The keywords for extension items SHOULD start
- with the characters "x-" or "X-", to guarantee that they will not conflict
- with keywords used by future versions of Tor.
-
-2. Router operation
-
- ORs SHOULD generate a new router descriptor whenever any of the
- following events have occurred:
-
- - A period of time (18 hrs by default) has passed since the last
- time a descriptor was generated.
-
- - A descriptor field other than bandwidth or uptime has changed.
-
- - Bandwidth has changed by at least a factor of 2 from the last time a
- descriptor was generated, and at least a given interval of time
- (20 mins by default) has passed since then.
-
- - Its uptime has been reset (by restarting).
-
- After generating a descriptor, ORs upload it to every directory
- authority they know, by posting it to the URL
-
- http://<hostname:port>/tor/
-
-2.1. Router descriptor format
-
- Every router descriptor MUST start with a "router" Item; MUST end with a
- "router-signature" Item and an extra NL; and MUST contain exactly one
- instance of each of the following Items: "published" "onion-key"
- "signing-key" "bandwidth".
-
- A router descriptor MAY have zero or one of each of the following Items,
- but MUST NOT have more than one: "contact", "uptime", "fingerprint",
- "hibernating", "read-history", "write-history", "eventdns", "platform",
- "family".
-
- Additionally, a router descriptor MAY contain any number of "accept",
- "reject", and "opt" Items. Other than "router" and "router-signature",
- the items may appear in any order.
-
- The items' formats are as follows:
- "router" nickname address ORPort SocksPort DirPort
-
- Indicates the beginning of a router descriptor. "address" must be an
- IPv4 address in dotted-quad format. The last three numbers indicate
- the TCP ports at which this OR exposes functionality. ORPort is a port
- at which this OR accepts TLS connections for the main OR protocol;
- SocksPort is deprecated and should always be 0; and DirPort is the
- port at which this OR accepts directory-related HTTP connections. If
- any port is not supported, the value 0 is given instead of a port
- number.
-
- "bandwidth" bandwidth-avg bandwidth-burst bandwidth-observed
-
- Estimated bandwidth for this router, in bytes per second. The
- "average" bandwidth is the volume per second that the OR is willing to
- sustain over long periods; the "burst" bandwidth is the volume that
- the OR is willing to sustain in very short intervals. The "observed"
- value is an estimate of the capacity this server can handle. The
- server remembers the max bandwidth sustained output over any ten
- second period in the past day, and another sustained input. The
- "observed" value is the lesser of these two numbers.
-
- "platform" string
-
- A human-readable string describing the system on which this OR is
- running. This MAY include the operating system, and SHOULD include
- the name and version of the software implementing the Tor protocol.
-
- "published" YYYY-MM-DD HH:MM:SS
-
- The time, in GMT, when this descriptor was generated.
-
- "fingerprint"
-
- A fingerprint (a HASH_LEN-byte of asn1 encoded public key, encoded in
- hex, with a single space after every 4 characters) for this router's
- identity key. A descriptor is considered invalid (and MUST be
- rejected) if the fingerprint line does not match the public key.
-
- [We didn't start parsing this line until Tor 0.1.0.6-rc; it should
- be marked with "opt" until earlier versions of Tor are obsolete.]
-
- "hibernating" 0|1
-
- If the value is 1, then the Tor server was hibernating when the
- descriptor was published, and shouldn't be used to build circuits.
-
- [We didn't start parsing this line until Tor 0.1.0.6-rc; it should be
- marked with "opt" until earlier versions of Tor are obsolete.]
-
- "uptime"
-
- The number of seconds that this OR process has been running.
-
- "onion-key" NL a public key in PEM format
-
- This key is used to encrypt EXTEND cells for this OR. The key MUST be
- accepted for at least 1 week after any new key is published in a
- subsequent descriptor.
-
- "signing-key" NL a public key in PEM format
-
- The OR's long-term identity key.
-
- "accept" exitpattern
- "reject" exitpattern
-
- These lines describe the rules that an OR follows when
- deciding whether to allow a new stream to a given address. The
- 'exitpattern' syntax is described below. The rules are considered in
- order; if no rule matches, the address will be accepted. For clarity,
- the last such entry SHOULD be accept *:* or reject *:*.
-
- "router-signature" NL Signature NL
-
- The "SIGNATURE" object contains a signature of the PKCS1-padded
- hash of the entire router descriptor, taken from the beginning of the
- "router" line, through the newline after the "router-signature" line.
- The router descriptor is invalid unless the signature is performed
- with the router's identity key.
-
- "contact" info NL
-
- Describes a way to contact the server's administrator, preferably
- including an email address and a PGP key fingerprint.
-
- "family" names NL
-
- 'Names' is a space-separated list of server nicknames or
- hexdigests. If two ORs list one another in their "family" entries,
- then OPs should treat them as a single OR for the purpose of path
- selection.
-
- For example, if node A's descriptor contains "family B", and node B's
- descriptor contains "family A", then node A and node B should never
- be used on the same circuit.
-
- "read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL
- "write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL
-
- Declare how much bandwidth the OR has used recently. Usage is divided
- into intervals of NSEC seconds. The YYYY-MM-DD HH:MM:SS field
- defines the end of the most recent interval. The numbers are the
- number of bytes used in the most recent intervals, ordered from
- oldest to newest.
-
- [We didn't start parsing these lines until Tor 0.1.0.6-rc; they should
- be marked with "opt" until earlier versions of Tor are obsolete.]
-
- "eventdns" bool NL
-
- Declare whether this version of Tor is using the newer enhanced
- dns logic. Versions of Tor without eventdns SHOULD NOT be used for
- reverse hostname lookups.
-
- [All versions of Tor before 0.1.2.2-alpha should be assumed to have
- this option set to 0 if it is not present. All Tor versions at
- 0.1.2.2-alpha or later should be assumed to have this option set to
- 1 if it is not present. Until 0.1.2.1-alpha-dev, this option was
- not generated, even when eventdns was in use. Versions of Tor
- before 0.1.2.1-alpha-dev did not parse this option, so it should be
- marked "opt". With 0.2.0.1-alpha, the old 'dnsworker' logic has
- been removed, rendering this option of historical interest only.]
-
-2.2. Nonterminals in router descriptors
-
- nickname ::= between 1 and 19 alphanumeric characters, case-insensitive.
- hexdigest ::= a '$', followed by 20 hexadecimal characters.
- [Represents a server by the digest of its identity key.]
-
- exitpattern ::= addrspec ":" portspec
- portspec ::= "*" | port | port "-" port
- port ::= an integer between 1 and 65535, inclusive.
- [Some implementations incorrectly generate ports with value 0.
- Implementations SHOULD accept this, and SHOULD NOT generate it.]
-
- addrspec ::= "*" | ip4spec | ip6spec
- ipv4spec ::= ip4 | ip4 "/" num_ip4_bits | ip4 "/" ip4mask
- ip4 ::= an IPv4 address in dotted-quad format
- ip4mask ::= an IPv4 mask in dotted-quad format
- num_ip4_bits ::= an integer between 0 and 32
- ip6spec ::= ip6 | ip6 "/" num_ip6_bits
- ip6 ::= an IPv6 address, surrounded by square brackets.
- num_ip6_bits ::= an integer between 0 and 128
-
- bool ::= "0" | "1"
-
- Ports are required; if they are not included in the router
- line, they must appear in the "ports" lines.
-
-3. Network status format
-
- Directory authorities generate, sign, and compress network-status
- documents. Directory servers SHOULD generate a fresh network-status
- document when the contents of such a document would be different from the
- last one generated, and some time (at least one second, possibly longer)
- has passed since the last one was generated.
-
- The network status document contains a preamble, a set of router status
- entries, and a signature, in that order.
-
- We use the same meta-format as used for directories and router descriptors
- in "tor-spec.txt". Implementations MAY insert blank lines
- for clarity between sections; these blank lines are ignored.
- Implementations MUST NOT depend on blank lines in any particular location.
-
- As used here, "whitespace" is a sequence of 1 or more tab or space
- characters.
-
- The preamble contains:
-
- "network-status-version" -- A document format version. For this
- specification, the version is "2".
- "dir-source" -- The authority's hostname, current IP address, and
- directory port, all separated by whitespace.
- "fingerprint" -- A base16-encoded hash of the signing key's
- fingerprint, with no additional spaces added.
- "contact" -- An arbitrary string describing how to contact the
- directory server's administrator. Administrators should include at
- least an email address and a PGP fingerprint.
- "dir-signing-key" -- The directory server's public signing key.
- "client-versions" -- A comma-separated list of recommended client
- versions.
- "server-versions" -- A comma-separated list of recommended server
- versions.
- "published" -- The publication time for this network-status object.
- "dir-options" -- A set of flags, in any order, separated by whitespace:
- "Names" if this directory authority performs name bindings.
- "Versions" if this directory authority recommends software versions.
- "BadExits" if the directory authority flags nodes that it believes
- are performing incorrectly as exit nodes.
- "BadDirectories" if the directory authority flags nodes that it
- believes are performing incorrectly as directory caches.
-
- The dir-options entry is optional. The "-versions" entries are required if
- the "Versions" flag is present. The other entries are required and must
- appear exactly once. The "network-status-version" entry must appear first;
- the others may appear in any order. Implementations MUST ignore
- additional arguments to the items above, and MUST ignore unrecognized
- flags.
-
- For each router, the router entry contains: (This format is designed for
- conciseness.)
-
- "r" -- followed by the following elements, in order, separated by
- whitespace:
- - The OR's nickname,
- - A hash of its identity key, encoded in base64, with trailing =
- signs removed.
- - A hash of its most recent descriptor, encoded in base64, with
- trailing = signs removed. (The hash is calculated as for
- computing the signature of a descriptor.)
- - The publication time of its most recent descriptor, in the form
- YYYY-MM-DD HH:MM:SS, in GMT.
- - An IP address
- - An OR port
- - A directory port (or "0" for none")
- "s" -- A series of whitespace-separated status flags, in any order:
- "Authority" if the router is a directory authority.
- "BadExit" if the router is believed to be useless as an exit node
- (because its ISP censors it, because it is behind a restrictive
- proxy, or for some similar reason).
- "BadDirectory" if the router is believed to be useless as a
- directory cache (because its directory port isn't working,
- its bandwidth is always throttled, or for some similar
- reason).
- "Exit" if the router is useful for building general-purpose exit
- circuits.
- "Fast" if the router is suitable for high-bandwidth circuits.
- "Guard" if the router is suitable for use as an entry guard.
- "Named" if the router's identity-nickname mapping is canonical,
- and this authority binds names.
- "Stable" if the router is suitable for long-lived circuits.
- "Running" if the router is currently usable.
- "Valid" if the router has been 'validated'.
- "V2Dir" if the router implements this protocol.
- "v" -- The version of the Tor protocol that this server is running. If
- the value begins with "Tor" SP, the rest of the string is a Tor
- version number, and the protocol is "The Tor protocol as supported
- by the given version of Tor." Otherwise, if the value begins with
- some other string, Tor has upgraded to a more sophisticated
- protocol versioning system, and the protocol is "a version of the
- Tor protocol more recent than any we recognize."
-
- The "r" entry for each router must appear first and is required. The
- "s" entry is optional (see Section 3.1 below for how the flags are
- decided). Unrecognized flags on the "s" line and extra elements
- on the "r" line must be ignored. The "v" line is optional; it was not
- supported until 0.1.2.5-alpha, and it must be preceded with an "opt"
- until all earlier versions of Tor are obsolete.
-
- The signature section contains:
-
- "directory-signature" nickname-of-dirserver NL Signature
-
- Signature is a signature of this network-status document
- (the document up until the signature, including the line
- "directory-signature <nick>\n"), using the directory authority's
- signing key.
-
- We compress the network status list with zlib before transmitting it.
-
-3.1. Establishing server status
-
- (This section describes how directory authorities choose which status
- flags to apply to routers, as of Tor 0.1.1.18-rc. Later directory
- authorities MAY do things differently, so long as clients keep working
- well. Clients MUST NOT depend on the exact behaviors in this section.)
-
- In the below definitions, a router is considered "active" if it is
- running, valid, and not hibernating.
-
- "Valid" -- a router is 'Valid' if it is running a version of Tor not
- known to be broken, and the directory authority has not blacklisted
- it as suspicious.
-
- "Named" -- Directory authority administrators may decide to support name
- binding. If they do, then they must maintain a file of
- nickname-to-identity-key mappings, and try to keep this file consistent
- with other directory authorities. If they don't, they act as clients, and
- report bindings made by other directory authorities (name X is bound to
- identity Y if at least one binding directory lists it, and no directory
- binds X to some other Y'.) A router is called 'Named' if the router
- believes the given name should be bound to the given key.
-
- "Running" -- A router is 'Running' if the authority managed to connect to
- it successfully within the last 30 minutes.
-
- "Stable" -- A router is 'Stable' if it is active, and either its
- uptime is at least the median uptime for known active routers, or
- its uptime is at least 30 days. Routers are never called stable if
- they are running a version of Tor known to drop circuits stupidly.
- (0.1.1.10-alpha through 0.1.1.16-rc are stupid this way.)
-
- "Fast" -- A router is 'Fast' if it is active, and its bandwidth is
- in the top 7/8ths for known active routers.
-
- "Guard" -- A router is a possible 'Guard' if it is 'Stable' and its
- bandwidth is above median for known active routers. If the total
- bandwidth of active non-BadExit Exit servers is less than one third
- of the total bandwidth of all active servers, no Exit is listed as
- a Guard.
-
- "Authority" -- A router is called an 'Authority' if the authority
- generating the network-status document believes it is an authority.
-
- "V2Dir" -- A router supports the v2 directory protocol if it has an open
- directory port, and it is running a version of the directory protocol that
- supports the functionality clients need. (Currently, this is
- 0.1.1.9-alpha or later.)
-
- Directory server administrators may label some servers or IPs as
- blacklisted, and elect not to include them in their network-status lists.
-
- Authorities SHOULD 'disable' any servers in excess of 3 on any single IP.
- When there are more than 3 to choose from, authorities should first prefer
- authorities to non-authorities, then prefer Running to non-Running, and
- then prefer high-bandwidth to low-bandwidth. To 'disable' a server, the
- authority *should* advertise it without the Running or Valid flag.
-
- Thus, the network-status list includes all non-blacklisted,
- non-expired, non-superseded descriptors.
-
-4. Directory server operation
-
- All directory authorities and directory mirrors ("directory servers")
- implement this section, except as noted.
-
-4.1. Accepting uploads (authorities only)
-
- When a router posts a signed descriptor to a directory authority, the
- authority first checks whether it is well-formed and correctly
- self-signed. If it is, the authority next verifies that the nickname
- in question is not already assigned to a router with a different
- public key.
- Finally, the authority MAY check that the router is not blacklisted
- because of its key, IP, or another reason.
-
- If the descriptor passes these tests, and the authority does not already
- have a descriptor for a router with this public key, it accepts the
- descriptor and remembers it.
-
- If the authority _does_ have a descriptor with the same public key, the
- newly uploaded descriptor is remembered if its publication time is more
- recent than the most recent old descriptor for that router, and either:
- - There are non-cosmetic differences between the old descriptor and the
- new one.
- - Enough time has passed between the descriptors' publication times.
- (Currently, 12 hours.)
-
- Differences between router descriptors are "non-cosmetic" if they would be
- sufficient to force an upload as described in section 2 above.
-
- Note that the "cosmetic difference" test only applies to uploaded
- descriptors, not to descriptors that the authority downloads from other
- authorities.
-
-4.2. Downloading network-status documents (authorities and caches)
-
- All directory servers (authorities and mirrors) try to keep a fresh
- set of network-status documents from every authority. To do so,
- every 5 minutes, each authority asks every other authority for its
- most recent network-status document. Every 15 minutes, each mirror
- picks a random authority and asks it for the most recent network-status
- documents for all the authorities the authority knows about (including
- the chosen authority itself).
-
- Directory servers and mirrors remember and serve the most recent
- network-status document they have from each authority. Other
- network-status documents don't need to be stored. If the most recent
- network-status document is over 10 days old, it is discarded anyway.
- Mirrors SHOULD store and serve network-status documents from authorities
- they don't recognize, but SHOULD NOT use such documents for any other
- purpose. Mirrors SHOULD discard network-status documents older than 48
- hours.
-
-4.3. Downloading and storing router descriptors (authorities and caches)
-
- Periodically (currently, every 10 seconds), directory servers check
- whether there are any specific descriptors (as identified by descriptor
- hash in a network-status document) that they do not have and that they
- are not currently trying to download.
-
- If so, the directory server launches requests to the authorities for these
- descriptors, such that each authority is only asked for descriptors listed
- in its most recent network-status. When more than one authority lists the
- descriptor, we choose which to ask at random.
-
- If one of these downloads fails, we do not try to download that descriptor
- from the authority that failed to serve it again unless we receive a newer
- network-status from that authority that lists the same descriptor.
-
- Directory servers must potentially cache multiple descriptors for each
- router. Servers must not discard any descriptor listed by any current
- network-status document from any authority. If there is enough space to
- store additional descriptors, servers SHOULD try to hold those which
- clients are likely to download the most. (Currently, this is judged
- based on the interval for which each descriptor seemed newest.)
-
- Authorities SHOULD NOT download descriptors for routers that they would
- immediately reject for reasons listed in 3.1.
-
-4.4. HTTP URLs
-
- "Fingerprints" in these URLs are base-16-encoded SHA1 hashes.
-
- The authoritative network-status published by a host should be available at:
- http://<hostname>/tor/status/authority.z
-
- The network-status published by a host with fingerprint
- <F> should be available at:
- http://<hostname>/tor/status/fp/<F>.z
-
- The network-status documents published by hosts with fingerprints
- <F1>,<F2>,<F3> should be available at:
- http://<hostname>/tor/status/fp/<F1>+<F2>+<F3>.z
-
- The most recent network-status documents from all known authorities,
- concatenated, should be available at:
- http://<hostname>/tor/status/all.z
-
- The most recent descriptor for a server whose identity key has a
- fingerprint of <F> should be available at:
- http://<hostname>/tor/server/fp/<F>.z
-
- The most recent descriptors for servers with identity fingerprints
- <F1>,<F2>,<F3> should be available at:
- http://<hostname>/tor/server/fp/<F1>+<F2>+<F3>.z
-
- (NOTE: Implementations SHOULD NOT download descriptors by identity key
- fingerprint. This allows a corrupted server (in collusion with a cache) to
- provide a unique descriptor to a client, and thereby partition that client
- from the rest of the network.)
-
- The server descriptor with (descriptor) digest <D> (in hex) should be
- available at:
- http://<hostname>/tor/server/d/<D>.z
-
- The most recent descriptors with digests <D1>,<D2>,<D3> should be
- available at:
- http://<hostname>/tor/server/d/<D1>+<D2>+<D3>.z
-
- The most recent descriptor for this server should be at:
- http://<hostname>/tor/server/authority.z
- [Nothing in the Tor protocol uses this resource yet, but it is useful
- for debugging purposes. Also, the official Tor implementations
- (starting at 0.1.1.x) use this resource to test whether a server's
- own DirPort is reachable.]
-
- A concatenated set of the most recent descriptors for all known servers
- should be available at:
- http://<hostname>/tor/server/all.z
-
- For debugging, directories SHOULD expose non-compressed objects at URLs like
- the above, but without the final ".z".
- Clients MUST handle compressed concatenated information in two forms:
- - A concatenated list of zlib-compressed objects.
- - A zlib-compressed concatenated list of objects.
- Directory servers MAY generate either format: the former requires less
- CPU, but the latter requires less bandwidth.
-
- Clients SHOULD use upper case letters (A-F) when base16-encoding
- fingerprints. Servers MUST accept both upper and lower case fingerprints
- in requests.
-
-5. Client operation: downloading information
-
- Every Tor that is not a directory server (that is, those that do
- not have a DirPort set) implements this section.
-
-5.1. Downloading network-status documents
-
- Each client maintains an ordered list of directory authorities.
- Insofar as possible, clients SHOULD all use the same ordered list.
-
- For each network-status document a client has, it keeps track of its
- publication time *and* the time when the client retrieved it. Clients
- consider a network-status document "live" if it was published within the
- last 24 hours.
-
- Clients try to have a live network-status document hours from *every*
- authority, and try to periodically get new network-status documents from
- each authority in rotation as follows:
-
- If a client is missing a live network-status document for any
- authority, it tries to fetch it from a directory cache. On failure,
- the client waits briefly, then tries that network-status document
- again from another cache. The client does not build circuits until it
- has live network-status documents from more than half the authorities
- it trusts, and it has descriptors for more than 1/4 of the routers
- that it believes are running.
-
- If the most recently _retrieved_ network-status document is over 30
- minutes old, the client attempts to download a network-status document.
- When choosing which documents to download, clients treat their list of
- directory authorities as a circular ring, and begin with the authority
- appearing immediately after the authority for their most recently
- retrieved network-status document. If this attempt fails (either it
- fails to download at all, or the one it gets is not as good as the
- one it has), the client retries at other caches several times, before
- moving on to the next network-status document in sequence.
-
- Clients discard all network-status documents over 24 hours old.
-
- If enough mirrors (currently 4) claim not to have a given network status,
- we stop trying to download that authority's network-status, until we
- download a new network-status that makes us believe that the authority in
- question is running. Clients should wait a little longer after each
- failure.
-
- Clients SHOULD try to batch as many network-status requests as possible
- into each HTTP GET.
-
- (Note: clients can and should pick caches based on the network-status
- information they have: once they have first fetched network-status info
- from an authority, they should not need to go to the authority directly
- again.)
-
-5.2. Downloading and storing router descriptors
-
- Clients try to have the best descriptor for each router. A descriptor is
- "best" if:
- * It is the most recently published descriptor listed for that router
- by at least two network-status documents.
- OR,
- * No descriptor for that router is listed by two or more
- network-status documents, and it is the most recently published
- descriptor listed by any network-status document.
-
- Periodically (currently every 10 seconds) clients check whether there are
- any "downloadable" descriptors. A descriptor is downloadable if:
- - It is the "best" descriptor for some router.
- - The descriptor was published at least 10 minutes in the past.
- (This prevents clients from trying to fetch descriptors that the
- mirrors have probably not yet retrieved and cached.)
- - The client does not currently have it.
- - The client is not currently trying to download it.
- - The client would not discard it immediately upon receiving it.
- - The client thinks it is running and valid (see 6.1 below).
-
- If at least 16 known routers have downloadable descriptors, or if
- enough time (currently 10 minutes) has passed since the last time the
- client tried to download descriptors, it launches requests for all
- downloadable descriptors, as described in 5.3 below.
-
- When a descriptor download fails, the client notes it, and does not
- consider the descriptor downloadable again until a certain amount of time
- has passed. (Currently 0 seconds for the first failure, 60 seconds for the
- second, 5 minutes for the third, 10 minutes for the fourth, and 1 day
- thereafter.) Periodically (currently once an hour) clients reset the
- failure count.
-
- No descriptors are downloaded until the client has downloaded more than
- half of the network-status documents.
-
- Clients retain the most recent descriptor they have downloaded for each
- router so long as it is not too old (currently, 48 hours), OR so long as
- it is recommended by at least one networkstatus AND no "better"
- descriptor has been downloaded. [Versions of Tor before 0.1.2.3-alpha
- would discard descriptors simply for being published too far in the past.]
- [The code seems to discard descriptors in all cases after they're 5
- days old. True? -RD]
-
-5.3. Managing downloads
-
- When a client has no live network-status documents, it downloads
- network-status documents from a randomly chosen authority. In all other
- cases, the client downloads from mirrors randomly chosen from among those
- believed to be V2 directory servers. (This information comes from the
- network-status documents; see 6 below.)
-
- When downloading multiple router descriptors, the client chooses multiple
- mirrors so that:
- - At least 3 different mirrors are used, except when this would result
- in more than one request for under 4 descriptors.
- - No more than 128 descriptors are requested from a single mirror.
- - Otherwise, as few mirrors as possible are used.
- After choosing mirrors, the client divides the descriptors among them
- randomly.
-
- After receiving any response client MUST discard any network-status
- documents and descriptors that it did not request.
-
-6. Using directory information
-
- Everyone besides directory authorities uses the approaches in this section
- to decide which servers to use and what their keys are likely to be.
- (Directory authorities just believe their own opinions, as in 3.1 above.)
-
-6.1. Choosing routers for circuits.
-
- Tor implementations only pay attention to "live" network-status documents.
- A network status is "live" if it is the most recently downloaded network
- status document for a given directory server, and the server is a
- directory server trusted by the client, and the network-status document is
- no more than 1 day old.
-
- For time-sensitive information, Tor implementations focus on "recent"
- network-status documents. A network status is "recent" if it is live, and
- if it was published in the last 60 minutes. If there are fewer
- than 3 such documents, the most recently published 3 are "recent." If
- there are fewer than 3 in all, all are "recent.")
-
- Circuits SHOULD NOT be built until the client has enough directory
- information: network-statuses (or failed attempts to download
- network-statuses) for all authorities, network-statuses for at more than
- half of the authorities, and descriptors for at least 1/4 of the servers
- believed to be running.
-
- A server is "listed" if it is included by more than half of the live
- network status documents. Clients SHOULD NOT use unlisted servers.
-
- Clients believe the flags "Valid", "Exit", "Fast", "Guard", "Stable", and
- "V2Dir" about a given router when they are asserted by more than half of
- the live network-status documents. Clients believe the flag "Running" if
- it is listed by more than half of the recent network-status documents.
-
- These flags are used as follows:
-
- - Clients SHOULD NOT use non-'Valid' or non-'Running' routers unless
- requested to do so.
-
- - Clients SHOULD NOT use non-'Fast' routers for any purpose other than
- very-low-bandwidth circuits (such as introduction circuits).
-
- - Clients SHOULD NOT use non-'Stable' routers for circuits that are
- likely to need to be open for a very long time (such as those used for
- IRC or SSH connections).
-
- - Clients SHOULD NOT choose non-'Guard' nodes when picking entry guard
- nodes.
-
- - Clients SHOULD NOT download directory information from non-'V2Dir'
- caches.
-
-6.2. Managing naming
-
- In order to provide human-memorable names for individual server
- identities, some directory servers bind names to IDs. Clients handle
- names in two ways:
-
- When a client encounters a name it has not mapped before:
-
- If all the live "Naming" network-status documents the client has
- claim that the name binds to some identity ID, and the client has at
- least three live network-status documents, the client maps the name to
- ID.
-
- When a user tries to refer to a router with a name that does not have a
- mapping under the above rules, the implementation SHOULD warn the user.
- After giving the warning, the implementation MAY use a router that at
- least one Naming authority maps the name to, so long as no other naming
- authority maps that name to a different router. If no Naming authority
- maps the name to a router, the implementation MAY use any router that
- advertises the name.
-
- Not every router needs a nickname. When a router doesn't configure a
- nickname, it publishes with the default nickname "Unnamed". Authorities
- SHOULD NOT ever mark a router with this nickname as Named; client software
- SHOULD NOT ever use a router in response to a user request for a router
- called "Unnamed".
-
-6.3. Software versions
-
- An implementation of Tor SHOULD warn when it has fetched (or has
- attempted to fetch and failed four consecutive times) a network-status
- for each authority, and it is running a software version
- not listed on more than half of the live "Versioning" network-status
- documents.
-
-6.4. Warning about a router's status.
-
- If a router tries to publish its descriptor to a Naming authority
- that has its nickname mapped to another key, the router SHOULD
- warn the operator that it is either using the wrong key or is using
- an already claimed nickname.
-
- If a router has fetched (or attempted to fetch and failed four
- consecutive times) a network-status for every authority, and at
- least one of the authorities is "Naming", and no live "Naming"
- authorities publish a binding for the router's nickname, the
- router MAY remind the operator that the chosen nickname is not
- bound to this key at the authorities, and suggest contacting the
- authority operators.
-
- ...
-
-6.5. Router protocol versions
-
- A client should believe that a router supports a given feature if that
- feature is supported by the router or protocol versions in more than half
- of the live networkstatus's "v" entries for that router. In other words,
- if the "v" entries for some router are:
- v Tor 0.0.8pre1 (from authority 1)
- v Tor 0.1.2.11 (from authority 2)
- v FutureProtocolDescription 99 (from authority 3)
- then the client should believe that the router supports any feature
- supported by 0.1.2.11.
-
- This is currently equivalent to believing the median declared version for
- a router in all live networkstatuses.
-
-7. Standards compliance
-
- All clients and servers MUST support HTTP 1.0.
-
-7.1. HTTP headers
-
- Servers MAY set the Content-Length: header. Servers SHOULD set
- Content-Encoding to "deflate" or "identity".
-
- Servers MAY include an X-Your-Address-Is: header, whose value is the
- apparent IP address of the client connecting to them (as a dotted quad).
- For directory connections tunneled over a BEGIN_DIR stream, servers SHOULD
- report the IP from which the circuit carrying the BEGIN_DIR stream reached
- them. [Servers before version 0.1.2.5-alpha reported 127.0.0.1 for all
- BEGIN_DIR-tunneled connections.]
-
- Servers SHOULD disable caching of multiple network statuses or multiple
- router descriptors. Servers MAY enable caching of single descriptors,
- single network statuses, the list of all router descriptors, a v1
- directory, or a v1 running routers document. XXX mention times.
-
-7.2. HTTP status codes
-
- XXX We should write down what return codes dirservers send in what situations.
-
diff --git a/doc/spec/dir-spec.txt b/doc/spec/dir-spec.txt
deleted file mode 100644
index 49b64e8a9..000000000
--- a/doc/spec/dir-spec.txt
+++ /dev/null
@@ -1,2440 +0,0 @@
-
- Tor directory protocol, version 3
-
-0. Scope and preliminaries
-
- This directory protocol is used by Tor version 0.2.0.x-alpha and later.
- See dir-spec-v1.txt for information on the protocol used up to the
- 0.1.0.x series, and dir-spec-v2.txt for information on the protocol
- used by the 0.1.1.x and 0.1.2.x series.
-
- Caches and authorities must still support older versions of the
- directory protocols, until the versions of Tor that require them are
- finally out of commission.
-
- This document merges and supersedes the following proposals:
-
- 101 Voting on the Tor Directory System
- 103 Splitting identity key from regularly used signing key
- 104 Long and Short Router Descriptors
-
- XXX when to download certificates.
- XXX timeline
- XXX fill in XXXXs
-
- The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
- NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
- "OPTIONAL" in this document are to be interpreted as described in
- RFC 2119.
-
-0.1. History
-
- The earliest versions of Onion Routing shipped with a list of known
- routers and their keys. When the set of routers changed, users needed to
- fetch a new list.
-
- The Version 1 Directory protocol
- --------------------------------
-
- Early versions of Tor (0.0.2) introduced "Directory authorities": servers
- that served signed "directory" documents containing a list of signed
- "router descriptors", along with short summary of the status of each
- router. Thus, clients could get up-to-date information on the state of
- the network automatically, and be certain that the list they were getting
- was attested by a trusted directory authority.
-
- Later versions (0.0.8) added directory caches, which download
- directories from the authorities and serve them to clients. Non-caches
- fetch from the caches in preference to fetching from the authorities, thus
- distributing bandwidth requirements.
-
- Also added during the version 1 directory protocol were "router status"
- documents: short documents that listed only the up/down status of the
- routers on the network, rather than a complete list of all the
- descriptors. Clients and caches would fetch these documents far more
- frequently than they would fetch full directories.
-
- The Version 2 Directory Protocol
- --------------------------------
-
- During the Tor 0.1.1.x series, Tor revised its handling of directory
- documents in order to address two major problems:
-
- * Directories had grown quite large (over 1MB), and most directory
- downloads consisted mainly of router descriptors that clients
- already had.
-
- * Every directory authority was a trust bottleneck: if a single
- directory authority lied, it could make clients believe for a time
- an arbitrarily distorted view of the Tor network. (Clients
- trusted the most recent signed document they downloaded.) Thus,
- adding more authorities would make the system less secure, not
- more.
-
- To address these, we extended the directory protocol so that
- authorities now published signed "network status" documents. Each
- network status listed, for every router in the network: a hash of its
- identity key, a hash of its most recent descriptor, and a summary of
- what the authority believed about its status. Clients would download
- the authorities' network status documents in turn, and believe
- statements about routers iff they were attested to by more than half of
- the authorities.
-
- Instead of downloading all router descriptors at once, clients
- downloaded only the descriptors that they did not have. Descriptors
- were indexed by their digests, in order to prevent malicious caches
- from giving different versions of a router descriptor to different
- clients.
-
- Routers began working harder to upload new descriptors only when their
- contents were substantially changed.
-
-
-0.2. Goals of the version 3 protocol
-
- Version 3 of the Tor directory protocol tries to solve the following
- issues:
-
- * A great deal of bandwidth used to transmit router descriptors was
- used by two fields that are not actually used by Tor routers
- (namely read-history and write-history). We save about 60% by
- moving them into a separate document that most clients do not
- fetch or use.
-
- * It was possible under certain perverse circumstances for clients
- to download an unusual set of network status documents, thus
- partitioning themselves from clients who have a more recent and/or
- typical set of documents. Even under the best of circumstances,
- clients were sensitive to the ages of the network status documents
- they downloaded. Therefore, instead of having the clients
- correlate multiple network status documents, we have the
- authorities collectively vote on a single consensus network status
- document.
-
- * The most sensitive data in the entire network (the identity keys
- of the directory authorities) needed to be stored unencrypted so
- that the authorities can sign network-status documents on the fly.
- Now, the authorities' identity keys are stored offline, and used
- to certify medium-term signing keys that can be rotated.
-
-0.3. Some Remaining questions
-
- Things we could solve on a v3 timeframe:
-
- The SHA-1 hash is showing its age. We should do something about our
- dependency on it. We could probably future-proof ourselves here in
- this revision, at least so far as documents from the authorities are
- concerned.
-
- Too many things about the authorities are hardcoded by IP.
-
- Perhaps we should start accepting longer identity keys for routers
- too.
-
- Things to solve eventually:
-
- Requiring every client to know about every router won't scale forever.
-
- Requiring every directory cache to know every router won't scale
- forever.
-
-
-1. Outline
-
- There is a small set (say, around 5-10) of semi-trusted directory
- authorities. A default list of authorities is shipped with the Tor
- software. Users can change this list, but are encouraged not to do so,
- in order to avoid partitioning attacks.
-
- Every authority has a very-secret, long-term "Authority Identity Key".
- This is stored encrypted and/or offline, and is used to sign "key
- certificate" documents. Every key certificate contains a medium-term
- (3-12 months) "authority signing key", that is used by the authority to
- sign other directory information. (Note that the authority identity
- key is distinct from the router identity key that the authority uses
- in its role as an ordinary router.)
-
- Routers periodically upload signed "routers descriptors" to the
- directory authorities describing their keys, capabilities, and other
- information. Routers may also upload signed "extra info documents"
- containing information that is not required for the Tor protocol.
- Directory authorities serve router descriptors indexed by router
- identity, or by hash of the descriptor.
-
- Routers may act as directory caches to reduce load on the directory
- authorities. They announce this in their descriptors.
-
- Periodically, each directory authority generates a view of
- the current descriptors and status for known routers. They send a
- signed summary of this view (a "status vote") to the other
- authorities. The authorities compute the result of this vote, and sign
- a "consensus status" document containing the result of the vote.
-
- Directory caches download, cache, and re-serve consensus documents.
-
- Clients, directory caches, and directory authorities all use consensus
- documents to find out when their list of routers is out-of-date.
- (Directory authorities also use vote statuses.) If it is, they download
- any missing router descriptors. Clients download missing descriptors
- from caches; caches and authorities download from authorities.
- Descriptors are downloaded by the hash of the descriptor, not by the
- server's identity key: this prevents servers from attacking clients by
- giving them descriptors nobody else uses.
-
- All directory information is uploaded and downloaded with HTTP.
-
- [Authorities also generate and caches also cache documents produced and
- used by earlier versions of this protocol; see dir-spec-v1.txt and
- dir-spec-v2.txt for notes on those versions.]
-
-1.1. What's different from version 2?
-
- Clients used to download multiple network status documents,
- corresponding roughly to "status votes" above. They would compute the
- result of the vote on the client side.
-
- Authorities used to sign documents using the same private keys they used
- for their roles as routers. This forced them to keep these extremely
- sensitive keys in memory unencrypted.
-
- All of the information in extra-info documents used to be kept in the
- main descriptors.
-
-1.2. Document meta-format
-
- Router descriptors, directories, and running-routers documents all obey the
- following lightweight extensible information format.
-
- The highest level object is a Document, which consists of one or more
- Items. Every Item begins with a KeywordLine, followed by zero or more
- Objects. A KeywordLine begins with a Keyword, optionally followed by
- whitespace and more non-newline characters, and ends with a newline. A
- Keyword is a sequence of one or more characters in the set [A-Za-z0-9-].
- An Object is a block of encoded data in pseudo-Open-PGP-style
- armor. (cf. RFC 2440)
-
- More formally:
-
- NL = The ascii LF character (hex value 0x0a).
- Document ::= (Item | NL)+
- Item ::= KeywordLine Object*
- KeywordLine ::= Keyword NL | Keyword WS ArgumentChar+ NL
- Keyword = KeywordChar+
- KeywordChar ::= 'A' ... 'Z' | 'a' ... 'z' | '0' ... '9' | '-'
- ArgumentChar ::= any printing ASCII character except NL.
- WS = (SP | TAB)+
- Object ::= BeginLine Base-64-encoded-data EndLine
- BeginLine ::= "-----BEGIN " Keyword "-----" NL
- EndLine ::= "-----END " Keyword "-----" NL
-
- The BeginLine and EndLine of an Object must use the same keyword.
-
- When interpreting a Document, software MUST ignore any KeywordLine that
- starts with a keyword it doesn't recognize; future implementations MUST NOT
- require current clients to understand any KeywordLine not currently
- described.
-
- The "opt" keyword was used until Tor 0.1.2.5-alpha for non-critical future
- extensions. All implementations MUST ignore any item of the form "opt
- keyword ....." when they would not recognize "keyword ....."; and MUST
- treat "opt keyword ....." as synonymous with "keyword ......" when keyword
- is recognized.
-
- Implementations before 0.1.2.5-alpha rejected any document with a
- KeywordLine that started with a keyword that they didn't recognize.
- When generating documents that need to be read by older versions of Tor,
- implementations MUST prefix items not recognized by older versions of
- Tor with an "opt" until those versions of Tor are obsolete. [Note that
- key certificates, status vote documents, extra info documents, and
- status consensus documents will never be read by older versions of Tor.]
-
- Other implementations that want to extend Tor's directory format MAY
- introduce their own items. The keywords for extension items SHOULD start
- with the characters "x-" or "X-", to guarantee that they will not conflict
- with keywords used by future versions of Tor.
-
- In our document descriptions below, we tag Items with a multiplicity in
- brackets. Possible tags are:
-
- "At start, exactly once": These items MUST occur in every instance of
- the document type, and MUST appear exactly once, and MUST be the
- first item in their documents.
-
- "Exactly once": These items MUST occur exactly one time in every
- instance of the document type.
-
- "At end, exactly once": These items MUST occur in every instance of
- the document type, and MUST appear exactly once, and MUST be the
- last item in their documents.
-
- "At most once": These items MAY occur zero or one times in any
- instance of the document type, but MUST NOT occur more than once.
-
- "Any number": These items MAY occur zero, one, or more times in any
- instance of the document type.
-
- "Once or more": These items MUST occur at least once in any instance
- of the document type, and MAY occur more.
-
-1.3. Signing documents
-
- Every signable document below is signed in a similar manner, using a
- given "Initial Item", a final "Signature Item", a digest algorithm, and
- a signing key.
-
- The Initial Item must be the first item in the document.
-
- The Signature Item has the following format:
-
- <signature item keyword> [arguments] NL SIGNATURE NL
-
- The "SIGNATURE" Object contains a signature (using the signing key) of
- the PKCS1-padded digest of the entire document, taken from the
- beginning of the Initial item, through the newline after the Signature
- Item's keyword and its arguments.
-
- Unless otherwise, the digest algorithm is SHA-1.
-
- All documents are invalid unless signed with the correct signing key.
-
- The "Digest" of a document, unless stated otherwise, is its digest *as
- signed by this signature scheme*.
-
-1.4. Voting timeline
-
- Every consensus document has a "valid-after" (VA) time, a "fresh-until"
- (FU) time and a "valid-until" (VU) time. VA MUST precede FU, which MUST
- in turn precede VU. Times are chosen so that every consensus will be
- "fresh" until the next consensus becomes valid, and "valid" for a while
- after. At least 3 consensuses should be valid at any given time.
-
- The timeline for a given consensus is as follows:
-
- VA-DistSeconds-VoteSeconds: The authorities exchange votes.
-
- VA-DistSeconds-VoteSeconds/2: The authorities try to download any
- votes they don't have.
-
- VA-DistSeconds: The authorities calculate the consensus and exchange
- signatures.
-
- VA-DistSeconds/2: The authorities try to download any signatures
- they don't have.
-
- VA: All authorities have a multiply signed consensus.
-
- VA ... FU: Caches download the consensus. (Note that since caches have
- no way of telling what VA and FU are until they have downloaded
- the consensus, they assume that the present consensus's VA is
- equal to the previous one's FU, and that its FU is one interval after
- that.)
-
- FU: The consensus is no longer the freshest consensus.
-
- FU ... (the current consensus's VU): Clients download the consensus.
- (See note above: clients guess that the next consensus's FU will be
- two intervals after the current VA.)
-
- VU: The consensus is no longer valid.
-
- VoteSeconds and DistSeconds MUST each be at least 20 seconds; FU-VA and
- VU-FU MUST each be at least 5 minutes.
-
-2. Router operation and formats
-
- ORs SHOULD generate a new router descriptor and a new extra-info
- document whenever any of the following events have occurred:
-
- - A period of time (18 hrs by default) has passed since the last
- time a descriptor was generated.
-
- - A descriptor field other than bandwidth or uptime has changed.
-
- - Bandwidth has changed by a factor of 2 from the last time a
- descriptor was generated, and at least a given interval of time
- (20 mins by default) has passed since then.
-
- - Its uptime has been reset (by restarting).
-
- [XXX this list is incomplete; see router_differences_are_cosmetic()
- in routerlist.c for others]
-
- ORs SHOULD NOT publish a new router descriptor or extra-info document
- if none of the above events have occurred and not much time has passed
- (12 hours by default).
-
- After generating a descriptor, ORs upload them to every directory
- authority they know, by posting them (in order) to the URL
-
- http://<hostname:port>/tor/
-
-2.1. Router descriptor format
-
- Router descriptors consist of the following items. For backward
- compatibility, there should be an extra NL at the end of each router
- descriptor.
-
- In lines that take multiple arguments, extra arguments SHOULD be
- accepted and ignored. Many of the nonterminals below are defined in
- section 2.3.
-
- "router" nickname address ORPort SOCKSPort DirPort NL
-
- [At start, exactly once.]
-
- Indicates the beginning of a router descriptor. "nickname" must be a
- valid router nickname as specified in 2.3. "address" must be an IPv4
- address in dotted-quad format. The last three numbers indicate the
- TCP ports at which this OR exposes functionality. ORPort is a port at
- which this OR accepts TLS connections for the main OR protocol;
- SOCKSPort is deprecated and should always be 0; and DirPort is the
- port at which this OR accepts directory-related HTTP connections. If
- any port is not supported, the value 0 is given instead of a port
- number. (At least one of DirPort and ORPort SHOULD be set;
- authorities MAY reject any descriptor with both DirPort and ORPort of
- 0.)
-
- "bandwidth" bandwidth-avg bandwidth-burst bandwidth-observed NL
-
- [Exactly once]
-
- Estimated bandwidth for this router, in bytes per second. The
- "average" bandwidth is the volume per second that the OR is willing to
- sustain over long periods; the "burst" bandwidth is the volume that
- the OR is willing to sustain in very short intervals. The "observed"
- value is an estimate of the capacity this server can handle. The
- server remembers the max bandwidth sustained output over any ten
- second period in the past day, and another sustained input. The
- "observed" value is the lesser of these two numbers.
-
- "platform" string NL
-
- [At most once]
-
- A human-readable string describing the system on which this OR is
- running. This MAY include the operating system, and SHOULD include
- the name and version of the software implementing the Tor protocol.
-
- "published" YYYY-MM-DD HH:MM:SS NL
-
- [Exactly once]
-
- The time, in GMT, when this descriptor (and its corresponding
- extra-info document if any) was generated.
-
- "fingerprint" fingerprint NL
-
- [At most once]
-
- A fingerprint (a HASH_LEN-byte of asn1 encoded public key, encoded in
- hex, with a single space after every 4 characters) for this router's
- identity key. A descriptor is considered invalid (and MUST be
- rejected) if the fingerprint line does not match the public key.
-
- [We didn't start parsing this line until Tor 0.1.0.6-rc; it should
- be marked with "opt" until earlier versions of Tor are obsolete.]
-
- "hibernating" bool NL
-
- [At most once]
-
- If the value is 1, then the Tor server was hibernating when the
- descriptor was published, and shouldn't be used to build circuits.
-
- [We didn't start parsing this line until Tor 0.1.0.6-rc; it should be
- marked with "opt" until earlier versions of Tor are obsolete.]
-
- "uptime" number NL
-
- [At most once]
-
- The number of seconds that this OR process has been running.
-
- "onion-key" NL a public key in PEM format
-
- [Exactly once]
-
- This key is used to encrypt EXTEND cells for this OR. The key MUST be
- accepted for at least 1 week after any new key is published in a
- subsequent descriptor. It MUST be 1024 bits.
-
- "signing-key" NL a public key in PEM format
-
- [Exactly once]
-
- The OR's long-term identity key. It MUST be 1024 bits.
-
- "accept" exitpattern NL
- "reject" exitpattern NL
-
- [Any number]
-
- These lines describe an "exit policy": the rules that an OR follows
- when deciding whether to allow a new stream to a given address. The
- 'exitpattern' syntax is described below. There MUST be at least one
- such entry. The rules are considered in order; if no rule matches,
- the address will be accepted. For clarity, the last such entry SHOULD
- be accept *:* or reject *:*.
-
- "router-signature" NL Signature NL
-
- [At end, exactly once]
-
- The "SIGNATURE" object contains a signature of the PKCS1-padded
- hash of the entire router descriptor, taken from the beginning of the
- "router" line, through the newline after the "router-signature" line.
- The router descriptor is invalid unless the signature is performed
- with the router's identity key.
-
- "contact" info NL
-
- [At most once]
-
- Describes a way to contact the server's administrator, preferably
- including an email address and a PGP key fingerprint.
-
- "family" names NL
-
- [At most once]
-
- 'Names' is a space-separated list of server nicknames or
- hexdigests. If two ORs list one another in their "family" entries,
- then OPs should treat them as a single OR for the purpose of path
- selection.
-
- For example, if node A's descriptor contains "family B", and node B's
- descriptor contains "family A", then node A and node B should never
- be used on the same circuit.
-
- "read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL
- [At most once]
- "write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL
- [At most once]
-
- Declare how much bandwidth the OR has used recently. Usage is divided
- into intervals of NSEC seconds. The YYYY-MM-DD HH:MM:SS field
- defines the end of the most recent interval. The numbers are the
- number of bytes used in the most recent intervals, ordered from
- oldest to newest.
-
- [We didn't start parsing these lines until Tor 0.1.0.6-rc; they should
- be marked with "opt" until earlier versions of Tor are obsolete.]
-
- [See also migration notes in section 2.2.1.]
-
- "eventdns" bool NL
-
- [At most once]
-
- Declare whether this version of Tor is using the newer enhanced
- dns logic. Versions of Tor with this field set to false SHOULD NOT
- be used for reverse hostname lookups.
-
- [This option is obsolete. All Tor current servers should be presumed
- to have the evdns backend.]
-
- "caches-extra-info" NL
-
- [At most once.]
-
- Present only if this router is a directory cache that provides
- extra-info documents.
-
- [Versions before 0.2.0.1-alpha don't recognize this, and versions
- before 0.1.2.5-alpha will reject descriptors containing it unless
- it is prefixed with "opt"; it should be so prefixed until these
- versions are obsolete.]
-
- "extra-info-digest" digest NL
-
- [At most once]
-
- "Digest" is a hex-encoded digest (using upper-case characters) of the
- router's extra-info document, as signed in the router's extra-info
- (that is, not including the signature). (If this field is absent, the
- router is not uploading a corresponding extra-info document.)
-
- [Versions before 0.2.0.1-alpha don't recognize this, and versions
- before 0.1.2.5-alpha will reject descriptors containing it unless
- it is prefixed with "opt"; it should be so prefixed until these
- versions are obsolete.]
-
- "hidden-service-dir" *(SP VersionNum) NL
-
- [At most once.]
-
- Present only if this router stores and serves hidden service
- descriptors. If any VersionNum(s) are specified, this router
- supports those descriptor versions. If none are specified, it
- defaults to version 2 descriptors.
-
- [Versions of Tor before 0.1.2.5-alpha rejected router descriptors
- with unrecognized items; the protocols line should be preceded with
- an "opt" until these Tors are obsolete.]
-
- "protocols" SP "Link" SP LINK-VERSION-LIST SP "Circuit" SP
- CIRCUIT-VERSION-LIST NL
-
- [At most once.]
-
- Both lists are space-separated sequences of numbers, to indicate which
- protocols the server supports. As of 30 Mar 2008, specified
- protocols are "Link 1 2 Circuit 1". See section 4.1 of tor-spec.txt
- for more information about link protocol versions.
-
- [Versions of Tor before 0.1.2.5-alpha rejected router descriptors
- with unrecognized items; the protocols line should be preceded with
- an "opt" until these Tors are obsolete.]
-
- "allow-single-hop-exits" NL
-
- [At most once.]
-
- Present only if the router allows single-hop circuits to make exit
- connections. Most Tor servers do not support this: this is
- included for specialized controllers designed to support perspective
- access and such.
-
-
-2.2. Extra-info documents
-
- Extra-info documents consist of the following items:
-
- "extra-info" Nickname Fingerprint NL
- [At start, exactly once.]
-
- Identifies what router this is an extra info descriptor for.
- Fingerprint is encoded in hex (using upper-case letters), with
- no spaces.
-
- "published" YYYY-MM-DD HH:MM:SS NL
-
- [Exactly once.]
-
- The time, in GMT, when this document (and its corresponding router
- descriptor if any) was generated. It MUST match the published time
- in the corresponding router descriptor.
-
- "read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL
- [At most once.]
- "write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM,NUM,NUM... NL
- [At most once.]
-
- As documented in 2.1 above. See migration notes in section 2.2.1.
-
- "geoip-db-digest" Digest NL
- [At most once.]
-
- SHA1 digest of the GeoIP database file that is used to resolve IP
- addresses to country codes.
-
- ("geoip-start" YYYY-MM-DD HH:MM:SS NL)
- ("geoip-client-origins" CC=N,CC=N,... NL)
-
- Only generated by bridge routers (see blocking.pdf), and only
- when they have been configured with a geoip database.
- Non-bridges SHOULD NOT generate these fields. Contains a list
- of mappings from two-letter country codes (CC) to the number
- of clients that have connected to that bridge from that
- country (approximate, and rounded up to the nearest multiple of 8
- in order to hamper traffic analysis). A country is included
- only if it has at least one address. The time in
- "geoip-start" is the time at which we began collecting geoip
- statistics.
-
- "geoip-start" and "geoip-client-origins" have been replaced by
- "bridge-stats-end" and "bridge-stats-ips" in 0.2.2.4-alpha. The
- reason is that the measurement interval with "geoip-stats" as
- determined by subtracting "geoip-start" from "published" could
- have had a variable length, whereas the measurement interval in
- 0.2.2.4-alpha and later is set to be exactly 24 hours long. In
- order to clearly distinguish the new measurement intervals from
- the old ones, the new keywords have been introduced.
-
- "bridge-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
- [At most once.]
-
- YYYY-MM-DD HH:MM:SS defines the end of the included measurement
- interval of length NSEC seconds (86400 seconds by default).
-
- A "bridge-stats-end" line, as well as any other "bridge-*" line,
- is only added when the relay has been running as a bridge for at
- least 24 hours.
-
- "bridge-ips" CC=N,CC=N,... NL
- [At most once.]
-
- List of mappings from two-letter country codes to the number of
- unique IP addresses that have connected from that country to the
- bridge and which are no known relays, rounded up to the nearest
- multiple of 8.
-
- "dirreq-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
- [At most once.]
-
- YYYY-MM-DD HH:MM:SS defines the end of the included measurement
- interval of length NSEC seconds (86400 seconds by default).
-
- A "dirreq-stats-end" line, as well as any other "dirreq-*" line,
- is only added when the relay has opened its Dir port and after 24
- hours of measuring directory requests.
-
- "dirreq-v2-ips" CC=N,CC=N,... NL
- [At most once.]
- "dirreq-v3-ips" CC=N,CC=N,... NL
- [At most once.]
-
- List of mappings from two-letter country codes to the number of
- unique IP addresses that have connected from that country to
- request a v2/v3 network status, rounded up to the nearest multiple
- of 8. Only those IP addresses are counted that the directory can
- answer with a 200 OK status code.
-
- "dirreq-v2-reqs" CC=N,CC=N,... NL
- [At most once.]
- "dirreq-v3-reqs" CC=N,CC=N,... NL
- [At most once.]
-
- List of mappings from two-letter country codes to the number of
- requests for v2/v3 network statuses from that country, rounded up
- to the nearest multiple of 8. Only those requests are counted that
- the directory can answer with a 200 OK status code.
-
- "dirreq-v2-share" num% NL
- [At most once.]
- "dirreq-v3-share" num% NL
- [At most once.]
-
- The share of v2/v3 network status requests that the directory
- expects to receive from clients based on its advertised bandwidth
- compared to the overall network bandwidth capacity. Shares are
- formatted in percent with two decimal places. Shares are
- calculated as means over the whole 24-hour interval.
-
- "dirreq-v2-resp" status=num,... NL
- [At most once.]
- "dirreq-v3-resp" status=nul,... NL
- [At most once.]
-
- List of mappings from response statuses to the number of requests
- for v2/v3 network statuses that were answered with that response
- status, rounded up to the nearest multiple of 4. Only response
- statuses with at least 1 response are reported. New response
- statuses can be added at any time. The current list of response
- statuses is as follows:
-
- "ok": a network status request is answered; this number
- corresponds to the sum of all requests as reported in
- "dirreq-v2-reqs" or "dirreq-v3-reqs", respectively, before
- rounding up.
- "not-enough-sigs: a version 3 network status is not signed by a
- sufficient number of requested authorities.
- "unavailable": a requested network status object is unavailable.
- "not-found": a requested network status is not found.
- "not-modified": a network status has not been modified since the
- If-Modified-Since time that is included in the request.
- "busy": the directory is busy.
-
- "dirreq-v2-direct-dl" key=val,... NL
- [At most once.]
- "dirreq-v3-direct-dl" key=val,... NL
- [At most once.]
- "dirreq-v2-tunneled-dl" key=val,... NL
- [At most once.]
- "dirreq-v3-tunneled-dl" key=val,... NL
- [At most once.]
-
- List of statistics about possible failures in the download process
- of v2/v3 network statuses. Requests are either "direct"
- HTTP-encoded requests over the relay's directory port, or
- "tunneled" requests using a BEGIN_DIR cell over the relay's OR
- port. The list of possible statistics can change, and statistics
- can be left out from reporting. The current list of statistics is
- as follows:
-
- Successful downloads and failures:
-
- "complete": a client has finished the download successfully.
- "timeout": a download did not finish within 10 minutes after
- starting to send the response.
- "running": a download is still running at the end of the
- measurement period for less than 10 minutes after starting to
- send the response.
-
- Download times:
-
- "min", "max": smallest and largest measured bandwidth in B/s.
- "d[1-4,6-9]": 1st to 4th and 6th to 9th decile of measured
- bandwidth in B/s. For a given decile i, i/10 of all downloads
- had a smaller bandwidth than di, and (10-i)/10 of all downloads
- had a larger bandwidth than di.
- "q[1,3]": 1st and 3rd quartile of measured bandwidth in B/s. One
- fourth of all downloads had a smaller bandwidth than q1, one
- fourth of all downloads had a larger bandwidth than q3, and the
- remaining half of all downloads had a bandwidth between q1 and
- q3.
- "md": median of measured bandwidth in B/s. Half of the downloads
- had a smaller bandwidth than md, the other half had a larger
- bandwidth than md.
-
- "dirreq-read-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL
- [At most once]
- "dirreq-write-history" YYYY-MM-DD HH:MM:SS (NSEC s) NUM,NUM,NUM... NL
- [At most once]
-
- Declare how much bandwidth the OR has spent on answering directory
- requests. Usage is divided into intervals of NSEC seconds. The
- YYYY-MM-DD HH:MM:SS field defines the end of the most recent
- interval. The numbers are the number of bytes used in the most
- recent intervals, ordered from oldest to newest.
-
- "entry-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
- [At most once.]
-
- YYYY-MM-DD HH:MM:SS defines the end of the included measurement
- interval of length NSEC seconds (86400 seconds by default).
-
- An "entry-stats-end" line, as well as any other "entry-*"
- line, is first added after the relay has been running for at least
- 24 hours.
-
- "entry-ips" CC=N,CC=N,... NL
- [At most once.]
-
- List of mappings from two-letter country codes to the number of
- unique IP addresses that have connected from that country to the
- relay and which are no known other relays, rounded up to the
- nearest multiple of 8.
-
- "cell-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
- [At most once.]
-
- YYYY-MM-DD HH:MM:SS defines the end of the included measurement
- interval of length NSEC seconds (86400 seconds by default).
-
- A "cell-stats-end" line, as well as any other "cell-*" line,
- is first added after the relay has been running for at least 24
- hours.
-
- "cell-processed-cells" num,...,num NL
- [At most once.]
-
- Mean number of processed cells per circuit, subdivided into
- deciles of circuits by the number of cells they have processed in
- descending order from loudest to quietest circuits.
-
- "cell-queued-cells" num,...,num NL
- [At most once.]
-
- Mean number of cells contained in queues by circuit decile. These
- means are calculated by 1) determining the mean number of cells in
- a single circuit between its creation and its termination and 2)
- calculating the mean for all circuits in a given decile as
- determined in "cell-processed-cells". Numbers have a precision of
- two decimal places.
-
- "cell-time-in-queue" num,...,num NL
- [At most once.]
-
- Mean time cells spend in circuit queues in milliseconds. Times are
- calculated by 1) determining the mean time cells spend in the
- queue of a single circuit and 2) calculating the mean for all
- circuits in a given decile as determined in
- "cell-processed-cells".
-
- "cell-circuits-per-decile" num NL
- [At most once.]
-
- Mean number of circuits that are included in any of the deciles,
- rounded up to the next integer.
-
- "conn-bi-direct" YYYY-MM-DD HH:MM:SS (NSEC s) BELOW,READ,WRITE,BOTH NL
- [At most once]
-
- Number of connections, split into 10-second intervals, that are
- used uni-directionally or bi-directionally as observed in the NSEC
- seconds (usually 86400 seconds) before YYYY-MM-DD HH:MM:SS. Every
- 10 seconds, we determine for every connection whether we read and
- wrote less than a threshold of 20 KiB (BELOW), read at least 10
- times more than we wrote (READ), wrote at least 10 times more than
- we read (WRITE), or read and wrote more than the threshold, but
- not 10 times more in either direction (BOTH). After classifying a
- connection, read and write counters are reset for the next
- 10-second interval.
-
- "exit-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
- [At most once.]
-
- YYYY-MM-DD HH:MM:SS defines the end of the included measurement
- interval of length NSEC seconds (86400 seconds by default).
-
- An "exit-stats-end" line, as well as any other "exit-*" line, is
- first added after the relay has been running for at least 24 hours
- and only if the relay permits exiting (where exiting to a single
- port and IP address is sufficient).
-
- "exit-kibibytes-written" port=N,port=N,... NL
- [At most once.]
- "exit-kibibytes-read" port=N,port=N,... NL
- [At most once.]
-
- List of mappings from ports to the number of kibibytes that the
- relay has written to or read from exit connections to that port,
- rounded up to the next full kibibyte.
-
- "exit-streams-opened" port=N,port=N,... NL
- [At most once.]
-
- List of mappings from ports to the number of opened exit streams
- to that port, rounded up to the nearest multiple of 4.
-
- "router-signature" NL Signature NL
- [At end, exactly once.]
-
- A document signature as documented in section 1.3, using the
- initial item "extra-info" and the final item "router-signature",
- signed with the router's identity key.
-
-2.2.1. Moving history fields to extra-info documents.
-
- Tools that want to use the read-history and write-history values SHOULD
- download extra-info documents as well as router descriptors. Such
- tools SHOULD accept history values from both sources; if they appear in
- both documents, the values in the extra-info documents are authoritative.
-
- New versions of Tor no longer generate router descriptors
- containing read-history or write-history. Tools should continue to
- accept read-history and write-history values in router descriptors
- produced by older versions of Tor until all Tor versions earlier
- than 0.2.0.x are obsolete.
-
-2.3. Nonterminals in router descriptors
-
- nickname ::= between 1 and 19 alphanumeric characters ([A-Za-z0-9]),
- case-insensitive.
- hexdigest ::= a '$', followed by 40 hexadecimal characters
- ([A-Fa-f0-9]). [Represents a server by the digest of its identity
- key.]
-
- exitpattern ::= addrspec ":" portspec
- portspec ::= "*" | port | port "-" port
- port ::= an integer between 1 and 65535, inclusive.
-
- [Some implementations incorrectly generate ports with value 0.
- Implementations SHOULD accept this, and SHOULD NOT generate it.
- Connections to port 0 are never permitted.]
-
- addrspec ::= "*" | ip4spec | ip6spec
- ipv4spec ::= ip4 | ip4 "/" num_ip4_bits | ip4 "/" ip4mask
- ip4 ::= an IPv4 address in dotted-quad format
- ip4mask ::= an IPv4 mask in dotted-quad format
- num_ip4_bits ::= an integer between 0 and 32
- ip6spec ::= ip6 | ip6 "/" num_ip6_bits
- ip6 ::= an IPv6 address, surrounded by square brackets.
- num_ip6_bits ::= an integer between 0 and 128
-
- bool ::= "0" | "1"
-
-3. Formats produced by directory authorities.
-
- Every authority has two keys used in this protocol: a signing key, and
- an authority identity key. (Authorities also have a router identity
- key used in their role as a router and by earlier versions of the
- directory protocol.) The identity key is used from time to time to
- sign new key certificates using new signing keys; it is very sensitive.
- The signing key is used to sign key certificates and status documents.
-
- There are three kinds of documents generated by directory authorities:
-
- Key certificates
- Status votes
- Status consensuses
-
- Each is discussed below.
-
-3.1. Key certificates
-
- Key certificates consist of the following items:
-
- "dir-key-certificate-version" version NL
-
- [At start, exactly once.]
-
- Determines the version of the key certificate. MUST be "3" for
- the protocol described in this document. Implementations MUST
- reject formats they don't understand.
-
- "dir-address" IPPort NL
- [At most once]
-
- An IP:Port for this authority's directory port.
-
- "fingerprint" fingerprint NL
-
- [Exactly once.]
-
- Hexadecimal encoding without spaces based on the authority's
- identity key.
-
- "dir-identity-key" NL a public key in PEM format
-
- [Exactly once.]
-
- The long-term authority identity key for this authority. This key
- SHOULD be at least 2048 bits long; it MUST NOT be shorter than
- 1024 bits.
-
- "dir-key-published" YYYY-MM-DD HH:MM:SS NL
-
- [Exactly once.]
-
- The time (in GMT) when this document and corresponding key were
- last generated.
-
- "dir-key-expires" YYYY-MM-DD HH:MM:SS NL
-
- [Exactly once.]
-
- A time (in GMT) after which this key is no longer valid.
-
- "dir-signing-key" NL a key in PEM format
-
- [Exactly once.]
-
- The directory server's public signing key. This key MUST be at
- least 1024 bits, and MAY be longer.
-
- "dir-key-crosscert" NL CrossSignature NL
-
- [At most once.]
-
- NOTE: Authorities MUST include this field in all newly generated
- certificates. A future version of this specification will make
- the field required.
-
- CrossSignature is a signature, made using the certificate's signing
- key, of the digest of the PKCS1-padded hash of the certificate's
- identity key. For backward compatibility with broken versions of the
- parser, we wrap the base64-encoded signature in -----BEGIN ID
- SIGNATURE---- and -----END ID SIGNATURE----- tags. Implementations
- MUST allow the "ID " portion to be omitted, however.
-
- When encountering a certificate with a dir-key-crosscert entry,
- implementations MUST verify that the signature is a correct signature
- of the hash of the identity key using the signing key.
-
- "dir-key-certification" NL Signature NL
-
- [At end, exactly once.]
-
- A document signature as documented in section 1.3, using the
- initial item "dir-key-certificate-version" and the final item
- "dir-key-certification", signed with the authority identity key.
-
- Authorities MUST generate a new signing key and corresponding
- certificate before the key expires.
-
-3.2. Vote and consensus status documents
-
- Votes and consensuses are more strictly formatted then other documents
- in this specification, since different authorities must be able to
- generate exactly the same consensus given the same set of votes.
-
- The procedure for deciding when to generate vote and consensus status
- documents are described in section 1.4 on the voting timeline.
-
- Status documents contain a preamble, an authority section, a list of
- router status entries, and one or more footer signature, in that order.
-
- Unlike other formats described above, a SP in these documents must be a
- single space character (hex 20).
-
- Some items appear only in votes, and some items appear only in
- consensuses. Unless specified, items occur in both.
-
- The preamble contains the following items. They MUST occur in the
- order given here:
-
- "network-status-version" SP version NL.
-
- [At start, exactly once.]
-
- A document format version. For this specification, the version is
- "3".
-
- "vote-status" SP type NL
-
- [Exactly once.]
-
- The status MUST be "vote" or "consensus", depending on the type of
- the document.
-
- "consensus-methods" SP IntegerList NL
-
- [Exactly once for votes; does not occur in consensuses.]
-
- A space-separated list of supported methods for generating
- consensuses from votes. See section 3.4.1 for details. Method "1"
- MUST be included.
-
- "consensus-method" SP Integer NL
-
- [Exactly once for consensuses; does not occur in votes.]
-
- See section 3.4.1 for details.
-
- (Only included when the vote is generated with consensus-method 2 or
- later.)
-
- "published" SP YYYY-MM-DD SP HH:MM:SS NL
-
- [Exactly once for votes; does not occur in consensuses.]
-
- The publication time for this status document (if a vote).
-
- "valid-after" SP YYYY-MM-DD SP HH:MM:SS NL
-
- [Exactly once.]
-
- The start of the Interval for this vote. Before this time, the
- consensus document produced from this vote should not be used.
- See 1.4 for voting timeline information.
-
- "fresh-until" SP YYYY-MM-DD SP HH:MM:SS NL
-
- [Exactly once.]
-
- The time at which the next consensus should be produced; before this
- time, there is no point in downloading another consensus, since there
- won't be a new one. See 1.4 for voting timeline information.
-
- "valid-until" SP YYYY-MM-DD SP HH:MM:SS NL
-
- [Exactly once.]
-
- The end of the Interval for this vote. After this time, the
- consensus produced by this vote should not be used. See 1.4 for
- voting timeline information.
-
- "voting-delay" SP VoteSeconds SP DistSeconds NL
-
- [Exactly once.]
-
- VoteSeconds is the number of seconds that we will allow to collect
- votes from all authorities; DistSeconds is the number of seconds
- we'll allow to collect signatures from all authorities. See 1.4 for
- voting timeline information.
-
- "client-versions" SP VersionList NL
-
- [At most once.]
-
- A comma-separated list of recommended client versions, in
- ascending order. If absent, no opinion is held about client
- versions.
-
- "server-versions" SP VersionList NL
-
- [At most once.]
-
- A comma-separated list of recommended server versions, in
- ascending order. If absent, no opinion is held about server
- versions.
-
- "known-flags" SP FlagList NL
-
- [Exactly once.]
-
- A space-separated list of all of the flags that this document
- might contain. A flag is "known" either because the authority
- knows about them and might set them (if in a vote), or because
- enough votes were counted for the consensus for an authoritative
- opinion to have been formed about their status.
-
- "params" SP [Parameters] NL
-
- [At most once]
-
- Parameter ::= Keyword '=' Int32
- Int32 ::= A decimal integer between -2147483648 and 2147483647.
- Parameters ::= Parameter | Parameters SP Parameter
-
- The parameters list, if present, contains a space-separated list of
- case-sensitive key-value pairs, sorted in lexical order by
- their keyword. Each parameter has its own meaning.
-
- (Only included when the vote is generated with consensus-method 7 or
- later.)
-
- Commonly used "param" arguments at this point include:
-
- "circwindow" -- the default package window that circuits should
- be established with. It started out at 1000 cells, but some
- research indicates that a lower value would mean fewer cells in
- transit in the network at any given time. Obeyed by Tor 0.2.1.20
- and later.
- Min: 100, Max: 1000
-
- "CircuitPriorityHalflifeMsec" -- the halflife parameter used when
- weighting which circuit will send the next cell. Obeyed by Tor
- 0.2.2.10-alpha and later. (Versions of Tor between 0.2.2.7-alpha
- and 0.2.2.10-alpha recognized a "CircPriorityHalflifeMsec" parameter,
- but mishandled it badly.)
- Min: -1, Max: 2147483647 (INT32_MAX)
-
- "perconnbwrate" and "perconnbwburst" -- if set, each relay sets
- up a separate token bucket for every client OR connection,
- and rate limits that connection indepedently. Typically left
- unset, except when used for performance experiments around trac
- entry 1750. Only honored by relays running Tor 0.2.2.16-alpha
- and later. (Note that relays running 0.2.2.7-alpha through
- 0.2.2.14-alpha looked for bwconnrate and bwconnburst, but then
- did the wrong thing with them; see bug 1830 for details.)
- Min: 1, Max: 2147483647 (INT32_MAX)
-
- "refuseunknownexits" -- if set to one, exit relays look at
- the previous hop of circuits that ask to open an exit stream,
- and refuse to exit if they don't recognize it as a relay. The
- goal is to make it harder for people to use them as one-hop
- proxies. See trac entry 1751 for details.
- Min: 0, Max: 1
-
- "cbtdisabled", "cbtnummodes", "cbtrecentcount", "cbtmaxtimeouts",
- "cbtmincircs", "cbtquantile", "cbtclosequantile", "cbttestfreq",
- "cbtmintimeout", and "cbtinitialtimeout" -- see "2.4.5. Consensus
- parameters governing behavior" in path-spec.txt for a series of
- circuit build time related consensus params.
-
- The authority section of a vote contains the following items, followed
- in turn by the authority's current key certificate:
-
- "dir-source" SP nickname SP identity SP address SP IP SP dirport SP
- orport NL
-
- [Exactly once, at start]
-
- Describes this authority. The nickname is a convenient identifier
- for the authority. The identity is an uppercase hex fingerprint of
- the authority's current (v3 authority) identity key. The address is
- the server's hostname. The IP is the server's current IP address,
- and dirport is its current directory port. XXXXorport
-
- "contact" SP string NL
-
- [At most once.]
-
- An arbitrary string describing how to contact the directory
- server's administrator. Administrators should include at least an
- email address and a PGP fingerprint.
-
- "legacy-key" SP FINGERPRINT NL
-
- [At most once]
-
- Lists a fingerprint for an obsolete _identity_ key still used
- by this authority to keep older clients working. This option
- is used to keep key around for a little while in case the
- authorities need to migrate many identity keys at once.
- (Generally, this would only happen because of a security
- vulnerability that affected multiple authorities, like the
- Debian OpenSSL RNG bug of May 2008.)
-
- The authority section of a consensus contains groups the following items,
- in the order given, with one group for each authority that contributed to
- the consensus, with groups sorted by authority identity digest:
-
- "dir-source" SP nickname SP identity SP address SP IP SP dirport SP
- orport NL
-
- [Exactly once, at start]
-
- As in the authority section of a vote.
-
- "contact" SP string NL
-
- [At most once.]
-
- As in the authority section of a vote.
-
- "vote-digest" SP digest NL
-
- [Exactly once.]
-
- A digest of the vote from the authority that contributed to this
- consensus, as signed (that is, not including the signature).
- (Hex, upper-case.)
-
- Each router status entry contains the following items. Router status
- entries are sorted in ascending order by identity digest.
-
- "r" SP nickname SP identity SP digest SP publication SP IP SP ORPort
- SP DirPort NL
-
- [At start, exactly once.]
-
- "Nickname" is the OR's nickname. "Identity" is a hash of its
- identity key, encoded in base64, with trailing equals sign(s)
- removed. "Digest" is a hash of its most recent descriptor as
- signed (that is, not including the signature), encoded in base64.
- "Publication" is the
- publication time of its most recent descriptor, in the form
- YYYY-MM-DD HH:MM:SS, in GMT. "IP" is its current IP address;
- ORPort is its current OR port, "DirPort" is it's current directory
- port, or "0" for "none".
-
- "s" SP Flags NL
-
- [At most once.]
-
- A series of space-separated status flags, in alphabetical order.
- Currently documented flags are:
-
- "Authority" if the router is a directory authority.
- "BadExit" if the router is believed to be useless as an exit node
- (because its ISP censors it, because it is behind a restrictive
- proxy, or for some similar reason).
- "BadDirectory" if the router is believed to be useless as a
- directory cache (because its directory port isn't working,
- its bandwidth is always throttled, or for some similar
- reason).
- "Exit" if the router is more useful for building
- general-purpose exit circuits than for relay circuits. The
- path building algorithm uses this flag; see path-spec.txt.
- "Fast" if the router is suitable for high-bandwidth circuits.
- "Guard" if the router is suitable for use as an entry guard.
- "HSDir" if the router is considered a v2 hidden service directory.
- "Named" if the router's identity-nickname mapping is canonical,
- and this authority binds names.
- "Stable" if the router is suitable for long-lived circuits.
- "Running" if the router is currently usable.
- "Unnamed" if another router has bound the name used by this
- router, and this authority binds names.
- "Valid" if the router has been 'validated'.
- "V2Dir" if the router implements the v2 directory protocol.
- "V3Dir" if the router implements this protocol.
-
- "v" SP version NL
-
- [At most once.]
-
- The version of the Tor protocol that this server is running. If
- the value begins with "Tor" SP, the rest of the string is a Tor
- version number, and the protocol is "The Tor protocol as supported
- by the given version of Tor." Otherwise, if the value begins with
- some other string, Tor has upgraded to a more sophisticated
- protocol versioning system, and the protocol is "a version of the
- Tor protocol more recent than any we recognize."
-
- Directory authorities SHOULD omit version strings they receive from
- descriptors if they would cause "v" lines to be over 128 characters
- long.
-
- "w" SP "Bandwidth=" INT [SP "Measured=" INT] NL
-
- [At most once.]
-
- An estimate of the bandwidth of this server, in an arbitrary
- unit (currently kilobytes per second). Used to weight router
- selection.
-
- Additionally, the Measured= keyword is present in votes by
- participating bandwidth measurement authorities to indicate
- a measured bandwidth currently produced by measuring stream
- capacities.
-
- Other weighting keywords may be added later.
- Clients MUST ignore keywords they do not recognize.
-
- "p" SP ("accept" / "reject") SP PortList NL
-
- [At most once.]
-
- PortList = PortOrRange
- PortList = PortList "," PortOrRange
- PortOrRange = INT "-" INT / INT
-
- A list of those ports that this router supports (if 'accept')
- or does not support (if 'reject') for exit to "most
- addresses".
-
- The footer section is delineated in all votes and consensuses supporting
- consensus method 9 and above with the following:
-
- "directory-footer" NL
-
- It contains two subsections, a bandwidths-weights line and a
- directory-signature.
-
- The bandwidths-weights line appears At Most Once for a consensus. It does
- not appear in votes.
-
- "bandwidth-weights" SP
- "Wbd=" INT SP "Wbe=" INT SP "Wbg=" INT SP "Wbm=" INT SP
- "Wdb=" INT SP
- "Web=" INT SP "Wed=" INT SP "Wee=" INT SP "Weg=" INT SP "Wem=" INT SP
- "Wgb=" INT SP "Wgd=" INT SP "Wgg=" INT SP "Wgm=" INT SP
- "Wmb=" INT SP "Wmd=" INT SP "Wme=" INT SP "Wmg=" INT SP "Wmm=" INT NL
-
- These values represent the weights to apply to router bandwidths during
- path selection. They are sorted in alphabetical order in the list. The
- integer values are divided by BW_WEIGHT_SCALE=10000 or the consensus
- param "bwweightscale". They are:
-
- Wgg - Weight for Guard-flagged nodes in the guard position
- Wgm - Weight for non-flagged nodes in the guard Position
- Wgd - Weight for Guard+Exit-flagged nodes in the guard Position
-
- Wmg - Weight for Guard-flagged nodes in the middle Position
- Wmm - Weight for non-flagged nodes in the middle Position
- Wme - Weight for Exit-flagged nodes in the middle Position
- Wmd - Weight for Guard+Exit flagged nodes in the middle Position
-
- Weg - Weight for Guard flagged nodes in the exit Position
- Wem - Weight for non-flagged nodes in the exit Position
- Wee - Weight for Exit-flagged nodes in the exit Position
- Wed - Weight for Guard+Exit-flagged nodes in the exit Position
-
- Wgb - Weight for BEGIN_DIR-supporting Guard-flagged nodes
- Wmb - Weight for BEGIN_DIR-supporting non-flagged nodes
- Web - Weight for BEGIN_DIR-supporting Exit-flagged nodes
- Wdb - Weight for BEGIN_DIR-supporting Guard+Exit-flagged nodes
-
- Wbg - Weight for Guard flagged nodes for BEGIN_DIR requests
- Wbm - Weight for non-flagged nodes for BEGIN_DIR requests
- Wbe - Weight for Exit-flagged nodes for BEGIN_DIR requests
- Wbd - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests
-
- These values are calculated as specified in Section 3.4.3.
-
- The signature contains the following item, which appears Exactly Once
- for a vote, and At Least Once for a consensus.
-
- "directory-signature" SP identity SP signing-key-digest NL Signature
-
- This is a signature of the status document, with the initial item
- "network-status-version", and the signature item
- "directory-signature", using the signing key. (In this case, we take
- the hash through the _space_ after directory-signature, not the
- newline: this ensures that all authorities sign the same thing.)
- "identity" is the hex-encoded digest of the authority identity key of
- the signing authority, and "signing-key-digest" is the hex-encoded
- digest of the current authority signing key of the signing authority.
-
-3.3. Assigning flags in a vote
-
- (This section describes how directory authorities choose which status
- flags to apply to routers, as of Tor 0.2.0.0-alpha-dev. Later directory
- authorities MAY do things differently, so long as clients keep working
- well. Clients MUST NOT depend on the exact behaviors in this section.)
-
- In the below definitions, a router is considered "active" if it is
- running, valid, and not hibernating.
-
- "Valid" -- a router is 'Valid' if it is running a version of Tor not
- known to be broken, and the directory authority has not blacklisted
- it as suspicious.
-
- "Named" -- Directory authority administrators may decide to support name
- binding. If they do, then they must maintain a file of
- nickname-to-identity-key mappings, and try to keep this file consistent
- with other directory authorities. If they don't, they act as clients, and
- report bindings made by other directory authorities (name X is bound to
- identity Y if at least one binding directory lists it, and no directory
- binds X to some other Y'.) A router is called 'Named' if the router
- believes the given name should be bound to the given key.
-
- Two strategies exist on the current network for deciding on
- values for the Named flag. In the original version, server
- operators were asked to send nickname-identity pairs to a
- mailing list of Naming directory authorities operators. The
- operators were then supposed to add the pairs to their
- mapping files; in practice, they didn't get to this often.
-
- Newer Naming authorities run a script that registers routers
- in their mapping files once the routers have been online at
- least two weeks, no other router has that nickname, and no
- other router has wanted the nickname for a month. If a router
- has not been online for six months, the router is removed.
-
- "Unnamed" -- Directory authorities that support naming should vote for a
- router to be 'Unnamed' if its given nickname is mapped to a different
- identity.
-
- "Running" -- A router is 'Running' if the authority managed to connect to
- it successfully within the last 30 minutes.
-
- "Stable" -- A router is 'Stable' if it is active, and either its Weighted
- MTBF is at least the median for known active routers or its Weighted MTBF
- corresponds to at least 7 days. Routers are never called Stable if they are
- running a version of Tor known to drop circuits stupidly. (0.1.1.10-alpha
- through 0.1.1.16-rc are stupid this way.)
-
- To calculate weighted MTBF, compute the weighted mean of the lengths
- of all intervals when the router was observed to be up, weighting
- intervals by $\alpha^n$, where $n$ is the amount of time that has
- passed since the interval ended, and $\alpha$ is chosen so that
- measurements over approximately one month old no longer influence the
- weighted MTBF much.
-
- [XXXX what happens when we have less than 4 days of MTBF info.]
-
- "Exit" -- A router is called an 'Exit' iff it allows exits to at
- least two of the ports 80, 443, and 6667 and allows exits to at
- least one /8 address space.
-
- "Fast" -- A router is 'Fast' if it is active, and its bandwidth is
- either in the top 7/8ths for known active routers or at least 20KB/s.
-
- "Guard" -- A router is a possible 'Guard' if its Weighted Fractional
- Uptime is at least the median for "familiar" active routers, and if
- its bandwidth is at least median or at least 250KB/s.
-
- To calculate weighted fractional uptime, compute the fraction
- of time that the router is up in any given day, weighting so that
- downtime and uptime in the past counts less.
-
- A node is 'familiar' if 1/8 of all active nodes have appeared more
- recently than it, OR it has been around for a few weeks.
-
- "Authority" -- A router is called an 'Authority' if the authority
- generating the network-status document believes it is an authority.
-
- "V2Dir" -- A router supports the v2 directory protocol if it has an open
- directory port, and it is running a version of the directory protocol that
- supports the functionality clients need. (Currently, this is
- 0.1.1.9-alpha or later.)
-
- "V3Dir" -- A router supports the v3 directory protocol if it has an open
- directory port, and it is running a version of the directory protocol that
- supports the functionality clients need. (Currently, this is
- 0.2.0.?????-alpha or later.)
-
- "HSDir" -- A router is a v2 hidden service directory if it stores and
- serves v2 hidden service descriptors and the authority managed to connect
- to it successfully within the last 24 hours.
-
- Directory server administrators may label some servers or IPs as
- blacklisted, and elect not to include them in their network-status lists.
-
- Authorities SHOULD 'disable' any servers in excess of 3 on any single IP.
- When there are more than 3 to choose from, authorities should first prefer
- authorities to non-authorities, then prefer Running to non-Running, and
- then prefer high-bandwidth to low-bandwidth. To 'disable' a server, the
- authority *should* advertise it without the Running or Valid flag.
-
- Thus, the network-status vote includes all non-blacklisted,
- non-expired, non-superseded descriptors.
-
- The bandwidth in a "w" line should be taken as the best estimate
- of the router's actual capacity that the authority has. For now,
- this should be the lesser of the observed bandwidth and bandwidth
- rate limit from the router descriptor. It is given in kilobytes
- per second, and capped at some arbitrary value (currently 10 MB/s).
-
- The Measured= keyword on a "w" line vote is currently computed
- by multiplying the previous published consensus bandwidth by the
- ratio of the measured average node stream capacity to the network
- average. If 3 or more authorities provide a Measured= keyword for
- a router, the authorities produce a consensus containing a "w"
- Bandwidth= keyword equal to the median of the Measured= votes.
-
- The ports listed in a "p" line should be taken as those ports for
- which the router's exit policy permits 'most' addresses, ignoring any
- accept not for all addresses, ignoring all rejects for private
- netblocks. "Most" addresses are permitted if no more than 2^25
- IPv4 addresses (two /8 networks) were blocked. The list is encoded
- as described in 3.4.2.
-
-3.4. Computing a consensus from a set of votes
-
- Given a set of votes, authorities compute the contents of the consensus
- document as follows:
-
- The "valid-after", "valid-until", and "fresh-until" times are taken as
- the median of the respective values from all the votes.
-
- The times in the "voting-delay" line are taken as the median of the
- VoteSeconds and DistSeconds times in the votes.
-
- Known-flags is the union of all flags known by any voter.
-
- Entries are given on the "params" line for every keyword on which any
- authority voted. The values given are the low-median of all votes on
- that keyword.
-
- "client-versions" and "server-versions" are sorted in ascending
- order; A version is recommended in the consensus if it is recommended
- by more than half of the voting authorities that included a
- client-versions or server-versions lines in their votes.
-
- The authority item groups (dir-source, contact, fingerprint,
- vote-digest) are taken from the votes of the voting
- authorities. These groups are sorted by the digests of the
- authorities identity keys, in ascending order. If the consensus
- method is 3 or later, a dir-source line must be included for
- every vote with legacy-key entry, using the legacy-key's
- fingerprint, the voter's ordinary nickname with the string
- "-legacy" appended, and all other fields as from the original
- vote's dir-source line.
-
- A router status entry:
- * is included in the result if some router status entry with the same
- identity is included by more than half of the authorities (total
- authorities, not just those whose votes we have).
-
- * For any given identity, we include at most one router status entry.
-
- * A router entry has a flag set if that is included by more than half
- of the authorities who care about that flag.
-
- * Two router entries are "the same" if they have the same
- <descriptor digest, published time, nickname, IP, ports> tuple.
- We choose the tuple for a given router as whichever tuple appears
- for that router in the most votes. We break ties first in favor of
- the more recently published, then in favor of smaller server
- descriptor digest.
-
- * The Named flag appears if it is included for this routerstatus by
- _any_ authority, and if all authorities that list it list the same
- nickname. However, if consensus-method 2 or later is in use, and
- any authority calls this identity/nickname pair Unnamed, then
- this routerstatus does not get the Named flag.
-
- * If consensus-method 2 or later is in use, the Unnamed flag is
- set for a routerstatus if any authorities have voted for a different
- identities to be Named with that nickname, or if any authority
- lists that nickname/ID pair as Unnamed.
-
- (With consensus-method 1, Unnamed is set like any other flag.)
-
- * The version is given as whichever version is listed by the most
- voters, with ties decided in favor of more recent versions.
-
- * If consensus-method 4 or later is in use, then routers that
- do not have the Running flag are not listed at all.
-
- * If consensus-method 5 or later is in use, then the "w" line
- is generated using a low-median of the bandwidth values from
- the votes that included "w" lines for this router.
-
- * If consensus-method 5 or later is in use, then the "p" line
- is taken from the votes that have the same policy summary
- for the descriptor we are listing. (They should all be the
- same. If they are not, we pick the most commonly listed
- one, breaking ties in favor of the lexicographically larger
- vote.) The port list is encoded as specified in 3.4.2.
-
- * If consensus-method 6 or later is in use and if 3 or more
- authorities provide a Measured= keyword in their votes for
- a router, the authorities produce a consensus containing a
- Bandwidth= keyword equal to the median of the Measured= votes.
-
- * If consensus-method 7 or later is in use, the params line is
- included in the output.
-
- * If the consensus method is under 11, bad exits are considered as
- possible exits when computing bandwidth weights. Otherwise, if
- method 11 or later is in use, any router that is determined to get
- the BadExit flag doesn't count when we're calculating weights.
-
- The signatures at the end of a consensus document are sorted in
- ascending order by identity digest.
-
- All ties in computing medians are broken in favor of the smaller or
- earlier item.
-
-3.4.1. Forward compatibility
-
- Future versions of Tor will need to include new information in the
- consensus documents, but it is important that all authorities (or at least
- half) generate and sign the same signed consensus.
-
- To achieve this, authorities list in their votes their supported methods
- for generating consensuses from votes. Later methods will be assigned
- higher numbers. Currently recognized methods:
- "1" -- The first implemented version.
- "2" -- Added support for the Unnamed flag.
- "3" -- Added legacy ID key support to aid in authority ID key rollovers
- "4" -- No longer list routers that are not running in the consensus
- "5" -- adds support for "w" and "p" lines.
- "6" -- Prefers measured bandwidth values rather than advertised
- "7" -- Provides keyword=integer pairs of consensus parameters
- "8" -- Provides microdescriptor summaries
- "9" -- Provides weights for selecting flagged routers in paths
- "10" -- Fixes edge case bugs in router flag selection weights
-
- Before generating a consensus, an authority must decide which consensus
- method to use. To do this, it looks for the highest version number
- supported by more than 2/3 of the authorities voting. If it supports this
- method, then it uses it. Otherwise, it falls back to method 1.
-
- (The consensuses generated by new methods must be parsable by
- implementations that only understand the old methods, and must not cause
- those implementations to compromise their anonymity. This is a means for
- making changes in the contents of consensus; not for making
- backward-incompatible changes in their format.)
-
-3.4.2. Encoding port lists
-
- Whether the summary shows the list of accepted ports or the list of
- rejected ports depends on which list is shorter (has a shorter string
- representation). In case of ties we choose the list of accepted
- ports. As an exception to this rule an allow-all policy is
- represented as "accept 1-65535" instead of "reject " and a reject-all
- policy is similarly given as "reject 1-65535".
-
- Summary items are compressed, that is instead of "80-88,89-100" there
- only is a single item of "80-100", similarly instead of "20,21" a
- summary will say "20-21".
-
- Port lists are sorted in ascending order.
-
- The maximum allowed length of a policy summary (including the "accept "
- or "reject ") is 1000 characters. If a summary exceeds that length we
- use an accept-style summary and list as much of the port list as is
- possible within these 1000 bytes. [XXXX be more specific.]
-
-3.4.3. Computing Bandwidth Weights
-
- Let weight_scale = 10000
-
- Let G be the total bandwidth for Guard-flagged nodes.
- Let M be the total bandwidth for non-flagged nodes.
- Let E be the total bandwidth for Exit-flagged nodes.
- Let D be the total bandwidth for Guard+Exit-flagged nodes.
- Let T = G+M+E+D
-
- Let Wgd be the weight for choosing a Guard+Exit for the guard position.
- Let Wmd be the weight for choosing a Guard+Exit for the middle position.
- Let Wed be the weight for choosing a Guard+Exit for the exit position.
-
- Let Wme be the weight for choosing an Exit for the middle position.
- Let Wmg be the weight for choosing a Guard for the middle position.
-
- Let Wgg be the weight for choosing a Guard for the guard position.
- Let Wee be the weight for choosing an Exit for the exit position.
-
- Balanced network conditions then arise from solutions to the following
- system of equations:
-
- Wgg*G + Wgd*D == M + Wmd*D + Wme*E + Wmg*G (guard bw = middle bw)
- Wgg*G + Wgd*D == Wee*E + Wed*D (guard bw = exit bw)
- Wed*D + Wmd*D + Wgd*D == D (aka: Wed+Wmd+Wdg = 1)
- Wmg*G + Wgg*G == G (aka: Wgg = 1-Wmg)
- Wme*E + Wee*E == E (aka: Wee = 1-Wme)
-
- We are short 2 constraints with the above set. The remaining constraints
- come from examining different cases of network load. The following
- constraints are used in consensus method 10 and above. There are another
- incorrect and obsolete set of constraints used for these same cases in
- consensus method 9. For those, see dir-spec.txt in Tor 0.2.2.10-alpha
- to 0.2.2.16-alpha.
-
- Case 1: E >= T/3 && G >= T/3 (Neither Exit nor Guard Scarce)
-
- In this case, the additional two constraints are: Wmg == Wmd,
- Wed == 1/3.
-
- This leads to the solution:
- Wgd = weight_scale/3
- Wed = weight_scale/3
- Wmd = weight_scale/3
- Wee = (weight_scale*(E+G+M))/(3*E)
- Wme = weight_scale - Wee
- Wmg = (weight_scale*(2*G-E-M))/(3*G)
- Wgg = weight_scale - Wmg
-
- Case 2: E < T/3 && G < T/3 (Both are scarce)
-
- Let R denote the more scarce class (Rare) between Guard vs Exit.
- Let S denote the less scarce class.
-
- Subcase a: R+D < S
-
- In this subcase, we simply devote all of D bandwidth to the
- scarce class.
-
- Wgg = Wee = weight_scale
- Wmg = Wme = Wmd = 0;
- if E < G:
- Wed = weight_scale
- Wgd = 0
- else:
- Wed = 0
- Wgd = weight_scale
-
- Subcase b: R+D >= S
-
- In this case, if M <= T/3, we have enough bandwidth to try to achieve
- a balancing condition.
-
- Add constraints Wgg = 1, Wmd == Wgd to maximize bandwidth in the guard
- position while still allowing exits to be used as middle nodes:
-
- Wee = (weight_scale*(E - G + M))/E
- Wed = (weight_scale*(D - 2*E + 4*G - 2*M))/(3*D)
- Wme = (weight_scale*(G-M))/E
- Wmg = 0
- Wgg = weight_scale
- Wmd = (weight_scale - Wed)/2
- Wgd = (weight_scale - Wed)/2
-
- If this system ends up with any values out of range (ie negative, or
- above weight_scale), use the constraints Wgg == 1 and Wee == 1, since
- both those positions are scarce:
-
- Wgg = weight_scale
- Wee = weight_scale
- Wed = (weight_scale*(D - 2*E + G + M))/(3*D)
- Wmd = (weight_Scale*(D - 2*M + G + E))/(3*D)
- Wme = 0
- Wmg = 0
- Wgd = weight_scale - Wed - Wmd
-
- If M > T/3, then the Wmd weight above will become negative. Set it to 0
- in this case:
- Wmd = 0
- Wgd = weight_scale - Wed
-
- Case 3: One of E < T/3 or G < T/3
-
- Let S be the scarce class (of E or G).
-
- Subcase a: (S+D) < T/3:
- if G=S:
- Wgg = Wgd = weight_scale;
- Wmd = Wed = Wmg = 0;
- // Minor subcase, if E is more scarce than M,
- // keep its bandwidth in place.
- if (E < M) Wme = 0;
- else Wme = (weight_scale*(E-M))/(2*E);
- Wee = weight_scale-Wme;
- if E=S:
- Wee = Wed = weight_scale;
- Wmd = Wgd = Wme = 0;
- // Minor subcase, if G is more scarce than M,
- // keep its bandwidth in place.
- if (G < M) Wmg = 0;
- else Wmg = (weight_scale*(G-M))/(2*G);
- Wgg = weight_scale-Wmg;
-
- Subcase b: (S+D) >= T/3
- if G=S:
- Add constraints Wgg = 1, Wmd == Wed to maximize bandwidth
- in the guard position, while still allowing exits to be
- used as middle nodes:
- Wgg = weight_scale
- Wgd = (weight_scale*(D - 2*G + E + M))/(3*D)
- Wmg = 0
- Wee = (weight_scale*(E+M))/(2*E)
- Wme = weight_scale - Wee
- Wmd = (weight_scale - Wgd)/2
- Wed = (weight_scale - Wgd)/2
- if E=S:
- Add constraints Wee == 1, Wmd == Wgd to maximize bandwidth
- in the exit position:
- Wee = weight_scale;
- Wed = (weight_scale*(D - 2*E + G + M))/(3*D);
- Wme = 0;
- Wgg = (weight_scale*(G+M))/(2*G);
- Wmg = weight_scale - Wgg;
- Wmd = (weight_scale - Wed)/2;
- Wgd = (weight_scale - Wed)/2;
-
- To ensure consensus, all calculations are performed using integer math
- with a fixed precision determined by the bwweightscale consensus
- parameter (defaults at 10000, Min: 1, Max: INT32_MAX).
-
- For future balancing improvements, Tor clients support 11 additional weights
- for directory requests and middle weighting. These weights are currently
- set at weight_scale, with the exception of the following groups of
- assignments:
-
- Directory requests use middle weights:
- Wbd=Wmd, Wbg=Wmg, Wbe=Wme, Wbm=Wmm
-
- Handle bridges and strange exit policies:
- Wgm=Wgg, Wem=Wee, Weg=Wed
-
-3.5. Detached signatures
-
- Assuming full connectivity, every authority should compute and sign the
- same consensus directory in each period. Therefore, it isn't necessary to
- download the consensus computed by each authority; instead, the
- authorities only push/fetch each others' signatures. A "detached
- signature" document contains items as follows:
-
- "consensus-digest" SP Digest NL
-
- [At start, at most once.]
-
- The digest of the consensus being signed.
-
- "valid-after" SP YYYY-MM-DD SP HH:MM:SS NL
- "fresh-until" SP YYYY-MM-DD SP HH:MM:SS NL
- "valid-until" SP YYYY-MM-DD SP HH:MM:SS NL
-
- [As in the consensus]
-
- "directory-signature"
-
- [As in the consensus; the signature object is the same as in the
- consensus document.]
-
-
-4. Directory server operation
-
- All directory authorities and directory caches ("directory servers")
- implement this section, except as noted.
-
-4.1. Accepting uploads (authorities only)
-
- When a router posts a signed descriptor to a directory authority, the
- authority first checks whether it is well-formed and correctly
- self-signed. If it is, the authority next verifies that the nickname
- in question is not already assigned to a router with a different
- public key.
- Finally, the authority MAY check that the router is not blacklisted
- because of its key, IP, or another reason.
-
- If the descriptor passes these tests, and the authority does not already
- have a descriptor for a router with this public key, it accepts the
- descriptor and remembers it.
-
- If the authority _does_ have a descriptor with the same public key, the
- newly uploaded descriptor is remembered if its publication time is more
- recent than the most recent old descriptor for that router, and either:
- - There are non-cosmetic differences between the old descriptor and the
- new one.
- - Enough time has passed between the descriptors' publication times.
- (Currently, 12 hours.)
-
- Differences between router descriptors are "non-cosmetic" if they would be
- sufficient to force an upload as described in section 2 above.
-
- Note that the "cosmetic difference" test only applies to uploaded
- descriptors, not to descriptors that the authority downloads from other
- authorities.
-
- When a router posts a signed extra-info document to a directory authority,
- the authority again checks it for well-formedness and correct signature,
- and checks that its matches the extra-info-digest in some router
- descriptor that it believes is currently useful. If so, it accepts it and
- stores it and serves it as requested. If not, it drops it.
-
-4.2. Voting (authorities only)
-
- Authorities divide time into Intervals. Authority administrators SHOULD
- try to all pick the same interval length, and SHOULD pick intervals that
- are commonly used divisions of time (e.g., 5 minutes, 15 minutes, 30
- minutes, 60 minutes, 90 minutes). Voting intervals SHOULD be chosen to
- divide evenly into a 24-hour day.
-
- Authorities SHOULD act according to interval and delays in the
- latest consensus. Lacking a latest consensus, they SHOULD default to a
- 30-minute Interval, a 5 minute VotingDelay, and a 5 minute DistDelay.
-
- Authorities MUST take pains to ensure that their clocks remain accurate
- within a few seconds. (Running NTP is usually sufficient.)
-
- The first voting period of each day begins at 00:00 (midnight) GMT. If
- the last period of the day would be truncated by one-half or more, it is
- merged with the second-to-last period.
-
- An authority SHOULD publish its vote immediately at the start of each voting
- period (minus VoteSeconds+DistSeconds). It does this by making it
- available at
- http://<hostname>/tor/status-vote/next/authority.z
- and sending it in an HTTP POST request to each other authority at the URL
- http://<hostname>/tor/post/vote
-
- If, at the start of the voting period, minus DistSeconds, an authority
- does not have a current statement from another authority, the first
- authority downloads the other's statement.
-
- Once an authority has a vote from another authority, it makes it available
- at
- http://<hostname>/tor/status-vote/next/<fp>.z
- where <fp> is the fingerprint of the other authority's identity key.
- And at
- http://<hostname>/tor/status-vote/next/d/<d>.z
- where <d> is the digest of the vote document.
-
- The consensus status, along with as many signatures as the server
- currently knows, should be available at
- http://<hostname>/tor/status-vote/next/consensus.z
- All of the detached signatures it knows for consensus status should be
- available at:
- http://<hostname>/tor/status-vote/next/consensus-signatures.z
-
- Once there are enough signatures, or once the voting period starts,
- these documents are available at
- http://<hostname>/tor/status-vote/current/consensus.z
- and
- http://<hostname>/tor/status-vote/current/consensus-signatures.z
- [XXX current/consensus-signatures is not currently implemented, as it
- is not used in the voting protocol.]
-
- The other vote documents are analogously made available under
- http://<hostname>/tor/status-vote/current/authority.z
- http://<hostname>/tor/status-vote/current/<fp>.z
- http://<hostname>/tor/status-vote/current/d/<d>.z
- once the consensus is complete.
-
- Once an authority has computed and signed a consensus network status, it
- should send its detached signature to each other authority in an HTTP POST
- request to the URL:
- http://<hostname>/tor/post/consensus-signature
-
- [XXX Note why we support push-and-then-pull.]
-
- [XXX possible future features include support for downloading old
- consensuses.]
-
-4.3. Downloading consensus status documents (caches only)
-
- All directory servers (authorities and caches) try to keep a recent
- network-status consensus document to serve to clients. A cache ALWAYS
- downloads a network-status consensus if any of the following are true:
- - The cache has no consensus document.
- - The cache's consensus document is no longer valid.
- Otherwise, the cache downloads a new consensus document at a randomly
- chosen time in the first half-interval after its current consensus
- stops being fresh. (This time is chosen at random to avoid swarming
- the authorities at the start of each period. The interval size is
- inferred from the difference between the valid-after time and the
- fresh-until time on the consensus.)
-
- [For example, if a cache has a consensus that became valid at 1:00,
- and is fresh until 2:00, that cache will fetch a new consensus at
- a random time between 2:00 and 2:30.]
-
-4.4. Downloading and storing router descriptors (authorities and caches)
-
- Periodically (currently, every 10 seconds), directory servers check
- whether there are any specific descriptors that they do not have and that
- they are not currently trying to download. Caches identify these
- descriptors by hash in the recent network-status consensus documents;
- authorities identify them by hash in vote (if publication date is more
- recent than the descriptor we currently have).
-
- [XXXX need a way to fetch descriptors ahead of the vote? v2 status docs can
- do that for now.]
-
- If so, the directory server launches requests to the authorities for these
- descriptors, such that each authority is only asked for descriptors listed
- in its most recent vote (if the requester is an authority) or in the
- consensus (if the requester is a cache). If we're an authority, and more
- than one authority lists the descriptor, we choose which to ask at random.
-
- If one of these downloads fails, we do not try to download that descriptor
- from the authority that failed to serve it again unless we receive a newer
- network-status (consensus or vote) from that authority that lists the same
- descriptor.
-
- Directory servers must potentially cache multiple descriptors for each
- router. Servers must not discard any descriptor listed by any recent
- consensus. If there is enough space to store additional descriptors,
- servers SHOULD try to hold those which clients are likely to download the
- most. (Currently, this is judged based on the interval for which each
- descriptor seemed newest.)
-[XXXX define recent]
-
- Authorities SHOULD NOT download descriptors for routers that they would
- immediately reject for reasons listed in 3.1.
-
-4.5. Downloading and storing extra-info documents
-
- All authorities, and any cache that chooses to cache extra-info documents,
- and any client that uses extra-info documents, should implement this
- section.
-
- Note that generally, clients don't need extra-info documents.
-
- Periodically, the Tor instance checks whether it is missing any extra-info
- documents: in other words, if it has any router descriptors with an
- extra-info-digest field that does not match any of the extra-info
- documents currently held. If so, it downloads whatever extra-info
- documents are missing. Caches download from authorities; non-caches try
- to download from caches. We follow the same splitting and back-off rules
- as in 4.4 (if a cache) or 5.3 (if a client).
-
-4.6. General-use HTTP URLs
-
- "Fingerprints" in these URLs are base-16-encoded SHA1 hashes.
-
- The most recent v3 consensus should be available at:
- http://<hostname>/tor/status-vote/current/consensus.z
-
- Starting with Tor version 0.2.1.1-alpha is also available at:
- http://<hostname>/tor/status-vote/current/consensus/<F1>+<F2>+<F3>.z
-
- Where F1, F2, etc. are authority identity fingerprints the client trusts.
- Servers will only return a consensus if more than half of the requested
- authorities have signed the document, otherwise a 404 error will be sent
- back. The fingerprints can be shortened to a length of any multiple of
- two, using only the leftmost part of the encoded fingerprint. Tor uses
- 3 bytes (6 hex characters) of the fingerprint.
-
- Clients SHOULD sort the fingerprints in ascending order. Server MUST
- accept any order.
-
- Clients SHOULD use this format when requesting consensus documents from
- directory authority servers and from caches running a version of Tor
- that is known to support this URL format.
-
- A concatenated set of all the current key certificates should be available
- at:
- http://<hostname>/tor/keys/all.z
-
- The key certificate for this server (if it is an authority) should be
- available at:
- http://<hostname>/tor/keys/authority.z
-
- The key certificate for an authority whose authority identity fingerprint
- is <F> should be available at:
- http://<hostname>/tor/keys/fp/<F>.z
-
- The key certificate whose signing key fingerprint is <F> should be
- available at:
- http://<hostname>/tor/keys/sk/<F>.z
-
- The key certificate whose identity key fingerprint is <F> and whose signing
- key fingerprint is <S> should be available at:
-
- http://<hostname>/tor/keys/fp-sk/<F>-<S>.z
-
- (As usual, clients may request multiple certificates using:
- http://<hostname>/tor/keys/fp-sk/<F1>-<S1>+<F2>-<S2>.z )
- [The above fp-sk format was not supported before Tor 0.2.1.9-alpha.]
-
- The most recent descriptor for a server whose identity key has a
- fingerprint of <F> should be available at:
- http://<hostname>/tor/server/fp/<F>.z
-
- The most recent descriptors for servers with identity fingerprints
- <F1>,<F2>,<F3> should be available at:
- http://<hostname>/tor/server/fp/<F1>+<F2>+<F3>.z
-
- (NOTE: Implementations SHOULD NOT download descriptors by identity key
- fingerprint. This allows a corrupted server (in collusion with a cache) to
- provide a unique descriptor to a client, and thereby partition that client
- from the rest of the network.)
-
- The server descriptor with (descriptor) digest <D> (in hex) should be
- available at:
- http://<hostname>/tor/server/d/<D>.z
-
- The most recent descriptors with digests <D1>,<D2>,<D3> should be
- available at:
- http://<hostname>/tor/server/d/<D1>+<D2>+<D3>.z
-
- The most recent descriptor for this server should be at:
- http://<hostname>/tor/server/authority.z
- [Nothing in the Tor protocol uses this resource yet, but it is useful
- for debugging purposes. Also, the official Tor implementations
- (starting at 0.1.1.x) use this resource to test whether a server's
- own DirPort is reachable.]
-
- A concatenated set of the most recent descriptors for all known servers
- should be available at:
- http://<hostname>/tor/server/all.z
-
- Extra-info documents are available at the URLS
- http://<hostname>/tor/extra/d/...
- http://<hostname>/tor/extra/fp/...
- http://<hostname>/tor/extra/all[.z]
- http://<hostname>/tor/extra/authority[.z]
- (As for /tor/server/ URLs: supports fetching extra-info
- documents by their digest, by the fingerprint of their servers,
- or all at once. When serving by fingerprint, we serve the
- extra-info that corresponds to the descriptor we would serve by
- that fingerprint. Only directory authorities of version
- 0.2.0.1-alpha or later are guaranteed to support the first
- three classes of URLs. Caches may support them, and MUST
- support them if they have advertised "caches-extra-info".)
-
- For debugging, directories SHOULD expose non-compressed objects at URLs like
- the above, but without the final ".z".
- Clients MUST handle compressed concatenated information in two forms:
- - A concatenated list of zlib-compressed objects.
- - A zlib-compressed concatenated list of objects.
- Directory servers MAY generate either format: the former requires less
- CPU, but the latter requires less bandwidth.
-
- Clients SHOULD use upper case letters (A-F) when base16-encoding
- fingerprints. Servers MUST accept both upper and lower case fingerprints
- in requests.
-
-5. Client operation: downloading information
-
- Every Tor that is not a directory server (that is, those that do
- not have a DirPort set) implements this section.
-
-5.1. Downloading network-status documents
-
- Each client maintains a list of directory authorities. Insofar as
- possible, clients SHOULD all use the same list.
-
- Clients try to have a live consensus network-status document at all times.
- A network-status document is "live" if the time in its valid-until field
- has not passed.
-
- If a client is missing a live network-status document, it tries to fetch
- it from a directory cache (or from an authority if it knows no caches).
- On failure, the client waits briefly, then tries that network-status
- document again from another cache. The client does not build circuits
- until it has a live network-status consensus document, and it has
- descriptors for more than 1/4 of the routers that it believes are running.
-
- (Note: clients can and should pick caches based on the network-status
- information they have: once they have first fetched network-status info
- from an authority, they should not need to go to the authority directly
- again.)
-
- To avoid swarming the caches whenever a consensus expires, the
- clients download new consensuses at a randomly chosen time after the
- caches are expected to have a fresh consensus, but before their
- consensus will expire. (This time is chosen uniformly at random from
- the interval between the time 3/4 into the first interval after the
- consensus is no longer fresh, and 7/8 of the time remaining after
- that before the consensus is invalid.)
-
- [For example, if a cache has a consensus that became valid at 1:00,
- and is fresh until 2:00, and expires at 4:00, that cache will fetch
- a new consensus at a random time between 2:45 and 3:50, since 3/4
- of the one-hour interval is 45 minutes, and 7/8 of the remaining 75
- minutes is 65 minutes.]
-
-5.2. Downloading and storing router descriptors
-
- Clients try to have the best descriptor for each router. A descriptor is
- "best" if:
- * It is listed in the consensus network-status document.
-
- Periodically (currently every 10 seconds) clients check whether there are
- any "downloadable" descriptors. A descriptor is downloadable if:
- - It is the "best" descriptor for some router.
- - The descriptor was published at least 10 minutes in the past.
- (This prevents clients from trying to fetch descriptors that the
- mirrors have probably not yet retrieved and cached.)
- - The client does not currently have it.
- - The client is not currently trying to download it.
- - The client would not discard it immediately upon receiving it.
- - The client thinks it is running and valid (see 6.1 below).
-
- If at least 16 known routers have downloadable descriptors, or if
- enough time (currently 10 minutes) has passed since the last time the
- client tried to download descriptors, it launches requests for all
- downloadable descriptors, as described in 5.3 below.
-
- When a descriptor download fails, the client notes it, and does not
- consider the descriptor downloadable again until a certain amount of time
- has passed. (Currently 0 seconds for the first failure, 60 seconds for the
- second, 5 minutes for the third, 10 minutes for the fourth, and 1 day
- thereafter.) Periodically (currently once an hour) clients reset the
- failure count.
-
- Clients retain the most recent descriptor they have downloaded for each
- router so long as it is not too old (currently, 48 hours), OR so long as
- no better descriptor has been downloaded for the same router.
-
- [Versions of Tor before 0.1.2.3-alpha would discard descriptors simply for
- being published too far in the past.] [The code seems to discard
- descriptors in all cases after they're 5 days old. True? -RD]
-
-5.3. Managing downloads
-
- When a client has no consensus network-status document, it downloads it
- from a randomly chosen authority. In all other cases, the client
- downloads from caches randomly chosen from among those believed to be V2
- directory servers. (This information comes from the network-status
- documents; see 6 below.)
-
- When downloading multiple router descriptors, the client chooses multiple
- mirrors so that:
- - At least 3 different mirrors are used, except when this would result
- in more than one request for under 4 descriptors.
- - No more than 128 descriptors are requested from a single mirror.
- - Otherwise, as few mirrors as possible are used.
- After choosing mirrors, the client divides the descriptors among them
- randomly.
-
- After receiving any response client MUST discard any network-status
- documents and descriptors that it did not request.
-
-6. Using directory information
-
- Everyone besides directory authorities uses the approaches in this section
- to decide which servers to use and what their keys are likely to be.
- (Directory authorities just believe their own opinions, as in 3.1 above.)
-
-6.1. Choosing routers for circuits.
-
- Circuits SHOULD NOT be built until the client has enough directory
- information: a live consensus network status [XXXX fallback?] and
- descriptors for at least 1/4 of the servers believed to be running.
-
- A server is "listed" if it is included by the consensus network-status
- document. Clients SHOULD NOT use unlisted servers.
-
- These flags are used as follows:
-
- - Clients SHOULD NOT use non-'Valid' or non-'Running' routers unless
- requested to do so.
-
- - Clients SHOULD NOT use non-'Fast' routers for any purpose other than
- very-low-bandwidth circuits (such as introduction circuits).
-
- - Clients SHOULD NOT use non-'Stable' routers for circuits that are
- likely to need to be open for a very long time (such as those used for
- IRC or SSH connections).
-
- - Clients SHOULD NOT choose non-'Guard' nodes when picking entry guard
- nodes.
-
- - Clients SHOULD NOT download directory information from non-'V2Dir'
- caches.
-
- See the "path-spec.txt" document for more details.
-
-6.2. Managing naming
-
- In order to provide human-memorable names for individual server
- identities, some directory servers bind names to IDs. Clients handle
- names in two ways:
-
- When a client encounters a name it has not mapped before:
-
- If the consensus lists any router with that name as "Named", or if
- consensus-method 2 or later is in use and the consensus lists any
- router with that name as having the "Unnamed" flag, then the name is
- bound. (It's bound to the ID listed in the entry with the Named,
- or to an unknown ID if no name is found.)
-
- When the user refers to a bound name, the implementation SHOULD provide
- only the router with ID bound to that name, and no other router, even
- if the router with the right ID can't be found.
-
- When a user tries to refer to a non-bound name, the implementation SHOULD
- warn the user. After warning the user, the implementation MAY use any
- router that advertises the name.
-
- Not every router needs a nickname. When a router doesn't configure a
- nickname, it publishes with the default nickname "Unnamed". Authorities
- SHOULD NOT ever mark a router with this nickname as Named; client software
- SHOULD NOT ever use a router in response to a user request for a router
- called "Unnamed".
-
-6.3. Software versions
-
- An implementation of Tor SHOULD warn when it has fetched a consensus
- network-status, and it is running a software version not listed.
-
-6.4. Warning about a router's status.
-
- If a router tries to publish its descriptor to a Naming authority
- that has its nickname mapped to another key, the router SHOULD
- warn the operator that it is either using the wrong key or is using
- an already claimed nickname.
-
- If a router has fetched a consensus document,, and the
- authorities do not publish a binding for the router's nickname, the
- router MAY remind the operator that the chosen nickname is not
- bound to this key at the authorities, and suggest contacting the
- authority operators.
-
- ...
-
-6.5. Router protocol versions
-
- A client should believe that a router supports a given feature if that
- feature is supported by the router or protocol versions in more than half
- of the live networkstatuses' "v" entries for that router. In other words,
- if the "v" entries for some router are:
- v Tor 0.0.8pre1 (from authority 1)
- v Tor 0.1.2.11 (from authority 2)
- v FutureProtocolDescription 99 (from authority 3)
- then the client should believe that the router supports any feature
- supported by 0.1.2.11.
-
- This is currently equivalent to believing the median declared version for
- a router in all live networkstatuses.
-
-7. Standards compliance
-
- All clients and servers MUST support HTTP 1.0. Clients and servers MAY
- support later versions of HTTP as well.
-
-7.1. HTTP headers
-
- Servers MAY set the Content-Length: header. Servers SHOULD set
- Content-Encoding to "deflate" or "identity".
-
- Servers MAY include an X-Your-Address-Is: header, whose value is the
- apparent IP address of the client connecting to them (as a dotted quad).
- For directory connections tunneled over a BEGIN_DIR stream, servers SHOULD
- report the IP from which the circuit carrying the BEGIN_DIR stream reached
- them. [Servers before version 0.1.2.5-alpha reported 127.0.0.1 for all
- BEGIN_DIR-tunneled connections.]
-
- Servers SHOULD disable caching of multiple network statuses or multiple
- router descriptors. Servers MAY enable caching of single descriptors,
- single network statuses, the list of all router descriptors, a v1
- directory, or a v1 running routers document. XXX mention times.
-
-7.2. HTTP status codes
-
- Tor delivers the following status codes. Some were chosen without much
- thought; other code SHOULD NOT rely on specific status codes yet.
-
- 200 -- the operation completed successfully
- -- the user requested statuses or serverdescs, and none of the ones we
- requested were found (0.2.0.4-alpha and earlier).
-
- 304 -- the client specified an if-modified-since time, and none of the
- requested resources have changed since that time.
-
- 400 -- the request is malformed, or
- -- the URL is for a malformed variation of one of the URLs we support,
- or
- -- the client tried to post to a non-authority, or
- -- the authority rejected a malformed posted document, or
-
- 404 -- the requested document was not found.
- -- the user requested statuses or serverdescs, and none of the ones
- requested were found (0.2.0.5-alpha and later).
-
- 503 -- we are declining the request in order to save bandwidth
- -- user requested some items that we ordinarily generate or store,
- but we do not have any available.
-
-9. Backward compatibility and migration plans
-
- Until Tor versions before 0.1.1.x are completely obsolete, directory
- authorities should generate, and mirrors should download and cache, v1
- directories and running-routers lists, and allow old clients to download
- them. These documents and the rules for retrieving, serving, and caching
- them are described in dir-spec-v1.txt.
-
- Until Tor versions before 0.2.0.x are completely obsolete, directory
- authorities should generate, mirrors should download and cache, v2
- network-status documents, and allow old clients to download them.
- Additionally, all directory servers and caches should download, store, and
- serve any router descriptor that is required because of v2 network-status
- documents. These documents and the rules for retrieving, serving, and
- caching them are described in dir-spec-v1.txt.
-
-A. Consensus-negotiation timeline.
-
- Period begins: this is the Published time.
- Everybody sends votes
- Reconciliation: everybody tries to fetch missing votes.
- consensus may exist at this point.
- End of voting period:
- everyone swaps signatures.
- Now it's okay for caches to download
- Now it's okay for clients to download.
-
- Valid-after/valid-until switchover
-
diff --git a/doc/spec/path-spec.txt b/doc/spec/path-spec.txt
deleted file mode 100644
index 7c313f8ab..000000000
--- a/doc/spec/path-spec.txt
+++ /dev/null
@@ -1,657 +0,0 @@
-
- Tor Path Specification
-
- Roger Dingledine
- Nick Mathewson
-
-Note: This is an attempt to specify Tor as currently implemented. Future
-versions of Tor will implement improved algorithms.
-
-This document tries to cover how Tor chooses to build circuits and assign
-streams to circuits. Other implementations MAY take other approaches, but
-implementors should be aware of the anonymity and load-balancing implications
-of their choices.
-
- THIS SPEC ISN'T DONE YET.
-
- The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
- NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
- "OPTIONAL" in this document are to be interpreted as described in
- RFC 2119.
-
-1. General operation
-
- Tor begins building circuits as soon as it has enough directory
- information to do so (see section 5 of dir-spec.txt). Some circuits are
- built preemptively because we expect to need them later (for user
- traffic), and some are built because of immediate need (for user traffic
- that no current circuit can handle, for testing the network or our
- reachability, and so on).
-
- When a client application creates a new stream (by opening a SOCKS
- connection or launching a resolve request), we attach it to an appropriate
- open circuit if one exists, or wait if an appropriate circuit is
- in-progress. We launch a new circuit only
- if no current circuit can handle the request. We rotate circuits over
- time to avoid some profiling attacks.
-
- To build a circuit, we choose all the nodes we want to use, and then
- construct the circuit. Sometimes, when we want a circuit that ends at a
- given hop, and we have an appropriate unused circuit, we "cannibalize" the
- existing circuit and extend it to the new terminus.
-
- These processes are described in more detail below.
-
- This document describes Tor's automatic path selection logic only; path
- selection can be overridden by a controller (with the EXTENDCIRCUIT and
- ATTACHSTREAM commands). Paths constructed through these means may
- violate some constraints given below.
-
-1.1. Terminology
-
- A "path" is an ordered sequence of nodes, not yet built as a circuit.
-
- A "clean" circuit is one that has not yet been used for any traffic.
-
- A "fast" or "stable" or "valid" node is one that has the 'Fast' or
- 'Stable' or 'Valid' flag
- set respectively, based on our current directory information. A "fast"
- or "stable" circuit is one consisting only of "fast" or "stable" nodes.
-
- In an "exit" circuit, the final node is chosen based on waiting stream
- requests if any, and in any case it avoids nodes with exit policy of
- "reject *:*". An "internal" circuit, on the other hand, is one where
- the final node is chosen just like a middle node (ignoring its exit
- policy).
-
- A "request" is a client-side stream or DNS resolve that needs to be
- served by a circuit.
-
- A "pending" circuit is one that we have started to build, but which has
- not yet completed.
-
- A circuit or path "supports" a request if it is okay to use the
- circuit/path to fulfill the request, according to the rules given below.
- A circuit or path "might support" a request if some aspect of the request
- is unknown (usually its target IP), but we believe the path probably
- supports the request according to the rules given below.
-
-1.1. A server's bandwidth
-
- Old versions of Tor did not report bandwidths in network status
- documents, so clients had to learn them from the routers' advertised
- server descriptors.
-
- For versions of Tor prior to 0.2.1.17-rc, everywhere below where we
- refer to a server's "bandwidth", we mean its clipped advertised
- bandwidth, computed by taking the smaller of the 'rate' and
- 'observed' arguments to the "bandwidth" element in the server's
- descriptor. If a router's advertised bandwidth is greater than
- MAX_BELIEVABLE_BANDWIDTH (currently 10 MB/s), we clipped to that
- value.
-
- For more recent versions of Tor, we take the bandwidth value declared
- in the consensus, and fall back to the clipped advertised bandwidth
- only if the consensus does not have bandwidths listed.
-
-2. Building circuits
-
-2.1. When we build
-
-2.1.1. Clients build circuits preemptively
-
- When running as a client, Tor tries to maintain at least a certain
- number of clean circuits, so that new streams can be handled
- quickly. To increase the likelihood of success, Tor tries to
- predict what circuits will be useful by choosing from among nodes
- that support the ports we have used in the recent past (by default
- one hour). Specifically, on startup Tor tries to maintain one clean
- fast exit circuit that allows connections to port 80, and at least
- two fast clean stable internal circuits in case we get a resolve
- request or hidden service request (at least three if we _run_ a
- hidden service).
-
- After that, Tor will adapt the circuits that it preemptively builds
- based on the requests it sees from the user: it tries to have two fast
- clean exit circuits available for every port seen within the past hour
- (each circuit can be adequate for many predicted ports -- it doesn't
- need two separate circuits for each port), and it tries to have the
- above internal circuits available if we've seen resolves or hidden
- service activity within the past hour. If there are 12 or more clean
- circuits open, it doesn't open more even if it has more predictions.
-
- Only stable circuits can "cover" a port that is listed in the
- LongLivedPorts config option. Similarly, hidden service requests
- to ports listed in LongLivedPorts make us create stable internal
- circuits.
-
- Note that if there are no requests from the user for an hour, Tor
- will predict no use and build no preemptive circuits.
-
- The Tor client SHOULD NOT store its list of predicted requests to a
- persistent medium.
-
-2.1.2. Clients build circuits on demand
-
- Additionally, when a client request exists that no circuit (built or
- pending) might support, we create a new circuit to support the request.
- For exit connections, we pick an exit node that will handle the
- most pending requests (choosing arbitrarily among ties), launch a
- circuit to end there, and repeat until every unattached request
- might be supported by a pending or built circuit. For internal
- circuits, we pick an arbitrary acceptable path, repeating as needed.
-
- In some cases we can reuse an already established circuit if it's
- clean; see Section 2.3 (cannibalizing circuits) for details.
-
-2.1.3. Servers build circuits for testing reachability and bandwidth
-
- Tor servers test reachability of their ORPort once they have
- successfully built a circuit (on start and whenever their IP address
- changes). They build an ordinary fast internal circuit with themselves
- as the last hop. As soon as any testing circuit succeeds, the Tor
- server decides it's reachable and is willing to publish a descriptor.
-
- We launch multiple testing circuits (one at a time), until we
- have NUM_PARALLEL_TESTING_CIRC (4) such circuits open. Then we
- do a "bandwidth test" by sending a certain number of relay drop
- cells down each circuit: BandwidthRate * 10 / CELL_NETWORK_SIZE
- total cells divided across the four circuits, but never more than
- CIRCWINDOW_START (1000) cells total. This exercises both outgoing and
- incoming bandwidth, and helps to jumpstart the observed bandwidth
- (see dir-spec.txt).
-
- Tor servers also test reachability of their DirPort once they have
- established a circuit, but they use an ordinary exit circuit for
- this purpose.
-
-2.1.4. Hidden-service circuits
-
- See section 4 below.
-
-2.1.5. Rate limiting of failed circuits
-
- If we fail to build a circuit N times in a X second period (see Section
- 2.3 for how this works), we stop building circuits until the X seconds
- have elapsed.
- XXXX
-
-2.1.6. When to tear down circuits
-
- XXXX
-
-
-2.2. Path selection and constraints
-
- We choose the path for each new circuit before we build it. We choose the
- exit node first, followed by the other nodes in the circuit. All paths
- we generate obey the following constraints:
- - We do not choose the same router twice for the same path.
- - We do not choose any router in the same family as another in the same
- path.
- - We do not choose more than one router in a given /16 subnet
- (unless EnforceDistinctSubnets is 0).
- - We don't choose any non-running or non-valid router unless we have
- been configured to do so. By default, we are configured to allow
- non-valid routers in "middle" and "rendezvous" positions.
- - If we're using Guard nodes, the first node must be a Guard (see 5
- below)
- - XXXX Choosing the length
-
- For "fast" circuits, we only choose nodes with the Fast flag. For
- non-"fast" circuits, all nodes are eligible.
-
- For all circuits, we weight node selection according to router bandwidth.
-
- We also weight the bandwidth of Exit and Guard flagged nodes depending on
- the fraction of total bandwidth that they make up and depending upon the
- position they are being selected for.
-
- These weights are published in the consensus, and are computed as described
- in Section 3.4.3 of dir-spec.txt. They are:
-
- Wgg - Weight for Guard-flagged nodes in the guard position
- Wgm - Weight for non-flagged nodes in the guard Position
- Wgd - Weight for Guard+Exit-flagged nodes in the guard Position
-
- Wmg - Weight for Guard-flagged nodes in the middle Position
- Wmm - Weight for non-flagged nodes in the middle Position
- Wme - Weight for Exit-flagged nodes in the middle Position
- Wmd - Weight for Guard+Exit flagged nodes in the middle Position
-
- Weg - Weight for Guard flagged nodes in the exit Position
- Wem - Weight for non-flagged nodes in the exit Position
- Wee - Weight for Exit-flagged nodes in the exit Position
- Wed - Weight for Guard+Exit-flagged nodes in the exit Position
-
- Wgb - Weight for BEGIN_DIR-supporting Guard-flagged nodes
- Wmb - Weight for BEGIN_DIR-supporting non-flagged nodes
- Web - Weight for BEGIN_DIR-supporting Exit-flagged nodes
- Wdb - Weight for BEGIN_DIR-supporting Guard+Exit-flagged nodes
-
- Wbg - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests
- Wbm - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests
- Wbe - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests
- Wbd - Weight for Guard+Exit-flagged nodes for BEGIN_DIR requests
-
- Additionally, we may be building circuits with one or more requests in
- mind. Each kind of request puts certain constraints on paths:
-
- - All service-side introduction circuits and all rendezvous paths
- should be Stable.
- - All connection requests for connections that we think will need to
- stay open a long time require Stable circuits. Currently, Tor decides
- this by examining the request's target port, and comparing it to a
- list of "long-lived" ports. (Default: 21, 22, 706, 1863, 5050,
- 5190, 5222, 5223, 6667, 6697, 8300.)
- - DNS resolves require an exit node whose exit policy is not equivalent
- to "reject *:*".
- - Reverse DNS resolves require a version of Tor with advertised eventdns
- support (available in Tor 0.1.2.1-alpha-dev and later).
- - All connection requests require an exit node whose exit policy
- supports their target address and port (if known), or which "might
- support it" (if the address isn't known). See 2.2.1.
- - Rules for Fast? XXXXX
-
-2.2.1. Choosing an exit
-
- If we know what IP address we want to connect to or resolve, we can
- trivially tell whether a given router will support it by simulating
- its declared exit policy.
-
- Because we often connect to addresses of the form hostname:port, we do not
- always know the target IP address when we select an exit node. In these
- cases, we need to pick an exit node that "might support" connections to a
- given address port with an unknown address. An exit node "might support"
- such a connection if any clause that accepts any connections to that port
- precedes all clauses (if any) that reject all connections to that port.
-
- Unless requested to do so by the user, we never choose an exit server
- flagged as "BadExit" by more than half of the authorities who advertise
- themselves as listing bad exits.
-
-2.2.2. User configuration
-
- Users can alter the default behavior for path selection with configuration
- options.
-
- - If "ExitNodes" is provided, then every request requires an exit node on
- the ExitNodes list. (If a request is supported by no nodes on that list,
- and StrictExitNodes is false, then Tor treats that request as if
- ExitNodes were not provided.)
-
- - "EntryNodes" and "StrictEntryNodes" behave analogously.
-
- - If a user tries to connect to or resolve a hostname of the form
- <target>.<servername>.exit, the request is rewritten to a request for
- <target>, and the request is only supported by the exit whose nickname
- or fingerprint is <servername>.
-
-2.3. Cannibalizing circuits
-
- If we need a circuit and have a clean one already established, in
- some cases we can adapt the clean circuit for our new
- purpose. Specifically,
-
- For hidden service interactions, we can "cannibalize" a clean internal
- circuit if one is available, so we don't need to build those circuits
- from scratch on demand.
-
- We can also cannibalize clean circuits when the client asks to exit
- at a given node -- either via the ".exit" notation or because the
- destination is running at the same location as an exit node.
-
-2.4. Learning when to give up ("timeout") on circuit construction
-
- Since version 0.2.2.8-alpha, Tor attempts to learn when to give up on
- circuits based on network conditions.
-
-2.4.1 Distribution choice and parameter estimation
-
- Based on studies of build times, we found that the distribution of
- circuit build times appears to be a Frechet distribution. However,
- estimators and quantile functions of the Frechet distribution are
- difficult to work with and slow to converge. So instead, since we
- are only interested in the accuracy of the tail, we approximate
- the tail of the distribution with a Pareto curve.
-
- We calculate the parameters for a Pareto distribution fitting the data
- using the estimators in equation 4 from:
- http://portal.acm.org/citation.cfm?id=1647962.1648139
-
- This is:
-
- alpha_m = s/(ln(U(X)/Xm^n))
-
- where s is the total number of completed circuits we have seen, and
-
- U(X) = x_max^u * Prod_s{x_i}
-
- with x_i as our i-th completed circuit time, x_max as the longest
- completed circuit build time we have yet observed, u as the
- number of unobserved timeouts that have no exact value recorded,
- and n as u+s, the total number of circuits that either timeout or
- complete.
-
- Using log laws, we compute this as the sum of logs to avoid
- overflow and ln(1.0+epsilon) precision issues:
-
- alpha_m = s/(u*ln(x_max) + Sum_s{ln(x_i)} - n*ln(Xm))
-
- This estimator is closely related to the parameters present in:
- http://en.wikipedia.org/wiki/Pareto_distribution#Parameter_estimation
- except they are adjusted to handle the fact that our samples are
- right-censored at the timeout cutoff.
-
- Additionally, because this is not a true Pareto distribution, we alter
- how Xm is computed. The Xm parameter is computed as the midpoint of the most
- frequently occurring 50ms histogram bin, until the point where 1000
- circuits are recorded. After this point, the weighted average of the top
- 'cbtnummodes' (default: 3) midpoint modes is used as Xm. All times below
- this value are counted as having the midpoint value of this weighted average bin.
-
- The timeout itself is calculated by using the Pareto Quantile function (the
- inverted CDF) to give us the value on the CDF such that 80% of the mass
- of the distribution is below the timeout value.
-
- Thus, we expect that the Tor client will accept the fastest 80% of
- the total number of paths on the network.
-
-2.4.2. How much data to record
-
- From our observations, the minimum number of circuit build times for a
- reasonable fit appears to be on the order of 100. However, to keep a
- good fit over the long term, we store 1000 most recent circuit build times
- in a circular array.
-
- The Tor client should build test circuits at a rate of one per
- minute up until 100 circuits are built. This allows a fresh Tor to have
- a CircuitBuildTimeout estimated within 1.5 hours after install,
- upgrade, or network change (see below).
-
- Timeouts are stored on disk in a histogram of 50ms bin width, the same
- width used to calculate the Xm value above. This histogram must be shuffled
- after being read from disk, to preserve a proper expiration of old values
- after restart.
-
-2.4.3. How to record timeouts
-
- Circuits that pass the timeout threshold should be allowed to continue
- building until a time corresponding to the point 'cbtclosequantile'
- (default 95) on the Pareto curve, or 60 seconds, whichever is greater.
-
- The actual completion times for these circuits should be recorded.
- Implementations should completely abandon a circuit and record a value
- as an 'unknown' timeout if the total build time exceeds this threshold.
-
- The reason for this is that right-censored pareto estimators begin to lose
- their accuracy if more than approximately 5% of the values are censored.
- Since we wish to set the cutoff at 20%, we must allow circuits to continue
- building past this cutoff point up to the 95th percentile.
-
-2.4.4. Detecting Changing Network Conditions
-
- We attempt to detect both network connectivity loss and drastic
- changes in the timeout characteristics.
-
- We assume that we've had network connectivity loss if 3 circuits
- timeout and we've received no cells or TLS handshakes since those
- circuits began. We then temporarily set the timeout to 60 seconds
- and stop counting timeouts.
-
- If 3 more circuits timeout and the network still has not been
- live within this new 60 second timeout window, we then discard
- the previous timeouts during this period from our history.
-
- To detect changing network conditions, we keep a history of
- the timeout or non-timeout status of the past 20 circuits that
- successfully completed at least one hop. If more than 90% of
- these circuits timeout, we discard all buildtimes history, reset
- the timeout to 60, and then begin recomputing the timeout.
-
- If the timeout was already 60 or higher, we double the timeout.
-
-2.4.5. Consensus parameters governing behavior
-
- Clients that implement circuit build timeout learning should obey the
- following consensus parameters that govern behavior, in order to allow
- us to handle bugs or other emergent behaviors due to client circuit
- construction. If these parameters are not present in the consensus,
- the listed default values should be used instead.
-
- cbtdisabled
- Default: 0
- Min: 0
- Max: 1
- Effect: If 1, all CircuitBuildTime learning code should be
- disabled and history should be discarded. For use in
- emergency situations only.
-
- cbtnummodes
- Default: 3
- Min: 1
- Max: 20
- Effect: This value governs how many modes to use in the weighted
- average calculation of Pareto parameter Xm. A value of 3 introduces
- some bias (2-5% of CDF) under ideal conditions, but allows for better
- performance in the event that a client chooses guard nodes of radically
- different performance characteristics.
-
- cbtrecentcount
- Default: 20
- Min: 3
- Max: 1000
- Effect: This is the number of circuit build times to keep track of
- for the following option.
-
- cbtmaxtimeouts
- Default: 18
- Min: 3
- Max: 10000
- Effect: When this many timeouts happen in the last 'cbtrecentcount'
- circuit attempts, the client should discard all of its
- history and begin learning a fresh timeout value.
-
- cbtmincircs
- Default: 100
- Min: 1
- Max: 10000
- Effect: This is the minimum number of circuits to build before
- computing a timeout.
-
- cbtquantile
- Default: 80
- Min: 10
- Max: 99
- Effect: This is the position on the quantile curve to use to set the
- timeout value. It is a percent (10-99).
-
- cbtclosequantile
- Default: 95
- Min: Value of cbtquantile parameter
- Max: 99
- Effect: This is the position on the quantile curve to use to set the
- timeout value to use to actually close circuits. It is a percent
- (0-99).
-
- cbttestfreq
- Default: 60
- Min: 1
- Max: 2147483647 (INT32_MAX)
- Effect: Describes how often in seconds to build a test circuit to
- gather timeout values. Only applies if less than 'cbtmincircs'
- have been recorded.
-
- cbtmintimeout
- Default: 2000
- Min: 500
- Max: 2147483647 (INT32_MAX)
- Effect: This is the minimum allowed timeout value in milliseconds.
- The minimum is to prevent rounding to 0 (we only check once
- per second).
-
- cbtinitialtimeout
- Default: 60000
- Min: Value of cbtmintimeout
- Max: 2147483647 (INT32_MAX)
- Effect: This is the timeout value to use before computing a timeout,
- in milliseconds.
-
-
-2.5. Handling failure
-
- If an attempt to extend a circuit fails (either because the first create
- failed or a subsequent extend failed) then the circuit is torn down and is
- no longer pending. (XXXX really?) Requests that might have been
- supported by the pending circuit thus become unsupported, and a new
- circuit needs to be constructed.
-
- If a stream "begin" attempt fails with an EXITPOLICY error, we
- decide that the exit node's exit policy is not correctly advertised,
- so we treat the exit node as if it were a non-exit until we retrieve
- a fresh descriptor for it.
-
- XXXX
-
-3. Attaching streams to circuits
-
- When a circuit that might support a request is built, Tor tries to attach
- the request's stream to the circuit and sends a BEGIN, BEGIN_DIR,
- or RESOLVE relay
- cell as appropriate. If the request completes unsuccessfully, Tor
- considers the reason given in the CLOSE relay cell. [XXX yes, and?]
-
-
- After a request has remained unattached for SocksTimeout (2 minutes
- by default), Tor abandons the attempt and signals an error to the
- client as appropriate (e.g., by closing the SOCKS connection).
-
- XXX Timeouts and when Tor auto-retries.
- * What stream-end-reasons are appropriate for retrying.
-
- If no reply to BEGIN/RESOLVE, then the stream will timeout and fail.
-
-4. Hidden-service related circuits
-
- XXX Tracking expected hidden service use (client-side and hidserv-side)
-
-5. Guard nodes
-
- We use Guard nodes (also called "helper nodes" in the literature) to
- prevent certain profiling attacks. Here's the risk: if we choose entry and
- exit nodes at random, and an attacker controls C out of N servers
- (ignoring bandwidth), then the
- attacker will control the entry and exit node of any given circuit with
- probability (C/N)^2. But as we make many different circuits over time,
- then the probability that the attacker will see a sample of about (C/N)^2
- of our traffic goes to 1. Since statistical sampling works, the attacker
- can be sure of learning a profile of our behavior.
-
- If, on the other hand, we picked an entry node and held it fixed, we would
- have probability C/N of choosing a bad entry and being profiled, and
- probability (N-C)/N of choosing a good entry and not being profiled.
-
- When guard nodes are enabled, Tor maintains an ordered list of entry nodes
- as our chosen guards, and stores this list persistently to disk. If a Guard
- node becomes unusable, rather than replacing it, Tor adds new guards to the
- end of the list. When choosing the first hop of a circuit, Tor
- chooses at
- random from among the first NumEntryGuards (default 3) usable guards on the
- list. If there are not at least 2 usable guards on the list, Tor adds
- routers until there are, or until there are no more usable routers to add.
-
- A guard is unusable if any of the following hold:
- - it is not marked as a Guard by the networkstatuses,
- - it is not marked Valid (and the user hasn't set AllowInvalid entry)
- - it is not marked Running
- - Tor couldn't reach it the last time it tried to connect
-
- A guard is unusable for a particular circuit if any of the rules for path
- selection in 2.2 are not met. In particular, if the circuit is "fast"
- and the guard is not Fast, or if the circuit is "stable" and the guard is
- not Stable, or if the guard has already been chosen as the exit node in
- that circuit, Tor can't use it as a guard node for that circuit.
-
- If the guard is excluded because of its status in the networkstatuses for
- over 30 days, Tor removes it from the list entirely, preserving order.
-
- If Tor fails to connect to an otherwise usable guard, it retries
- periodically: every hour for six hours, every 4 hours for 3 days, every
- 18 hours for a week, and every 36 hours thereafter. Additionally, Tor
- retries unreachable guards the first time it adds a new guard to the list,
- since it is possible that the old guards were only marked as unreachable
- because the network was unreachable or down.
-
- Tor does not add a guard persistently to the list until the first time we
- have connected to it successfully.
-
-6. Router descriptor purposes
-
- There are currently three "purposes" supported for router descriptors:
- general, controller, and bridge. Most descriptors are of type general
- -- these are the ones listed in the consensus, and the ones fetched
- and used in normal cases.
-
- Controller-purpose descriptors are those delivered by the controller
- and labelled as such: they will be kept around (and expire like
- normal descriptors), and they can be used by the controller in its
- CIRCUITEXTEND commands. Otherwise they are ignored by Tor when it
- chooses paths.
-
- Bridge-purpose descriptors are for routers that are used as bridges. See
- doc/design-paper/blocking.pdf for more design explanation, or proposal
- 125 for specific details. Currently bridge descriptors are used in place
- of normal entry guards, for Tor clients that have UseBridges enabled.
-
-
-X. Old notes
-
-X.1. Do we actually do this?
-
-How to deal with network down.
- - While all helpers are down/unreachable and there are no established
- or on-the-way testing circuits, launch a testing circuit. (Do this
- periodically in the same way we try to establish normal circuits
- when things are working normally.)
- (Testing circuits are a special type of circuit, that streams won't
- attach to by accident.)
- - When a testing circuit succeeds, mark all helpers up and hold
- the testing circuit open.
- - If a connection to a helper succeeds, close all testing circuits.
- Else mark that helper down and try another.
- - If the last helper is marked down and we already have a testing
- circuit established, then add the first hop of that testing circuit
- to the end of our helper node list, close that testing circuit,
- and go back to square one. (Actually, rather than closing the
- testing circuit, can we get away with converting it to a normal
- circuit and beginning to use it immediately?)
-
- [Do we actually do any of the above? If so, let's spec it. If not, let's
- remove it. -NM]
-
-X.2. A thing we could do to deal with reachability.
-
-And as a bonus, it leads to an answer to Nick's attack ("If I pick
-my helper nodes all on 18.0.0.0:*, then I move, you'll know where I
-bootstrapped") -- the answer is to pick your original three helper nodes
-without regard for reachability. Then the above algorithm will add some
-more that are reachable for you, and if you move somewhere, it's more
-likely (though not certain) that some of the originals will become useful.
-Is that smart or just complex?
-
-X.3. Some stuff that worries me about entry guards. 2006 Jun, Nickm.
-
- It is unlikely for two users to have the same set of entry guards.
- Observing a user is sufficient to learn its entry guards. So, as we move
- around, entry guards make us linkable. If we want to change guards when
- our location (IP? subnet?) changes, we have two bad options. We could
- - Drop the old guards. But if we go back to our old location,
- we'll not use our old guards. For a laptop that sometimes gets used
- from work and sometimes from home, this is pretty fatal.
- - Remember the old guards as associated with the old location, and use
- them again if we ever go back to the old location. This would be
- nasty, since it would force us to record where we've been.
-
- [Do we do any of this now? If not, this should move into 099-misc or
- 098-todo. -NM]
-
diff --git a/doc/spec/proposals/000-index.txt b/doc/spec/proposals/000-index.txt
deleted file mode 100644
index 580ce36fa..000000000
--- a/doc/spec/proposals/000-index.txt
+++ /dev/null
@@ -1,196 +0,0 @@
-Filename: 000-index.txt
-Title: Index of Tor Proposals
-Author: Nick Mathewson
-Created: 26-Jan-2007
-Status: Meta
-
-Overview:
-
- This document provides an index to Tor proposals.
-
- This is an informational document.
-
- Everything in this document below the line of '=' signs is automatically
- generated by reindex.py; do not edit by hand.
-
-============================================================
-Proposals by number:
-
-000 Index of Tor Proposals [META]
-001 The Tor Proposal Process [META]
-098 Proposals that should be written [META]
-099 Miscellaneous proposals [META]
-100 Tor Unreliable Datagram Extension Proposal [DEAD]
-101 Voting on the Tor Directory System [CLOSED]
-102 Dropping "opt" from the directory format [CLOSED]
-103 Splitting identity key from regularly used signing key [CLOSED]
-104 Long and Short Router Descriptors [CLOSED]
-105 Version negotiation for the Tor protocol [CLOSED]
-106 Checking fewer things during TLS handshakes [CLOSED]
-107 Uptime Sanity Checking [CLOSED]
-108 Base "Stable" Flag on Mean Time Between Failures [CLOSED]
-109 No more than one server per IP address [CLOSED]
-110 Avoiding infinite length circuits [ACCEPTED]
-111 Prioritizing local traffic over relayed traffic [CLOSED]
-112 Bring Back Pathlen Coin Weight [SUPERSEDED]
-113 Simplifying directory authority administration [SUPERSEDED]
-114 Distributed Storage for Tor Hidden Service Descriptors [CLOSED]
-115 Two Hop Paths [DEAD]
-116 Two hop paths from entry guards [DEAD]
-117 IPv6 exits [ACCEPTED]
-118 Advertising multiple ORPorts at once [ACCEPTED]
-119 New PROTOCOLINFO command for controllers [CLOSED]
-120 Shutdown descriptors when Tor servers stop [DEAD]
-121 Hidden Service Authentication [FINISHED]
-122 Network status entries need a new Unnamed flag [CLOSED]
-123 Naming authorities automatically create bindings [CLOSED]
-124 Blocking resistant TLS certificate usage [SUPERSEDED]
-125 Behavior for bridge users, bridge relays, and bridge authorities [CLOSED]
-126 Getting GeoIP data and publishing usage summaries [CLOSED]
-127 Relaying dirport requests to Tor download site / website [DRAFT]
-128 Families of private bridges [DEAD]
-129 Block Insecure Protocols by Default [CLOSED]
-130 Version 2 Tor connection protocol [CLOSED]
-131 Help users to verify they are using Tor [NEEDS-REVISION]
-132 A Tor Web Service For Verifying Correct Browser Configuration [DRAFT]
-133 Incorporate Unreachable ORs into the Tor Network [DRAFT]
-134 More robust consensus voting with diverse authority sets [REJECTED]
-135 Simplify Configuration of Private Tor Networks [CLOSED]
-136 Mass authority migration with legacy keys [CLOSED]
-137 Keep controllers informed as Tor bootstraps [CLOSED]
-138 Remove routers that are not Running from consensus documents [CLOSED]
-139 Download consensus documents only when it will be trusted [CLOSED]
-140 Provide diffs between consensuses [ACCEPTED]
-141 Download server descriptors on demand [DRAFT]
-142 Combine Introduction and Rendezvous Points [DEAD]
-143 Improvements of Distributed Storage for Tor Hidden Service Descriptors [OPEN]
-144 Increase the diversity of circuits by detecting nodes belonging the same provider [DRAFT]
-145 Separate "suitable as a guard" from "suitable as a new guard" [OPEN]
-146 Add new flag to reflect long-term stability [OPEN]
-147 Eliminate the need for v2 directories in generating v3 directories [ACCEPTED]
-148 Stream end reasons from the client side should be uniform [CLOSED]
-149 Using data from NETINFO cells [OPEN]
-150 Exclude Exit Nodes from a circuit [CLOSED]
-151 Improving Tor Path Selection [FINISHED]
-152 Optionally allow exit from single-hop circuits [CLOSED]
-153 Automatic software update protocol [SUPERSEDED]
-154 Automatic Software Update Protocol [SUPERSEDED]
-155 Four Improvements of Hidden Service Performance [FINISHED]
-156 Tracking blocked ports on the client side [OPEN]
-157 Make certificate downloads specific [ACCEPTED]
-158 Clients download consensus + microdescriptors [OPEN]
-159 Exit Scanning [OPEN]
-160 Authorities vote for bandwidth offsets in consensus [FINISHED]
-161 Computing Bandwidth Adjustments [FINISHED]
-162 Publish the consensus in multiple flavors [OPEN]
-163 Detecting whether a connection comes from a client [OPEN]
-164 Reporting the status of server votes [OPEN]
-165 Easy migration for voting authority sets [OPEN]
-166 Including Network Statistics in Extra-Info Documents [ACCEPTED]
-167 Vote on network parameters in consensus [CLOSED]
-168 Reduce default circuit window [OPEN]
-169 Eliminate TLS renegotiation for the Tor connection handshake [DRAFT]
-170 Configuration options regarding circuit building [DRAFT]
-171 Separate streams across circuits by connection metadata [OPEN]
-172 GETINFO controller option for circuit information [ACCEPTED]
-173 GETINFO Option Expansion [ACCEPTED]
-174 Optimistic Data for Tor: Server Side [OPEN]
-175 Automatically promoting Tor clients to nodes [DRAFT]
-176 Proposed version-3 link handshake for Tor [DRAFT]
-177 Abstaining from votes on individual flags [DRAFT]
-
-
-Proposals by status:
-
- DRAFT:
- 127 Relaying dirport requests to Tor download site / website
- 132 A Tor Web Service For Verifying Correct Browser Configuration
- 133 Incorporate Unreachable ORs into the Tor Network
- 141 Download server descriptors on demand
- 144 Increase the diversity of circuits by detecting nodes belonging the same provider
- 169 Eliminate TLS renegotiation for the Tor connection handshake [for 0.2.2]
- 170 Configuration options regarding circuit building
- 175 Automatically promoting Tor clients to nodes
- 176 Proposed version-3 link handshake for Tor [for 0.2.3]
- 177 Abstaining from votes on individual flags
- NEEDS-REVISION:
- 131 Help users to verify they are using Tor
- OPEN:
- 143 Improvements of Distributed Storage for Tor Hidden Service Descriptors [for 0.2.1.x]
- 145 Separate "suitable as a guard" from "suitable as a new guard" [for 0.2.1.x]
- 146 Add new flag to reflect long-term stability [for 0.2.1.x]
- 149 Using data from NETINFO cells [for 0.2.1.x]
- 156 Tracking blocked ports on the client side [for 0.2.?]
- 158 Clients download consensus + microdescriptors
- 159 Exit Scanning
- 162 Publish the consensus in multiple flavors [for 0.2.2]
- 163 Detecting whether a connection comes from a client [for 0.2.2]
- 164 Reporting the status of server votes [for 0.2.2]
- 165 Easy migration for voting authority sets
- 168 Reduce default circuit window [for 0.2.2]
- 171 Separate streams across circuits by connection metadata
- 174 Optimistic Data for Tor: Server Side
- ACCEPTED:
- 110 Avoiding infinite length circuits [for 0.2.1.x] [in 0.2.1.3-alpha]
- 117 IPv6 exits [for 0.2.1.x]
- 118 Advertising multiple ORPorts at once [for 0.2.1.x]
- 140 Provide diffs between consensuses [for 0.2.2.x]
- 147 Eliminate the need for v2 directories in generating v3 directories [for 0.2.1.x]
- 157 Make certificate downloads specific [for 0.2.1.x]
- 166 Including Network Statistics in Extra-Info Documents [for 0.2.2]
- 172 GETINFO controller option for circuit information
- 173 GETINFO Option Expansion
- META:
- 000 Index of Tor Proposals
- 001 The Tor Proposal Process
- 098 Proposals that should be written
- 099 Miscellaneous proposals
- FINISHED:
- 121 Hidden Service Authentication [in 0.2.1.x]
- 151 Improving Tor Path Selection
- 155 Four Improvements of Hidden Service Performance [in 0.2.1.x]
- 160 Authorities vote for bandwidth offsets in consensus [for 0.2.2.x]
- 161 Computing Bandwidth Adjustments [for 0.2.2.x]
- CLOSED:
- 101 Voting on the Tor Directory System [in 0.2.0.x]
- 102 Dropping "opt" from the directory format [in 0.2.0.x]
- 103 Splitting identity key from regularly used signing key [in 0.2.0.x]
- 104 Long and Short Router Descriptors [in 0.2.0.x]
- 105 Version negotiation for the Tor protocol [in 0.2.0.x]
- 106 Checking fewer things during TLS handshakes [in 0.2.0.x]
- 107 Uptime Sanity Checking [in 0.2.0.x]
- 108 Base "Stable" Flag on Mean Time Between Failures [in 0.2.0.x]
- 109 No more than one server per IP address [in 0.2.0.x]
- 111 Prioritizing local traffic over relayed traffic [in 0.2.0.x]
- 114 Distributed Storage for Tor Hidden Service Descriptors [in 0.2.0.x]
- 119 New PROTOCOLINFO command for controllers [in 0.2.0.x]
- 122 Network status entries need a new Unnamed flag [in 0.2.0.x]
- 123 Naming authorities automatically create bindings [in 0.2.0.x]
- 125 Behavior for bridge users, bridge relays, and bridge authorities [in 0.2.0.x]
- 126 Getting GeoIP data and publishing usage summaries [in 0.2.0.x]
- 129 Block Insecure Protocols by Default [in 0.2.0.x]
- 130 Version 2 Tor connection protocol [in 0.2.0.x]
- 135 Simplify Configuration of Private Tor Networks [for 0.2.1.x] [in 0.2.1.2-alpha]
- 136 Mass authority migration with legacy keys [in 0.2.0.x]
- 137 Keep controllers informed as Tor bootstraps [in 0.2.1.x]
- 138 Remove routers that are not Running from consensus documents [in 0.2.1.2-alpha]
- 139 Download consensus documents only when it will be trusted [in 0.2.1.x]
- 148 Stream end reasons from the client side should be uniform [in 0.2.1.9-alpha]
- 150 Exclude Exit Nodes from a circuit [in 0.2.1.3-alpha]
- 152 Optionally allow exit from single-hop circuits [in 0.2.1.6-alpha]
- 167 Vote on network parameters in consensus [in 0.2.2]
- SUPERSEDED:
- 112 Bring Back Pathlen Coin Weight
- 113 Simplifying directory authority administration
- 124 Blocking resistant TLS certificate usage
- 153 Automatic software update protocol
- 154 Automatic Software Update Protocol
- DEAD:
- 100 Tor Unreliable Datagram Extension Proposal
- 115 Two Hop Paths
- 116 Two hop paths from entry guards
- 120 Shutdown descriptors when Tor servers stop
- 128 Families of private bridges
- 142 Combine Introduction and Rendezvous Points
- REJECTED:
- 134 More robust consensus voting with diverse authority sets
diff --git a/doc/spec/proposals/001-process.txt b/doc/spec/proposals/001-process.txt
deleted file mode 100644
index 53ad32ba1..000000000
--- a/doc/spec/proposals/001-process.txt
+++ /dev/null
@@ -1,184 +0,0 @@
-Filename: 001-process.txt
-Title: The Tor Proposal Process
-Author: Nick Mathewson
-Created: 30-Jan-2007
-Status: Meta
-
-Overview:
-
- This document describes how to change the Tor specifications, how Tor
- proposals work, and the relationship between Tor proposals and the
- specifications.
-
- This is an informational document.
-
-Motivation:
-
- Previously, our process for updating the Tor specifications was maximally
- informal: we'd patch the specification (sometimes forking first, and
- sometimes not), then discuss the patches, reach consensus, and implement
- the changes.
-
- This had a few problems.
-
- First, even at its most efficient, the old process would often have the
- spec out of sync with the code. The worst cases were those where
- implementation was deferred: the spec and code could stay out of sync for
- versions at a time.
-
- Second, it was hard to participate in discussion, since you had to know
- which portions of the spec were a proposal, and which were already
- implemented.
-
- Third, it littered the specifications with too many inline comments.
- [This was a real problem -NM]
- [Especially when it went to multiple levels! -NM]
- [XXXX especially when they weren't signed and talked about that
- thing that you can't remember after a year]
-
-How to change the specs now:
-
- First, somebody writes a proposal document. It should describe the change
- that should be made in detail, and give some idea of how to implement it.
- Once it's fleshed out enough, it becomes a proposal.
-
- Like an RFC, every proposal gets a number. Unlike RFCs, proposals can
- change over time and keep the same number, until they are finally
- accepted or rejected. The history for each proposal
- will be stored in the Tor repository.
-
- Once a proposal is in the repository, we should discuss and improve it
- until we've reached consensus that it's a good idea, and that it's
- detailed enough to implement. When this happens, we implement the
- proposal and incorporate it into the specifications. Thus, the specs
- remain the canonical documentation for the Tor protocol: no proposal is
- ever the canonical documentation for an implemented feature.
-
- (This process is pretty similar to the Python Enhancement Process, with
- the major exception that Tor proposals get re-integrated into the specs
- after implementation, whereas PEPs _become_ the new spec.)
-
- {It's still okay to make small changes directly to the spec if the code
- can be
- written more or less immediately, or cosmetic changes if no code change is
- required. This document reflects the current developers' _intent_, not
- a permanent promise to always use this process in the future: we reserve
- the right to get really excited and run off and implement something in a
- caffeine-or-m&m-fueled all-night hacking session.}
-
-How new proposals get added:
-
- Once an idea has been proposed on the development list, a properly formatted
- (see below) draft exists, and rough consensus within the active development
- community exists that this idea warrants consideration, the proposal editor
- will officially add the proposal.
-
- To get your proposal in, send it to or-dev.
-
- The current proposal editors are Nick Mathewson and Jacob Appelbaum.
-
-What should go in a proposal:
-
- Every proposal should have a header containing these fields:
- Filename, Title, Author, Created, Status.
-
- These fields are optional but recommended:
- Target, Implemented-In.
- The Target field should describe which version the proposal is hoped to be
- implemented in (if it's Open or Accepted). The Implemented-In field
- should describe which version the proposal was implemented in (if it's
- Finished or Closed).
-
- The body of the proposal should start with an Overview section explaining
- what the proposal's about, what it does, and about what state it's in.
-
- After the Overview, the proposal becomes more free-form. Depending on its
- length and complexity, the proposal can break into sections as
- appropriate, or follow a short discursive format. Every proposal should
- contain at least the following information before it is "ACCEPTED",
- though the information does not need to be in sections with these names.
-
- Motivation: What problem is the proposal trying to solve? Why does
- this problem matter? If several approaches are possible, why take this
- one?
-
- Design: A high-level view of what the new or modified features are, how
- the new or modified features work, how they interoperate with each
- other, and how they interact with the rest of Tor. This is the main
- body of the proposal. Some proposals will start out with only a
- Motivation and a Design, and wait for a specification until the
- Design seems approximately right.
-
- Security implications: What effects the proposed changes might have on
- anonymity, how well understood these effects are, and so on.
-
- Specification: A detailed description of what needs to be added to the
- Tor specifications in order to implement the proposal. This should
- be in about as much detail as the specifications will eventually
- contain: it should be possible for independent programmers to write
- mutually compatible implementations of the proposal based on its
- specifications.
-
- Compatibility: Will versions of Tor that follow the proposal be
- compatible with versions that do not? If so, how will compatibility
- be achieved? Generally, we try to not drop compatibility if at
- all possible; we haven't made a "flag day" change since May 2004,
- and we don't want to do another one.
-
- Implementation: If the proposal will be tricky to implement in Tor's
- current architecture, the document can contain some discussion of how
- to go about making it work. Actual patches should go on public git
- branches, or be uploaded to trac.
-
- Performance and scalability notes: If the feature will have an effect
- on performance (in RAM, CPU, bandwidth) or scalability, there should
- be some analysis on how significant this effect will be, so that we
- can avoid really expensive performance regressions, and so we can
- avoid wasting time on insignificant gains.
-
-Proposal status:
-
- Open: A proposal under discussion.
-
- Accepted: The proposal is complete, and we intend to implement it.
- After this point, substantive changes to the proposal should be
- avoided, and regarded as a sign of the process having failed
- somewhere.
-
- Finished: The proposal has been accepted and implemented. After this
- point, the proposal should not be changed.
-
- Closed: The proposal has been accepted, implemented, and merged into the
- main specification documents. The proposal should not be changed after
- this point.
-
- Rejected: We're not going to implement the feature as described here,
- though we might do some other version. See comments in the document
- for details. The proposal should not be changed after this point;
- to bring up some other version of the idea, write a new proposal.
-
- Draft: This isn't a complete proposal yet; there are definite missing
- pieces. Please don't add any new proposals with this status; put them
- in the "ideas" sub-directory instead.
-
- Needs-Revision: The idea for the proposal is a good one, but the proposal
- as it stands has serious problems that keep it from being accepted.
- See comments in the document for details.
-
- Dead: The proposal hasn't been touched in a long time, and it doesn't look
- like anybody is going to complete it soon. It can become "Open" again
- if it gets a new proponent.
-
- Needs-Research: There are research problems that need to be solved before
- it's clear whether the proposal is a good idea.
-
- Meta: This is not a proposal, but a document about proposals.
-
-
- The editor maintains the correct status of proposals, based on rough
- consensus and his own discretion.
-
-Proposal numbering:
-
- Numbers 000-099 are reserved for special and meta-proposals. 100 and up
- are used for actual proposals. Numbers aren't recycled.
diff --git a/doc/spec/proposals/098-todo.txt b/doc/spec/proposals/098-todo.txt
deleted file mode 100644
index a0bbbeb56..000000000
--- a/doc/spec/proposals/098-todo.txt
+++ /dev/null
@@ -1,107 +0,0 @@
-Filename: 098-todo.txt
-Title: Proposals that should be written
-Author: Nick Mathewson, Roger Dingledine
-Created: 26-Jan-2007
-Status: Meta
-
-Overview:
-
- This document lists ideas that various people have had for improving the
- Tor protocol. These should be implemented and specified if they're
- trivial, or written up as proposals if they're not.
-
- This is an active document, to be edited as proposals are written and as
- we come up with new ideas for proposals. We should take stuff out as it
- seems irrelevant.
-
-
-For some later protocol version.
-
- - It would be great to get smarter about identity and linkability.
- It's not crazy to say, "Never use the same circuit for my SSH
- connections and my web browsing." How far can/should we take this?
- See ideas/xxx-separate-streams-by-port.txt for a start.
-
- - Fix onionskin handshake scheme to be more mainstream, less nutty.
- Can we just do
- E(HMAC(g^x), g^x) rather than just E(g^x) ?
- No, that has the same flaws as before. We should send
- E(g^x, C) with random C and expect g^y, HMAC_C(K=g^xy).
- Better ask Ian; probably Stephen too.
-
- - Length on CREATE and friends
-
- - Versioning on circuits and create cells, so we have a clear path
- to improve the circuit protocol.
-
- - SHA1 is showing its age. We should get a design for upgrading our
- hash once the AHS competition is done, or even sooner.
-
- - Not being able to upgrade ciphersuites or increase key lengths is
- lame.
- - Paul has some ideas about circuit creation; read his PET paper once it's
- out.
-
-Any time:
-
- - Some ideas for revising the directory protocol:
- - Extend the "r" line in network-status to give a set of buckets (say,
- comma-separated) for that router.
- - Buckets are deterministic based on IP address.
- - Then clients can choose a bucket (or set of buckets) to
- download and use.
- - We need a way for the authorities to declare that nodes are in a
- family. Also, it kinda sucks that family declarations use O(N^2) space
- in the descriptors.
- - REASON_CONNECTFAILED should include an IP.
- - Spec should incorporate some prose from tor-design to be more readable.
- - Spec when we should rotate which keys
- - Spec how to publish descriptors less often
- - Describe pros and cons of non-deterministic path lengths
-
- - We should use a variable-length path length by default -- 3 +/- some
- distribution. Need to think harder about allowing values less than 3,
- and there's a tradeoff between having a wide variance and performance.
-
- - Clients currently use certs during TLS. Is this wise? It does make it
- easier for servers to tell which NATted client is which. We could use a
- seprate set of certs for each guard, I suppose, but generating so many
- certs could get expensive. Omitting them entirely would make OP->OR
- easier to tell from OR->OR.
-
-Things that should change...
-
-B.1. ... but which will require backward-incompatible change
-
- - Circuit IDs should be longer.
- . IPv6 everywhere.
- - Maybe, keys should be longer.
- - Maybe, key-length should be adjustable. How to do this without
- making anonymity suck?
- - Drop backward compatibility.
- - We should use a 128-bit subgroup of our DH prime.
- - Handshake should use HMAC.
- - Multiple cell lengths.
- - Ability to split circuits across paths (If this is useful.)
- - SENDME windows should be dynamic.
-
- - Directory
- - Stop ever mentioning socks ports
-
-B.1. ... and that will require no changes
-
- - Advertised outbound IP?
- - Migrate streams across circuits.
- - Fix bug 469 by limiting the number of simultaneous connections per IP.
-
-B.2. ... and that we have no idea how to do.
-
- - UDP (as transport)
- - UDP (as content)
- - Use a better AES mode that has built-in integrity checking,
- doesn't grow with the number of hops, is not patented, and
- is implemented and maintained by smart people.
-
-Let onion keys be not just RSA but maybe DH too, for Paul's reply onion
-design.
-
diff --git a/doc/spec/proposals/099-misc.txt b/doc/spec/proposals/099-misc.txt
deleted file mode 100644
index a3621dd25..000000000
--- a/doc/spec/proposals/099-misc.txt
+++ /dev/null
@@ -1,28 +0,0 @@
-Filename: 099-misc.txt
-Title: Miscellaneous proposals
-Author: Various
-Created: 26-Jan-2007
-Status: Meta
-
-Overview:
-
- This document is for small proposal ideas that are about one paragraph in
- length. From here, ideas can be rejected outright, expanded into full
- proposals, or specified and implemented as-is.
-
-Proposals
-
-1. Directory compression.
-
- Gzip would be easier to work with than zlib; bzip2 would result in smaller
- data lengths. [Concretely, we're looking at about 10-15% space savings at
- the expense of 3-5x longer compression time for using bzip2.] Doing
- on-the-fly gzip requires zlib 1.2 or later; doing bzip2 requires bzlib.
- Pre-compressing status documents in multiple formats would force us to use
- more memory to hold them.
-
- Status: Open
-
- -- Nick Mathewson
-
-
diff --git a/doc/spec/proposals/100-tor-spec-udp.txt b/doc/spec/proposals/100-tor-spec-udp.txt
deleted file mode 100644
index 7f062222c..000000000
--- a/doc/spec/proposals/100-tor-spec-udp.txt
+++ /dev/null
@@ -1,422 +0,0 @@
-Filename: 100-tor-spec-udp.txt
-Title: Tor Unreliable Datagram Extension Proposal
-Author: Marc Liberatore
-Created: 23 Feb 2006
-Status: Dead
-
-Overview:
-
- This is a modified version of the Tor specification written by Marc
- Liberatore to add UDP support to Tor. For each TLS link, it adds a
- corresponding DTLS link: control messages and TCP data flow over TLS, and
- UDP data flows over DTLS.
-
- This proposal is not likely to be accepted as-is; see comments at the end
- of the document.
-
-
-Contents
-
-0. Introduction
-
- Tor is a distributed overlay network designed to anonymize low-latency
- TCP-based applications. The current tor specification supports only
- TCP-based traffic. This limitation prevents the use of tor to anonymize
- other important applications, notably voice over IP software. This document
- is a proposal to extend the tor specification to support UDP traffic.
-
- The basic design philosophy of this extension is to add support for
- tunneling unreliable datagrams through tor with as few modifications to the
- protocol as possible. As currently specified, tor cannot directly support
- such tunneling, as connections between nodes are built using transport layer
- security (TLS) atop TCP. The latency incurred by TCP is likely unacceptable
- to the operation of most UDP-based application level protocols.
-
- Thus, we propose the addition of links between nodes using datagram
- transport layer security (DTLS). These links allow packets to traverse a
- route through tor quickly, but their unreliable nature requires minor
- changes to the tor protocol. This proposal outlines the necessary
- additions and changes to the tor specification to support UDP traffic.
-
- We note that a separate set of DTLS links between nodes creates a second
- overlay, distinct from the that composed of TLS links. This separation and
- resulting decrease in each anonymity set's size will make certain attacks
- easier. However, it is our belief that VoIP support in tor will
- dramatically increase its appeal, and correspondingly, the size of its user
- base, number of deployed nodes, and total traffic relayed. These increases
- should help offset the loss of anonymity that two distinct networks imply.
-
-1. Overview of Tor-UDP and its complications
-
- As described above, this proposal extends the Tor specification to support
- UDP with as few changes as possible. Tor's overlay network is managed
- through TLS based connections; we will re-use this control plane to set up
- and tear down circuits that relay UDP traffic. These circuits be built atop
- DTLS, in a fashion analogous to how Tor currently sends TCP traffic over
- TLS.
-
- The unreliability of DTLS circuits creates problems for Tor at two levels:
-
- 1. Tor's encryption of the relay layer does not allow independent
- decryption of individual records. If record N is not received, then
- record N+1 will not decrypt correctly, as the counter for AES/CTR is
- maintained implicitly.
-
- 2. Tor's end-to-end integrity checking works under the assumption that
- all RELAY cells are delivered. This assumption is invalid when cells
- are sent over DTLS.
-
- The fix for the first problem is straightforward: add an explicit sequence
- number to each cell. To fix the second problem, we introduce a
- system of nonces and hashes to RELAY packets.
-
- In the following sections, we mirror the layout of the Tor Protocol
- Specification, presenting the necessary modifications to the Tor protocol as
- a series of deltas.
-
-2. Connections
-
- Tor-UDP uses DTLS for encryption of some links. All DTLS links must have
- corresponding TLS links, as all control messages are sent over TLS. All
- implementations MUST support the DTLS ciphersuite "[TODO]".
-
- DTLS connections are formed using the same protocol as TLS connections.
- This occurs upon request, following a CREATE_UDP or CREATE_FAST_UDP cell,
- as detailed in section 4.6.
-
- Once a paired TLS/DTLS connection is established, the two sides send cells
- to one another. All but two types of cells are sent over TLS links. RELAY
- cells containing the commands RELAY_UDP_DATA and RELAY_UDP_DROP, specified
- below, are sent over DTLS links. [Should all cells still be 512 bytes long?
- Perhaps upon completion of a preliminary implementation, we should do a
- performance evaluation for some class of UDP traffic, such as VoIP. - ML]
- Cells may be sent embedded in TLS or DTLS records of any size or divided
- across such records. The framing of these records MUST NOT leak any more
- information than the above differentiation on the basis of cell type. [I am
- uncomfortable with this leakage, but don't see any simple, elegant way
- around it. -ML]
-
- As with TLS connections, DTLS connections are not permanent.
-
-3. Cell format
-
- Each cell contains the following fields:
-
- CircID [2 bytes]
- Command [1 byte]
- Sequence Number [2 bytes]
- Payload (padded with 0 bytes) [507 bytes]
- [Total size: 512 bytes]
-
- The 'Command' field holds one of the following values:
- 0 -- PADDING (Padding) (See Sec 6.2)
- 1 -- CREATE (Create a circuit) (See Sec 4)
- 2 -- CREATED (Acknowledge create) (See Sec 4)
- 3 -- RELAY (End-to-end data) (See Sec 5)
- 4 -- DESTROY (Stop using a circuit) (See Sec 4)
- 5 -- CREATE_FAST (Create a circuit, no PK) (See Sec 4)
- 6 -- CREATED_FAST (Circuit created, no PK) (See Sec 4)
- 7 -- CREATE_UDP (Create a UDP circuit) (See Sec 4)
- 8 -- CREATED_UDP (Acknowledge UDP create) (See Sec 4)
- 9 -- CREATE_FAST_UDP (Create a UDP circuit, no PK) (See Sec 4)
- 10 -- CREATED_FAST_UDP(UDP circuit created, no PK) (See Sec 4)
-
- The sequence number allows for AES/CTR decryption of RELAY cells
- independently of one another; this functionality is required to support
- cells sent over DTLS. The sequence number is described in more detail in
- section 4.5.
-
- [Should the sequence number only appear in RELAY packets? The overhead is
- small, and I'm hesitant to force more code paths on the implementor. -ML]
- [There's already a separate relay header that has other material in it,
- so it wouldn't be the end of the world to move it there if it's
- appropriate. -RD]
-
- [Having separate commands for UDP circuits seems necessary, unless we can
- assume a flag day event for a large number of tor nodes. -ML]
-
-4. Circuit management
-
-4.2. Setting circuit keys
-
- Keys are set up for UDP circuits in the same fashion as for TCP circuits.
- Each UDP circuit shares keys with its corresponding TCP circuit.
-
- [If the keys are used for both TCP and UDP connections, how does it
- work to mix sequence-number-less cells with sequenced-numbered cells --
- how do you know you have the encryption order right? -RD]
-
-4.3. Creating circuits
-
- UDP circuits are created as TCP circuits, using the *_UDP cells as
- appropriate.
-
-4.4. Tearing down circuits
-
- UDP circuits are torn down as TCP circuits, using the *_UDP cells as
- appropriate.
-
-4.5. Routing relay cells
-
- When an OR receives a RELAY cell, it checks the cell's circID and
- determines whether it has a corresponding circuit along that
- connection. If not, the OR drops the RELAY cell.
-
- Otherwise, if the OR is not at the OP edge of the circuit (that is,
- either an 'exit node' or a non-edge node), it de/encrypts the payload
- with AES/CTR, as follows:
- 'Forward' relay cell (same direction as CREATE):
- Use Kf as key; decrypt, using sequence number to synchronize
- ciphertext and keystream.
- 'Back' relay cell (opposite direction from CREATE):
- Use Kb as key; encrypt, using sequence number to synchronize
- ciphertext and keystream.
- Note that in counter mode, decrypt and encrypt are the same operation.
- [Since the sequence number is only 2 bytes, what do you do when it
- rolls over? -RD]
-
- Each stream encrypted by a Kf or Kb has a corresponding unique state,
- captured by a sequence number; the originator of each such stream chooses
- the initial sequence number randomly, and increments it only with RELAY
- cells. [This counts cells; unlike, say, TCP, tor uses fixed-size cells, so
- there's no need for counting bytes directly. Right? - ML]
- [I believe this is true. You'll find out for sure when you try to
- build it. ;) -RD]
-
- The OR then decides whether it recognizes the relay cell, by
- inspecting the payload as described in section 5.1 below. If the OR
- recognizes the cell, it processes the contents of the relay cell.
- Otherwise, it passes the decrypted relay cell along the circuit if
- the circuit continues. If the OR at the end of the circuit
- encounters an unrecognized relay cell, an error has occurred: the OR
- sends a DESTROY cell to tear down the circuit.
-
- When a relay cell arrives at an OP, the OP decrypts the payload
- with AES/CTR as follows:
- OP receives data cell:
- For I=N...1,
- Decrypt with Kb_I, using the sequence number as above. If the
- payload is recognized (see section 5.1), then stop and process
- the payload.
-
- For more information, see section 5 below.
-
-4.6. CREATE_UDP and CREATED_UDP cells
-
- Users set up UDP circuits incrementally. The procedure is similar to that
- for TCP circuits, as described in section 4.1. In addition to the TLS
- connection to the first node, the OP also attempts to open a DTLS
- connection. If this succeeds, the OP sends a CREATE_UDP cell, with a
- payload in the same format as a CREATE cell. To extend a UDP circuit past
- the first hop, the OP sends an EXTEND_UDP relay cell (see section 5) which
- instructs the last node in the circuit to send a CREATE_UDP cell to extend
- the circuit.
-
- The relay payload for an EXTEND_UDP relay cell consists of:
- Address [4 bytes]
- TCP port [2 bytes]
- UDP port [2 bytes]
- Onion skin [186 bytes]
- Identity fingerprint [20 bytes]
-
- The address field and ports denote the IPV4 address and ports of the next OR
- in the circuit.
-
- The payload for a CREATED_UDP cell or the relay payload for an
- RELAY_EXTENDED_UDP cell is identical to that of the corresponding CREATED or
- RELAY_EXTENDED cell. Both circuits are established using the same key.
-
- Note that the existence of a UDP circuit implies the
- existence of a corresponding TCP circuit, sharing keys, sequence numbers,
- and any other relevant state.
-
-4.6.1 CREATE_FAST_UDP/CREATED_FAST_UDP cells
-
- As above, the OP must successfully connect using DTLS before attempting to
- send a CREATE_FAST_UDP cell. Otherwise, the procedure is the same as in
- section 4.1.1.
-
-5. Application connections and stream management
-
-5.1. Relay cells
-
- Within a circuit, the OP and the exit node use the contents of RELAY cells
- to tunnel end-to-end commands, TCP connections ("Streams"), and UDP packets
- across circuits. End-to-end commands and UDP packets can be initiated by
- either edge; streams are initiated by the OP.
-
- The payload of each unencrypted RELAY cell consists of:
- Relay command [1 byte]
- 'Recognized' [2 bytes]
- StreamID [2 bytes]
- Digest [4 bytes]
- Length [2 bytes]
- Data [498 bytes]
-
- The relay commands are:
- 1 -- RELAY_BEGIN [forward]
- 2 -- RELAY_DATA [forward or backward]
- 3 -- RELAY_END [forward or backward]
- 4 -- RELAY_CONNECTED [backward]
- 5 -- RELAY_SENDME [forward or backward]
- 6 -- RELAY_EXTEND [forward]
- 7 -- RELAY_EXTENDED [backward]
- 8 -- RELAY_TRUNCATE [forward]
- 9 -- RELAY_TRUNCATED [backward]
- 10 -- RELAY_DROP [forward or backward]
- 11 -- RELAY_RESOLVE [forward]
- 12 -- RELAY_RESOLVED [backward]
- 13 -- RELAY_BEGIN_UDP [forward]
- 14 -- RELAY_DATA_UDP [forward or backward]
- 15 -- RELAY_EXTEND_UDP [forward]
- 16 -- RELAY_EXTENDED_UDP [backward]
- 17 -- RELAY_DROP_UDP [forward or backward]
-
- Commands labelled as "forward" must only be sent by the originator
- of the circuit. Commands labelled as "backward" must only be sent by
- other nodes in the circuit back to the originator. Commands marked
- as either can be sent either by the originator or other nodes.
-
- The 'recognized' field in any unencrypted relay payload is always set to
- zero.
-
- The 'digest' field can have two meanings. For all cells sent over TLS
- connections (that is, all commands and all non-UDP RELAY data), it is
- computed as the first four bytes of the running SHA-1 digest of all the
- bytes that have been sent reliably and have been destined for this hop of
- the circuit or originated from this hop of the circuit, seeded from Df or Db
- respectively (obtained in section 4.2 above), and including this RELAY
- cell's entire payload (taken with the digest field set to zero). Cells sent
- over DTLS connections do not affect this running digest. Each cell sent
- over DTLS (that is, RELAY_DATA_UDP and RELAY_DROP_UDP) has the digest field
- set to the SHA-1 digest of the current RELAY cells' entire payload, with the
- digest field set to zero. Coupled with a randomly-chosen streamID, this
- provides per-cell integrity checking on UDP cells.
- [If you drop malformed UDP relay cells but don't close the circuit,
- then this 8 bytes of digest is not as strong as what we get in the
- TCP-circuit side. Is this a problem? -RD]
-
- When the 'recognized' field of a RELAY cell is zero, and the digest
- is correct, the cell is considered "recognized" for the purposes of
- decryption (see section 4.5 above).
-
- (The digest does not include any bytes from relay cells that do
- not start or end at this hop of the circuit. That is, it does not
- include forwarded data. Therefore if 'recognized' is zero but the
- digest does not match, the running digest at that node should
- not be updated, and the cell should be forwarded on.)
-
- All RELAY cells pertaining to the same tunneled TCP stream have the
- same streamID. Such streamIDs are chosen arbitrarily by the OP. RELAY
- cells that affect the entire circuit rather than a particular
- stream use a StreamID of zero.
-
- All RELAY cells pertaining to the same UDP tunnel have the same streamID.
- This streamID is chosen randomly by the OP, but cannot be zero.
-
- The 'Length' field of a relay cell contains the number of bytes in
- the relay payload which contain real payload data. The remainder of
- the payload is padded with NUL bytes.
-
- If the RELAY cell is recognized but the relay command is not
- understood, the cell must be dropped and ignored. Its contents
- still count with respect to the digests, though. [Before
- 0.1.1.10, Tor closed circuits when it received an unknown relay
- command. Perhaps this will be more forward-compatible. -RD]
-
-5.2.1. Opening UDP tunnels and transferring data
-
- To open a new anonymized UDP connection, the OP chooses an open
- circuit to an exit that may be able to connect to the destination
- address, selects a random streamID not yet used on that circuit,
- and constructs a RELAY_BEGIN_UDP cell with a payload encoding the address
- and port of the destination host. The payload format is:
-
- ADDRESS | ':' | PORT | [00]
-
- where ADDRESS can be a DNS hostname, or an IPv4 address in
- dotted-quad format, or an IPv6 address surrounded by square brackets;
- and where PORT is encoded in decimal.
-
- [What is the [00] for? -NM]
- [It's so the payload is easy to parse out with string funcs -RD]
-
- Upon receiving this cell, the exit node resolves the address as necessary.
- If the address cannot be resolved, the exit node replies with a RELAY_END
- cell. (See 5.4 below.) Otherwise, the exit node replies with a
- RELAY_CONNECTED cell, whose payload is in one of the following formats:
- The IPv4 address to which the connection was made [4 octets]
- A number of seconds (TTL) for which the address may be cached [4 octets]
- or
- Four zero-valued octets [4 octets]
- An address type (6) [1 octet]
- The IPv6 address to which the connection was made [16 octets]
- A number of seconds (TTL) for which the address may be cached [4 octets]
- [XXXX Versions of Tor before 0.1.1.6 ignore and do not generate the TTL
- field. No version of Tor currently generates the IPv6 format.]
-
- The OP waits for a RELAY_CONNECTED cell before sending any data.
- Once a connection has been established, the OP and exit node
- package UDP data in RELAY_DATA_UDP cells, and upon receiving such
- cells, echo their contents to the corresponding socket.
- RELAY_DATA_UDP cells sent to unrecognized streams are dropped.
-
- Relay RELAY_DROP_UDP cells are long-range dummies; upon receiving such
- a cell, the OR or OP must drop it.
-
-5.3. Closing streams
-
- UDP tunnels are closed in a fashion corresponding to TCP connections.
-
-6. Flow Control
-
- UDP streams are not subject to flow control.
-
-7.2. Router descriptor format.
-
-The items' formats are as follows:
- "router" nickname address ORPort SocksPort DirPort UDPPort
-
- Indicates the beginning of a router descriptor. "address" must be
- an IPv4 address in dotted-quad format. The last three numbers
- indicate the TCP ports at which this OR exposes
- functionality. ORPort is a port at which this OR accepts TLS
- connections for the main OR protocol; SocksPort is deprecated and
- should always be 0; DirPort is the port at which this OR accepts
- directory-related HTTP connections; and UDPPort is a port at which
- this OR accepts DTLS connections for UDP data. If any port is not
- supported, the value 0 is given instead of a port number.
-
-Other sections:
-
-What changes need to happen to each node's exit policy to support this? -RD
-
-Switching to UDP means managing the queues of incoming packets better,
-so we don't miss packets. How does this interact with doing large public
-key operations (handshakes) in the same thread? -RD
-
-========================================================================
-COMMENTS
-========================================================================
-
-[16 May 2006]
-
-I don't favor this approach; it makes packet traffic partitioned from
-stream traffic end-to-end. The architecture I'd like to see is:
-
- A *All* Tor-to-Tor traffic is UDP/DTLS, unless we need to fall back on
- TCP/TLS for firewall penetration or something. (This also gives us an
- upgrade path for routing through legacy servers.)
-
- B Stream traffic is handled with end-to-end per-stream acks/naks and
- retries. On failure, the data is retransmitted in a new RELAY_DATA cell;
- a cell isn't retransmitted.
-
-We'll need to do A anyway, to fix our behavior on packet-loss. Once we've
-done so, B is more or less inevitable, and we can support end-to-end UDP
-traffic "for free".
-
-(Also, there are some details that this draft spec doesn't address. For
-example, what happens when a UDP packet doesn't fit in a single cell?)
-
--NM
diff --git a/doc/spec/proposals/101-dir-voting.txt b/doc/spec/proposals/101-dir-voting.txt
deleted file mode 100644
index 634d3f194..000000000
--- a/doc/spec/proposals/101-dir-voting.txt
+++ /dev/null
@@ -1,283 +0,0 @@
-Filename: 101-dir-voting.txt
-Title: Voting on the Tor Directory System
-Author: Nick Mathewson
-Created: Nov 2006
-Status: Closed
-Implemented-In: 0.2.0.x
-
-Overview
-
- This document describes a consensus voting scheme for Tor directories;
- instead of publishing different network statuses, directories would vote on
- and publish a single "consensus" network status document.
-
- This is an open proposal.
-
-Proposal:
-
-0. Scope and preliminaries
-
- This document describes a consensus voting scheme for Tor directories.
- Once it's accepted, it should be merged with dir-spec.txt. Some
- preliminaries for authority and caching support should be done during
- the 0.1.2.x series; the main deployment should come during the 0.2.0.x
- series.
-
-0.1. Goals and motivation: voting.
-
- The current directory system relies on clients downloading separate
- network status statements from the caches signed by each directory.
- Clients download a new statement every 30 minutes or so, choosing to
- replace the oldest statement they currently have.
-
- This creates a partitioning problem: different clients have different
- "most recent" networkstatus sources, and different versions of each
- (since authorities change their statements often).
-
- It also creates a scaling problem: most of the downloaded networkstatus
- are probably quite similar, and the redundancy grows as we add more
- authorities.
-
- So if we have clients only download a single multiply signed consensus
- network status statement, we can:
- - Save bandwidth.
- - Reduce client partitioning
- - Reduce client-side and cache-side storage
- - Simplify client-side voting code (by moving voting away from the
- client)
-
- We should try to do this without:
- - Assuming that client-side or cache-side clocks are more correct
- than we assume now.
- - Assuming that authority clocks are perfectly correct.
- - Degrading badly if a few authorities die or are offline for a bit.
-
- We do not have to perform well if:
- - No clique of more than half the authorities can agree about who
- the authorities are.
-
-1. The idea.
-
- Instead of publishing a network status whenever something changes,
- each authority instead publishes a fresh network status only once per
- "period" (say, 60 minutes). Authorities either upload this network
- status (or "vote") to every other authority, or download every other
- authority's "vote" (see 3.1 below for discussion on push vs pull).
-
- After an authority has (or has become convinced that it won't be able to
- get) every other authority's vote, it deterministically computes a
- consensus networkstatus, and signs it. Authorities download (or are
- uploaded; see 3.1) one another's signatures, and form a multiply signed
- consensus. This multiply-signed consensus is what caches cache and what
- clients download.
-
- If an authority is down, authorities vote based on what they *can*
- download/get uploaded.
-
- If an authority is "a little" down and only some authorities can reach
- it, authorities try to get its info from other authorities.
-
- If an authority computes the vote wrong, its signature isn't included on
- the consensus.
-
- Clients use a consensus if it is "trusted": signed by more than half the
- authorities they recognize. If clients can't find any such consensus,
- they use the most recent trusted consensus they have. If they don't
- have any trusted consensus, they warn the user and refuse to operate
- (and if DirServers is not the default, beg the user to adapt the list
- of authorities).
-
-2. Details.
-
-2.0. Versioning
-
- All documents generated here have version "3" given in their
- network-status-version entries.
-
-2.1. Vote specifications
-
- Votes in v3 are similar to v2 network status documents. We add these
- fields to the preamble:
-
- "vote-status" -- the word "vote".
-
- "valid-until" -- the time when this authority expects to publish its
- next vote.
-
- "known-flags" -- a space-separated list of flags that will sometimes
- be included on "s" lines later in the vote.
-
- "dir-source" -- as before, except the "hostname" part MUST be the
- authority's nickname, which MUST be unique among authorities, and
- MUST match the nickname in the "directory-signature" entry.
-
- Authorities SHOULD cache their most recently generated votes so they
- can persist them across restarts. Authorities SHOULD NOT generate
- another document until valid-until has passed.
-
- Router entries in the vote MUST be sorted in ascending order by router
- identity digest. The flags in "s" lines MUST appear in alphabetical
- order.
-
- Votes SHOULD be synchronized to half-hour publication intervals (one
- hour? XXX say more; be more precise.)
-
- XXXX some way to request older networkstatus docs?
-
-2.2. Consensus directory specifications
-
- Consensuses are like v3 votes, except for the following fields:
-
- "vote-status" -- the word "consensus".
-
- "published" is the latest of all the published times on the votes.
-
- "valid-until" is the earliest of all the valid-until times on the
- votes.
-
- "dir-source" and "fingerprint" and "dir-signing-key" and "contact"
- are included for each authority that contributed to the vote.
-
- "vote-digest" for each authority that contributed to the vote,
- calculated as for the digest in the signature on the vote. [XXX
- re-English this sentence]
-
- "client-versions" and "server-versions" are sorted in ascending
- order based on version-spec.txt.
-
- "dir-options" and "known-flags" are not included.
-[XXX really? why not list the ones that are used in the consensus?
-For example, right now BadExit is in use, but no servers would be
-labelled BadExit, and it's still worth knowing that it was considered
-by the authorities. -RD]
-
- The fields MUST occur in the following order:
- "network-status-version"
- "vote-status"
- "published"
- "valid-until"
- For each authority, sorted in ascending order of nickname, case-
- insensitively:
- "dir-source", "fingerprint", "contact", "dir-signing-key",
- "vote-digest".
- "client-versions"
- "server-versions"
-
- The signatures at the end of the document appear as multiple instances
- of directory-signature, sorted in ascending order by nickname,
- case-insensitively.
-
- A router entry should be included in the result if it is included by more
- than half of the authorities (total authorities, not just those whose votes
- we have). A router entry has a flag set if it is included by more than
- half of the authorities who care about that flag. [XXXX this creates an
- incentive for attackers to DOS authorities whose votes they don't like.
- Can we remember what flags people set the last time we saw them? -NM]
- [Which 'we' are we talking here? The end-users never learn which
- authority sets which flags. So you're thinking the authorities
- should record the last vote they saw from each authority and if it's
- within a week or so, count all the flags that it advertised as 'no'
- votes? Plausible. -RD]
-
- The signature hash covers from the "network-status-version" line through
- the characters "directory-signature" in the first "directory-signature"
- line.
-
- Consensus directories SHOULD be rejected if they are not signed by more
- than half of the known authorities.
-
-2.2.1. Detached signatures
-
- Assuming full connectivity, every authority should compute and sign the
- same consensus directory in each period. Therefore, it isn't necessary to
- download the consensus computed by each authority; instead, the authorities
- only push/fetch each others' signatures. A "detached signature" document
- contains a single "consensus-digest" entry and one or more
- directory-signature entries. [XXXX specify more.]
-
-2.3. URLs and timelines
-
-2.3.1. URLs and timeline used for agreement
-
- An authority SHOULD publish its vote immediately at the start of each voting
- period. It does this by making it available at
- http://<hostname>/tor/status-vote/current/authority.z
- and sending it in an HTTP POST request to each other authority at the URL
- http://<hostname>/tor/post/vote
-
- If, N minutes after the voting period has begun, an authority does not have
- a current statement from another authority, the first authority retrieves
- the other's statement.
-
- Once an authority has a vote from another authority, it makes it available
- at
- http://<hostname>/tor/status-vote/current/<fp>.z
- where <fp> is the fingerprint of the other authority's identity key.
-
- The consensus network status, along with as many signatures as the server
- currently knows, should be available at
- http://<hostname>/tor/status-vote/current/consensus.z
- All of the detached signatures it knows for consensus status should be
- available at:
- http://<hostname>/tor/status-vote/current/consensus-signatures.z
-
- Once an authority has computed and signed a consensus network status, it
- should send its detached signature to each other authority in an HTTP POST
- request to the URL:
- http://<hostname>/tor/post/consensus-signature
-
-
- [XXXX Store votes to disk.]
-
-2.3.2. Serving a consensus directory
-
- Once the authority is done getting signatures on the consensus directory,
- it should serve it from:
- http://<hostname>/tor/status/consensus.z
-
- Caches SHOULD download consensus directories from an authority and serve
- them from the same URL.
-
-2.3.3. Timeline and synchronization
-
- [XXXX]
-
-2.4. Distributing routerdescs between authorities
-
- Consensus will be more meaningful if authorities take steps to make sure
- that they all have the same set of descriptors _before_ the voting
- starts. This is safe, since all descriptors are self-certified and
- timestamped: it's always okay to replace a signed descriptor with a more
- recent one signed by the same identity.
-
- In the long run, we might want some kind of sophisticated process here.
- For now, since authorities already download one another's networkstatus
- documents and use them to determine what descriptors to download from one
- another, we can rely on this existing mechanism to keep authorities up to
- date.
-
- [We should do a thorough read-through of dir-spec again to make sure
- that the authorities converge on which descriptor to "prefer" for
- each router. Right now the decision happens at the client, which is
- no longer the right place for it. -RD]
-
-3. Questions and concerns
-
-3.1. Push or pull?
-
- The URLs above define a push mechanism for publishing votes and consensus
- signatures via HTTP POST requests, and a pull mechanism for downloading
- these documents via HTTP GET requests. As specified, every authority will
- post to every other. The "download if no copy has been received" mechanism
- exists only as a fallback.
-
-4. Migration
-
- * It would be cool if caches could get ready to download consensus
- status docs, verify enough signatures, and serve them now. That way
- once stuff works all we need to do is upgrade the authorities. Caches
- don't need to verify the correctness of the format so long as it's
- signed (or maybe multisigned?). We need to make sure that caches back
- off very quickly from downloading consensus docs until they're
- actually implemented.
-
diff --git a/doc/spec/proposals/102-drop-opt.txt b/doc/spec/proposals/102-drop-opt.txt
deleted file mode 100644
index 490376bb5..000000000
--- a/doc/spec/proposals/102-drop-opt.txt
+++ /dev/null
@@ -1,38 +0,0 @@
-Filename: 102-drop-opt.txt
-Title: Dropping "opt" from the directory format
-Author: Nick Mathewson
-Created: Jan 2007
-Status: Closed
-Implemented-In: 0.2.0.x
-
-Overview:
-
- This document proposes a change in the format used to transmit router and
- directory information.
-
- This proposal has been accepted, implemented, and merged into dir-spec.txt.
-
-Proposal:
-
- The "opt" keyword in Tor's directory formats was originally intended to
- mean, "it is okay to ignore this entry if you don't understand it"; the
- default behavior has been "discard a routerdesc if it contains entries you
- don't recognize."
-
- But so far, every new flag we have added has been marked 'opt'. It would
- probably make sense to change the default behavior to "ignore unrecognized
- fields", and add the statement that clients SHOULD ignore fields they don't
- recognize. As a meta-principle, we should say that clients and servers
- MUST NOT have to understand new fields in order to use directory documents
- correctly.
-
- Of course, this will make it impossible to say, "The format has changed a
- lot; discard this quietly if you don't understand it." We could do that by
- adding a version field.
-
-Status:
-
- * We stopped requiring it as of 0.1.2.5-alpha. We'll stop generating it
- once earlier formats are obsolete.
-
-
diff --git a/doc/spec/proposals/103-multilevel-keys.txt b/doc/spec/proposals/103-multilevel-keys.txt
deleted file mode 100644
index c8a7a6677..000000000
--- a/doc/spec/proposals/103-multilevel-keys.txt
+++ /dev/null
@@ -1,204 +0,0 @@
-Filename: 103-multilevel-keys.txt
-Title: Splitting identity key from regularly used signing key.
-Author: Nick Mathewson
-Created: Jan 2007
-Status: Closed
-Implemented-In: 0.2.0.x
-
-Overview:
-
- This document proposes a change in the way identity keys are used, so that
- highly sensitive keys can be password-protected and seldom loaded into RAM.
-
- It presents options; it is not yet a complete proposal.
-
-Proposal:
-
- Replacing a directory authority's identity key in the event of a compromise
- would be tremendously annoying. We'd need to tell every client to switch
- their configuration, or update to a new version with an uploaded list. So
- long as some weren't upgraded, they'd be at risk from whoever had
- compromised the key.
-
- With this in mind, it's a shame that our current protocol forces us to
- store identity keys unencrypted in RAM. We need some kind of signing key
- stored unencrypted, since we need to generate new descriptors/directories
- and rotate link and onion keys regularly. (And since, of course, we can't
- ask server operators to be on-hand to enter a passphrase every time we
- want to rotate keys or sign a descriptor.)
-
- The obvious solution seems to be to have a signing-only key that lives
- indefinitely (months or longer) and signs descriptors and link keys, and a
- separate identity key that's used to sign the signing key. Tor servers
- could run in one of several modes:
- 1. Identity key stored encrypted. You need to pick a passphrase when
- you enable this mode, and re-enter this passphrase every time you
- rotate the signing key.
- 1'. Identity key stored separate. You save your identity key to a
- floppy, and use the floppy when you need to rotate the signing key.
- 2. All keys stored unencrypted. In this case, we might not want to even
- *have* a separate signing key. (We'll need to support no-separate-
- signing-key mode anyway to keep old servers working.)
- 3. All keys stored encrypted. You need to enter a passphrase to start
- Tor.
- (Of course, we might not want to implement all of these.)
-
- Case 1 is probably most usable and secure, if we assume that people don't
- forget their passphrases or lose their floppies. We could mitigate this a
- bit by encouraging people to PGP-encrypt their passphrases to themselves,
- or keep a cleartext copy of their secret key secret-split into a few
- pieces, or something like that.
-
- Migration presents another difficulty, especially with the authorities. If
- we use the current set of identity keys as the new identity keys, we're in
- the position of having sensitive keys that have been stored on
- media-of-dubious-encryption up to now. Also, we need to keep old clients
- (who will expect descriptors to be signed by the identity keys they know
- and love, and who will not understand signing keys) happy.
-
-A possible solution:
-
- One thing to consider is that router identity keys are not very sensitive:
- if an OR disappears and reappears with a new key, the network treats it as
- though an old router had disappeared and a new one had joined the network.
- The Tor network continues unharmed; this isn't a disaster.
-
- Thus, the ideas above are mostly relevant for authorities.
-
- The most straightforward solution for the authorities is probably to take
- advantage of the protocol transition that will come with proposal 101, and
- introduce a new set of signing _and_ identity keys used only to sign votes
- and consensus network-status documents. Signing and identity keys could be
- delivered to users in a separate, rarely changing "keys" document, so that
- the consensus network-status documents wouldn't need to include N signing
- keys, N identity keys, and N certifications.
-
- Note also that there is no reason that the identity/signing keys used by
- directory authorities would necessarily have to be the same as the identity
- keys those authorities use in their capacity as routers. Decoupling these
- keys would give directory authorities the following set of keys:
-
- Directory authority identity:
- Highly confidential; stored encrypted and/or offline. Used to
- identity directory authorities. Shipped with clients. Used to
- sign Directory authority signing keys.
-
- Directory authority signing key:
- Stored online, accessible to regular Tor process. Used to sign
- votes and consensus directories. Downloaded as part of a "keys"
- document.
-
- [Administrators SHOULD rotate their signing keys every month or
- two, just to keep in practice and keep from forgetting the
- password to the authority identity.]
-
- V1-V2 directory authority identity:
- Stored online, never changed. Used to sign legacy network-status
- and directory documents.
-
- Router identity:
- Stored online, seldom changed. Used to sign server descriptors
- for this authority in its role as a router. Implicitly certified
- by being listed in network-status documents.
-
- Onion key, link key:
- As in tor-spec.txt
-
-
-Extensions to Proposal 101.
-
- Define a new document type, "Key certificate". It contains the
- following fields, in order:
-
- "dir-key-certificate-version": As network-status-version. Must be
- "3".
- "fingerprint": Hex fingerprint, with spaces, based on the directory
- authority's identity key.
- "dir-identity-key": The long-term identity key for this authority.
- "dir-key-published": The time when this directory's signing key was
- last changed.
- "dir-key-expires": A time after which this key is no longer valid.
- "dir-signing-key": As in proposal 101.
- "dir-key-certification": A signature of the above fields, in order.
- The signed material extends from the beginning of
- "dir-key-certicate-version" through the newline after
- "dir-key-certification". The identity key is used to generate
- this signature.
-
- These elements together constitute a "key certificate". These are
- generated offline when starting a v3 authority. Private identity
- keys SHOULD be stored offline, encrypted, or both. A running
- authority only needs access to the signing key.
-
- Unlike other keys currently used by Tor, the authority identity
- keys and directory signing keys MAY be longer than 1024 bits.
- (They SHOULD be 2048 bits or longer; they MUST NOT be shorter than
- 1024.)
-
- Vote documents change as follows:
-
- A key certificate MUST be included in-line in every vote document. With
- the exception of "fingerprint", its elements MUST NOT appear in consensus
- documents.
-
- Consensus network statuses change as follows:
-
- Remove dir-signing-key.
-
- Change "directory-signature" to take a fingerprint of the authority's
- identity key and a fingerprint of the authority's current signing key
- rather than the authority's nickname.
-
- Change "dir-source" to take the a fingerprint of the authority's
- identity key rather than the authority's nickname or hostname.
-
- Add a new document type:
-
- A "keys" document contains all currently known key certificates.
- All authorities serve it at
-
- http://<hostname>/tor/status/keys.z
-
- Caches and clients download the keys document whenever they receive a
- consensus vote that uses a key they do not recognize. Caches download
- from authorities; clients download from caches.
-
- Processing votes:
-
- When receiving a vote, authorities check to see if the key
- certificate for the voter is different from the one they have. If
- the key certificate _is_ different, and its dir-key-published is
- more recent than the most recently known one, and it is
- well-formed and correctly signed with the correct identity key,
- then authorities remember it as the new canonical key certificate
- for that voter.
-
- A key certificate is invalid if any of the following hold:
- * The version is unrecognized.
- * The fingerprint does not match the identity key.
- * The identity key or the signing key is ill-formed.
- * The published date is very far in the past or future.
-
- * The signature is not a valid signature of the key certificate
- generated with the identity key.
-
- When processing the signatures on consensus, clients and caches act as
- follows:
-
- 1. Only consider the directory-signature entries whose identity
- key hashes match trusted authorities.
-
- 2. If any such entries have signing key hashes that match unknown
- signing keys, download a new keys document.
-
- 3. For every entry with a known (identity key,signing key) pair,
- check the signature on the document.
-
- 4. If the document has been signed by more than half of the
- authorities the client recognizes, treat the consensus as
- correctly signed.
-
- If not, but the number entries with known identity keys but
- unknown signing keys might be enough to make the consensus
- correctly signed, do not use the consensus, but do not discard
- it until we have a new keys document.
diff --git a/doc/spec/proposals/104-short-descriptors.txt b/doc/spec/proposals/104-short-descriptors.txt
deleted file mode 100644
index 90e0764fe..000000000
--- a/doc/spec/proposals/104-short-descriptors.txt
+++ /dev/null
@@ -1,181 +0,0 @@
-Filename: 104-short-descriptors.txt
-Title: Long and Short Router Descriptors
-Author: Nick Mathewson
-Created: Jan 2007
-Status: Closed
-Implemented-In: 0.2.0.x
-
-Overview:
-
- This document proposes moving unused-by-clients information from regular
- router descriptors into a new "extra info" router descriptor.
-
-Proposal:
-
- Some of the costliest fields in the current directory protocol are ones
- that no client actually uses. In particular, the "read-history" and
- "write-history" fields are used only by the authorities for monitoring the
- status of the network. If we took them out, the size of a compressed list
- of all the routers would fall by about 60%. (No other disposable field
- would save much more than 2%.)
-
- We propose to remove these fields from descriptors, and and have them
- uploaded as a part of a separate signed "extra info" to the authorities.
- This document will be signed. A hash of this document will be included in
- the regular descriptors.
-
- (We considered another design, where routers would generate and upload a
- short-form and a long-form descriptor. Only the short-form descriptor would
- ever be used by anybody for routing. The long-form descriptor would be
- used only for analytics and other tools. We decided against this because
- well-behaved tools would need to download short-form descriptors too (as
- these would be the only ones indexed), and hence get redundant info. Badly
- behaved tools would download only long-form descriptors, and expose
- themselves to partitioning attacks.)
-
-Other disposable fields:
-
- Clients don't need these fields, but removing them doesn't help bandwidth
- enough to be worthwhile.
- contact (save about 1%)
- fingerprint (save about 3%)
-
- We could represent these fields more succinctly, but removing them would
- only save 1%. (!)
- reject
- accept
- (Apparently, exit polices are highly compressible.)
-
- [Does size-on-disk matter to anybody? Some clients and servers don't
- have much disk, or have really slow disk (e.g. USB). And we don't
- store caches compressed right now. -RD]
-
-Specification:
-
- 1. Extra Info Format.
-
- An "extra info" descriptor contains the following fields:
-
- "extra-info" Nickname Fingerprint
- Identifies what router this is an extra info descriptor for.
- Fingerprint is encoded in hex (using upper-case letters), with
- no spaces.
-
- "published" As currently documented in dir-spec.txt. It MUST match the
- "published" field of the descriptor published at the same time.
-
- "read-history"
- "write-history"
- As currently documented in dir-spec.txt. Optional.
-
- "router-signature" NL Signature NL
-
- A signature of the PKCS1-padded hash of the entire extra info
- document, taken from the beginning of the "extra-info" line, through
- the newline after the "router-signature" line. An extra info
- document is not valid unless the signature is performed with the
- identity key whose digest matches FINGERPRINT.
-
- The "extra-info" field is required and MUST appear first. The
- router-signature field is required and MUST appear last. All others are
- optional. As for other documents, unrecognized fields must be ignored.
-
- 2. Existing formats
-
- Implementations that use "read-history" and "write-history" SHOULD
- continue accepting router descriptors that contain them. (Prior to
- 0.2.0.x, this information was encoded in ordinary router descriptors;
- in any case they have always been listed as opt, so they should be
- accepted anyway.)
-
- Add these fields to router descriptors:
-
- "extra-info-digest" Digest
- "Digest" is a hex-encoded digest (using upper-case characters)
- of the router's extra-info document, as signed in the router's
- extra-info. (If this field is absent, no extra-info-digest
- exists.)
-
- "caches-extra-info"
- Present if this router is a directory cache that provides
- extra-info documents, or an authority that handles extra-info
- documents.
-
- (Since implementations before 0.1.2.5-alpha required that the "opt"
- keyword precede any unrecognized entry, these keys MUST be preceded
- with "opt" until 0.1.2.5-alpha is obsolete.)
-
- 3. New communications rules
-
- Servers SHOULD generate and upload one extra-info document after each
- descriptor they generate and upload; no more, no less. Servers MUST
- upload the new descriptor before they upload the new extra-info.
-
- Authorities receiving an extra-info document SHOULD verify all of the
- following:
- * They have a router descriptor for some server with a matching
- nickname and identity fingerprint.
- * That server's identity key has been used to sign the extra-info
- document.
- * The extra-info-digest field in the router descriptor matches
- the digest of the extra-info document.
- * The published fields in the two documents match.
-
- Authorities SHOULD drop extra-info documents that do not meet these
- criteria.
-
- Extra-info documents MAY be uploaded as part of the same HTTP post as
- the router descriptor, or separately. Authorities MUST accept both
- methods.
-
- Authorities SHOULD try to fetch extra-info documents from one another if
- they do not have one matching the digest declared in a router
- descriptor.
-
- Caches that are running locally with a tool that needs to use extra-info
- documents MAY download and store extra-info documents. They should do
- so when they notice that the recommended descriptor has an
- extra-info-digest not matching any extra-info document they currently
- have. (Caches not running on a host that needs to use extra-info
- documents SHOULD NOT download or cache them.)
-
- 4. New URLs
-
- http://<hostname>/tor/extra/d/...
- http://<hostname>/tor/extra/fp/...
- http://<hostname>/tor/extra/all[.z]
- (As for /tor/server/ URLs: supports fetching extra-info documents
- by their digest, by the fingerprint of their servers, or all
- at once. When serving by fingerprint, we serve the extra-info
- that corresponds to the descriptor we would serve by that
- fingerprint. Only directory authorities are guaranteed to support
- these URLs.)
-
- http://<hostname>/tor/extra/authority[.z]
- (The extra-info document for this router.)
-
- Extra-info documents are uploaded to the same URLs as regular
- router descriptors.
-
-Migration:
-
- For extra info approach:
- * First:
- * Authorities should accept extra info, and support serving it.
- * Routers should upload extra info once authorities accept it.
- * Caches should support an option to download and cache it, once
- authorities serve it.
- * Tools should be updated to use locally cached information.
- These tools include:
- lefkada's exit.py script.
- tor26's noreply script and general directory cache.
- https://nighteffect.us/tns/ for its graphs
- and check with or-talk for the rest, once it's time.
-
- * Set a cutoff time for including bandwidth in router descriptors, so
- that tools that use bandwidth info know that they will need to fetch
- extra info documents.
-
- * Once tools that want bandwidth info support fetching extra info:
- * Have routers stop including bandwidth info in their router
- descriptors.
diff --git a/doc/spec/proposals/105-handshake-revision.txt b/doc/spec/proposals/105-handshake-revision.txt
deleted file mode 100644
index 791a016c2..000000000
--- a/doc/spec/proposals/105-handshake-revision.txt
+++ /dev/null
@@ -1,323 +0,0 @@
-Filename: 105-handshake-revision.txt
-Title: Version negotiation for the Tor protocol.
-Author: Nick Mathewson, Roger Dingledine
-Created: Jan 2007
-Status: Closed
-Implemented-In: 0.2.0.x
-
-Overview:
-
- This document was extracted from a modified version of tor-spec.txt that we
- had written before the proposal system went into place. It adds two new
- cells types to the Tor link connection setup handshake: one used for
- version negotiation, and another to prevent MITM attacks.
-
- This proposal is partially implemented, and partially proceded by
- proposal 130.
-
-Motivation: Tor versions
-
- Our *current* approach to versioning the Tor protocol(s) has been as
- follows:
- - All changes must be backward compatible.
- - It's okay to add new cell types, if they would be ignored by previous
- versions of Tor.
- - It's okay to add new data elements to cells, if they would be
- ignored by previous versions of Tor.
- - For forward compatibility, Tor must ignore cell types it doesn't
- recognize, and ignore data in those cells it doesn't expect.
- - Clients can inspect the version of Tor declared in the platform line
- of a router's descriptor, and use that to learn whether a server
- supports a given feature. Servers, however, aren't assumed to all
- know about each other, and so don't know the version of who they're
- talking to.
-
- This system has these problems:
- - It's very hard to change fundamental aspects of the protocol, like the
- cell format, the link protocol, any of the various encryption schemes,
- and so on.
- - The router-to-router link protocol has remained more-or-less frozen
- for a long time, since we can't easily have an OR use new features
- unless it knows the other OR will understand them.
-
- We need to resolve these problems because:
- - Our cipher suite is showing its age: SHA1/AES128/RSA1024/DH1024 will
- not seem like the best idea for all time.
- - There are many ideas circulating for multiple cell sizes; while it's
- not obvious whether these are safe, we can't do them at all without a
- mechanism to permit them.
- - There are many ideas circulating for alternative circuit building and
- cell relay rules: they don't work unless they can coexist in the
- current network.
- - If our protocol changes a lot, it's hard to describe any coherent
- version of it: we need to say "the version that Tor versions W through
- X use when talking to versions Y through Z". This makes analysis
- harder.
-
-Motivation: Preventing MITM attacks
-
- TLS prevents a man-in-the-middle attacker from reading or changing the
- contents of a communication. It does not, however, prevent such an
- attacker from observing timing information. Since timing attacks are some
- of the most effective against low-latency anonymity nets like Tor, we
- should take more care to make sure that we're not only talking to who
- we think we're talking to, but that we're using the network path we
- believe we're using.
-
-Motivation: Signed clock information
-
- It's very useful for Tor instances to know how skewed they are relative
- to one another. The only way to find out currently has been to download
- directory information, and check the Date header--but this is not
- authenticated, and hence subject to modification on the wire. Using
- BEGIN_DIR to create an authenticated directory stream through an existing
- circuit is better, but that's an extra step and it might be nicer to
- learn the information in the course of the regular protocol.
-
-Proposal:
-
-1.0. Version numbers
-
- The node-to-node TLS-based "OR connection" protocol and the multi-hop
- "circuit" protocol are versioned quasi-independently.
-
- Of course, some dependencies will continue to exist: Certain versions
- of the circuit protocol may require a minimum version of the connection
- protocol to be used. The connection protocol affects:
- - Initial connection setup, link encryption, transport guarantees,
- etc.
- - The allowable set of cell commands
- - Allowable formats for cells.
-
- The circuit protocol determines:
- - How circuits are established and maintained
- - How cells are decrypted and relayed
- - How streams are established and maintained.
-
- Version numbers are incremented for backward-incompatible protocol changes
- only. Backward-compatible changes are generally implemented by adding
- additional fields to existing structures; implementations MUST ignore
- fields they do not expect. Unused portions of cells MUST be set to zero.
-
- Though versioning the protocol will make it easier to maintain backward
- compatibility with older versions of Tor, we will nevertheless continue to
- periodically drop support for older protocols,
- - to keep the implementation from growing without bound,
- - to limit the maintenance burden of patching bugs in obsolete Tors,
- - to limit the testing burden of verifying that many old protocol
- versions continue to be implemented properly, and
- - to limit the exposure of the network to protocol versions that are
- expensive to support.
-
- The Tor protocol as implemented through the 0.1.2.x Tor series will be
- called "version 1" in its link protocol and "version 1" in its relay
- protocol. Versions of the Tor protocol so old as to be incompatible with
- Tor 0.1.2.x can be considered to be version 0 of each, and are not
- supported.
-
-2.1. VERSIONS cells
-
- When a Tor connection is established, both parties normally send a
- VERSIONS cell before sending any other cells. (But see below.)
-
- VersionsLen [2 byte]
- Versions [VersionsLen bytes]
-
- "Versions" is a sequence of VersionsLen bytes. Each value between 1 and
- 127 inclusive represents a single version; current implementations MUST
- ignore other bytes. Parties should list all of the versions which they
- are able and willing to support. Parties can only communicate if they
- have some connection protocol version in common.
-
- Version 0.2.0.x-alpha and earlier don't understand VERSIONS cells,
- and therefore don't support version negotiation. Thus, waiting until
- the other side has sent a VERSIONS cell won't work for these servers:
- if the other side sends no cells back, it is impossible to tell
- whether they
- have sent a VERSIONS cell that has been stalled, or whether they have
- dropped our own VERSIONS cell as unrecognized. Therefore, we'll
- change the TLS negotiation parameters so that old parties can still
- negotiate, but new parties can recognize each other. Immediately
- after a TLS connection has been established, the parties check
- whether the other side negotiated the connection in an "old" way or a
- "new" way. If either party negotiated in the "old" way, we assume a
- v1 connection. Otherwise, both parties send VERSIONS cells listing
- all their supported versions. Upon receiving the other party's
- VERSIONS cell, the implementation begins using the highest-valued
- version common to both cells. If the first cell from the other party
- has a recognized command, and is _not_ a VERSIONS cell, we assume a
- v1 protocol.
-
- (For more detail on the TLS protocol change, see forthcoming draft
- proposals from Steven Murdoch.)
-
- Implementations MUST discard VERSIONS cells that are not the first
- recognized cells sent on a connection.
-
- The VERSIONS cell must be sent as a v1 cell (2 bytes of circuitID, 1
- byte of command, 509 bytes of payload).
-
- [NOTE: The VERSIONS cell is assigned the command number 7.]
-
-2.2. MITM-prevention and time checking
-
- If we negotiate a v2 connection or higher, the second cell we send SHOULD
- be a NETINFO cell. Implementations SHOULD NOT send NETINFO cells at other
- times.
-
- A NETINFO cell contains:
- Timestamp [4 bytes]
- Other OR's address [variable]
- Number of addresses [1 byte]
- This OR's addresses [variable]
-
- Timestamp is the OR's current Unix time, in seconds since the epoch. If
- an implementation receives time values from many ORs that
- indicate that its clock is skewed, it SHOULD try to warn the
- administrator. (We leave the definition of 'many' intentionally vague
- for now.)
-
- Before believing the timestamp in a NETINFO cell, implementations
- SHOULD compare the time at which they received the cell to the time
- when they sent their VERSIONS cell. If the difference is very large,
- it is likely that the cell was delayed long enough that its
- contents are out of date.
-
- Each address contains Type/Length/Value as used in Section 6.4 of
- tor-spec.txt. The first address is the one that the party sending
- the NETINFO cell believes the other has -- it can be used to learn
- what your IP address is if you have no other hints.
- The rest of the addresses are the advertised addresses of the party
- sending the NETINFO cell -- we include them
- to block a man-in-the-middle attack on TLS that lets an attacker bounce
- traffic through his own computers to enable timing and packet-counting
- attacks.
-
- A Tor instance should use the other Tor's reported address
- information as part of logic to decide whether to treat a given
- connection as suitable for extending circuits to a given address/ID
- combination. When we get an extend request, we use an
- existing OR connection if the ID matches, and ANY of the following
- conditions hold:
- - The IP matches the requested IP.
- - We know that the IP we're using is canonical because it was
- listed in the NETINFO cell.
- - We know that the IP we're using is canonical because it was
- listed in the server descriptor.
-
- [NOTE: The NETINFO cell is assigned the command number 8.]
-
-Discussion: Versions versus feature lists
-
- Many protocols negotiate lists of available features instead of (or in
- addition to) protocol versions. While it's possible that some amount of
- feature negotiation could be supported in a later Tor, we should prefer to
- use protocol versions whenever possible, for reasons discussed in
- the "Anonymity Loves Company" paper.
-
-Discussion: Bytes per version, versions per cell
-
- This document provides for a one-byte count of how many versions a Tor
- supports, and allows one byte per version. Thus, it can only support only
- 254 more versions of the protocol beyond the unallocated v0 and the
- current v1. If we ever need to split the protocol into 255 incompatible
- versions, we've probably screwed up badly somewhere.
-
- Nevertheless, here are two ways we could support more versions:
- - Change the version count to a two-byte field that counts the number of
- _bytes_ used, and use a UTF8-style encoding: versions 0 through 127
- take one byte to encode, versions 128 through 2047 take two bytes to
- encode, and so on. We wouldn't need to parse any version higher than
- 127 right now, since all bytes used to encode higher versions would
- have their high bit set.
-
- We'd still have a limit of 380 simultaneously versions that could be
- declared in any version. This is probably okay.
-
- - Decide that if we need to support more versions, we can add a
- MOREVERSIONS cell that gets sent before the VERSIONS cell. The spec
- above requires Tors to ignore unrecognized cell types that they get
- before the first VERSIONS cell, and still allows version negotiation
- to
- succeed.
-
- [Resolution: Reserve the high bit and the v0 value for later use. If
- we ever have more live versions than we can fit in a cell, we've made a
- bad design decision somewhere along the line.]
-
-Discussion: Reducing round-trips
-
- It might be appealing to see if we can cram more information in the
- initial VERSIONS cell. For example, the contents of NETINFO will pretty
- soon be sent by everybody before any more information is exchanged, but
- decoupling them from the version exchange increases round-trips.
-
- Instead, we could speculatively include handshaking information at
- the end of a VERSIONS cell, wrapped in a marker to indicate, "if we wind
- up speaking VERSION 2, here's the NETINFO I'll send. Otherwise, ignore
- this." This could be extended to opportunistically reduce round trips
- when possible for future versions when we guess the versions right.
-
- Of course, we'd need to be careful about using a feature like this:
- - We don't want to include things that are expensive to compute,
- like PK signatures or proof-of-work.
- - We don't want to speculate as a mobile client: it may leak our
- experience with the server in question.
-
-Discussion: Advertising versions in routerdescs and networkstatuses.
-
- In network-statuses:
-
- The networkstatus "v" line now has the format:
- "v" IMPLEMENTATION IMPL-VERSION "Link" LINK-VERSION-LIST
- "Circuit" CIRCUIT-VERSION-LIST NL
-
- LINK-VERSION-LIST and CIRCUIT-VERSION-LIST are comma-separated lists of
- supported version numbers. IMPLEMENTATION is the name of the
- implementation of the Tor protocol (e.g., "Tor"), and IMPL-VERSION is the
- version of the implementation.
-
- Examples:
- v Tor 0.2.5.1-alpha Link 1,2,3 Circuit 2,5
-
- v OtherOR 2000+ Link 3 Circuit 5
-
- Implementations that release independently of the Tor codebase SHOULD NOT
- use "Tor" as the value of their IMPLEMENTATION.
-
- Additional fields on the "v" line MUST be ignored.
-
- In router descriptors:
-
- The router descriptor should contain a line of the form,
- "protocols" "Link" LINK-VERSION-LIST "Circuit" CIRCUIT_VERSION_LIST
-
- Additional fields on the "protocols" line MUST be ignored.
-
- [Versions of Tor before 0.1.2.5-alpha rejected router descriptors with
- unrecognized items; the protocols line should be preceded with an "opt"
- until these Tors are obsolete.]
-
-Security issues:
-
- Client partitioning is the big danger when we introduce new versions; if a
- client supports some very unusual set of protocol versions, it will stand
- out from others no matter where it goes. If a server supports an unusual
- version, it will get a disproportionate amount of traffic from clients who
- prefer that version. We can mitigate this somewhat as follows:
-
- - Do not have clients prefer any protocol version by default until that
- version is widespread. (First introduce the new version to servers,
- and have clients admit to using it only when configured to do so for
- testing. Then, once many servers are running the new protocol
- version, enable its use by default.)
-
- - Do not multiply protocol versions needlessly.
-
- - Encourage protocol implementors to implement the same protocol version
- sets as some popular version of Tor.
-
- - Disrecommend very old/unpopular versions of Tor via the directory
- authorities' RecommmendedVersions mechanism, even if it is still
- technically possible to use them.
-
diff --git a/doc/spec/proposals/106-less-tls-constraint.txt b/doc/spec/proposals/106-less-tls-constraint.txt
deleted file mode 100644
index 7e7621df6..000000000
--- a/doc/spec/proposals/106-less-tls-constraint.txt
+++ /dev/null
@@ -1,111 +0,0 @@
-Filename: 106-less-tls-constraint.txt
-Title: Checking fewer things during TLS handshakes
-Author: Nick Mathewson
-Created: 9-Feb-2007
-Status: Closed
-Implemented-In: 0.2.0.x
-
-Overview:
-
- This document proposes that we relax our requirements on the context of
- X.509 certificates during initial TLS handshakes.
-
-Motivation:
-
- Later, we want to try harder to avoid protocol fingerprinting attacks.
- This means that we'll need to make our connection handshake look closer
- to a regular HTTPS connection: one certificate on the server side and
- zero certificates on the client side. For now, about the best we
- can do is to stop requiring things during handshake that we don't
- actually use.
-
-What we check now, and where we check it:
-
- tor_tls_check_lifetime:
- peer has certificate
- notBefore <= now <= notAfter
-
- tor_tls_verify:
- peer has at least one certificate
- There is at least one certificate in the chain
- At least one of the certificates in the chain is not the one used to
- negotiate the connection. (The "identity cert".)
- The certificate _not_ used to negotiate the connection has signed the
- link cert
-
- tor_tls_get_peer_cert_nickname:
- peer has a certificate.
- certificate has a subjectName.
- subjectName has a commonName.
- commonName consists only of characters in LEGAL_NICKNAME_CHARACTERS. [2]
-
- tor_tls_peer_has_cert:
- peer has a certificate.
-
- connection_or_check_valid_handshake:
- tor_tls_peer_has_cert [1]
- tor_tls_get_peer_cert_nickname [1]
- tor_tls_verify [1]
- If nickname in cert is a known, named router, then its identity digest
- must be as expected.
- If we initiated the connection, then we got the identity digest we
- expected.
-
- USEFUL THINGS WE COULD DO:
-
- [1] We could just not force clients to have any certificate at all, let alone
- an identity certificate. Internally to the code, we could assign the
- identity_digest field of these or_connections to a random number, or even
- not add them to the identity_digest->or_conn map.
- [so if somebody connects with no certs, we let them. and mark them as
- a client and don't treat them as a server. great. -rd]
-
- [2] Instead of using a restricted nickname character set that makes our
- commonName structure look unlike typical SSL certificates, we could treat
- the nickname as extending from the start of the commonName up to but not
- including the first non-nickname character.
-
- Alternatively, we could stop checking commonNames entirely. We don't
- actually _do_ anything based on the nickname in the certificate, so
- there's really no harm in letting every router have any commonName it
- wants.
- [this is the better choice -rd]
- [agreed. -nm]
-
-REMAINING WAYS TO RECOGNIZE CLIENT->SERVER CONNECTIONS:
-
- Assuming that we removed the above requirements, we could then (in a later
- release) have clients not send certificates, and sometimes and started
- making our DNs a little less formulaic, client->server OR connections would
- still be recognizable by:
- having a two-certificate chain sent by the server
- using a particular set of ciphersuites
- traffic patterns
- probing the server later
-
-OTHER IMPLICATIONS:
-
- If we stop verifying the above requirements:
-
- It will be slightly (but only slightly) more common to connect to a non-Tor
- server running TLS, and believe that you're talking to a Tor server (until
- you send the first cell).
-
- It will be far easier for non-Tor SSL clients to accidentally connect to
- Tor servers and speak HTTPS or whatever to them.
-
- If, in a later release, we have clients not send certificates, and we make
- DNs less recognizable:
-
- If clients don't send certs, servers don't need to verify them: win!
-
- If we remove these restrictions, it will be easier for people to write
- clients to fuzz our protocol: sorta win!
-
- If clients don't send certs, they look slightly less like servers.
-
-OTHER SPEC CHANGES:
-
- When a client doesn't give us an identity, we should never extend any
- circuits to it (duh), and we should allow it to set circuit ID however it
- wants.
diff --git a/doc/spec/proposals/107-uptime-sanity-checking.txt b/doc/spec/proposals/107-uptime-sanity-checking.txt
deleted file mode 100644
index 922129b21..000000000
--- a/doc/spec/proposals/107-uptime-sanity-checking.txt
+++ /dev/null
@@ -1,54 +0,0 @@
-Filename: 107-uptime-sanity-checking.txt
-Title: Uptime Sanity Checking
-Author: Kevin Bauer & Damon McCoy
-Created: 8-March-2007
-Status: Closed
-Implemented-In: 0.2.0.x
-
-Overview:
-
- This document describes how to cap the uptime that is used when computing
- which routers are marked as stable such that highly stable routers cannot
- be displaced by malicious routers that report extremely high uptime
- values.
-
- This is similar to how bandwidth is capped at 1.5MB/s.
-
-Motivation:
-
- It has been pointed out that an attacker can displace all stable nodes and
- entry guard nodes by reporting high uptimes. This is an easy fix that will
- prevent highly stable nodes from being displaced.
-
-Security implications:
-
- It should decrease the effectiveness of routing attacks that report high
- uptimes while not impacting the normal routing algorithms.
-
-Specification:
-
- So we could patch Section 3.1 of dir-spec.txt to say:
-
- "Stable" -- A router is 'Stable' if it is running, valid, not
- hibernating, and either its uptime is at least the median uptime for
- known running, valid, non-hibernating routers, or its uptime is at
- least 30 days. Routers are never called stable if they are running
- a version of Tor known to drop circuits stupidly. (0.1.1.10-alpha
- through 0.1.1.16-rc are stupid this way.)
-
-Compatibility:
-
- There should be no compatibility issues due to uptime capping.
-
-Implementation:
-
- Implemented and merged into dir-spec in 0.2.0.0-alpha-dev (r9788).
-
-Discussion:
-
- Initially, this proposal set the maximum at 60 days, not 30; the 30 day
- limit and spec wording was suggested by Roger in an or-dev post on 9 March
- 2007.
-
- This proposal also led to 108-mtbf-based-stability.txt
-
diff --git a/doc/spec/proposals/108-mtbf-based-stability.txt b/doc/spec/proposals/108-mtbf-based-stability.txt
deleted file mode 100644
index 294103760..000000000
--- a/doc/spec/proposals/108-mtbf-based-stability.txt
+++ /dev/null
@@ -1,88 +0,0 @@
-Filename: 108-mtbf-based-stability.txt
-Title: Base "Stable" Flag on Mean Time Between Failures
-Author: Nick Mathewson
-Created: 10-Mar-2007
-Status: Closed
-Implemented-In: 0.2.0.x
-
-Overview:
-
- This document proposes that we change how directory authorities set the
- stability flag from inspection of a router's declared Uptime to the
- authorities' perceived mean time between failure for the router.
-
-Motivation:
-
- Clients prefer nodes that the authorities call Stable. This flag is (as
- of 0.2.0.0-alpha-dev) set entirely based on the node's declared value for
- uptime. This creates an opportunity for malicious nodes to declare
- falsely high uptimes in order to get more traffic.
-
-Spec changes:
-
- Replace the current rule for setting the Stable flag with:
-
- "Stable" -- A router is 'Stable' if it is active and its observed Stability
- for the past month is at or above the median Stability for active routers.
- Routers are never called stable if they are running a version of Tor
- known to drop circuits stupidly. (0.1.1.10-alpha through 0.1.1.16-rc
- are stupid this way.)
-
- Stability shall be defined as the weighted mean length of the runs
- observed by a given directory authority. A run begins when an authority
- decides that the server is Running, and ends when the authority decides
- that the server is not Running. In-progress runs are counted when
- measuring Stability. When calculating the mean, runs are weighted by
- $\alpha ^ t$, where $t$ is time elapsed since the end of the run, and
- $0 < \alpha < 1$. Time when an authority is down do not count to the
- length of the run.
-
-Rejected Alternative:
-
- "A router's Stability shall be defined as the sum of $\alpha ^ d$ for every
- $d$ such that the router was considered reachable for the entire day
- $d$ days ago.
-
- This allows a simpler implementation: every day, we multiply
- yesterday's Stability by alpha, and if the router was observed to be
- available every time we looked today, we add 1.
-
- Instead of "day", we could pick an arbitrary time unit. We should
- pick alpha to be high enough that long-term stability counts, but low
- enough that the distant past is eventually forgotten. Something
- between .8 and .95 seems right.
-
- (By requiring that routers be up for an entire day to get their
- stability increased, instead of counting fractions of a day, we
- capture the notion that stability is more like "probability of
- staying up for the next hour" than it is like "probability of being
- up at some randomly chosen time over the next hour." The former
- notion of stability is far more relevant for long-lived circuits.)
-
-Limitations:
-
- Authorities can have false positives and false negatives when trying to
- tell whether a router is up or down. So long as these aren't terribly
- wrong, and so long as they aren't significantly biased, we should be able
- to use them to estimate stability pretty well.
-
- Probing approaches like the above could miss short incidents of
- downtime. If we use the router's declared uptime, we could detect
- these: but doing so would penalize routers who reported their uptime
- accurately.
-
-Implementation:
-
- For now, the easiest way to store this information at authorities
- would probably be in some kind of periodically flushed flat file.
- Later, we could move to Berkeley db or something if we really had to.
-
- For each router, an authority will need to store:
- The router ID.
- Whether the router is up.
- The time when the current run started, if the router is up.
- The weighted sum length of all previous runs.
- The time at which the weighted sum length was last weighted down.
-
- Servers should probe at random intervals to test whether servers are
- running.
diff --git a/doc/spec/proposals/109-no-sharing-ips.txt b/doc/spec/proposals/109-no-sharing-ips.txt
deleted file mode 100644
index 5438cf049..000000000
--- a/doc/spec/proposals/109-no-sharing-ips.txt
+++ /dev/null
@@ -1,90 +0,0 @@
-Filename: 109-no-sharing-ips.txt
-Title: No more than one server per IP address.
-Author: Kevin Bauer & Damon McCoy
-Created: 9-March-2007
-Status: Closed
-Implemented-In: 0.2.0.x
-
-Overview:
- This document describes a solution to a Sybil attack vulnerability in the
- directory servers. Currently, it is possible for a single IP address to
- host an arbitrarily high number of Tor routers. We propose that the
- directory servers limit the number of Tor routers that may be registered at
- a particular IP address to some small (fixed) number, perhaps just one Tor
- router per IP address.
-
- While Tor never uses more than one server from a given /16 in the same
- circuit, an attacker with multiple servers in the same place is still
- dangerous because he can get around the per-server bandwidth cap that is
- designed to prevent a single server from attracting too much of the overall
- traffic.
-
-Motivation:
- Since it is possible for an attacker to register an arbitrarily large
- number of Tor routers, it is possible for malicious parties to do this
- as part of a traffic analysis attack.
-
-Security implications:
- This countermeasure will increase the number of IP addresses that an
- attacker must control in order to carry out traffic analysis.
-
-Specification:
-
- For each IP address, each directory authority tracks the number of routers
- using that IP address, along with their total observed bandwidth. If there
- are more than MAX_SERVERS_PER_IP servers at some IP, the authority should
- "disable" all but MAX_SERVERS_PER_IP servers. When choosing which servers
- to disable, the authority should first disable non-Running servers in
- increasing order of observed bandwidth, and then should disable Running
- servers in increasing order of bandwidth.
-
- [[ We don't actually do this part here. -NM
-
- If the total observed
- bandwidth of the remaining non-"disabled" servers exceeds MAX_BW_PER_IP,
- the authority should "disable" some of the remaining servers until only one
- server remains, or until the remaining observed bandwidth of non-"disabled"
- servers is under MAX_BW_PER_IP.
- ]]
-
- Servers that are "disabled" MUST be marked as non-Valid and non-Running.
-
- MAX_SERVERS_PER_IP is 3.
-
- MAX_BW_PER_IP is 8 MB per s.
-
-Compatibility:
-
- Upon inspection of a directory server, we found that the following IP
- addresses have more than one Tor router:
-
- Scruples 68.5.113.81 ip68-5-113-81.oc.oc.cox.net 443
- WiseUp 68.5.113.81 ip68-5-113-81.oc.oc.cox.net 9001
- Unnamed 62.1.196.71 pc01-megabyte-net-arkadiou.megabyte.gr 9001
- Unnamed 62.1.196.71 pc01-megabyte-net-arkadiou.megabyte.gr 9001
- Unnamed 62.1.196.71 pc01-megabyte-net-arkadiou.megabyte.gr 9001
- aurel 85.180.62.138 e180062138.adsl.alicedsl.de 9001
- sokrates 85.180.62.138 e180062138.adsl.alicedsl.de 9001
- moria1 18.244.0.188 moria.mit.edu 9001
- peacetime 18.244.0.188 moria.mit.edu 9100
-
- There may exist compatibility issues with this proposed fix. Reasons why
- more than one server would share an IP address include:
-
- * Testing. moria1, moria2, peacetime, and other morias all run on one
- computer at MIT, because that way we get testing. Moria1 and moria2 are
- run by Roger, and peacetime is run by Nick.
- * NAT. If there are several servers but they port-forward through the same
- IP address, ... we can hope that the operators coordinate with each
- other. Also, we should recognize that while they help the network in
- terms of increased capacity, they don't help as much as they could in
- terms of location diversity. But our approach so far has been to take
- what we can get.
- * People who have more than 1.5MB/s and want to help out more. For
- example, for a while Tonga was offering 10MB/s and its Tor server
- would only make use of a bit of it. So Roger suggested that he run
- two Tor servers, to use more.
-
-[Note Roger's tweak to this behavior, in
-http://archives.seul.org/or/cvs/Oct-2007/msg00118.html]
-
diff --git a/doc/spec/proposals/110-avoid-infinite-circuits.txt b/doc/spec/proposals/110-avoid-infinite-circuits.txt
deleted file mode 100644
index fffc41c25..000000000
--- a/doc/spec/proposals/110-avoid-infinite-circuits.txt
+++ /dev/null
@@ -1,120 +0,0 @@
-Filename: 110-avoid-infinite-circuits.txt
-Title: Avoiding infinite length circuits
-Author: Roger Dingledine
-Created: 13-Mar-2007
-Status: Accepted
-Target: 0.2.1.x
-Implemented-In: 0.2.1.3-alpha
-
-History:
-
- Revised 28 July 2008 by nickm: set K.
- Revised 3 July 2008 by nickm: rename from relay_extend to
- relay_early. Revise to current migration plan. Allow K cells
- over circuit lifetime, not just at start.
-
-Overview:
-
- Right now, an attacker can add load to the Tor network by extending a
- circuit an arbitrary number of times. Every cell that goes down the
- circuit then adds N times that amount of load in overall bandwidth
- use. This vulnerability arises because servers don't know their position
- on the path, so they can't tell how many nodes there are before them
- on the path.
-
- We propose a new set of relay cells that are distinguishable by
- intermediate hops as permitting extend cells. This approach will allow
- us to put an upper bound on circuit length relative to the number of
- colluding adversary nodes; but there are some downsides too.
-
-Motivation:
-
- The above attack can be used to generally increase load all across the
- network, or it can be used to target specific servers: by building a
- circuit back and forth between two victim servers, even a low-bandwidth
- attacker can soak up all the bandwidth offered by the fastest Tor
- servers.
-
- The general attacks could be used as a demonstration that Tor isn't
- perfect (leading to yet more media articles about "breaking" Tor), and
- the targetted attacks will come into play once we have a reputation
- system -- it will be trivial to DoS a server so it can't pass its
- reputation checks, in turn impacting security.
-
-Design:
-
- We should split RELAY cells into two types: RELAY and RELAY_EARLY.
-
- Only K (say, 10) Relay_early cells can be sent across a circuit, and
- only relay_early cells are allowed to contain extend requests. We
- still support obscuring the length of the circuit (if more research
- shows us what to do), because Alice can choose how many of the K to
- mark as relay_early. Note that relay_early cells *can* contain any
- sort of data cell; so in effect it's actually the relay type cells
- that are restricted. By default, she would just send the first K
- data cells over the stream as relay_early cells, regardless of their
- actual type.
-
- (Note that a circuit that is out of relay_early cells MUST NOT be
- cannibalized later, since it can't extend. Note also that it's always okay
- to use regular RELAY cells when sending non-EXTEND commands targetted at
- the first hop of a circuit, since there is no intermediate hop to try to
- learn the relay command type.)
-
- Each intermediate server would pass on the same type of cell that it
- received (either relay or relay_early), and the cell's destination
- will be able to learn whether it's allowed to contain an Extend request.
-
- If an intermediate server receives more than K relay_early cells, or
- if it sees a relay cell that contains an extend request, then it
- tears down the circuit (protocol violation).
-
-Security implications:
-
- The upside is that this limits the bandwidth amplification factor to
- K: for an individual circuit to become arbitrary-length, the attacker
- would need an adversary-controlled node every K hops, and at that
- point the attack is no worse than if the attacker creates N/K separate
- K-hop circuits.
-
- On the other hand, we want to pick a large enough value of K that we
- don't mind the cap.
-
- If we ever want to take steps to hide the number of hops in the circuit
- or a node's position in the circuit, this design probably makes that
- more complex.
-
-Migration:
-
- In 0.2.0, servers speaking v2 or later of the link protocol accept
- RELAY_EARLY cells, and pass them on. If the next OR in the circuit
- is not speaking the v2 link protocol, the server relays the cell as
- a RELAY cell.
-
- In 0.2.1.3-alpha, clients begin using RELAY_EARLY cells on v2
- connections. This functionality can be safely backported to
- 0.2.0.x. Clients should pick a random number betweeen (say) K and
- K-2 to send.
-
- In 0.2.1.3-alpha, servers close any circuit in which more than K
- relay_early cells are sent.
-
- Once all versions the do not send RELAY_EARLY cells are obsolete,
- servers can begin to reject any EXTEND requests not sent in a
- RELAY_EARLY cell.
-
-Parameters:
-
- Let K = 8, for no terribly good reason.
-
-Spec:
-
- [We can formalize this part once we think the design is a good one.]
-
-Acknowledgements:
-
- This design has been kicking around since Christian Grothoff and I came
- up with it at PET 2004. (Nathan Evans, Christian Grothoff's student,
- is working on implementing a fix based on this design in the summer
- 2007 timeframe.)
-
diff --git a/doc/spec/proposals/111-local-traffic-priority.txt b/doc/spec/proposals/111-local-traffic-priority.txt
deleted file mode 100644
index 9411463c2..000000000
--- a/doc/spec/proposals/111-local-traffic-priority.txt
+++ /dev/null
@@ -1,151 +0,0 @@
-Filename: 111-local-traffic-priority.txt
-Title: Prioritizing local traffic over relayed traffic
-Author: Roger Dingledine
-Created: 14-Mar-2007
-Status: Closed
-Implemented-In: 0.2.0.x
-
-Overview:
-
- We describe some ways to let Tor users operate as a relay and enforce
- rate limiting for relayed traffic without impacting their locally
- initiated traffic.
-
-Motivation:
-
- Right now we encourage people who use Tor as a client to configure it
- as a relay too ("just click the button in Vidalia"). Most of these users
- are on asymmetric links, meaning they have a lot more download capacity
- than upload capacity. But if they enable rate limiting too, suddenly
- they're limited to the same download capacity as upload capacity. And
- they have to enable rate limiting, or their upstream pipe gets filled
- up, starts dropping packets, and now their net connection doesn't work
- even for non-Tor stuff. So they end up turning off the relaying part
- so they can use Tor (and other applications) again.
-
- So far this hasn't mattered that much: most of our fast relays are
- being operated only in relay mode, so the rate limiting makes sense
- for them. But if we want to be able to attract many more relays in
- the future, we need to let ordinary users act as relays too.
-
- Further, as we begin to deploy the blocking-resistance design and we
- rely on ordinary users to click the "Tor for Freedom" button, this
- limitation will become a serious stumbling block to getting volunteers
- to act as bridges.
-
-The problem:
-
- Tor implements its rate limiting on the 'read' side by only reading
- a certain number of bytes from the network in each second. If it has
- emptied its token bucket, it doesn't read any more from the network;
- eventually TCP notices and stalls until we resume reading. But if we
- want to have two classes of service, we can't know what class a given
- incoming cell will be until we look at it, at which point we've already
- read it.
-
-Some options:
-
- Option 1: read when our token bucket is full enough, and if it turns
- out that what we read was local traffic, then add the tokens back into
- the token bucket. This will work when local traffic load alternates
- with relayed traffic load; but it's a poor option in general, because
- when we're receiving both local and relayed traffic, there are plenty
- of cases where we'll end up with an empty token bucket, and then we're
- back where we were before.
-
- More generally, notice that our problem is easy when a given TCP
- connection either has entirely local circuits or entirely relayed
- circuits. In fact, even if they are both present, if one class is
- entirely idle (none of its circuits have sent or received in the past
- N seconds), we can ignore that class until it wakes up again. So it
- only gets complex when a single connection contains active circuits
- of both classes.
-
- Next, notice that local traffic uses only the entry guards, whereas
- relayed traffic likely doesn't. So if we're a bridge handling just
- a few users, the expected number of overlapping connections would be
- almost zero, and even if we're a full relay the number of overlapping
- connections will be quite small.
-
- Option 2: build separate TCP connections for local traffic and for
- relayed traffic. In practice this will actually only require a few
- extra TCP connections: we would only need redundant TCP connections
- to at most the number of entry guards in use.
-
- However, this approach has some drawbacks. First, if the remote side
- wants to extend a circuit to you, how does it know which TCP connection
- to send it on? We would need some extra scheme to label some connections
- "client-only" during construction. Perhaps we could do this by seeing
- whether any circuit was made via CREATE_FAST; but this still opens
- up a race condition where the other side sends a create request
- immediately. The only ways I can imagine to avoid the race entirely
- are to specify our preference in the VERSIONS cell, or to add some
- sort of "nope, not this connection, why don't you try another rather
- than failing" response to create cells, or to forbid create cells on
- connections that you didn't initiate and on which you haven't seen
- any circuit creation requests yet -- this last one would lead to a bit
- more connection bloat but doesn't seem so bad. And we already accept
- this race for the case where directory authorities establish new TCP
- connections periodically to check reachability, and then hope to hang
- up on them soon after. (In any case this issue is moot for bridges,
- since each destination will be one-way with respect to extend requests:
- either receiving extend requests from bridge users or sending extend
- requests to the Tor server, never both.)
-
- The second problem with option 2 is that using two TCP connections
- reveals that there are two classes of traffic (and probably quickly
- reveals which is which, based on throughput). Now, it's unclear whether
- this information is already available to the other relay -- he would
- easily be able to tell that some circuits are fast and some are rate
- limited, after all -- but it would be nice to not add even more ways to
- leak that information. Also, it's less clear that an external observer
- already has this information if the circuits are all bundled together,
- and for this case it's worth trying to protect it.
-
- Option 3: tell the other side about our rate limiting rules. When we
- establish the TCP connection, specify the different policy classes we
- have configured. Each time we extend a circuit, specify which policy
- class that circuit should be part of. Then hope the other side obeys
- our wishes. (If he doesn't, hang up on him.) Besides the design and
- coordination hassles involved in this approach, there's a big problem:
- our rate limiting classes apply to all our connections, not just
- pairwise connections. How does one server we're connected to know how
- much of our bucket has already been spent by another? I could imagine
- a complex and inefficient "ok, now you can send me those two more cells
- that you've got queued" protocol. I'm not sure how else we could do it.
-
- (Gosh. How could UDP designs possibly be compatible with rate limiting
- with multiple bucket sizes?)
-
- Option 4: put both classes of circuits over a single connection, and
- keep track of the last time we read or wrote a high-priority cell. If
- it's been less than N seconds, give the whole connection high priority,
- else give the whole connection low priority.
-
- Option 5: put both classes of circuits over a single connection, and
- play a complex juggling game by periodically telling the remote side
- what rate limits to set for that connection, so you end up giving
- priority to the right connections but still stick to roughly your
- intended bandwidthrate and relaybandwidthrate.
-
- Option 6: ?
-
-Prognosis:
-
- Nick really didn't like option 2 because of the partitioning questions.
-
- I've put option 4 into place as of Tor 0.2.0.3-alpha.
-
- In terms of implementation, it will be easy: just add a time_t to
- or_connection_t that specifies client_used (used by the initiator
- of the connection to rate limit it differently depending on how
- recently the time_t was reset). We currently update client_used
- in three places:
- - command_process_relay_cell() when we receive a relay cell for
- an origin circuit.
- - relay_send_command_from_edge() when we send a relay cell for
- an origin circuit.
- - circuit_deliver_create_cell() when send a create cell.
- We could probably remove the third case and it would still work,
- but hey.
-
diff --git a/doc/spec/proposals/112-bring-back-pathlencoinweight.txt b/doc/spec/proposals/112-bring-back-pathlencoinweight.txt
deleted file mode 100644
index 3f6c3376f..000000000
--- a/doc/spec/proposals/112-bring-back-pathlencoinweight.txt
+++ /dev/null
@@ -1,163 +0,0 @@
-Filename: 112-bring-back-pathlencoinweight.txt
-Title: Bring Back Pathlen Coin Weight
-Author: Mike Perry
-Created:
-Status: Superseded
-Superseded-By: 115
-
-
-Overview:
-
- The idea is that users should be able to choose a weight which
- probabilistically chooses their path lengths to be 2 or 3 hops. This
- weight will essentially be a biased coin that indicates an
- additional hop (beyond 2) with probability P. The user should be
- allowed to choose 0 for this weight to always get 2 hops and 1 to
- always get 3.
-
- This value should be modifiable from the controller, and should be
- available from Vidalia.
-
-
-Motivation:
-
- The Tor network is slow and overloaded. Increasingly often I hear
- stories about friends and friends of friends who are behind firewalls,
- annoying censorware, or under surveillance that interferes with their
- productivity and Internet usage, or chills their speech. These people
- know about Tor, but they choose to put up with the censorship because
- Tor is too slow to be usable for them. In fact, to download a fresh,
- complete copy of levine-timing.pdf for the Anonymity Implications
- section of this proposal over Tor took me 3 tries.
-
- There are many ways to improve the speed problem, and of course we
- should and will implement as many as we can. Johannes's GSoC project
- and my reputation system are longer term, higher-effort things that
- will still provide benefit independent of this proposal.
-
- However, reducing the path length to 2 for those who do not need the
- (questionable) extra anonymity 3 hops provide not only improves
- their Tor experience but also reduces their load on the Tor network by
- 33%, and can be done in less than 10 lines of code. That's not just
- Win-Win, it's Win-Win-Win.
-
- Furthermore, when blocking resistance measures insert an extra relay
- hop into the equation, 4 hops will certainly be completely unusable
- for these users, especially since it will be considerably more
- difficult to balance the load across a dark relay net than balancing
- the load on Tor itself (which today is still not without its flaws).
-
-
-Anonymity Implications:
-
- It has long been established that timing attacks against mixed
- networks are extremely effective, and that regardless of path
- length, if the adversary has compromised your first and last
- hop of your path, you can assume they have compromised your
- identity for that connection.
-
- In [1], it is demonstrated that for all but the slowest, lossiest
- networks, error rates for false positives and false negatives were
- very near zero. Only for constant streams of traffic over slow and
- (more importantly) extremely lossy network links did the error rate
- hit 20%. For loss rates typical to the Internet, even the error rate
- for slow nodes with constant traffic streams was 13%.
-
- When you take into account that most Tor streams are not constant,
- but probably much more like their "HomeIP" dataset, which consists
- mostly of web traffic that exists over finite intervals at specific
- times, error rates drop to fractions of 1%, even for the "worst"
- network nodes.
-
- Therefore, the user has little benefit from the extra hop, assuming
- the adversary does timing correlation on their nodes. The real
- protection is the probability of getting both the first and last hop,
- and this is constant whether the client chooses 2 hops, 3 hops, or 42.
-
- Partitioning attacks form another concern. Since Tor uses telescoping
- to build circuits, it is possible to tell a user is constructing only
- two hop paths at the entry node. It is questionable if this data is
- actually worth anything though, especially if the majority of users
- have easy access to this option, and do actually choose their path
- lengths semi-randomly.
-
- Nick has postulated that exits may also be able to tell that you are
- using only 2 hops by the amount of time between sending their
- RELAY_CONNECTED cell and the first bit of RELAY_DATA traffic they
- see from the OP. I doubt that they will be able to make much use
- of this timing pattern, since it will likely vary widely depending
- upon the type of node selected for that first hop, and the user's
- connection rate to that first hop. It is also questionable if this
- data is worth anything, especially if many users are using this
- option (and I imagine many will).
-
- Perhaps most seriously, two hop paths do allow malicious guards
- to easily fail circuits if they do not extend to their colluding peers
- for the exit hop. Since guards can detect the number of hops in a
- path, they could always fail the 3 hop circuits and focus on
- selectively failing the two hop ones until a peer was chosen.
-
- I believe currently guards are rotated if circuits fail, which does
- provide some protection, but this could be changed so that an entry
- guard is completely abandoned after a certain ratio of extend or
- general circuit failures with respect to non-failed circuits. This
- could possibly be gamed to increase guard turnover, but such a game
- would be much more noticeable than an individual guard failing circuits,
- though, since it would affect all clients, not just those who chose
- a particular guard.
-
-
-Why not fix Pathlen=2?:
-
- The main reason I am not advocating that we always use 2 hops is that
- in some situations, timing correlation evidence by itself may not be
- considered as solid and convincing as an actual, uninterrupted, fully
- traced path. Are these timing attacks as effective on a real network
- as they are in simulation? Would an extralegal adversary or authoritarian
- government even care? In the face of these situation-dependent unknowns,
- it should be up to the user to decide if this is a concern for them or not.
-
- It should probably also be noted that even a false positive
- rate of 1% for a 200k concurrent-user network could mean that for a
- given node, a given stream could be confused with something like 10
- users, assuming ~200 nodes carry most of the traffic (ie 1000 users
- each). Though of course to really know for sure, someone needs to do
- an attack on a real network, unfortunately.
-
-
-Implementation:
-
- new_route_len() can be modified directly with a check of the
- PathlenCoinWeight option (converted to percent) and a call to
- crypto_rand_int(0,100) for the weighted coin.
-
- The entry_guard_t structure could have num_circ_failed and
- num_circ_succeeded members such that if it exceeds N% circuit
- extend failure rate to a second hop, it is removed from the entry list.
- N should be sufficiently high to avoid churn from normal Tor circuit
- failure as determined by TorFlow scans.
-
- The Vidalia option should be presented as a boolean, to minimize confusion
- for the user. Something like a radiobutton with:
-
- * "I use Tor for Censorship Resistance, not Anonymity. Speed is more
- important to me than Anonymity."
- * "I use Tor for Anonymity. I need extra protection at the cost of speed."
-
- and then some explanation in the help for exactly what this means, and
- the risks involved with eliminating the adversary's need for timing attacks
- wrt to false positives, etc.
-
-Migration:
-
- Phase one: Experiment with the proper ratio of circuit failures
- used to expire garbage or malicious guards via TorFlow.
-
- Phase two: Re-enable config and modify new_route_len() to add an
- extra hop if coin comes up "heads".
-
- Phase three: Make radiobutton in Vidalia, along with help entry
- that explains in layman's terms the risks involved.
-
-
-[1] http://www.cs.umass.edu/~mwright/papers/levine-timing.pdf
diff --git a/doc/spec/proposals/113-fast-authority-interface.txt b/doc/spec/proposals/113-fast-authority-interface.txt
deleted file mode 100644
index 8912b5322..000000000
--- a/doc/spec/proposals/113-fast-authority-interface.txt
+++ /dev/null
@@ -1,85 +0,0 @@
-Filename: 113-fast-authority-interface.txt
-Title: Simplifying directory authority administration
-Author: Nick Mathewson
-Created:
-Status: Superseded
-
-Overview
-
-The problem:
-
- Administering a directory authority is a pain: you need to go through
- emails and manually add new nodes as "named". When bad things come up,
- you need to mark nodes (or whole regions) as invalid, badexit, etc.
-
- This means that mostly, authority admins don't: only 2/4 current authority
- admins actually bind names or list bad exits, and those two have often
- complained about how annoying it is to do so.
-
- Worse, name binding is a common path, but it's a pain in the neck: nobody
- has done it for a couple of months.
-
-Digression: who knows what?
-
- It's trivial for Tor to automatically keep track of all of the
- following information about a server:
- name, fingerprint, IP, last-seen time, first-seen time, declared
- contact.
-
- All we need to have the administrator set is:
- - Is this name/fingerprint pair bound?
- - Is this fingerprint/IP a bad exit?
- - Is this fingerprint/IP an invalid node?
- - Is this fingerprint/IP to be rejected?
-
- The workflow for authority admins has two parts:
- - Periodically, go through tor-ops and add new names. This doesn't
- need to be done urgently.
- - Less often, mark badly behaved serves as badly behaved. This is more
- urgent.
-
-Possible solution #1: Web-interface for name binding.
-
- Deprecate use of the tor-ops mailing list; instead, have operators go to a
- webform and enter their server info. This would put the information in a
- standardized format, thus allowing quick, nearly-automated approval and
- reply.
-
-Possible solution #2: Self-binding names.
-
- Peter Palfrader has proposed that names be assigned automatically to nodes
- that have been up and running and valid for a while.
-
-Possible solution #3: Self-maintaining approved-routers file
-
- Mixminion alpha has a neat feature where whenever a new server is seen,
- a stub line gets added to a configuration file. For Tor, it could look
- something like this:
-
- ## First seen with this key on 2007-04-21 13:13:14
- ## Stayed up for at least 12 hours on IP 192.168.10.10
- #RouterName AAAABBBBCCCCDDDDEFEF
-
- (Note that the implementation needs to parse commented lines to make sure
- that it doesn't add duplicates, but that's not so hard.)
-
- To add a router as named, administrators would only need to uncomment the
- entry. This automatically maintained file could be kept separately from a
- manually maintained one.
-
- This could be combined with solution #2, such that Tor would do the hard
- work of uncommenting entries for routers that should get Named, but
- operators could override its decisions.
-
-Possible solution #4: A separate mailing list for authority operators.
-
- Right now, the tor-ops list is very high volume. There should be another
- list that's only for dealing with problems that need prompt action, like
- marking a router as !badexit.
-
-Resolution:
-
- Solution #2 is described in "Proposal 123: Naming authorities
- automatically create bindings", and that approach is implemented.
- There are remaining issues in the problem statement above that need
- their own solutions.
diff --git a/doc/spec/proposals/114-distributed-storage.txt b/doc/spec/proposals/114-distributed-storage.txt
deleted file mode 100644
index 91a787d30..000000000
--- a/doc/spec/proposals/114-distributed-storage.txt
+++ /dev/null
@@ -1,439 +0,0 @@
-Filename: 114-distributed-storage.txt
-Title: Distributed Storage for Tor Hidden Service Descriptors
-Author: Karsten Loesing
-Created: 13-May-2007
-Status: Closed
-Implemented-In: 0.2.0.x
-
-Change history:
-
- 13-May-2007 Initial proposal
- 14-May-2007 Added changes suggested by Lasse Øverlier
- 30-May-2007 Changed descriptor format, key length discussion, typos
- 09-Jul-2007 Incorporated suggestions by Roger, added status of specification
- and implementation for upcoming GSoC mid-term evaluation
- 11-Aug-2007 Updated implementation statuses, included non-consecutive
- replication to descriptor format
- 20-Aug-2007 Renamed config option HSDir as HidServDirectoryV2
- 02-Dec-2007 Closed proposal
-
-Overview:
-
- The basic idea of this proposal is to distribute the tasks of storing and
- serving hidden service descriptors from currently three authoritative
- directory nodes among a large subset of all onion routers. The three
- reasons to do this are better robustness (availability), better
- scalability, and improved security properties. Further,
- this proposal suggests changes to the hidden service descriptor format to
- prevent new security threats coming from decentralization and to gain even
- better security properties.
-
-Status:
-
- As of December 2007, the new hidden service descriptor format is implemented
- and usable. However, servers and clients do not yet make use of descriptor
- cookies, because there are open usability issues of this feature that might
- be resolved in proposal 121. Further, hidden service directories do not
- perform replication by themselves, because (unauthorized) replica fetch
- requests would allow any attacker to fetch all hidden service descriptors in
- the system. As neither issue is critical to the functioning of v2
- descriptors and their distribution, this proposal is considered as Closed.
-
-Motivation:
-
- The current design of hidden services exhibits the following performance and
- security problems:
-
- First, the three hidden service authoritative directories constitute a
- performance bottleneck in the system. The directory nodes are responsible for
- storing and serving all hidden service descriptors. As of May 2007 there are
- about 1000 descriptors at a time, but this number is assumed to increase in
- the future. Further, there is no replication protocol for descriptors between
- the three directory nodes, so that hidden services must ensure the
- availability of their descriptors by manually publishing them on all
- directory nodes. Whenever a fourth or fifth hidden service authoritative
- directory is added, hidden services will need to maintain an equally
- increasing number of replicas. These scalability issues have an impact on the
- current usage of hidden services and put an even higher burden on the
- development of new kinds of applications for hidden services that might
- require storing even more descriptors.
-
- Second, besides posing a limitation to scalability, storing all hidden
- service descriptors on three directory nodes also constitutes a security
- risk. The directory node operators could easily analyze the publish and fetch
- requests to derive information on service activity and usage and read the
- descriptor contents to determine which onion routers work as introduction
- points for a given hidden service and need to be attacked or threatened to
- shut it down. Furthermore, the contents of a hidden service descriptor offer
- only minimal security properties to the hidden service. Whoever gets aware of
- the service ID can easily find out whether the service is active at the
- moment and which introduction points it has. This applies to (former)
- clients, (former) introduction points, and of course to the directory nodes.
- It requires only to request the descriptor for the given service ID, which
- can be performed by anyone anonymously.
-
- This proposal suggests two major changes to approach the described
- performance and security problems:
-
- The first change affects the storage location for hidden service descriptors.
- Descriptors are distributed among a large subset of all onion routers instead
- of three fixed directory nodes. Each storing node is responsible for a subset
- of descriptors for a limited time only. It is not able to choose which
- descriptors it stores at a certain time, because this is determined by its
- onion ID which is hard to change frequently and in time (only routers which
- are stable for a given time are accepted as storing nodes). In order to
- resist single node failures and untrustworthy nodes, descriptors are
- replicated among a certain number of storing nodes. A first replication
- protocol makes sure that descriptors don't get lost when the node population
- changes; therefore, a storing node periodically requests the descriptors from
- its siblings. A second replication protocol distributes descriptors among
- non-consecutive nodes of the ID ring to prevent a group of adversaries from
- generating new onion keys until they have consecutive IDs to create a 'black
- hole' in the ring and make random services unavailable. Connections to
- storing nodes are established by extending existing circuits by one hop to
- the storing node. This also ensures that contents are encrypted. The effect
- of this first change is that the probability that a single node operator
- learns about a certain hidden service is very small and that it is very hard
- to track a service over time, even when it collaborates with other node
- operators.
-
- The second change concerns the content of hidden service descriptors.
- Obviously, security problems cannot be solved only by decentralizing storage;
- in fact, they could also get worse if done without caution. At first, a
- descriptor ID needs to change periodically in order to be stored on changing
- nodes over time. Next, the descriptor ID needs to be computable only for the
- service's clients, but should be unpredictable for all other nodes. Further,
- the storing node needs to be able to verify that the hidden service is the
- true originator of the descriptor with the given ID even though it is not a
- client. Finally, a storing node should learn as little information as
- necessary by storing a descriptor, because it might not be as trustworthy as
- a directory node; for example it does not need to know the list of
- introduction points. Therefore, a second key is applied that is only known to
- the hidden service provider and its clients and that is not included in the
- descriptor. It is used to calculate descriptor IDs and to encrypt the
- introduction points. This second key can either be given to all clients
- together with the hidden service ID, or to a group or a single client as
- an authentication token. In the future this second key could be the result of
- some key agreement protocol between the hidden service and one or more
- clients. A new text-based format is proposed for descriptors instead of an
- extension of the existing binary format for reasons of future extensibility.
-
-Design:
-
- The proposed design is described by the required changes to the current
- design. These requirements are grouped by content, rather than by affected
- specification documents or code files, and numbered for reference below.
-
- Hidden service clients, servers, and directories:
-
- /1/ Create routing list
-
- All participants can filter the consensus status document received from the
- directory authorities to one routing list containing only those servers
- that store and serve hidden service descriptors and which are running for
- at least 24 hours. A participant only trusts its own routing list and never
- learns about routing information from other parties.
-
- /2/ Determine responsible hidden service directory
-
- All participants can determine the hidden service directory that is
- responsible for storing and serving a given ID, as well as the hidden
- service directories that replicate its content. Every hidden service
- directory is responsible for the descriptor IDs in the interval from
- its predecessor, exclusive, to its own ID, inclusive. Further, a hidden
- service directory holds replicas for its n predecessors, where n denotes
- the number of consecutive replicas. (requires /1/)
-
- [/3/ and /4/ were requirements to use BEGIN_DIR cells for directory
- requests which have not been fulfilled in the course of the implementation
- of this proposal, but elsewhere.]
-
- Hidden service directory nodes:
-
- /5/ Advertise hidden service directory functionality
-
- Every onion router that has its directory port open can decide whether it
- wants to store and serve hidden service descriptors by setting a new config
- option "HidServDirectoryV2" 0|1 to 1. An onion router with this config
- option being set includes the flag "hidden-service-dir" in its router
- descriptors that it sends to directory authorities.
-
- /6/ Accept v2 publish requests, parse and store v2 descriptors
-
- Hidden service directory nodes accept publish requests for hidden service
- descriptors and store them to their local memory. (It is not necessary to
- make descriptors persistent, because after disconnecting, the onion router
- would not be accepted as storing node anyway, because it has not been
- running for at least 24 hours.) All requests and replies are formatted as
- HTTP messages. Requests are directed to the router's directory port and are
- contained within BEGIN_DIR cells. A hidden service directory node stores a
- descriptor only when it thinks that it is responsible for storing that
- descriptor based on its own routing table. Every hidden service directory
- node is responsible for the descriptor IDs in the interval of its n-th
- predecessor in the ID circle up to its own ID (n denotes the number of
- consecutive replicas). (requires /1/)
-
- /7/ Accept v2 fetch requests
-
- Same as /6/, but with fetch requests for hidden service descriptors.
- (requires /2/)
-
- /8/ Replicate descriptors with neighbors
-
- A hidden service directory node replicates descriptors from its two
- predecessors by downloading them once an hour. Further, it checks its
- routing table periodically for changes. Whenever it realizes that a
- predecessor has left the network, it establishes a connection to the new
- n-th predecessor and requests its stored descriptors in the interval of its
- (n+1)-th predecessor and the requested n-th predecessor. Whenever it
- realizes that a new onion router has joined with an ID higher than its
- former n-th predecessor, it adds it to its predecessors and discards all
- descriptors in the interval of its (n+1)-th and its n-th predecessor.
- (requires /1/)
-
- [Dec 02: This function has not been implemented, because arbitrary nodes
- what have been able to download the entire set of v2 descriptors. An
- authorized replication request would be necessary. For the moment, the
- system runs without any directory-side replication. -KL]
-
- Authoritative directory nodes:
-
- /9/ Confirm a router's hidden service directory functionality
-
- Directory nodes include a new flag "HSDir" for routers that decided to
- provide storage for hidden service descriptors and that are running for at
- least 24 hours. The last requirement prevents a node from frequently
- changing its onion key to become responsible for an identifier it wants to
- target.
-
- Hidden service provider:
-
- /10/ Configure v2 hidden service
-
- Each hidden service provider that has set the config option
- "PublishV2HidServDescriptors" 0|1 to 1 is configured to publish v2
- descriptors and conform to the v2 connection establishment protocol. When
- configuring a hidden service, a hidden service provider checks if it has
- already created a random secret_cookie and a hostname2 file; if not, it
- creates both of them. (requires /2/)
-
- /11/ Establish introduction points with fresh key
-
- If configured to publish only v2 descriptors and no v0/v1 descriptors any
- more, a hidden service provider that is setting up the hidden service at
- introduction points does not pass its own public key, but the public key
- of a freshly generated key pair. It also includes these fresh public keys
- in the hidden service descriptor together with the other introduction point
- information. The reason is that the introduction point does not need to and
- therefore should not know for which hidden service it works, so as to
- prevent it from tracking the hidden service's activity. (If a hidden
- service provider supports both, v0/v1 and v2 descriptors, v0/v1 clients
- rely on the fact that all introduction points accept the same public key,
- so that this new feature cannot be used.)
-
- /12/ Encode v2 descriptors and send v2 publish requests
-
- If configured to publish v2 descriptors, a hidden service provider
- publishes a new descriptor whenever its content changes or a new
- publication period starts for this descriptor. If the current publication
- period would only last for less than 60 minutes (= 2 x 30 minutes to allow
- the server to be 30 minutes behind and the client 30 minutes ahead), the
- hidden service provider publishes both a current descriptor and one for
- the next period. Publication is performed by sending the descriptor to all
- hidden service directories that are responsible for keeping replicas for
- the descriptor ID. This includes two non-consecutive replicas that are
- stored at 3 consecutive nodes each. (requires /1/ and /2/)
-
- Hidden service client:
-
- /13/ Send v2 fetch requests
-
- A hidden service client that has set the config option
- "FetchV2HidServDescriptors" 0|1 to 1 handles SOCKS requests for v2 onion
- addresses by requesting a v2 descriptor from a randomly chosen hidden
- service directory that is responsible for keeping replica for the
- descriptor ID. In total there are six replicas of which the first and the
- last three are stored on consecutive nodes. The probability of picking one
- of the three consecutive replicas is 1/6, 2/6, and 3/6 to incorporate the
- fact that the availability will be the highest on the node with next higher
- ID. A hidden service client relies on the hidden service provider to store
- two sets of descriptors to compensate clock skew between service and
- client. (requires /1/ and /2/)
-
- /14/ Process v2 fetch reply and parse v2 descriptors
-
- A hidden service client that has sent a request for a v2 descriptor can
- parse it and store it to the local cache of rendezvous service descriptors.
-
- /15/ Establish connection to v2 hidden service
-
- A hidden service client can establish a connection to a hidden service
- using a v2 descriptor. This includes using the secret cookie for decrypting
- the introduction points contained in the descriptor. When contacting an
- introduction point, the client does not use the public key of the hidden
- service provider, but the freshly-generated public key that is included in
- the hidden service descriptor. Whether or not a fresh key is used instead
- of the key of the hidden service depends on the available protocol versions
- that are included in the descriptor; by this, connection establishment is
- to a certain extend decoupled from fetching the descriptor.
-
- Hidden service descriptor:
-
- (Requirements concerning the descriptor format are contained in /6/ and /7/.)
-
- The new v2 hidden service descriptor format looks like this:
-
- onion-address = h(public-key) + cookie
- descriptor-id = h(h(public-key) + h(time-period + cookie + relica))
- descriptor-content = {
- descriptor-id,
- version,
- public-key,
- h(time-period + cookie + replica),
- timestamp,
- protocol-versions,
- { introduction-points } encrypted with cookie
- } signed with private-key
-
- The "descriptor-id" needs to change periodically in order for the
- descriptor to be stored on changing nodes over time. It may only be
- computable by a hidden service provider and all of his clients to prevent
- unauthorized nodes from tracking the service activity by periodically
- checking whether there is a descriptor for this service. Finally, the
- hidden service directory needs to be able to verify that the hidden service
- provider is the true originator of the descriptor with the given ID.
-
- Therefore, "descriptor-id" is derived from the "public-key" of the hidden
- service provider, the current "time-period" which changes every 24 hours,
- a secret "cookie" shared between hidden service provider and clients, and
- a "replica" denoting the number of this non-consecutive replica. (The
- "time-period" is constructed in a way that time periods do not change at
- the same moment for all descriptors by deriving a value between 0:00 and
- 23:59 hours from h(public-key) and making the descriptors of this hidden
- service provider expire at that time of the day.) The "descriptor-id" is
- defined to be 160 bits long. [extending the "descriptor-id" length
- suggested by LØ]
-
- Only the hidden service provider and the clients are able to generate
- future "descriptor-ID"s. Hence, the "onion-address" is extended from now
- the hash value of "public-key" by the secret "cookie". The "public-key" is
- determined to be 80 bits long, whereas the "cookie" is dimensioned to be
- 120 bits long. This makes a total of 200 bits or 40 base32 chars, which is
- quite a lot to handle for a human, but necessary to provide sufficient
- protection against an adversary from generating a key pair with same
- "public-key" hash or guessing the "cookie".
-
- A hidden service directory can verify that a descriptor was created by the
- hidden service provider by checking if the "descriptor-id" corresponds to
- the "public-key" and if the signature can be verified with the
- "public-key".
-
- The "introduction-points" that are included in the descriptor are encrypted
- using the same "cookie" that is shared between hidden service provider and
- clients. [correction to use another key than h(time-period + cookie) as
- encryption key for introduction points made by LØ]
-
- A new text-based format is proposed for descriptors instead of an extension
- of the existing binary format for reasons of future extensibility.
-
-Security implications:
-
- The security implications of the proposed changes are grouped by the roles of
- nodes that could perform attacks or on which attacks could be performed.
-
- Attacks by authoritative directory nodes
-
- Authoritative directory nodes are no longer the single places in the
- network that know about a hidden service's activity and introduction
- points. Thus, they cannot perform attacks using this information, e.g.
- track a hidden service's activity or usage pattern or attack its
- introduction points. Formerly, it would only require a single corrupted
- authoritative directory operator to perform such an attack.
-
- Attacks by hidden service directory nodes
-
- A hidden service directory node could misuse a stored descriptor to track a
- hidden service's activity and usage pattern by clients. Though there is no
- countermeasure against this kind of attack, it is very expensive to track a
- certain hidden service over time. An attacker would need to run a large
- number of stable onion routers that work as hidden service directory nodes
- to have a good probability to become responsible for its changing
- descriptor IDs. For each period, the probability is:
-
- 1-(N-c choose r)/(N choose r) for N-c>=r and 1 otherwise, with N
- as total
- number of hidden service directories, c as compromised nodes, and r as
- number of replicas
-
- The hidden service directory nodes could try to make a certain hidden
- service unavailable to its clients. Therefore, they could discard all
- stored descriptors for that hidden service and reply to clients that there
- is no descriptor for the given ID or return an old or false descriptor
- content. The client would detect a false descriptor, because it could not
- contain a correct signature. But an old content or an empty reply could
- confuse the client. Therefore, the countermeasure is to replicate
- descriptors among a small number of hidden service directories, e.g. 5.
- The probability of a group of collaborating nodes to make a hidden service
- completely unavailable is in each period:
-
- (c choose r)/(N choose r) for c>=r and N>=r, and 0 otherwise,
- with N as total
- number of hidden service directories, c as compromised nodes, and r as
- number of replicas
-
- A hidden service directory could try to find out which introduction points
- are working on behalf of a hidden service. In contrast to the previous
- design, this is not possible anymore, because this information is encrypted
- to the clients of a hidden service.
-
- Attacks on hidden service directory nodes
-
- An anonymous attacker could try to swamp a hidden service directory with
- false descriptors for a given descriptor ID. This is prevented by requiring
- that descriptors are signed.
-
- Anonymous attackers could swamp a hidden service directory with correct
- descriptors for non-existing hidden services. There is no countermeasure
- against this attack. However, the creation of valid descriptors is more
- expensive than verification and storage in local memory. This should make
- this kind of attack unattractive.
-
- Attacks by introduction points
-
- Current or former introduction points could try to gain information on the
- hidden service they serve. But due to the fresh key pair that is used by
- the hidden service, this attack is not possible anymore.
-
- Attacks by clients
-
- Current or former clients could track a hidden service's activity, attack
- its introduction points, or determine the responsible hidden service
- directory nodes and attack them. There is nothing that could prevent them
- from doing so, because honest clients need the full descriptor content to
- establish a connection to the hidden service. At the moment, the only
- countermeasure against dishonest clients is to change the secret cookie and
- pass it only to the honest clients.
-
-Compatibility:
-
- The proposed design is meant to replace the current design for hidden service
- descriptors and their storage in the long run.
-
- There should be a first transition phase in which both, the current design
- and the proposed design are served in parallel. Onion routers should start
- serving as hidden service directories, and hidden service providers and
- clients should make use of the new design if both sides support it. Hidden
- service providers should be allowed to publish descriptors of the current
- format in parallel, and authoritative directories should continue storing and
- serving these descriptors.
-
- After the first transition phase, hidden service providers should stop
- publishing descriptors on authoritative directories, and hidden service
- clients should not try to fetch descriptors from the authoritative
- directories. However, the authoritative directories should continue serving
- hidden service descriptors for a second transition phase. As of this point,
- all v2 config options should be set to a default value of 1.
-
- After the second transition phase, the authoritative directories should stop
- serving hidden service descriptors.
-
diff --git a/doc/spec/proposals/115-two-hop-paths.txt b/doc/spec/proposals/115-two-hop-paths.txt
deleted file mode 100644
index 9854c9ad5..000000000
--- a/doc/spec/proposals/115-two-hop-paths.txt
+++ /dev/null
@@ -1,385 +0,0 @@
-Filename: 115-two-hop-paths.txt
-Title: Two Hop Paths
-Author: Mike Perry
-Created:
-Status: Dead
-Supersedes: 112
-
-
-Overview:
-
- The idea is that users should be able to choose if they would like
- to have either two or three hop paths through the tor network.
-
- Let us be clear: the users who would choose this option should be
- those that are concerned with IP obfuscation only: ie they would not be
- targets of a resource-intensive multi-node attack. It is sometimes said
- that these users should find some other network to use other than Tor.
- This is a foolish suggestion: more users improves security of everyone,
- and the current small userbase size is a critical hindrance to
- anonymity, as is discussed below and in [1].
-
- This value should be modifiable from the controller, and should be
- available from Vidalia.
-
-
-Motivation:
-
- The Tor network is slow and overloaded. Increasingly often I hear
- stories about friends and friends of friends who are behind firewalls,
- annoying censorware, or under surveillance that interferes with their
- productivity and Internet usage, or chills their speech. These people
- know about Tor, but they choose to put up with the censorship because
- Tor is too slow to be usable for them. In fact, to download a fresh,
- complete copy of levine-timing.pdf for the Theoretical Argument
- section of this proposal over Tor took me 3 tries.
-
- Furthermore, the biggest current problem with Tor's anonymity for
- those who really need it is not someone attacking the network to
- discover who they are. It's instead the extreme danger that so few
- people use Tor because it's so slow, that those who do use it have
- essentially no confusion set.
-
- The recent case where the professor and the rogue Tor user were the
- only Tor users on campus, and thus suspected in an incident involving
- Tor and that University underscores this point: "That was why the police
- had come to see me. They told me that only two people on our campus were
- using Tor: me and someone they suspected of engaging in an online scam.
- The detectives wanted to know whether the other user was a former
- student of mine, and why I was using Tor"[1].
-
- Not only does Tor provide no anonymity if you use it to be anonymous
- but are obviously from a certain institution, location or circumstance,
- it is also dangerous to use Tor for risk of being accused of having
- something significant enough to hide to be willing to put up with
- the horrible performance as opposed to using some weaker alternative.
-
- There are many ways to improve the speed problem, and of course we
- should and will implement as many as we can. Johannes's GSoC project
- and my reputation system are longer term, higher-effort things that
- will still provide benefit independent of this proposal.
-
- However, reducing the path length to 2 for those who do not need the
- extra anonymity 3 hops provide not only improves their Tor experience
- but also reduces their load on the Tor network by 33%, and should
- increase adoption of Tor by a good deal. That's not just Win-Win, it's
- Win-Win-Win.
-
-
-Who will enable this option?
-
- This is the crux of the proposal. Admittedly, there is some anonymity
- loss and some degree of decreased investment required on the part of
- the adversary to attack 2 hop users versus 3 hop users, even if it is
- minimal and limited mostly to up-front costs and false positives.
-
- The key questions are:
-
- 1. Are these users in a class such that their risk is significantly
- less than the amount of this anonymity loss?
-
- 2. Are these users able to identify themselves?
-
- Many many users of Tor are not at risk for an adversary capturing c/n
- nodes of the network just to see what they do. These users use Tor to
- circumvent aggressive content filters, or simply to keep their IP out of
- marketing and search engine databases. Most content filters have no
- interest in running Tor nodes to catch violators, and marketers
- certainly would never consider such a thing, both on a cost basis and a
- legal one.
-
- In a sense, this represents an alternate threat model against these
- users who are not at risk for Tor's normal threat model.
-
- It should be evident to these users that they fall into this class. All
- that should be needed is a radio button
-
- * "I use Tor for local content filter circumvention and/or IP obfuscation,
- not anonymity. Speed is more important to me than high anonymity.
- No one will make considerable efforts to determine my real IP."
- * "I use Tor for anonymity and/or national-level, legally enforced
- censorship. It is possible effort will be taken to identify
- me, including but not limited to network surveillance. I need more
- protection."
-
- and then some explanation in the help for exactly what this means, and
- the risks involved with eliminating the adversary's need for timing
- attacks with respect to false positives. Ultimately, the decision is a
- simple one that can be made without this information, however. The user
- does not need Paul Syverson to instruct them on the deep magic of Onion
- Routing to make this decision. They just need to know why they use Tor.
- If they use it just to stay out of marketing databases and/or bypass a
- local content filter, two hops is plenty. This is likely the vast
- majority of Tor users, and many non-users we would like to bring on
- board.
-
- So, having established this class of users, let us now go on to
- examine theoretical and practical risks we place them at, and determine
- if these risks violate the users needs, or introduce additional risk
- to node operators who may be subject to requests from law enforcement
- to track users who need 3 hops, but use 2 because they enjoy the
- thrill of russian roulette.
-
-
-Theoretical Argument:
-
- It has long been established that timing attacks against mixed
- and onion networks are extremely effective, and that regardless
- of path length, if the adversary has compromised your first and
- last hop of your path, you can assume they have compromised your
- identity for that connection.
-
- In fact, it was demonstrated that for all but the slowest, lossiest
- networks, error rates for false positives and false negatives were
- very near zero[2]. Only for constant streams of traffic over slow and
- (more importantly) extremely lossy network links did the error rate
- hit 20%. For loss rates typical to the Internet, even the error rate
- for slow nodes with constant traffic streams was 13%.
-
- When you take into account that most Tor streams are not constant,
- but probably much more like their "HomeIP" dataset, which consists
- mostly of web traffic that exists over finite intervals at specific
- times, error rates drop to fractions of 1%, even for the "worst"
- network nodes.
-
- Therefore, the user has little benefit from the extra hop, assuming
- the adversary does timing correlation on their nodes. Since timing
- correlation is simply an implementation issue and is most likely
- a single up-front cost (and one that is like quite a bit cheaper
- than the cost of the machines purchased to host the nodes to mount
- an attack), the real protection is the low probability of getting
- both the first and last hop of a client's stream.
-
-
-Practical Issues:
-
- Theoretical issues aside, there are several practical issues with the
- implementation of Tor that need to be addressed to ensure that
- identity information is not leaked by the implementation.
-
- Exit policy issues:
-
- If a client chooses an exit with a very restrictive exit policy
- (such as an IP or IP range), the first hop then knows a good deal
- about the destination. For this reason, clients should not select
- exits that match their destination IP with anything other than "*".
-
- Partitioning:
-
- Partitioning attacks form another concern. Since Tor uses telescoping
- to build circuits, it is possible to tell a user is constructing only
- two hop paths at the entry node and on the local network. An external
- adversary can potentially differentiate 2 and 3 hop users, and decide
- that all IP addresses connecting to Tor and using 3 hops have something
- to hide, and should be scrutinized more closely or outright apprehended.
-
- One solution to this is to use the "leaky-circuit" method of attaching
- streams: The user always creates 3-hop circuits, but if the option
- is enabled, they always exit from their 2nd hop. The ideal solution
- would be to create a RELAY_SHISHKABOB cell which contains onion
- skins for every host along the path, but this requires protocol
- changes at the nodes to support.
-
- Guard nodes:
-
- Since guard nodes can rotate due to client relocation, network
- failure, node upgrades and other issues, if you amortize the risk a
- mobile, dialup, or otherwise intermittently connected user is exposed to
- over any reasonable duration of Tor usage (on the order of a year), it
- is the same with or without guard nodes. Assuming an adversary has
- c%/n% of network bandwidth, and guards rotate on average with period R,
- statistically speaking, it's merely a question of if the user wishes
- their risk to be concentrated with probability c/n over an expected
- period of R*c, and probability 0 over an expected period of R*(n-c),
- versus a continuous risk of (c/n)^2. So statistically speaking, guards
- only create a time-tradeoff of risk over the long run for normal Tor
- usage. Rotating guards do not reduce risk for normal client usage long
- term.[3]
-
- On other other hand, assuming a more stable method of guard selection
- and preservation is devised, or a more stable client side network than
- my own is typical (which rotates guards frequently due to network issues
- and moving about), guard nodes provide a tradeoff in the form of c/n% of
- the users being "sacrificial users" who are exposed to high risk O(c/n)
- of identification, while the rest of the network is exposed to zero
- risk.
-
- The nature of Tor makes it likely an adversary will take a "shock and
- awe" approach to suppressing Tor by rounding up a few users whose
- browsing activity has been observed to be made into examples, in an
- attempt to prove that Tor is not perfect.
-
- Since this "shock and awe" attack can be applied with or without guard
- nodes, stable guard nodes do offer a measure of accountability of sorts.
- If a user was using a small set of guard nodes and knows them well, and
- then is suddenly apprehended as a result of Tor usage, having a fixed
- set of entry points to suspect is a lot better than suspecting the whole
- network. Conversely, it can also give non-apprehended users comfort
- that they are likely to remain safe indefinitely with their set of (now
- presumably trusted) guards. This is probably the most beneficial
- property of reliable guards: they deter the adversary from mounting
- "shock and awe" attacks because the surviving users will not
- intimidated, but instead made more confident. Of course, guards need to
- be made much more stable and users need to be encouraged to know their
- guards for this property to really take effect.
-
- This beneficial property of client vigilance also carries over to an
- active adversary, except in this case instead of relying on the user
- to remember their guard nodes and somehow communicate them after
- apprehension, the code can alert them to the presence of an active
- adversary before they are apprehended. But only if they use guard nodes.
-
- So lets consider the active adversary: Two hop paths allow malicious
- guards to get considerably more benefit from failing circuits if they do
- not extend to their colluding peers for the exit hop. Since guards can
- detect the number of hops in a path via either timing or by statistical
- analysis of the exit policy of the 2nd hop, they can perform this attack
- predominantly against 2 hop users.
-
- This can be addressed by completely abandoning an entry guard after a
- certain ratio of extend or general circuit failures with respect to
- non-failed circuits. The proper value for this ratio can be determined
- experimentally with TorFlow. There is the possibility that the local
- network can abuse this feature to cause certain guards to be dropped,
- but they can do that anyways with the current Tor by just making guards
- they don't like unreachable. With this mechanism, Tor will complain
- loudly if any guard failure rate exceeds the expected in any failure
- case, local or remote.
-
- Eliminating guards entirely would actually not address this issue due
- to the time-tradeoff nature of risk. In fact, it would just make it
- worse. Without guard nodes, it becomes much more difficult for clients
- to become alerted to Tor entry points that are failing circuits to make
- sure that they only devote bandwidth to carry traffic for streams which
- they observe both ends. Yet the rogue entry points are still able to
- significantly increase their success rates by failing circuits.
-
- For this reason, guard nodes should remain enabled for 2 hop users,
- at least until an IP-independent, undetectable guard scanner can
- be created. TorFlow can scan for failing guards, but after a while,
- its unique behavior gives away the fact that its IP is a scanner and
- it can be given selective service.
-
- Consideration of risks for node operators:
-
- There is a serious risk for two hop users in the form of guard
- profiling. If an adversary running an exit node notices that a
- particular site is always visited from a fixed previous hop, it is
- likely that this is a two hop user using a certain guard, which could be
- monitored to determine their identity. Thus, for the protection of both
- 2 hop users and node operators, 2 hop users should limit their guard
- duration to a sufficient number of days to verify reliability of a node,
- but not much more. This duration can be determined experimentally by
- TorFlow.
-
- Considering a Tor client builds on average 144 circuits/day (10
- minutes per circuit), if the adversary owns c/n% of exits on the
- network, they can expect to see 144*c/n circuits from this user, or
- about 14 minutes of usage per day per percentage of network penetration.
- Since it will take several occurrences of user-linkable exit content
- from the same predecessor hop for the adversary to have any confidence
- this is a 2 hop user, it is very unlikely that any sort of demands made
- upon the predecessor node would guaranteed to be effective (ie it
- actually was a guard), let alone be executed in time to apprehend the
- user before they rotated guards.
-
- The reverse risk also warrants consideration. If a malicious guard has
- orders to surveil Mike Perry, it can determine Mike Perry is using two
- hops by observing his tendency to choose a 2nd hop with a viable exit
- policy. This can be done relatively quickly, unfortunately, and
- indicates Mike Perry should spend some of his time building real 3 hop
- circuits through the same guards, to require them to at least wait for
- him to actually use Tor to determine his style of operation, rather than
- collect this information from his passive building patterns.
-
- However, to actively determine where Mike Perry is going, the guard
- will need to require logging ahead of time at multiple exit nodes that
- he may use over the course of the few days while he is at that guard,
- and correlate the usage times of the exit node with Mike Perry's
- activity at that guard for the few days he uses it. At this point, the
- adversary is mounting a scale and method of attack (widespread logging,
- timing attacks) that works pretty much just as effectively against 3
- hops, so exit node operators are exposed to no additional danger than
- they otherwise normally are.
-
-
-Why not fix Pathlen=2?:
-
- The main reason I am not advocating that we always use 2 hops is that
- in some situations, timing correlation evidence by itself may not be
- considered as solid and convincing as an actual, uninterrupted, fully
- traced path. Are these timing attacks as effective on a real network as
- they are in simulation? Maybe the circuit multiplexing of Tor can serve
- to frustrate them to a degree? Would an extralegal adversary or
- authoritarian government even care? In the face of these situation
- dependent unknowns, it should be up to the user to decide if this is
- a concern for them or not.
-
- It should probably also be noted that even a false positive
- rate of 1% for a 200k concurrent-user network could mean that for a
- given node, a given stream could be confused with something like 10
- users, assuming ~200 nodes carry most of the traffic (ie 1000 users
- each). Though of course to really know for sure, someone needs to do
- an attack on a real network, unfortunately.
-
- Additionally, at some point cover traffic schemes may be implemented to
- frustrate timing attacks on the first hop. It is possible some expert
- users may do this ad-hoc already, and may wish to continue using 3 hops
- for this reason.
-
-
-Implementation:
-
- new_route_len() can be modified directly with a check of the
- Pathlen option. However, circuit construction logic should be
- altered so that both 2 hop and 3 hop users build the same types of
- circuits, and the option should ultimately govern circuit selection,
- not construction. This improves coverage against guard nodes being
- able to passively profile users who aren't even using Tor.
- PathlenCoinWeight, anyone? :)
-
- The exit policy hack is a bit more tricky. compare_addr_to_addr_policy
- needs to return an alternate ADDR_POLICY_ACCEPTED_WILDCARD or
- ADDR_POLICY_ACCEPTED_SPECIFIC return value for use in
- circuit_is_acceptable.
-
- The leaky exit is trickier still.. handle_control_attachstream
- does allow paths to exit at a given hop. Presumably something similar
- can be done in connection_ap_handshake_process_socks, and elsewhere?
- Circuit construction would also have to be performed such that the
- 2nd hop's exit policy is what is considered, not the 3rd's.
-
- The entry_guard_t structure could have num_circ_failed and
- num_circ_succeeded members such that if it exceeds F% circuit
- extend failure rate to a second hop, it is removed from the entry list.
-
- F should be sufficiently high to avoid churn from normal Tor circuit
- failure as determined by TorFlow scans.
-
- The Vidalia option should be presented as a radio button.
-
-
-Migration:
-
- Phase 1: Adjust exit policy checks if Pathlen is set, implement leaky
- circuit ability, and 2-3 hop circuit selection logic governed by
- Pathlen.
-
- Phase 2: Experiment to determine the proper ratio of circuit
- failures used to expire garbage or malicious guards via TorFlow
- (pending Bug #440 backport+adoption).
-
- Phase 3: Implement guard expiration code to kick off failure-prone
- guards and warn the user. Cap 2 hop guard duration to a proper number
- of days determined sufficient to establish guard reliability (to be
- determined by TorFlow).
-
- Phase 4: Make radiobutton in Vidalia, along with help entry
- that explains in layman's terms the risks involved.
-
- Phase 5: Allow user to specify path length by HTTP URL suffix.
-
-
-[1] http://p2pnet.net/story/11279
-[2] http://www.cs.umass.edu/~mwright/papers/levine-timing.pdf
-[3] Proof available upon request ;)
diff --git a/doc/spec/proposals/116-two-hop-paths-from-guard.txt b/doc/spec/proposals/116-two-hop-paths-from-guard.txt
deleted file mode 100644
index f45625350..000000000
--- a/doc/spec/proposals/116-two-hop-paths-from-guard.txt
+++ /dev/null
@@ -1,118 +0,0 @@
-Filename: 116-two-hop-paths-from-guard.txt
-Title: Two hop paths from entry guards
-Author: Michael Lieberman
-Created: 26-Jun-2007
-Status: Dead
-
-This proposal is related to (but different from) Mike Perry's proposal 115
-"Two Hop Paths."
-
-Overview:
-
-Volunteers who run entry guards should have the option of using only 2
-additional tor nodes when constructing their own tor circuits.
-
-While the option of two hop paths should perhaps be extended to every client
-(as discussed in Mike Perry's thread), I believe the anonymity properties of
-two hop paths are particularly well-suited to client computers that are also
-serving as entry guards.
-
-First I will describe the details of the strategy, as well as possible
-avenues of attack. Then I will list advantages and disadvantages. Then, I
-will discuss some possibly safer variations of the strategy, and finally
-some implementation issues.
-
-Details:
-
-Suppose Alice is an entry guard, and wants to construct a two hop circuit.
-Alice chooses a middle node at random (not using the entry guard strategy),
-and gains anonymity by having her traffic look just like traffic from
-someone else using her as an entry guard.
-
-Can Alice's middle node figure out that she is initiator of the traffic? I
-can think of four possible approaches for distinguishing traffic from Alice
-with traffic through Alice:
-
-1) Notice that communication from Alice comes too fast: Experimentation is
-needed to determine if traffic from Alice can be distinguished from traffic
-from a computer with a decent link to Alice.
-
-2) Monitor Alice's network traffic to discover the lack of incoming packets
-at the appropriate times. If an adversary has this ability, then Alice
-already has problems in the current system, because the adversary can run a
-standard timing attack on Alice's traffic.
-
-3) Notice that traffic from Alice is unique in some way such that if Alice
-was just one of 3 entry guards for this traffic, then the traffic should be
-coming from two other entry guards as well. An example of "unique traffic"
-could be always sending 117 packets every 3 minutes to an exit node that
-exits to port 4661. However, if such patterns existed with sufficient
-precision, then it seems to me that Tor already has a problem. (This "unique
-traffic" may not be a problem if clients often end up choosing a single
-entry guard because their other two are down. Does anyone know if this is
-the case?)
-
-4) First, control the middle node *and* some other part of the traffic,
-using standard attacks on a two hop circuit without entry nodes (my recent
-paper on Browser-Based Attacks would work well for this
-http://petworkshop.org/2007/papers/PET2007_preproc_Browser_based.pdf). With
-control of the circuit, we can now cause "unique traffic" as in 3).
-Alternatively, if we know something about Alice independently, and we can
-see what websites are being visited, we might be able to guess that she is
-the kind of person that would visit those websites.
-
-Anonymity Advantages:
-
--Alice never has the problem of choosing a malicious entry guard. In some
-sense, Alice acts as her own entry guard.
-
-Anonymity Disadvantages:
-
--If Alice's traffic is identified as originating from herself (see above for
-how hard that might be), then she has the anonymity of a 2 hop circuit
-without entry guards.
-
-Additional advantages:
-
--A discussion of the latency advantages of two hop circuits is going on in
-Mike Perry's thread already.
--Also, we can advertise this change as "Run an entry guard and decrease your
-own Tor latency." This incentive has the potential to add nodes to the
-network, improving the network as a whole.
-
-Safer variations:
-
-To solve the "unique traffic" problem, Alice could use two hop paths only
-1/3 of the time, and choose 2 other entry guards for the other 2/3 of the
-time. All the advantages are now 1/3 as useful (possibly more, if the other
-2 entry guards are not always up).
-
-To solve the problem that Alice's responses are too fast, Alice could delay
-her responses (ideally based on some real data of response time when Alice
-is used an entry guard). This loses most of the speed advantages of the two
-hop path, but if Alice is a fast entry guard, it doesn't lose everything. It
-also still has the (arguable) anonymity advantage that Alice doesn't have to
-worry about having a malicious entry guard.
-
-Implementation details:
-For Alice to remain anonymous using this strategy, she has to actually be
-acting as an entry guard for other nodes. This means the two hop option can
-only be available to whatever high-performance threshold is currently set on
-entry guards. Alice may need to somehow check her own current status as an
-entry guard before choosing this two hop strategy.
-
-Another thing to consider: suppose Alice is also an exit node. If the
-fraction of exit nodes in existence is too small, she may rarely or never be
-chosen as an entry guard. It would be sad if we offered an incentive to run
-an entry guard that didn't extend to exit nodes. I suppose clients of Exit
-nodes could pull the same trick, and bypass using Tor altogether (zero hop
-paths), though that has additional issues.*
-
-Mike Lieberman
-MIT
-
-*Why we shouldn't recommend Exit nodes pull the same trick:
-1) Exit nodes would suffer heavily from the problem of "unique traffic"
-mentioned above.
-2) It would give governments an incentive to confiscate exit nodes to see if
-they are pulling this trick.
diff --git a/doc/spec/proposals/117-ipv6-exits.txt b/doc/spec/proposals/117-ipv6-exits.txt
deleted file mode 100644
index 00cd7cef1..000000000
--- a/doc/spec/proposals/117-ipv6-exits.txt
+++ /dev/null
@@ -1,410 +0,0 @@
-Filename: 117-ipv6-exits.txt
-Title: IPv6 exits
-Author: coderman
-Created: 10-Jul-2007
-Status: Accepted
-Target: 0.2.1.x
-
-Overview
-
- Extend Tor for TCP exit via IPv6 transport and DNS resolution of IPv6
- addresses. This proposal does not imply any IPv6 support for OR
- traffic, only exit and name resolution.
-
-
-Contents
-
-0. Motivation
-
- As the IPv4 address space becomes more scarce there is increasing
- effort to provide Internet services via the IPv6 protocol. Many
- hosts are available at IPv6 endpoints which are currently
- inaccessible for Tor users.
-
- Extending Tor to support IPv6 exit streams and IPv6 DNS name
- resolution will allow users of the Tor network to access these hosts.
- This capability would be present for those who do not currently have
- IPv6 access, thus increasing the utility of Tor and furthering
- adoption of IPv6.
-
-
-1. Design
-
-1.1. General design overview
-
- There are three main components to this proposal. The first is a
- method for routers to advertise their ability to exit IPv6 traffic.
- The second is the manner in which routers resolve names to IPv6
- addresses. Last but not least is the method in which clients
- communicate with Tor to resolve and connect to IPv6 endpoints
- anonymously.
-
-1.2. Router IPv6 exit support
-
- In order to specify exit policies and IPv6 capability new directives
- in the Tor configuration will be needed. If a router advertises IPv6
- exit policies in its descriptor this will signal the ability to
- provide IPv6 exit. There are a number of additional default deny
- rules associated with this new address space which are detailed in
- the addendum.
-
- When Tor is started on a host it should check for the presence of a
- global unicast IPv6 address and if present include the default IPv6
- exit policies and any user specified IPv6 exit policies.
-
- If a user provides IPv6 exit policies but no global unicast IPv6
- address is available Tor should generate a warning and not publish the
- IPv6 policies in the router descriptor.
-
- It should be noted that IPv4 mapped IPv6 addresses are not valid exit
- destinations. This mechanism is mainly used to interoperate with
- both IPv4 and IPv6 clients on the same socket. Any attempts to use
- an IPv4 mapped IPv6 address, perhaps to circumvent exit policy for
- IPv4, must be refused.
-
-1.3. DNS name resolution of IPv6 addresses (AAAA records)
-
- In addition to exit support for IPv6 TCP connections, a method to
- resolve domain names to their respective IPv6 addresses is also
- needed. This is accomplished in the existing DNS system via AAAA
- records. Routers will perform both A and AAAA requests when
- resolving a name so that the client can utilize an IPv6 endpoint when
- available or preferred.
-
- To avoid potential problems with caching DNS servers that behave
- poorly all NXDOMAIN responses to AAAA requests should be ignored if a
- successful response is received for an A request. This implies that
- both AAAA and A requests will always be performed for each name
- resolution.
-
- For reverse lookups on IPv6 addresses, like that used for
- RESOLVE_PTR, Tor will perform the necessary PTR requests via
- IP6.ARPA.
-
- All routers which perform DNS resolution on behalf of clients
- (RELAY_RESOLVE) should perform and respond with both A and AAAA
- resources.
-
- [NOTE: In a future version, when we extend the behavior of RESOLVE to
- encapsulate more of real DNS, it will make sense to allow more
- flexibility here. -nickm]
-
-1.4. Client interaction with IPv6 exit capability
-
-1.4.1. Usability goals
-
- There are a number of behaviors which Tor can provide when
- interacting with clients that will improve the usability of IPv6 exit
- capability. These behaviors are designed to make it simple for
- clients to express a preference for IPv6 transport and utilize IPv6
- host services.
-
-1.4.2. SOCKSv5 IPv6 client behavior
-
- The SOCKS version 5 protocol supports IPv6 connections. When using
- SOCKSv5 with hostnames it is difficult to determine if a client
- wishes to use an IPv4 or IPv6 address to connect to the desired host
- if it resolves to both address types.
-
- In order to make this more intuitive the SOCKSv5 protocol can be
- supported on a local IPv6 endpoint, [::1] port 9050 for example.
- When a client requests a connection to the desired host via an IPv6
- SOCKS connection Tor will prefer IPv6 addresses when resolving the
- host name and connecting to the host.
-
- Likewise, RESOLVE and RESOLVE_PTR requests from an IPv6 SOCKS
- connection will return IPv6 addresses when available, and fall back
- to IPv4 addresses if not.
-
- [NOTE: This means that SocksListenAddress and DNSListenAddress should
- support IPv6 addresses. Perhaps there should also be a general option
- to have listeners that default to 127.0.0.1 and 0.0.0.0 listen
- additionally or instead on ::1 and :: -nickm]
-
-1.4.3. MAPADDRESS behavior
-
- The MAPADDRESS capability supports clients that may not be able to
- use the SOCKSv4a or SOCKSv5 hostname support to resolve names via
- Tor. This ability should be extended to IPv6 addresses in SOCKSv5 as
- well.
-
- When a client requests an address mapping from the wildcard IPv6
- address, [::0], the server will respond with a unique local IPv6
- address on success. It is important to note that there may be two
- mappings for the same name if both an IPv4 and IPv6 address are
- associated with the host. In this case a CONNECT to a mapped IPv6
- address should prefer IPv6 for the connection to the host, if
- available, while CONNECT to a mapped IPv4 address will prefer IPv4.
-
- It should be noted that IPv6 does not provide the concept of a host
- local subnet, like 127.0.0.0/8 in IPv4. For this reason integration
- of Tor with IPv6 clients should consider a firewall or filter rule to
- drop unique local addresses to or from the network when possible.
- These packets should not be routed, however, keeping them off the
- subnet entirely is worthwhile.
-
-1.4.3.1. Generating unique local IPv6 addresses
-
- The usual manner of generating a unique local IPv6 address is to
- select a Global ID part randomly, along with a Subnet ID, and sharing
- this prefix among the communicating parties who each have their own
- distinct Interface ID. In this style a given Tor instance might
- select a random Global and Subnet ID and provide MAPADDRESS
- assignments with a random Interface ID as needed. This has the
- potential to associate unique Global/Subnet identifiers with a given
- Tor instance and may expose attacks against the anonymity of Tor
- users.
-
- Tor avoid this potential problem entirely MAPADDRESS must always
- generate the Global, Subnet, and Interface IDs randomly for each
- request. It is also highly suggested that explicitly specifying an
- IPv6 source address instead of the wildcard address not be supported
- to ensure that a good random address is used.
-
-1.4.4. DNSProxy IPv6 client behavior
-
- A new capability in recent Tor versions is the transparent DNS proxy.
- This feature will need to return both A and AAAA resource records
- when responding to client name resolution requests.
-
- The transparent DNS proxy should also support reverse lookups for
- IPv6 addresses. It is suggested that any such requests to the
- deprecated IP6.INT domain should be translated to IP6.ARPA instead.
- This translation is not likely to be used and is of low priority.
-
- It would be nice to support DNS over IPv6 transport as well, however,
- this is not likely to be used and is of low priority.
-
-1.4.5. TransPort IPv6 client behavior
-
- Tor also provides transparent TCP proxy support via the Trans*
- directives in the configuration. The TransListenAddress directive
- should accept an IPv6 address in addition to IPv4 so that IPv6 TCP
- connections can be transparently proxied.
-
-1.5. Additional changes
-
- The RedirectExit option should be deprecated rather than extending
- this feature to IPv6.
-
-
-2. Spec changes
-
-2.1. Tor specification
-
- In '6.2. Opening streams and transferring data' the following should
- be changed to indicate IPv6 exit capability:
-
- "No version of Tor currently generates the IPv6 format."
-
- In '6.4. Remote hostname lookup' the following should be updated to
- reflect use of ip6.arpa in addition to in-addr.arpa.
-
- "For a reverse lookup, the OP sends a RELAY_RESOLVE cell containing an
- in-addr.arpa address."
-
- In 'A.1. Differences between spec and implementation' the following
- should be updated to indicate IPv6 exit capability:
-
- "The current codebase has no IPv6 support at all."
-
- [NOTE: the EXITPOLICY end-cell reason says that it can hold an ipv4 or an
- ipv6 address, but doesn't say how. We may want a separate EXITPOLICY2
- type that can hold an ipv6 address, since the way we encode ipv6
- addresses elsewhere ("0.0.0.0 indicates that the next 16 bytes are ipv6")
- is a bit dumb. -nickm]
- [Actually, the length field lets us distinguish EXITPOLICY. -nickm]
-
-2.2. Directory specification
-
- In '2.1. Router descriptor format' a new set of directives is needed
- for IPv6 exit policy. The existing accept/reject directives should
- be clarified to indicate IPv4 or wildcard address relevance. The new
- IPv6 directives will be in the form of:
-
- "accept6" exitpattern NL
- "reject6" exitpattern NL
-
- The section describing accept6/reject6 should explain that the
- presence of accept6 or reject6 exit policies in a router descriptor
- signals the ability of that router to exit IPv6 traffic (according to
- IPv6 exit policies).
-
- The "[::]/0" notation is used to represent "all IPv6 addresses".
- "[::0]/0" may also be used for this representation.
-
- If a user specifies a 'reject6 [::]/0:*' policy in the Tor
- configuration this will be interpreted as forcing no IPv6 exit
- support and no accept6/reject6 policies will be included in the
- published descriptor. This will prevent IPv6 exit if the router host
- has a global unicast IPv6 address present.
-
- It is important to note that a wildcard address in an accept or
- reject policy applies to both IPv4 and IPv6 addresses.
-
-2.3. Control specification
-
- In '3.8. MAPADDRESS' the potential to have to addresses for a given
- name should be explained. The method for generating unique local
- addresses for IPv6 mappings needs explanation as described above.
-
- When IPv6 addresses are used in this document they should include the
- brackets for consistency. For example, the null IPv6 address should
- be written as "[::0]" and not "::0". The control commands will
- expect the same syntax as well.
-
- In '3.9. GETINFO' the "address" command should return both public
- IPv4 and IPv6 addresses if present. These addresses should be
- separated via \r\n.
-
-
-2.4. Tor SOCKS extensions
-
- In '2. Name lookup' a description of IPv6 address resolution is
- needed for SOCKSv5 as described above. IPv6 addresses should be
- supported in both the RESOLVE and RESOLVE_PTR extensions.
-
- A new section describing the ability to accept SOCKSv5 clients on a
- local IPv6 address to indicate a preference for IPv6 transport as
- described above is also needed. The behavior of Tor SOCKSv5 proxy
- with an IPv6 preference should be explained, for example, preferring
- IPv6 transport to a named host with both IPv4 and IPv6 addresses
- available (A and AAAA records).
-
-
-3. Questions and concerns
-
-3.1. DNS A6 records
-
- A6 is explicitly avoided in this document. There are potential
- reasons for implementing this, however, the inherent complexity of
- the protocol and resolvers make this unappealing. Is there a
- compelling reason to consider A6 as part of IPv6 exit support?
-
- [IMO not till anybody needs it. -nickm]
-
-3.2. IPv4 and IPv6 preference
-
- The design above tries to infer a preference for IPv4 or IPv6
- transport based on client interactions with Tor. It might be useful
- to provide more explicit control over this preference. For example,
- an IPv4 SOCKSv5 client may want to use IPv6 transport to named hosts
- in CONNECT requests while the current implementation would assume an
- IPv4 preference. Should more explicit control be available, through
- either configuration directives or control commands?
-
- Many applications support a inet6-only or prefer-family type option
- that provides the user manual control over address preference. This
- could be provided as a Tor configuration option.
-
- An explicit preference is still possible by resolving names and then
- CONNECTing to an IPv4 or IPv6 address as desired, however, not all
- client applications may have this option available.
-
-3.3. Support for IPv6 only transparent proxy clients
-
- It may be useful to support IPv6 only transparent proxy clients using
- IPv4 mapped IPv6 like addresses. This would require transparent DNS
- proxy using IPv6 transport and the ability to map A record responses
- into IPv4 mapped IPv6 like addresses in the manner described in the
- "NAT-PT" RFC for a traditional Basic-NAT-PT with DNS-ALG. The
- transparent TCP proxy would thus need to detect these mapped addresses
- and connect to the desired IPv4 host.
-
- The IPv6 prefix used for this purpose must not be the actual IPv4
- mapped IPv6 address prefix, though the manner in which IPv4 addresses
- are embedded in IPv6 addresses would be the same.
-
- The lack of any IPv6 only hosts which would use this transparent proxy
- method makes this a lot of work for very little gain. Is there a
- compelling reason to support this NAT-PT like capability?
-
-3.4. IPv6 DNS and older Tor routers
-
- It is expected that many routers will continue to run with older
- versions of Tor when the IPv6 exit capability is released. Clients
- who wish to use IPv6 will need to route RELAY_RESOLVE requests to the
- newer routers which will respond with both A and AAAA resource
- records when possible.
-
- One way to do this is to route RELAY_RESOLVE requests to routers with
- IPv6 exit policies published, however, this would not utilize current
- routers that can resolve IPv6 addresses even if they can't exit such
- traffic.
-
- There was also concern expressed about the ability of existing clients
- to cope with new RELAY_RESOLVE responses that contain IPv6 addresses.
- If this breaks backward compatibility, a new request type may be
- necessary, like RELAY_RESOLVE6, or some other mechanism of indicating
- the ability to parse IPv6 responses when making the request.
-
-3.5. IPv4 and IPv6 bindings in MAPADDRESS
-
- It may be troublesome to try and support two distinct address mappings
- for the same name in the existing MAPADDRESS implementation. If this
- cannot be accommodated then the behavior should replace existing
- mappings with the new address regardless of family. A warning when
- this occurs would be useful to assist clients who encounter problems
- when both an IPv4 and IPv6 application are using MAPADDRESS for the
- same names concurrently, causing lost connections for one of them.
-
-4. Addendum
-
-4.1. Sample IPv6 default exit policy
-
- reject 0.0.0.0/8
- reject 169.254.0.0/16
- reject 127.0.0.0/8
- reject 192.168.0.0/16
- reject 10.0.0.0/8
- reject 172.16.0.0/12
- reject6 [0000::]/8
- reject6 [0100::]/8
- reject6 [0200::]/7
- reject6 [0400::]/6
- reject6 [0800::]/5
- reject6 [1000::]/4
- reject6 [4000::]/3
- reject6 [6000::]/3
- reject6 [8000::]/3
- reject6 [A000::]/3
- reject6 [C000::]/3
- reject6 [E000::]/4
- reject6 [F000::]/5
- reject6 [F800::]/6
- reject6 [FC00::]/7
- reject6 [FE00::]/9
- reject6 [FE80::]/10
- reject6 [FEC0::]/10
- reject6 [FF00::]/8
- reject *:25
- reject *:119
- reject *:135-139
- reject *:445
- reject *:1214
- reject *:4661-4666
- reject *:6346-6429
- reject *:6699
- reject *:6881-6999
- accept *:*
- # accept6 [2000::]/3:* is implied
-
-4.2. Additional resources
-
- 'DNS Extensions to Support IP Version 6'
- http://www.ietf.org/rfc/rfc3596.txt
-
- 'DNS Extensions to Support IPv6 Address Aggregation and Renumbering'
- http://www.ietf.org/rfc/rfc2874.txt
-
- 'SOCKS Protocol Version 5'
- http://www.ietf.org/rfc/rfc1928.txt
-
- 'Unique Local IPv6 Unicast Addresses'
- http://www.ietf.org/rfc/rfc4193.txt
-
- 'INTERNET PROTOCOL VERSION 6 ADDRESS SPACE'
- http://www.iana.org/assignments/ipv6-address-space
-
- 'Network Address Translation - Protocol Translation (NAT-PT)'
- http://www.ietf.org/rfc/rfc2766.txt
diff --git a/doc/spec/proposals/118-multiple-orports.txt b/doc/spec/proposals/118-multiple-orports.txt
deleted file mode 100644
index 2381ec7ca..000000000
--- a/doc/spec/proposals/118-multiple-orports.txt
+++ /dev/null
@@ -1,84 +0,0 @@
-Filename: 118-multiple-orports.txt
-Title: Advertising multiple ORPorts at once
-Author: Nick Mathewson
-Created: 09-Jul-2007
-Status: Accepted
-Target: 0.2.1.x
-
-Overview:
-
- This document is a proposal for servers to advertise multiple
- address/port combinations for their ORPort.
-
-Motivation:
-
- Sometimes servers want to support multiple ports for incoming
- connections, either in order to support multiple address families, to
- better use multiple interfaces, or to support a variety of
- FascistFirewallPorts settings. This is easy to set up now, but
- there's no way to advertise it to clients.
-
-New descriptor syntax:
-
- We add a new line in the router descriptor, "or-address". This line
- can occur zero, one, or multiple times. Its format is:
-
- or-address SP ADDRESS ":" PORTLIST NL
-
- ADDRESS = IP6ADDR / IP4ADDR
- IPV6ADDR = an ipv6 address, surrounded by square brackets.
- IPV4ADDR = an ipv4 address, represented as a dotted quad.
- PORTLIST = PORTSPEC | PORTSPEC "," PORTLIST
- PORTSPEC = PORT | PORT "-" PORT
-
- [This is the regular format for specifying sets of addresses and
- ports in Tor.]
-
-New OR behavior:
-
- We add two more options to supplement ORListenAddress:
- ORPublishedListenAddress, and ORPublishAddressSet. The former
- listens on an address-port combination and publishes it in addition
- to the regular address. The latter advertises a set of address-port
- combinations, but does not listen on them. [To use this option, the
- server operator should set up port forwarding to the regular ORPort,
- as for example with firewall rules.]
-
- Servers should extend their testing to include advertised addresses
- and ports. No address or port should be advertised until it's been
- tested. [This might get expensive in practice.]
-
-New authority behavior:
-
- Authorities should spot-test descriptors, and reject any where a
- substantial part of the addresses can't be reached.
-
-New client behavior:
-
- When connecting to another server, clients SHOULD pick an
- address-port ocmbination at random as supported by their
- reachableaddresses. If a client has a connection to a server at one
- address, it SHOULD use that address for any simultaneous connections
- to that server. Clients SHOULD use the canonical address for any
- server when generating extend cells.
-
-Not addressed here:
-
- * There's no reason to listen on multiple dirports; current Tors
- mostly don't connect directly to the dirport anyway.
-
- * It could be advantageous to list something about extra addresses in
- the network-status document. This would, however, eat space there.
- More analysis is needed, particularly in light of proposal 141
- ("Download server descriptors on demand")
-
-Dependencies:
-
- Testing for canonical connections needs to be implemented before it's
- safe to use this proposal.
-
-
-Notes 3 July:
- - Write up the simple version of this. No ranges needed yet. No
- networkstatus chagnes yet.
-
diff --git a/doc/spec/proposals/119-controlport-auth.txt b/doc/spec/proposals/119-controlport-auth.txt
deleted file mode 100644
index 9ed1cc1cb..000000000
--- a/doc/spec/proposals/119-controlport-auth.txt
+++ /dev/null
@@ -1,140 +0,0 @@
-Filename: 119-controlport-auth.txt
-Title: New PROTOCOLINFO command for controllers
-Author: Roger Dingledine
-Created: 14-Aug-2007
-Status: Closed
-Implemented-In: 0.2.0.x
-
-Overview:
-
- Here we describe how to help controllers locate the cookie
- authentication file when authenticating to Tor, so we can a) require
- authentication by default for Tor controllers and b) still keep
- things usable. Also, we propose an extensible, general-purpose mechanism
- for controllers to learn about a Tor instance's protocol and
- authentication requirements before authenticating.
-
-The Problem:
-
- When we first added the controller protocol, we wanted to make it
- easy for people to play with it, so by default we didn't require any
- authentication from controller programs. We allowed requests only from
- localhost as a stopgap measure for security.
-
- Due to an increasing number of vulnerabilities based on this approach,
- it's time to add authentication in default configurations.
-
- We have a number of goals:
- - We want the default Vidalia bundles to transparently work. That
- means we don't want the users to have to type in or know a password.
- - We want to allow multiple controller applications to connect to the
- control port. So if Vidalia is launching Tor, it can't just keep the
- secrets to itself.
-
- Right now there are three authentication approaches supported
- by the control protocol: NULL, CookieAuthentication, and
- HashedControlPassword. See Sec 5.1 in control-spec.txt for details.
-
- There are a couple of challenges here. The first is: if the controller
- launches Tor, how should we teach Tor what authentication approach
- it should require, and the secret that goes along with it? Next is:
- how should this work when the controller attaches to an existing Tor,
- rather than launching Tor itself?
-
- Cookie authentication seems most amenable to letting multiple controller
- applications interact with Tor. But that brings in yet another question:
- how does the controller guess where to look for the cookie file,
- without first knowing what DataDirectory Tor is using?
-
-Design:
-
- We should add a new controller command PROTOCOLINFO that can be sent
- as a valid first command (the others being AUTHENTICATE and QUIT). If
- PROTOCOLINFO is sent as the first command, the second command must be
- either a successful AUTHENTICATE or a QUIT.
-
- If the initial command sequence is not valid, Tor closes the connection.
-
-
-Spec:
-
- C: "PROTOCOLINFO" *(SP PIVERSION) CRLF
- S: "250+PROTOCOLINFO" SP PIVERSION CRLF *InfoLine "250 OK" CRLF
-
- InfoLine = AuthLine / VersionLine / OtherLine
-
- AuthLine = "250-AUTH" SP "METHODS=" AuthMethod *(",")AuthMethod
- *(SP "COOKIEFILE=" AuthCookieFile) CRLF
- VersionLine = "250-VERSION" SP "Tor=" TorVersion [SP Arguments] CRLF
-
- AuthMethod =
- "NULL" / ; No authentication is required
- "HASHEDPASSWORD" / ; A controller must supply the original password
- "COOKIE" / ; A controller must supply the contents of a cookie
-
- AuthCookieFile = QuotedString
- TorVersion = QuotedString
-
- OtherLine = "250-" Keyword [SP Arguments] CRLF
-
- For example:
-
- C: PROTOCOLINFO CRLF
- S: "250+PROTOCOLINFO 1" CRLF
- S: "250-AUTH Methods=HASHEDPASSWORD,COOKIE COOKIEFILE="/tor/cookie"" CRLF
- S: "250-VERSION Tor=0.2.0.5-alpha" CRLF
- S: "250 OK" CRLF
-
- Tor MAY give its InfoLines in any order; controllers MUST ignore InfoLines
- with keywords it does not recognize. Controllers MUST ignore extraneous
- data on any InfoLine.
-
- PIVERSION is there in case we drastically change the syntax one day. For
- now it should always be "1", for the controller protocol. Controllers MAY
- provide a list of the protocol versions they support; Tor MAY select a
- version that the controller does not support.
-
- Right now only two "topics" (AUTH and VERSION) are included, but more
- may be included in the future. Controllers must accept lines with
- unexpected topics.
-
- AuthCookieFile = QuotedString
-
- AuthMethod is used to specify one or more control authentication
- methods that Tor currently accepts.
-
- AuthCookieFile specifies the absolute path and filename of the
- authentication cookie that Tor is expecting and is provided iff
- the METHODS field contains the method "COOKIE". Controllers MUST handle
- escape sequences inside this string.
-
- The VERSION line contains the Tor version.
-
- [What else might we want to include that could be useful? -RD]
-
-Compatibility:
-
- Tor 0.1.2.16 and 0.2.0.4-alpha hang up after the first failed
- command. Earlier Tors don't know about this command but don't hang
- up. That means controllers will need a mechanism for distinguishing
- whether they're talking to a Tor that speaks PROTOCOLINFO or not.
-
- I suggest that the controllers attempt a PROTOCOLINFO. Then:
- - If it works, great. Authenticate as required.
- - If they get hung up on, reconnect and do a NULL AUTHENTICATE.
- - If it's unrecognized but they're not hung up on, do a NULL
- AUTHENTICATE.
-
-Unsolved problems:
-
- If Torbutton wants to be a Tor controller one day... talking TCP is
- bad enough, but reading from the filesystem is even harder. Is there
- a way to let simple programs work with the controller port without
- needing all the auth infrastructure?
-
- Once we put this approach in place, the next vulnerability we see will
- involve an attacker somehow getting read access to the victim's files
- --- and then we're back where we started. This means we still need
- to think about how to demand password-based authentication without
- bothering the user about it.
-
diff --git a/doc/spec/proposals/120-shutdown-descriptors.txt b/doc/spec/proposals/120-shutdown-descriptors.txt
deleted file mode 100644
index 5cfe2b5bc..000000000
--- a/doc/spec/proposals/120-shutdown-descriptors.txt
+++ /dev/null
@@ -1,83 +0,0 @@
-Filename: 120-shutdown-descriptors.txt
-Title: Shutdown descriptors when Tor servers stop
-Author: Roger Dingledine
-Created: 15-Aug-2007
-Status: Dead
-
-[Proposal dead as of 11 Jul 2008. The point of this proposal was to give
-routers a good way to get out of the networkstatus early, but proposal
-138 (already implemented) has achieved this.]
-
-Overview:
-
- Tor servers should publish a last descriptor whenever they shut down,
- to let others know that they are no longer offering service.
-
-The Problem:
-
- The main reason for this is in reaction to Internet services that want
- to treat connections from the Tor network differently. Right now,
- if a user experiments with turning on the "relay" functionality, he
- is punished by being locked out of some websites, some IRC networks,
- etc --- and this lockout persists for several days even after he turns
- the server off.
-
-Design:
-
- During the "slow shutdown" period if exiting, or shortly after the
- user sets his ORPort back to 0 if not exiting, Tor should publish a
- final descriptor with the following characteristics:
-
- 1) Exit policy is listed as "reject *:*"
- 2) It includes a new entry called "opt shutdown 1"
-
- The first step is so current blacklists will no longer list this node
- as exiting to whatever the service is.
-
- The second step is so directory authorities can avoid wasting time
- doing reachability testing. Authorities should automatically not list
- as Running any router whose latest descriptor says it shut down.
-
- [I originally had in mind a third step --- Advertised bandwidth capacity
- is listed as "0" --- so current Tor clients will skip over this node
- when building most circuits. But since clients won't fetch descriptors
- from nodes not listed as Running, this step seems pointless. -RD]
-
-Spec:
-
- TBD but should be pretty straightforward.
-
-Security issues:
-
- Now external people can learn exactly when a node stopped offering
- relay service. How bad is this? I can see a few minor attacks based
- on this knowledge, but on the other hand as it is we don't really take
- any steps to keep this information secret.
-
-Overhead issues:
-
- We are creating more descriptors that want to be remembered. However,
- since the router won't be marked as Running, ordinary clients won't
- fetch the shutdown descriptors. Caches will, though. I hope this is ok.
-
-Implementation:
-
- To make things easy, we should publish the shutdown descriptor only
- on controlled shutdown (SIGINT as opposed to SIGTERM). That would
- leave enough time for publishing that we probably wouldn't need any
- extra synchronization code.
-
- If that turns out to be too unintuitive for users, I could imagine doing
- it on SIGTERMs too, and just delaying exit until we had successfully
- published to at least one authority, at which point we'd hope that it
- propagated from there.
-
-Acknowledgements:
-
- tup suggested this idea.
-
-Comments:
-
- 2) Maybe add a rule "Don't do this for hibernation if we expect to wake
- up before the next consensus is published"?
- - NM 9 Oct 2007
diff --git a/doc/spec/proposals/121-hidden-service-authentication.txt b/doc/spec/proposals/121-hidden-service-authentication.txt
deleted file mode 100644
index 0d92b53a8..000000000
--- a/doc/spec/proposals/121-hidden-service-authentication.txt
+++ /dev/null
@@ -1,776 +0,0 @@
-Filename: 121-hidden-service-authentication.txt
-Title: Hidden Service Authentication
-Author: Tobias Kamm, Thomas Lauterbach, Karsten Loesing, Ferdinand Rieger,
- Christoph Weingarten
-Created: 10-Sep-2007
-Status: Finished
-Implemented-In: 0.2.1.x
-
-Change history:
-
- 26-Sep-2007 Initial proposal for or-dev
- 08-Dec-2007 Incorporated comments by Nick posted to or-dev on 10-Oct-2007
- 15-Dec-2007 Rewrote complete proposal for better readability, modified
- authentication protocol, merged in personal notes
- 24-Dec-2007 Replaced misleading term "authentication" by "authorization"
- and added some clarifications (comments by Sven Kaffille)
- 28-Apr-2008 Updated most parts of the concrete authorization protocol
- 04-Jul-2008 Add a simple algorithm to delay descriptor publication for
- different clients of a hidden service
- 19-Jul-2008 Added INTRODUCE1V cell type (1.2), improved replay
- protection for INTRODUCE2 cells (1.3), described limitations
- for auth protocols (1.6), improved hidden service protocol
- without client authorization (2.1), added second, more
- scalable authorization protocol (2.2), rewrote existing
- authorization protocol (2.3); changes based on discussion
- with Nick
- 31-Jul-2008 Limit maximum descriptor size to 20 kilobytes to prevent
- abuse.
- 01-Aug-2008 Use first part of Diffie-Hellman handshake for replay
- protection instead of rendezvous cookie.
- 01-Aug-2008 Remove improved hidden service protocol without client
- authorization (2.1). It might get implemented in proposal
- 142.
-
-Overview:
-
- This proposal deals with a general infrastructure for performing
- authorization (not necessarily implying authentication) of requests to
- hidden services at three points: (1) when downloading and decrypting
- parts of the hidden service descriptor, (2) at the introduction point,
- and (3) at Bob's Tor client before contacting the rendezvous point. A
- service provider will be able to restrict access to his service at these
- three points to authorized clients only. Further, the proposal contains
- specific authorization protocols as instances that implement the
- presented authorization infrastructure.
-
- This proposal is based on v2 hidden service descriptors as described in
- proposal 114 and introduced in version 0.2.0.10-alpha.
-
- The proposal is structured as follows: The next section motivates the
- integration of authorization mechanisms in the hidden service protocol.
- Then we describe a general infrastructure for authorization in hidden
- services, followed by specific authorization protocols for this
- infrastructure. At the end we discuss a number of attacks and non-attacks
- as well as compatibility issues.
-
-Motivation:
-
- The major part of hidden services does not require client authorization
- now and won't do so in the future. To the contrary, many clients would
- not want to be (pseudonymously) identifiable by the service (though this
- is unavoidable to some extent), but rather use the service
- anonymously. These services are not addressed by this proposal.
-
- However, there may be certain services which are intended to be accessed
- by a limited set of clients only. A possible application might be a
- wiki or forum that should only be accessible for a closed user group.
- Another, less intuitive example might be a real-time communication
- service, where someone provides a presence and messaging service only to
- his buddies. Finally, a possible application would be a personal home
- server that should be remotely accessed by its owner.
-
- Performing authorization for a hidden service within the Tor network, as
- proposed here, offers a range of advantages compared to allowing all
- client connections in the first instance and deferring authorization to
- the transported protocol:
-
- (1) Reduced traffic: Unauthorized requests would be rejected as early as
- possible, thereby reducing the overall traffic in the network generated
- by establishing circuits and sending cells.
-
- (2) Better protection of service location: Unauthorized clients could not
- force Bob to create circuits to their rendezvous points, thus preventing
- the attack described by Øverlier and Syverson in their paper "Locating
- Hidden Servers" even without the need for guards.
-
- (3) Hiding activity: Apart from performing the actual authorization, a
- service provider could also hide the mere presence of his service from
- unauthorized clients when not providing hidden service descriptors to
- them, rejecting unauthorized requests already at the introduction
- point (ideally without leaking presence information at any of these
- points), or not answering unauthorized introduction requests.
-
- (4) Better protection of introduction points: When providing hidden
- service descriptors to authorized clients only and encrypting the
- introduction points as described in proposal 114, the introduction points
- would be unknown to unauthorized clients and thereby protected from DoS
- attacks.
-
- (5) Protocol independence: Authorization could be performed for all
- transported protocols, regardless of their own capabilities to do so.
-
- (6) Ease of administration: A service provider running multiple hidden
- services would be able to configure access at a single place uniformly
- instead of doing so for all services separately.
-
- (7) Optional QoS support: Bob could adapt his node selection algorithm
- for building the circuit to Alice's rendezvous point depending on a
- previously guaranteed QoS level, thus providing better latency or
- bandwidth for selected clients.
-
- A disadvantage of performing authorization within the Tor network is
- that a hidden service cannot make use of authorization data in
- the transported protocol. Tor hidden services were designed to be
- independent of the transported protocol. Therefore it's only possible to
- either grant or deny access to the whole service, but not to specific
- resources of the service.
-
- Authorization often implies authentication, i.e. proving one's identity.
- However, when performing authorization within the Tor network, untrusted
- points should not gain any useful information about the identities of
- communicating parties, neither server nor client. A crucial challenge is
- to remain anonymous towards directory servers and introduction points.
- However, trying to hide identity from the hidden service is a futile
- task, because a client would never know if he is the only authorized
- client and therefore perfectly identifiable. Therefore, hiding client
- identity from the hidden service is not an aim of this proposal.
-
- The current implementation of hidden services does not provide any kind
- of authorization. The hidden service descriptor version 2, introduced by
- proposal 114, was designed to use a descriptor cookie for downloading and
- decrypting parts of the descriptor content, but this feature is not yet
- in use. Further, most relevant cell formats specified in rend-spec
- contain fields for authorization data, but those fields are neither
- implemented nor do they suffice entirely.
-
-Details:
-
- 1. General infrastructure for authorization to hidden services
-
- We spotted three possible authorization points in the hidden service
- protocol:
-
- (1) when downloading and decrypting parts of the hidden service
- descriptor,
- (2) at the introduction point, and
- (3) at Bob's Tor client before contacting the rendezvous point.
-
- The general idea of this proposal is to allow service providers to
- restrict access to some or all of these points to authorized clients
- only.
-
- 1.1. Client authorization at directory
-
- Since the implementation of proposal 114 it is possible to combine a
- hidden service descriptor with a so-called descriptor cookie. If done so,
- the descriptor cookie becomes part of the descriptor ID, thus having an
- effect on the storage location of the descriptor. Someone who has learned
- about a service, but is not aware of the descriptor cookie, won't be able
- to determine the descriptor ID and download the current hidden service
- descriptor; he won't even know whether the service has uploaded a
- descriptor recently. Descriptor IDs are calculated as follows (see
- section 1.2 of rend-spec for the complete specification of v2 hidden
- service descriptors):
-
- descriptor-id =
- H(service-id | H(time-period | descriptor-cookie | replica))
-
- Currently, service-id is equivalent to permanent-id which is calculated
- as in the following formula. But in principle it could be any public
- key.
-
- permanent-id = H(permanent-key)[:10]
-
- The second purpose of the descriptor cookie is to encrypt the list of
- introduction points, including optional authorization data. Hence, the
- hidden service directories won't learn any introduction information from
- storing a hidden service descriptor. This feature is implemented but
- unused at the moment. So this proposal will harness the advantages
- of proposal 114.
-
- The descriptor cookie can be used for authorization by keeping it secret
- from everyone but authorized clients. A service could then decide whether
- to publish hidden service descriptors using that descriptor cookie later
- on. An authorized client being aware of the descriptor cookie would be
- able to download and decrypt the hidden service descriptor.
-
- The number of concurrently used descriptor cookies for one hidden service
- is not restricted. A service could use a single descriptor cookie for all
- users, a distinct cookie per user, or something in between, like one
- cookie per group of users. It is up to the specific protocol and how it
- is applied by a service provider.
-
- Two or more hidden service descriptors for different groups or users
- should not be uploaded at the same time. A directory node could conclude
- easily that the descriptors were issued by the same hidden service, thus
- being able to link the two groups or users. Therefore, descriptors for
- different users or clients that ought to be stored on the same directory
- are delayed, so that only one descriptor is uploaded to a directory at a
- time. The remaining descriptors are uploaded with a delay of up to
- 30 seconds.
- Further, descriptors for different groups or users that are to be stored
- on different directories are delayed for a random time of up to 30
- seconds to hide relations from colluding directories. Certainly, this
- does not prevent linking entirely, but it makes it somewhat harder.
- There is a conflict between hiding links between clients and making a
- service available in a timely manner.
-
- Although this part of the proposal is meant to describe a general
- infrastructure for authorization, changing the way of using the
- descriptor cookie to look up hidden service descriptors, e.g. applying
- some sort of asymmetric crypto system, would require in-depth changes
- that would be incompatible to v2 hidden service descriptors. On the
- contrary, using another key for en-/decrypting the introduction point
- part of a hidden service descriptor, e.g. a different symmetric key or
- asymmetric encryption, would be easy to implement and compatible to v2
- hidden service descriptors as understood by hidden service directories
- (clients and services would have to be upgraded anyway for using the new
- features).
-
- An adversary could try to abuse the fact that introduction points can be
- encrypted by storing arbitrary, unrelated data in the hidden service
- directory. This abuse can be limited by setting a hard descriptor size
- limit, forcing the adversary to split data into multiple chunks. There
- are some limitations that make splitting data across multiple descriptors
- unattractive: 1) The adversary would not be able to choose descriptor IDs
- freely and would therefore have to implement his own indexing
- structure. 2) Validity of descriptors is limited to at most 24 hours
- after which descriptors need to be republished.
-
- The regular descriptor size in bytes is 745 + num_ipos * 837 + auth_data.
- A large descriptor with 7 introduction points and 5 kilobytes of
- authorization data would be 11724 bytes in size. The upper size limit of
- descriptors should be set to 20 kilobytes, which limits the effect of
- abuse while retaining enough flexibility in designing authorization
- protocols.
-
- 1.2. Client authorization at introduction point
-
- The next possible authorization point after downloading and decrypting
- a hidden service descriptor is the introduction point. It may be important
- for authorization, because it bears the last chance of hiding presence
- of a hidden service from unauthorized clients. Further, performing
- authorization at the introduction point might reduce traffic in the
- network, because unauthorized requests would not be passed to the
- hidden service. This applies to those clients who are aware of a
- descriptor cookie and thereby of the hidden service descriptor, but do
- not have authorization data to pass the introduction point or access the
- service (such a situation might occur when authorization data for
- authorization at the directory is not issued on a per-user basis, but
- authorization data for authorization at the introduction point is).
-
- It is important to note that the introduction point must be considered
- untrustworthy, and therefore cannot replace authorization at the hidden
- service itself. Nor should the introduction point learn any sensitive
- identifiable information from either the service or the client.
-
- In order to perform authorization at the introduction point, three
- message formats need to be modified: (1) v2 hidden service descriptors,
- (2) ESTABLISH_INTRO cells, and (3) INTRODUCE1 cells.
-
- A v2 hidden service descriptor needs to contain authorization data that
- is introduction-point-specific and sometimes also authorization data
- that is introduction-point-independent. Therefore, v2 hidden service
- descriptors as specified in section 1.2 of rend-spec already contain two
- reserved fields "intro-authorization" and "service-authorization"
- (originally, the names of these fields were "...-authentication")
- containing an authorization type number and arbitrary authorization
- data. We propose that authorization data consists of base64 encoded
- objects of arbitrary length, surrounded by "-----BEGIN MESSAGE-----" and
- "-----END MESSAGE-----". This will increase the size of hidden service
- descriptors, but this is allowed since there is no strict upper limit.
-
- The current ESTABLISH_INTRO cells as described in section 1.3 of
- rend-spec do not contain either authorization data or version
- information. Therefore, we propose a new version 1 of the ESTABLISH_INTRO
- cells adding these two issues as follows:
-
- V Format byte: set to 255 [1 octet]
- V Version byte: set to 1 [1 octet]
- KL Key length [2 octets]
- PK Bob's public key [KL octets]
- HS Hash of session info [20 octets]
- AUTHT The auth type that is supported [1 octet]
- AUTHL Length of auth data [2 octets]
- AUTHD Auth data [variable]
- SIG Signature of above information [variable]
-
- From the format it is possible to determine the maximum allowed size for
- authorization data: given the fact that cells are 512 octets long, of
- which 498 octets are usable (see section 6.1 of tor-spec), and assuming
- 1024 bit = 128 octet long keys, there are 215 octets left for
- authorization data. Hence, authorization protocols are bound to use no
- more than these 215 octets, regardless of the number of clients that
- shall be authenticated at the introduction point. Otherwise, one would
- need to send multiple ESTABLISH_INTRO cells or split them up, which we do
- not specify here.
-
- In order to understand a v1 ESTABLISH_INTRO cell, the implementation of
- a relay must have a certain Tor version. Hidden services need to be able
- to distinguish relays being capable of understanding the new v1 cell
- formats and perform authorization. We propose to use the version number
- that is contained in networkstatus documents to find capable
- introduction points.
-
- The current INTRODUCE1 cell as described in section 1.8 of rend-spec is
- not designed to carry authorization data and has no version number, too.
- Unfortunately, unversioned INTRODUCE1 cells consist only of a fixed-size,
- seemingly random PK_ID, followed by the encrypted INTRODUCE2 cell. This
- makes it impossible to distinguish unversioned INTRODUCE1 cells from any
- later format. In particular, it is not possible to introduce some kind of
- format and version byte for newer versions of this cell. That's probably
- where the comment "[XXX011 want to put intro-level auth info here, but no
- version. crap. -RD]" that was part of rend-spec some time ago comes from.
-
- We propose that new versioned INTRODUCE1 cells use the new cell type 41
- RELAY_INTRODUCE1V (where V stands for versioned):
-
- Cleartext
- V Version byte: set to 1 [1 octet]
- PK_ID Identifier for Bob's PK [20 octets]
- AUTHT The auth type that is included [1 octet]
- AUTHL Length of auth data [2 octets]
- AUTHD Auth data [variable]
- Encrypted to Bob's PK:
- (RELAY_INTRODUCE2 cell)
-
- The maximum length of contained authorization data depends on the length
- of the contained INTRODUCE2 cell. A calculation follows below when
- describing the INTRODUCE2 cell format we propose to use.
-
- 1.3. Client authorization at hidden service
-
- The time when a hidden service receives an INTRODUCE2 cell constitutes
- the last possible authorization point during the hidden service
- protocol. Performing authorization here is easier than at the other two
- authorization points, because there are no possibly untrusted entities
- involved.
-
- In general, a client that is successfully authorized at the introduction
- point should be granted access at the hidden service, too. Otherwise, the
- client would receive a positive INTRODUCE_ACK cell from the introduction
- point and conclude that it may connect to the service, but the request
- will be dropped without notice. This would appear as a failure to
- clients. Therefore, the number of cases in which a client successfully
- passes the introduction point but fails at the hidden service should be
- zero. However, this does not lead to the conclusion that the
- authorization data used at the introduction point and the hidden service
- must be the same, but only that both authorization data should lead to
- the same authorization result.
-
- Authorization data is transmitted from client to server via an
- INTRODUCE2 cell that is forwarded by the introduction point. There are
- versions 0 to 2 specified in section 1.8 of rend-spec, but none of these
- contain fields for carrying authorization data. We propose a slightly
- modified version of v3 INTRODUCE2 cells that is specified in section
- 1.8.1 and which is not implemented as of December 2007. In contrast to
- the specified v3 we avoid specifying (and implementing) IPv6 capabilities,
- because Tor relays will be required to support IPv4 addresses for a long
- time in the future, so that this seems unnecessary at the moment. The
- proposed format of v3 INTRODUCE2 cells is as follows:
-
- VER Version byte: set to 3. [1 octet]
- AUTHT The auth type that is used [1 octet]
- AUTHL Length of auth data [2 octets]
- AUTHD Auth data [variable]
- TS Timestamp (seconds since 1-1-1970) [4 octets]
- IP Rendezvous point's address [4 octets]
- PORT Rendezvous point's OR port [2 octets]
- ID Rendezvous point identity ID [20 octets]
- KLEN Length of onion key [2 octets]
- KEY Rendezvous point onion key [KLEN octets]
- RC Rendezvous cookie [20 octets]
- g^x Diffie-Hellman data, part 1 [128 octets]
-
- The maximum possible length of authorization data is related to the
- enclosing INTRODUCE1V cell. A v3 INTRODUCE2 cell with
- 1024 bit = 128 octets long public key without any authorization data
- occupies 306 octets (AUTHL is only used when AUTHT has a value != 0),
- plus 58 octets for hybrid public key encryption (see
- section 5.1 of tor-spec on hybrid encryption of CREATE cells). The
- surrounding INTRODUCE1V cell requires 24 octets. This leaves only 110
- of the 498 available octets free, which must be shared between
- authorization data to the introduction point _and_ to the hidden
- service.
-
- When receiving a v3 INTRODUCE2 cell, Bob checks whether a client has
- provided valid authorization data to him. He also requires that the
- timestamp is no more than 30 minutes in the past or future and that the
- first part of the Diffie-Hellman handshake has not been used in the past
- 60 minutes to prevent replay attacks by rogue introduction points. (The
- reason for not using the rendezvous cookie to detect replays---even
- though it is only sent once in the current design---is that it might be
- desirable to re-use rendezvous cookies for multiple introduction requests
- in the future.) If all checks pass, Bob builds a circuit to the provided
- rendezvous point. Otherwise he drops the cell.
-
- 1.4. Summary of authorization data fields
-
- In summary, the proposed descriptor format and cell formats provide the
- following fields for carrying authorization data:
-
- (1) The v2 hidden service descriptor contains:
- - a descriptor cookie that is used for the lookup process, and
- - an arbitrary encryption schema to ensure authorization to access
- introduction information (currently symmetric encryption with the
- descriptor cookie).
-
- (2) For performing authorization at the introduction point we can use:
- - the fields intro-authorization and service-authorization in
- hidden service descriptors,
- - a maximum of 215 octets in the ESTABLISH_INTRO cell, and
- - one part of 110 octets in the INTRODUCE1V cell.
-
- (3) For performing authorization at the hidden service we can use:
- - the fields intro-authorization and service-authorization in
- hidden service descriptors,
- - the other part of 110 octets in the INTRODUCE2 cell.
-
- It will also still be possible to access a hidden service without any
- authorization or only use a part of the authorization infrastructure.
- However, this requires to consider all parts of the infrastructure. For
- example, authorization at the introduction point relying on confidential
- intro-authorization data transported in the hidden service descriptor
- cannot be performed without using an encryption schema for introduction
- information.
-
- 1.5. Managing authorization data at servers and clients
-
- In order to provide authorization data at the hidden service and the
- authenticated clients, we propose to use files---either the Tor
- configuration file or separate files. The exact format of these special
- files depends on the authorization protocol used.
-
- Currently, rend-spec contains the proposition to encode client-side
- authorization data in the URL, like in x.y.z.onion. This was never used
- and is also a bad idea, because in case of HTTP the requested URL may be
- contained in the Host and Referer fields.
-
- 1.6. Limitations for authorization protocols
-
- There are two limitations of the current hidden service protocol for
- authorization protocols that shall be identified here.
-
- 1. The three cell types ESTABLISH_INTRO, INTRODUCE1V, and INTRODUCE2
- restricts the amount of data that can be used for authorization.
- This forces authorization protocols that require per-user
- authorization data at the introduction point to restrict the number
- of authorized clients artificially. A possible solution could be to
- split contents among multiple cells and reassemble them at the
- introduction points.
-
- 2. The current hidden service protocol does not specify cell types to
- perform interactive authorization between client and introduction
- point or hidden service. If there should be an authorization
- protocol that requires interaction, new cell types would have to be
- defined and integrated into the hidden service protocol.
-
-
- 2. Specific authorization protocol instances
-
- In the following we present two specific authorization protocols that
- make use of (parts of) the new authorization infrastructure:
-
- 1. The first protocol allows a service provider to restrict access
- to clients with a previously received secret key only, but does not
- attempt to hide service activity from others.
-
- 2. The second protocol, albeit being feasible for a limited set of about
- 16 clients, performs client authorization and hides service activity
- from everyone but the authorized clients.
-
- These two protocol instances extend the existing hidden service protocol
- version 2. Hidden services that perform client authorization may run in
- parallel to other services running versions 0, 2, or both.
-
- 2.1. Service with large-scale client authorization
-
- The first client authorization protocol aims at performing access control
- while consuming as few additional resources as possible. A service
- provider should be able to permit access to a large number of clients
- while denying access for everyone else. However, the price for
- scalability is that the service won't be able to hide its activity from
- unauthorized or formerly authorized clients.
-
- The main idea of this protocol is to encrypt the introduction-point part
- in hidden service descriptors to authorized clients using symmetric keys.
- This ensures that nobody else but authorized clients can learn which
- introduction points a service currently uses, nor can someone send a
- valid INTRODUCE1 message without knowing the introduction key. Therefore,
- a subsequent authorization at the introduction point is not required.
-
- A service provider generates symmetric "descriptor cookies" for his
- clients and distributes them outside of Tor. The suggested key size is
- 128 bits, so that descriptor cookies can be encoded in 22 base64 chars
- (which can hold up to 22 * 5 = 132 bits, leaving 4 bits to encode the
- authorization type (here: "0") and allow a client to distinguish this
- authorization protocol from others like the one proposed below).
- Typically, the contact information for a hidden service using this
- authorization protocol looks like this:
-
- v2cbb2l4lsnpio4q.onion Ll3X7Xgz9eHGKCCnlFH0uz
-
- When generating a hidden service descriptor, the service encrypts the
- introduction-point part with a single randomly generated symmetric
- 128-bit session key using AES-CTR as described for v2 hidden service
- descriptors in rend-spec. Afterwards, the service encrypts the session
- key to all descriptor cookies using AES. Authorized client should be able
- to efficiently find the session key that is encrypted for him/her, so
- that 4 octet long client ID are generated consisting of descriptor cookie
- and initialization vector. Descriptors always contain a number of
- encrypted session keys that is a multiple of 16 by adding fake entries.
- Encrypted session keys are ordered by client IDs in order to conceal
- addition or removal of authorized clients by the service provider.
-
- ATYPE Authorization type: set to 1. [1 octet]
- ALEN Number of clients := 1 + ((clients - 1) div 16) [1 octet]
- for each symmetric descriptor cookie:
- ID Client ID: H(descriptor cookie | IV)[:4] [4 octets]
- SKEY Session key encrypted with descriptor cookie [16 octets]
- (end of client-specific part)
- RND Random data [(15 - ((clients - 1) mod 16)) * 20 octets]
- IV AES initialization vector [16 octets]
- IPOS Intro points, encrypted with session key [remaining octets]
-
- An authorized client needs to configure Tor to use the descriptor cookie
- when accessing the hidden service. Therefore, a user adds the contact
- information that she received from the service provider to her torrc
- file. Upon downloading a hidden service descriptor, Tor finds the
- encrypted introduction-point part and attempts to decrypt it using the
- configured descriptor cookie. (In the rare event of two or more client
- IDs being equal a client tries to decrypt all of them.)
-
- Upon sending the introduction, the client includes her descriptor cookie
- as auth type "1" in the INTRODUCE2 cell that she sends to the service.
- The hidden service checks whether the included descriptor cookie is
- authorized to access the service and either responds to the introduction
- request, or not.
-
- 2.2. Authorization for limited number of clients
-
- A second, more sophisticated client authorization protocol goes the extra
- mile of hiding service activity from unauthorized clients. With all else
- being equal to the preceding authorization protocol, the second protocol
- publishes hidden service descriptors for each user separately and gets
- along with encrypting the introduction-point part of descriptors to a
- single client. This allows the service to stop publishing descriptors for
- removed clients. As long as a removed client cannot link descriptors
- issued for other clients to the service, it cannot derive service
- activity any more. The downside of this approach is limited scalability.
- Even though the distributed storage of descriptors (cf. proposal 114)
- tackles the problem of limited scalability to a certain extent, this
- protocol should not be used for services with more than 16 clients. (In
- fact, Tor should refuse to advertise services for more than this number
- of clients.)
-
- A hidden service generates an asymmetric "client key" and a symmetric
- "descriptor cookie" for each client. The client key is used as
- replacement for the service's permanent key, so that the service uses a
- different identity for each of his clients. The descriptor cookie is used
- to store descriptors at changing directory nodes that are unpredictable
- for anyone but service and client, to encrypt the introduction-point
- part, and to be included in INTRODUCE2 cells. Once the service has
- created client key and descriptor cookie, he tells them to the client
- outside of Tor. The contact information string looks similar to the one
- used by the preceding authorization protocol (with the only difference
- that it has "1" encoded as auth-type in the remaining 4 of 132 bits
- instead of "0" as before).
-
- When creating a hidden service descriptor for an authorized client, the
- hidden service uses the client key and descriptor cookie to compute
- secret ID part and descriptor ID:
-
- secret-id-part = H(time-period | descriptor-cookie | replica)
-
- descriptor-id = H(client-key[:10] | secret-id-part)
-
- The hidden service also replaces permanent-key in the descriptor with
- client-key and encrypts introduction-points with the descriptor cookie.
-
- ATYPE Authorization type: set to 2. [1 octet]
- IV AES initialization vector [16 octets]
- IPOS Intro points, encr. with descriptor cookie [remaining octets]
-
- When uploading descriptors, the hidden service needs to make sure that
- descriptors for different clients are not uploaded at the same time (cf.
- Section 1.1) which is also a limiting factor for the number of clients.
-
- When a client is requested to establish a connection to a hidden service
- it looks up whether it has any authorization data configured for that
- service. If the user has configured authorization data for authorization
- protocol "2", the descriptor ID is determined as described in the last
- paragraph. Upon receiving a descriptor, the client decrypts the
- introduction-point part using its descriptor cookie. Further, the client
- includes its descriptor cookie as auth-type "2" in INTRODUCE2 cells that
- it sends to the service.
-
- 2.3. Hidden service configuration
-
- A hidden service that is meant to perform client authorization adds a
- new option HiddenServiceAuthorizeClient to its hidden service
- configuration. This option contains the authorization type which is
- either "1" for the protocol described in 2.1 or "2" for the protocol in
- 2.2 and a comma-separated list of human-readable client names, so that
- Tor can create authorization data for these clients:
-
- HiddenServiceAuthorizeClient auth-type client-name,client-name,...
-
- If this option is configured, HiddenServiceVersion is automatically
- reconfigured to contain only version numbers of 2 or higher.
-
- Tor stores all generated authorization data for the authorization
- protocols described in Sections 2.1 and 2.2 in a new file using the
- following file format:
-
- "client-name" human-readable client identifier NL
- "descriptor-cookie" 128-bit key ^= 22 base64 chars NL
-
- If the authorization protocol of Section 2.2 is used, Tor also generates
- and stores the following data:
-
- "client-key" NL a public key in PEM format
-
- 2.4. Client configuration
-
- Clients need to make their authorization data known to Tor using another
- configuration option that contains a service name (mainly for the sake of
- convenience), the service address, and the descriptor cookie that is
- required to access a hidden service (the authorization protocol number is
- encoded in the descriptor cookie):
-
- HidServAuth service-name service-address descriptor-cookie
-
-Security implications:
-
- In the following we want to discuss possible attacks by dishonest
- entities in the presented infrastructure and specific protocol. These
- security implications would have to be verified once more when adding
- another protocol. The dishonest entities (theoretically) include the
- hidden service itself, the authenticated clients, hidden service directory
- nodes, introduction points, and rendezvous points. The relays that are
- part of circuits used during protocol execution, but never learn about
- the exchanged descriptors or cells by design, are not considered.
- Obviously, this list makes no claim to be complete. The discussed attacks
- are sorted by the difficulty to perform them, in ascending order,
- starting with roles that everyone could attempt to take and ending with
- partially trusted entities abusing the trust put in them.
-
- (1) A hidden service directory could attempt to conclude presence of a
- service from the existence of a locally stored hidden service descriptor:
- This passive attack is possible only for a single client-service
- relation, because descriptors need to contain a publicly visible
- signature of the service using the client key.
- A possible protection would be to increase the number of hidden service
- directories in the network.
-
- (2) A hidden service directory could try to break the descriptor cookies
- of locally stored descriptors: This attack can be performed offline. The
- only useful countermeasure against it might be using safe passwords that
- are generated by Tor.
-
-[passwords? where did those come in? -RD]
-
- (3) An introduction point could try to identify the pseudonym of the
- hidden service on behalf of which it operates: This is impossible by
- design, because the service uses a fresh public key for every
- establishment of an introduction point (see proposal 114) and the
- introduction point receives a fresh introduction cookie, so that there is
- no identifiable information about the service that the introduction point
- could learn. The introduction point cannot even tell if client accesses
- belong to the same client or not, nor can it know the total number of
- authorized clients. The only information might be the pattern of
- anonymous client accesses, but that is hardly enough to reliably identify
- a specific service.
-
- (4) An introduction point could want to learn the identities of accessing
- clients: This is also impossible by design, because all clients use the
- same introduction cookie for authorization at the introduction point.
-
- (5) An introduction point could try to replay a correct INTRODUCE1 cell
- to other introduction points of the same service, e.g. in order to force
- the service to create a huge number of useless circuits: This attack is
- not possible by design, because INTRODUCE1 cells are encrypted using a
- freshly created introduction key that is only known to authorized
- clients.
-
- (6) An introduction point could attempt to replay a correct INTRODUCE2
- cell to the hidden service, e.g. for the same reason as in the last
- attack: This attack is stopped by the fact that a service will drop
- INTRODUCE2 cells containing a DH handshake they have seen recently.
-
- (7) An introduction point could block client requests by sending either
- positive or negative INTRODUCE_ACK cells back to the client, but without
- forwarding INTRODUCE2 cells to the server: This attack is an annoyance
- for clients, because they might wait for a timeout to elapse until trying
- another introduction point. However, this attack is not introduced by
- performing authorization and it cannot be targeted towards a specific
- client. A countermeasure might be for the server to periodically perform
- introduction requests to his own service to see if introduction points
- are working correctly.
-
- (8) The rendezvous point could attempt to identify either server or
- client: This remains impossible as it was before, because the
- rendezvous cookie does not contain any identifiable information.
-
- (9) An authenticated client could swamp the server with valid INTRODUCE1
- and INTRODUCE2 cells, e.g. in order to force the service to create
- useless circuits to rendezvous points; as opposed to an introduction
- point replaying the same INTRODUCE2 cell, a client could include a new
- rendezvous cookie for every request: The countermeasure for this attack
- is the restriction to 10 connection establishments per client per hour.
-
-Compatibility:
-
- An implementation of this proposal would require changes to hidden
- services and clients to process authorization data and encode and
- understand the new formats. However, both services and clients would
- remain compatible to regular hidden services without authorization.
-
-Implementation:
-
- The implementation of this proposal can be divided into a number of
- changes to hidden service and client side. There are no
- changes necessary on directory, introduction, or rendezvous nodes. All
- changes are marked with either [service] or [client] do denote on which
- side they need to be made.
-
- /1/ Configure client authorization [service]
-
- - Parse configuration option HiddenServiceAuthorizeClient containing
- authorized client names.
- - Load previously created client keys and descriptor cookies.
- - Generate missing client keys and descriptor cookies, add them to
- client_keys file.
- - Rewrite the hostname file.
- - Keep client keys and descriptor cookies of authorized clients in
- memory.
- [- In case of reconfiguration, mark which client authorizations were
- added and whether any were removed. This can be used later when
- deciding whether to rebuild introduction points and publish new
- hidden service descriptors. Not implemented yet.]
-
- /2/ Publish hidden service descriptors [service]
-
- - Create and upload hidden service descriptors for all authorized
- clients.
- [- See /1/ for the case of reconfiguration.]
-
- /3/ Configure permission for hidden services [client]
-
- - Parse configuration option HidServAuth containing service
- authorization, store authorization data in memory.
-
- /5/ Fetch hidden service descriptors [client]
-
- - Look up client authorization upon receiving a hidden service request.
- - Request hidden service descriptor ID including client key and
- descriptor cookie. Only request v2 descriptors, no v0.
-
- /6/ Process hidden service descriptor [client]
-
- - Decrypt introduction points with descriptor cookie.
-
- /7/ Create introduction request [client]
-
- - Include descriptor cookie in INTRODUCE2 cell to introduction point.
- - Pass descriptor cookie around between involved connections and
- circuits.
-
- /8/ Process introduction request [service]
-
- - Read descriptor cookie from INTRODUCE2 cell.
- - Check whether descriptor cookie is authorized for access, including
- checking access counters.
- - Log access for accountability.
-
diff --git a/doc/spec/proposals/122-unnamed-flag.txt b/doc/spec/proposals/122-unnamed-flag.txt
deleted file mode 100644
index 2ce7bb22b..000000000
--- a/doc/spec/proposals/122-unnamed-flag.txt
+++ /dev/null
@@ -1,136 +0,0 @@
-Filename: 122-unnamed-flag.txt
-Title: Network status entries need a new Unnamed flag
-Author: Roger Dingledine
-Created: 04-Oct-2007
-Status: Closed
-Implemented-In: 0.2.0.x
-
-1. Overview:
-
- Tor's directory authorities can give certain servers a "Named" flag
- in the network-status entry, when they want to bind that nickname to
- that identity key. This allows clients to specify a nickname rather
- than an identity fingerprint and still be certain they're getting the
- "right" server. As dir-spec.txt describes it,
-
- Name X is bound to identity Y if at least one binding directory lists
- it, and no directory binds X to some other Y'.
-
- In practice, clients can refer to servers by nickname whether they are
- Named or not; if they refer to nicknames that aren't Named, a complaint
- shows up in the log asking them to use the identity key in the future
- --- but it still works.
-
- The problem? Imagine a Tor server with nickname Bob. Bob and his
- identity fingerprint are registered in tor26's approved-routers
- file, but none of the other authorities registered him. Imagine
- there are several other unregistered servers also with nickname Bob
- ("the imposters").
-
- While Bob is online, all is well: a) tor26 gives a Named flag to
- the real one, and refuses to list the other ones; and b) the other
- authorities list the imposters but don't give them a Named flag. Clients
- who have all the network-statuses can compute which one is the real Bob.
-
- But when the real Bob disappears and his descriptor expires? tor26
- continues to refuse to list any of the imposters, and the other
- authorities continue to list the imposters. Clients don't have any
- idea that there exists a Named Bob, so they can ask for server Bob and
- get one of the imposters. (A warning will also appear in their log,
- but so what.)
-
-2. The stopgap solution:
-
- tor26 should start accepting and listing the imposters, but it should
- assign them a new flag: "Unnamed".
-
- This would produce three cases in terms of assigning flags in the consensus
- networkstatus:
-
- i) a router gets the Named flag in the v3 networkstatus if
- a) it's the only router with that nickname that has the Named flag
- out of all the votes, and
- b) no vote lists it as Unnamed
- else,
- ii) a router gets the Unnamed flag if
- a) some vote lists a different router with that nickname as Named, or
- b) at least one vote lists it as Unnamed, or
- c) there are other routers with the same nickname that are Unnamed
- else,
- iii) the router neither gets a Named nor an Unnamed flag.
-
- (This whole proposal is meant only for v3 dir flags; we shouldn't try
- to backport it to the v2 dir world.)
-
- Then client behavior is:
-
- a) If there's a Bob with a Named flag, pick that one.
- else b) If the Bobs don't have the Unnamed flag (notice that they should
- either all have it, or none), pick one of them and warn.
- else c) They all have the Unnamed flag -- no router found.
-
-3. Problems not solved by this stopgap:
-
- 3.1. Naming authorities can go offline.
-
- If tor26 is the only authority that provides a binding for Bob, when
- tor26 goes offline we're back in our previous situation -- the imposters
- can be referenced with a mere ignorable warning in the client's log.
-
- If some other authority Names a different Bob, and tor26 goes offline,
- then that other Bob becomes the unique Named Bob.
-
- So be it. We should try to solve these one day, but there's no clear way
- to do it that doesn't destroy usability in other ways, and if we want
- to get the Unnamed flag into v3 network statuses we should add it soon.
-
- 3.2. V3 dir spec magnifies brief discrepancies.
-
- Another point to notice is if tor26 names Bob(1), doesn't know about
- Bob(2), but moria lists Bob(2). Then Bob(2) doesn't get an Unnamed flag
- even if it should (and Bob(1) is not around).
-
- Right now, in v2 dirs, the case where an authority doesn't know about
- a server but the other authorities do know is rare. That's because
- authorities periodically ask for other networkstatuses and then fetch
- descriptors that are missing.
-
- With v3, if that window occurs at the wrong time, it is extended for the
- entire period. We could solve this by making the voting more complex,
- but that doesn't seem worth it.
-
- [3.3. Tor26 is only one tor26.
-
- We need more naming authorities, possibly with some kind of auto-naming
- feature. This is out-of-scope for this proposal -NM]
-
-4. Changes to the v2 directory
-
- Previously, v2 authorities that had a binding for a server named Bob did
- not list any other server named Bob. This will change too:
-
- Version 2 authorities will start listing all routers they know about,
- whether they conflict with a name-binding or not: Servers for which
- this authority has a binding will continue to be marked Named,
- additionally all other servers of that nickname will be listed without the
- Named flag (i.e. there will be no Unnamed flag in v2 status documents).
-
- Clients already should handle having a named Bob alongside unnamed
- Bobs correctly, and having the unnamed Bobs in the status file even
- without the named server is no worse than the current status quo where
- clients learn about those servers from other authorities.
-
- The benefit of this is that an authority's opinion on a server like
- Guard, Stable, Fast etc. can now be learned by clients even if that
- specific authority has reserved that server's name for somebody else.
-
-5. Other benefits:
-
- This new flag will allow people to operate servers that happen to have
- the same nickname as somebody who registered their server two years ago
- and left soon after. Right now there are dozens of nicknames that are
- registered on all three binding directory authorities, yet haven't been
- running for years. While it's bad that these nicknames are effectively
- blacklisted from the network, the really bad part is that this logic
- is really unintuitive to prospective new server operators.
-
diff --git a/doc/spec/proposals/123-autonaming.txt b/doc/spec/proposals/123-autonaming.txt
deleted file mode 100644
index 74c486985..000000000
--- a/doc/spec/proposals/123-autonaming.txt
+++ /dev/null
@@ -1,54 +0,0 @@
-Filename: 123-autonaming.txt
-Title: Naming authorities automatically create bindings
-Author: Peter Palfrader
-Created: 2007-10-11
-Status: Closed
-Implemented-In: 0.2.0.x
-
-Overview:
-
- Tor's directory authorities can give certain servers a "Named" flag
- in the network-status entry, when they want to bind that nickname to
- that identity key. This allows clients to specify a nickname rather
- than an identity fingerprint and still be certain they're getting the
- "right" server.
-
- Authority operators name a server by adding their nickname and
- identity fingerprint to the 'approved-routers' file. Historically
- being listed in the file was required for a router, at first for being
- listed in the directory at all, and later in order to be used by
- clients as a first or last hop of a circuit.
-
- Adding identities to the list of named routers so far has been a
- manual, time consuming, and boring job. Given that and the fact that
- the Tor network works just fine without named routers the last
- authority to keep a current binding list stopped updating it well over
- half a year ago.
-
- Naming, if it were done, would serve a useful purpose however in that
- users can have a reasonable expectation that the exit server Bob they
- are using in their http://www.google.com.bob.exit/ URL is the same
- Bob every time.
-
-Proposal:
- I propose that identity<->name binding be completely automated:
-
- New bindings should be added after the router has been around for a
- bit and their name has not been used by other routers, similarly names
- that have not appeared on the network for a long time should be freed
- in case a new router wants to use it.
-
- The following rules are suggested:
- i) If a named router has not been online for half a year, the
- identity<->name binding for that name is removed. The nickname
- is free to be taken by other routers now.
- ii) If a router claims a certain nickname and
- a) has been on the network for at least two weeks, and
- b) that nickname is not yet linked to a different router, and
- c) no other router has wanted that nickname in the last month,
- a new binding should be created for this router and its desired
- nickname.
-
- This automaton does not necessarily need to live in the Tor code, it
- can do its job just as well when it's an external tool.
-
diff --git a/doc/spec/proposals/124-tls-certificates.txt b/doc/spec/proposals/124-tls-certificates.txt
deleted file mode 100644
index 9472d14af..000000000
--- a/doc/spec/proposals/124-tls-certificates.txt
+++ /dev/null
@@ -1,313 +0,0 @@
-Filename: 124-tls-certificates.txt
-Title: Blocking resistant TLS certificate usage
-Author: Steven J. Murdoch
-Created: 2007-10-25
-Status: Superseded
-
-Overview:
-
- To be less distinguishable from HTTPS web browsing, only Tor servers should
- present TLS certificates. This should be done whilst maintaining backwards
- compatibility with Tor nodes which present and expect client certificates, and
- while preserving existing security properties. This specification describes
- the negotiation protocol, what certificates should be presented during the TLS
- negotiation, and how to move the client authentication within the encrypted
- tunnel.
-
-Motivation:
-
- In Tor's current TLS [1] handshake, both client and server present a
- two-certificate chain. Since TLS performs authentication prior to establishing
- the encrypted tunnel, the contents of these certificates are visible to an
- eavesdropper. In contrast, during normal HTTPS web browsing, the server
- presents a single certificate, signed by a root CA and the client presents no
- certificate. Hence it is possible to distinguish Tor from HTTP by identifying
- this pattern.
-
- To resist blocking based on traffic identification, Tor should behave as close
- to HTTPS as possible, i.e. servers should offer a single certificate and not
- request a client certificate; clients should present no certificate. This
- presents two difficulties: clients are no longer authenticated and servers are
- authenticated by the connection key, rather than identity key. The link
- protocol must thus be modified to preserve the old security semantics.
-
- Finally, in order to maintain backwards compatibility, servers must correctly
- identify whether the client supports the modified certificate handling. This
- is achieved by modifying the cipher suites that clients advertise support
- for. These cipher suites are selected to be similar to those chosen by web
- browsers, in order to resist blocking based on client hello.
-
-Terminology:
-
- Initiator: OP or OR which initiates a TLS connection ("client" in TLS
- terminology)
-
- Responder: OR which receives an incoming TLS connection ("server" in TLS
- terminology)
-
-Version negotiation and cipher suite selection:
-
- In the modified TLS handshake, the responder does not request a certificate
- from the initiator. This request would normally occur immediately after the
- responder receives the client hello (the first message in a TLS handshake) and
- so the responder must decide whether to request a certificate based only on
- the information in the client hello. This is achieved by examining the cipher
- suites in the client hello.
-
- List 1: cipher suites lists offered by version 0/1 Tor
-
- From src/common/tortls.c, revision 12086:
- TLS1_TXT_DHE_RSA_WITH_AES_128_SHA
- TLS1_TXT_DHE_RSA_WITH_AES_128_SHA : SSL3_TXT_EDH_RSA_DES_192_CBC3_SHA
- SSL3_TXT_EDH_RSA_DES_192_CBC3_SHA
-
- Client hello sent by initiator:
-
- Initiators supporting version 2 of the Tor connection protocol MUST
- offer a different cipher suite list from those sent by pre-version 2
- Tors, contained in List 1. To maintain compatibility with older Tor
- versions and common browsers, the cipher suite list MUST include
- support for:
-
- TLS_DHE_RSA_WITH_AES_256_CBC_SHA
- TLS_DHE_RSA_WITH_AES_128_CBC_SHA
- SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA
- SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA
-
- Client hello received by responder/server hello sent by responder:
-
- Responders supporting version 2 of the Tor connection protocol should compare
- the cipher suite list in the client hello with those in List 1. If it matches
- any in the list then the responder should assume that the initiatior supports
- version 1, and thus should maintain the version 1 behavior, i.e. send a
- two-certificate chain, request a client certificate and do not send or expect
- a VERSIONS cell [2].
-
- Otherwise, the responder should assume version 2 behavior and select a cipher
- suite following TLS [1] behavior, i.e. select the first entry from the client
- hello cipher list which is acceptable. Responders MUST NOT select any suite
- that lacks ephemeral keys, or whose symmetric keys are less then KEY_LEN bits,
- or whose digests are less than HASH_LEN bits. Implementations SHOULD NOT
- allow other SSLv3 ciphersuites.
-
- Should no mutually acceptable cipher suite be found, the connection MUST be
- closed.
-
- If the responder is implementing version 2 of the connection protocol it
- SHOULD send a server certificate with random contents. The organizationName
- field MUST NOT be "Tor", "TOR" or "t o r".
-
- Server certificate received by initiator:
-
- If the server certificate has an organizationName of "Tor", "TOR" or "t o r",
- the initiator should assume that the responder does not support version 2 of
- the connection protocol. In which case the initiator should respond following
- version 1, i.e. send a two-certificate client chain and do not send or expect
- a VERSIONS cell.
-
- [SJM: We could also use the fact that a client certificate request was sent]
-
- If the server hello contains a ciphersuite which does not comply with the key
- length requirements above, even if it was one offered in the client hello, the
- connection MUST be closed. This will only occur if the responder is not a Tor
- server.
-
- Backward compatibility:
-
- v1 Initiator, v1 Responder: No change
- v1 Initiator, v2 Responder: Responder detects v1 initiator by client hello
- v2 Initiator, v1 Responder: Responder accepts v2 client hello. Initiator
- detects v1 server certificate and continues with v1 protocol
- v2 Initiator, v2 Responder: Responder accepts v2 client hello. Initiator
- detects v2 server certificate and continues with v2 protocol.
-
- Additional link authentication process:
-
- Following VERSION and NETINFO negotiation, both responder and
- initiator MUST send a certification chain in a CERT cell. If one
- party does not have a certificate, the CERT cell MUST still be sent,
- but with a length of zero.
-
- A CERT cell is a variable length cell, of the format
- CircID [2 bytes]
- Command [1 byte]
- Length [2 bytes]
- Payload [<length> bytes]
-
- CircID MUST set to be 0x0000
- Command is [SJM: TODO]
- Length is the length of the payload
- Payload contains 0 or more certificates, each is of the format:
- Cert_Length [2 bytes]
- Certificate [<cert_length> bytes]
-
- Each certificate MUST sign the one preceding it. The initator MUST
- place its connection certificate first; the responder, having
- already sent its connection certificate as part of the TLS handshake
- MUST place its identity certificate first.
-
- Initiators who send a CERT cell MUST follow that with an LINK_AUTH
- cell to prove that they posess the corresponding private key.
-
- A LINK_AUTH cell is fixed-lenth, of the format:
- CircID [2 bytes]
- Command [1 byte]
- Length [2 bytes]
- Payload (padded with 0 bytes) [PAYLOAD_LEN - 2 bytes]
-
- CircID MUST set to be 0x0000
- Command is [SJM: TODO]
- Length is the valid portion of the payload
- Payload is of the format:
- Signature version [1 byte]
- Signature [<length> - 1 bytes]
- Padding [PAYLOAD_LEN - <length> - 2 bytes]
-
- Signature version: Identifies the type of signature, currently 0x00
- Signature: Digital signature under the initiator's connection key of the
- following item, in PKCS #1 block type 1 [3] format:
-
- HMAC-SHA1, using the TLS master secret as key, of the
- following elements concatenated:
- - The signature version (0x00)
- - The NUL terminated ASCII string: "Tor initiator certificate verification"
- - client_random, as sent in the Client Hello
- - server_random, as sent in the Server Hello
- - SHA-1 hash of the initiator connection certificate
- - SHA-1 hash of the responder connection certificate
-
- Security checks:
-
- - Before sending a LINK_AUTH cell, a node MUST ensure that the TLS
- connection is authenticated by the responder key.
- - For the handshake to have succeeded, the initiator MUST confirm:
- - That the TLS handshake was authenticated by the
- responder connection key
- - That the responder connection key was signed by the first
- certificate in the CERT cell
- - That each certificate in the CERT cell was signed by the
- following certificate, with the exception of the last
- - That the last certificate in the CERT cell is the expected
- identity certificate for the node being connected to
- - For the handshake to have succeeded, the responder MUST confirm
- either:
- A) - A zero length CERT cell was sent and no LINK_AUTH cell was
- sent
- In which case the responder shall treat the identity of the
- initiator as unknown
- or
- B) - That the LINK_AUTH MAC contains a signature by the first
- certificate in the CERT cell
- - That the MAC signed matches the expected value
- - That each certificate in the CERT cell was signed by the
- following certificate, with the exception of the last
- In which case the responder shall treat the identity of the
- initiator as that of the last certificate in the CERT cell
-
- Protocol summary:
-
- 1. I(nitiator) <-> R(esponder): TLS handshake, including responder
- authentication under connection certificate R_c
- 2. I <->: VERSION and NETINFO negotiation
- 3. R -> I: CERT (Responder identity certificate R_i (which signs R_c))
- 4. I -> R: CERT (Initiator connection certificate I_c,
- Initiator identity certificate I_i (which signs I_c)
- 5. I -> R: LINK_AUTH (Signature, under I_c of HMAC-SHA1(master_secret,
- "Tor initiator certificate verification" ||
- client_random || server_random ||
- I_c hash || R_c hash)
-
- Notes: I -> R doesn't need to wait for R_i before sending its own
- messages (reduces round-trips).
- Certificate hash is calculated like identity hash in CREATE cells.
- Initiator signature is calculated in a similar way to Certificate
- Verify messages in TLS 1.1 (RFC4346, Sections 7.4.8 and 4.7).
- If I is an OP, a zero length certificate chain may be sent in step 4;
- In which case, step 5 is not performed
-
- Rationale:
-
- - Version and netinfo negotiation before authentication: The version cell needs
- to come before before the rest of the protocol, since we may choose to alter
- the rest at some later point, e.g switch to a different MAC/signature scheme.
- It is useful to keep the NETINFO and VERSION cells close to each other, since
- the time between them is used to check if there is a delay-attack. Still, a
- server might want to not act on NETINFO data from an initiator until the
- authentication is complete.
-
-Appendix A: Cipher suite choices
-
- This specification intentionally does not put any constraints on the
- TLS ciphersuite lists presented by clients, other than a minimum
- required for compatibility. However, to maximize blocking
- resistance, ciphersuite lists should be carefully selected.
-
- Recommended client ciphersuite list
-
- Source: http://lxr.mozilla.org/security/source/security/nss/lib/ssl/sslproto.h
-
- 0xc00a: TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA
- 0xc014: TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA
- 0x0039: TLS_DHE_RSA_WITH_AES_256_CBC_SHA
- 0x0038: TLS_DHE_DSS_WITH_AES_256_CBC_SHA
- 0xc00f: TLS_ECDH_RSA_WITH_AES_256_CBC_SHA
- 0xc005: TLS_ECDH_ECDSA_WITH_AES_256_CBC_SHA
- 0x0035: TLS_RSA_WITH_AES_256_CBC_SHA
- 0xc007: TLS_ECDHE_ECDSA_WITH_RC4_128_SHA
- 0xc009: TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA
- 0xc011: TLS_ECDHE_RSA_WITH_RC4_128_SHA
- 0xc013: TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA
- 0x0033: TLS_DHE_RSA_WITH_AES_128_CBC_SHA
- 0x0032: TLS_DHE_DSS_WITH_AES_128_CBC_SHA
- 0xc00c: TLS_ECDH_RSA_WITH_RC4_128_SHA
- 0xc00e: TLS_ECDH_RSA_WITH_AES_128_CBC_SHA
- 0xc002: TLS_ECDH_ECDSA_WITH_RC4_128_SHA
- 0xc004: TLS_ECDH_ECDSA_WITH_AES_128_CBC_SHA
- 0x0004: SSL_RSA_WITH_RC4_128_MD5
- 0x0005: SSL_RSA_WITH_RC4_128_SHA
- 0x002f: TLS_RSA_WITH_AES_128_CBC_SHA
- 0xc008: TLS_ECDHE_ECDSA_WITH_3DES_EDE_CBC_SHA
- 0xc012: TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA
- 0x0016: SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA
- 0x0013: SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA
- 0xc00d: TLS_ECDH_RSA_WITH_3DES_EDE_CBC_SHA
- 0xc003: TLS_ECDH_ECDSA_WITH_3DES_EDE_CBC_SHA
- 0xfeff: SSL_RSA_FIPS_WITH_3DES_EDE_CBC_SHA (168-bit Triple DES with RSA and a SHA1 MAC)
- 0x000a: SSL_RSA_WITH_3DES_EDE_CBC_SHA
-
- Order specified in:
- http://lxr.mozilla.org/security/source/security/nss/lib/ssl/sslenum.c#47
-
- Recommended options:
- 0x0000: Server Name Indication [4]
- 0x000a: Supported Elliptic Curves [5]
- 0x000b: Supported Point Formats [5]
-
- Recommended compression:
- 0x00
-
- Recommended server ciphersuite selection:
-
- The responder should select the first entry in this list which is
- listed in the client hello:
-
- 0x0039: TLS_DHE_RSA_WITH_AES_256_CBC_SHA [ Common Firefox choice ]
- 0x0033: TLS_DHE_RSA_WITH_AES_128_CBC_SHA [ Tor v1 default ]
- 0x0016: SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA [ Tor v1 fallback ]
- 0x0013: SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA [ Valid IE option ]
-
-References:
-
-[1] The Transport Layer Security (TLS) Protocol, Version 1.1, RFC4346, IETF
-
-[2] Version negotiation for the Tor protocol, Tor proposal 105
-
-[3] B. Kaliski, "Public-Key Cryptography Standards (PKCS) #1:
- RSA Cryptography Specifications Version 1.5", RFC 2313,
- March 1998.
-
-[4] TLS Extensions, RFC 3546
-
-[5] Elliptic Curve Cryptography (ECC) Cipher Suites for Transport Layer Security (TLS)
-
-% <!-- Local IspellDict: american -->
diff --git a/doc/spec/proposals/125-bridges.txt b/doc/spec/proposals/125-bridges.txt
deleted file mode 100644
index 9d95729d4..000000000
--- a/doc/spec/proposals/125-bridges.txt
+++ /dev/null
@@ -1,291 +0,0 @@
-Filename: 125-bridges.txt
-Title: Behavior for bridge users, bridge relays, and bridge authorities
-Author: Roger Dingledine
-Created: 11-Nov-2007
-Status: Closed
-Implemented-In: 0.2.0.x
-
-0. Preface
-
- This document describes the design decisions around support for bridge
- users, bridge relays, and bridge authorities. It acts as an overview
- of the bridge design and deployment for developers, and it also tries
- to point out limitations in the current design and implementation.
-
- For more details on what all of these mean, look at blocking.tex in
- /doc/design-paper/
-
-1. Bridge relays
-
- Bridge relays are just like normal Tor relays except they don't publish
- their server descriptors to the main directory authorities.
-
-1.1. PublishServerDescriptor
-
- To configure your relay to be a bridge relay, just add
- BridgeRelay 1
- PublishServerDescriptor bridge
- to your torrc. This will cause your relay to publish its descriptor
- to the bridge authorities rather than to the default authorities.
-
- Alternatively, you can say
- BridgeRelay 1
- PublishServerDescriptor 0
- which will cause your relay to not publish anywhere. This could be
- useful for private bridges.
-
-1.2. Exit policy
-
- Bridge relays should use an exit policy of "reject *:*". This is
- because they only need to relay traffic between the bridge users
- and the rest of the Tor network, so there's no need to let people
- exit directly from them.
-
-1.3. RelayBandwidthRate / RelayBandwidthBurst
-
- We invented the RelayBandwidth* options for this situation: Tor clients
- who want to allow relaying too. See proposal 111 for details. Relay
- operators should feel free to rate-limit their relayed traffic.
-
-1.4. Helping the user with port forwarding, NAT, etc.
-
- Just as for operating normal relays, our documentation and hints for
- how to make your ORPort reachable are inadequate for normal users.
-
- We need to work harder on this step, perhaps in 0.2.2.x.
-
-1.5. Vidalia integration
-
- Vidalia has turned its "Relay" settings page into a tri-state
- "Don't relay" / "Relay for the Tor network" / "Help censored users".
-
- If you click the third choice, it forces your exit policy to reject *:*.
-
- If all the bridges end up on port 9001, that's not so good. On the
- other hand, putting the bridges on a low-numbered port in the Unix
- world requires jumping through extra hoops. The current compromise is
- that Vidalia makes the ORPort default to 443 on Windows, and 9001 on
- other platforms.
-
- At the bottom of the relay config settings window, Vidalia displays
- the bridge identifier to the operator (see Section 3.1) so he can pass
- it on to bridge users.
-
-1.6. What if the default ORPort is already used?
-
- If the user already has a webserver or some other application
- bound to port 443, then Tor will fail to bind it and complain to the
- user, probably in a cryptic way. Rather than just working on a better
- error message (though we should do this), we should consider an
- "ORPort auto" option that tells Tor to try to find something that's
- bindable and reachable. This would also help us tolerate ISPs that
- filter incoming connections on port 80 and port 443. But this should
- be a different proposal, and can wait until 0.2.2.x.
-
-2. Bridge authorities.
-
- Bridge authorities are like normal directory authorities, except they
- don't create their own network-status documents or votes. So if you
- ask an authority for a network-status document or consensus, they
- behave like a directory mirror: they give you one from one of the main
- authorities. But if you ask the bridge authority for the descriptor
- corresponding to a particular identity fingerprint, it will happily
- give you the latest descriptor for that fingerprint.
-
- To become a bridge authority, add these lines to your torrc:
- AuthoritativeDirectory 1
- BridgeAuthoritativeDir 1
-
- Right now there's one bridge authority, running on the Tonga relay.
-
-2.1. Exporting bridge-purpose descriptors
-
- We've added a new purpose for server descriptors: the "bridge"
- purpose. With the new router-descriptors file format that includes
- annotations, it's easy to look through it and find the bridge-purpose
- descriptors.
-
- Currently we export the bridge descriptors from Tonga to the
- BridgeDB server, so it can give them out according to the policies
- in blocking.pdf.
-
-2.2. Reachability/uptime testing
-
- Right now the bridge authorities do active reachability testing of
- bridges, so we know which ones to recommend for users.
-
- But in the design document, we suggested that bridges should publish
- anonymously (i.e. via Tor) to the bridge authority, so somebody watching
- the bridge authority can't just enumerate all the bridges. But if we're
- doing active measurement, the game is up. Perhaps we should back off on
- this goal, or perhaps we should do our active measurement anonymously?
-
- Answering this issue is scheduled for 0.2.1.x.
-
-2.3. Migrating to multiple bridge authorities
-
- Having only one bridge authority is both a trust bottleneck (if you
- break into one place you learn about every single bridge we've got)
- and a robustness bottleneck (when it's down, bridge users become sad).
-
- Right now if we put up a second bridge authority, all the bridges would
- publish to it, and (assuming the code works) bridge users would query
- a random bridge authority. This resolves the robustness bottleneck,
- but makes the trust bottleneck even worse.
-
- In 0.2.2.x and later we should think about better ways to have multiple
- bridge authorities.
-
-3. Bridge users.
-
- Bridge users are like ordinary Tor users except they use encrypted
- directory connections by default, and they use bridge relays as both
- entry guards (their first hop) and directory guards (the source of
- all their directory information).
-
- To become a bridge user, add the following line to your torrc:
-
- UseBridges 1
-
- and then add at least one "Bridge" line to your torrc based on the
- format below.
-
-3.1. Format of the bridge identifier.
-
- The canonical format for a bridge identifier contains an IP address,
- an ORPort, and an identity fingerprint:
- bridge 128.31.0.34:9009 4C17 FB53 2E20 B2A8 AC19 9441 ECD2 B017 7B39 E4B1
-
- However, the identity fingerprint can be left out, in which case the
- bridge user will connect to that relay and use it as a bridge regardless
- of what identity key it presents:
- bridge 128.31.0.34:9009
- This might be useful for cases where only short bridge identifiers
- can be communicated to bridge users.
-
- In a future version we may also support bridge identifiers that are
- only a key fingerprint:
- bridge 4C17 FB53 2E20 B2A8 AC19 9441 ECD2 B017 7B39 E4B1
- and the bridge user can fetch the latest descriptor from the bridge
- authority (see Section 3.4).
-
-3.2. Bridges as entry guards
-
- For now, bridge users add their bridge relays to their list of "entry
- guards" (see path-spec.txt for background on entry guards). They are
- managed by the entry guard algorithms exactly as if they were a normal
- entry guard -- their keys and timing get cached in the "state" file,
- etc. This means that when the Tor user starts up with "UseBridges"
- disabled, he will skip past the bridge entries since they won't be
- listed as up and usable in his networkstatus consensus. But to be clear,
- the "entry_guards" list doesn't currently distinguish guards by purpose.
-
- Internally, each bridge user keeps a smartlist of "bridge_info_t"
- that reflects the "bridge" lines from his torrc along with a download
- schedule (see Section 3.5 below). When he starts Tor, he attempts
- to fetch a descriptor for each configured bridge (see Section 3.4
- below). When he succeeds at getting a descriptor for one of the bridges
- in his list, he adds it directly to the entry guard list using the
- normal add_an_entry_guard() interface. Once a bridge descriptor has
- been added, should_delay_dir_fetches() will stop delaying further
- directory fetches, and the user begins to bootstrap his directory
- information from that bridge (see Section 3.3).
-
- Currently bridge users cache their bridge descriptors to the
- "cached-descriptors" file (annotated with purpose "bridge"), but
- they don't make any attempt to reuse descriptors they find in this
- file. The theory is that either the bridge is available now, in which
- case you can get a fresh descriptor, or it's not, in which case an
- old descriptor won't do you much good.
-
- We could disable writing out the bridge lines to the state file, if
- we think this is a problem.
-
- As an exception, if we get an application request when we have one
- or more bridge descriptors but we believe none of them are running,
- we mark them all as running again. This is similar to the exception
- already in place to help long-idle Tor clients realize they should
- fetch fresh directory information rather than just refuse requests.
-
-3.3. Bridges as directory guards
-
- In addition to using bridges as the first hop in their circuits, bridge
- users also use them to fetch directory updates. Other than initial
- bootstrapping to find a working bridge descriptor (see Section 3.4
- below), all further non-anonymized directory fetches will be redirected
- to the bridge.
-
- This means that bridge relays need to have cached answers for all
- questions the bridge user might ask. This makes the upgrade path
- tricky --- for example, if we migrate to a v4 directory design, the
- bridge user would need to keep using v3 so long as his bridge relays
- only knew how to answer v3 queries.
-
- In a future design, for cases where the user has enough information
- to build circuits yet the chosen bridge doesn't know how to answer a
- given query, we might teach bridge users to make an anonymized request
- to a more suitable directory server.
-
-3.4. How bridge users get their bridge descriptor
-
- Bridge users can fetch bridge descriptors in two ways: by going directly
- to the bridge and asking for "/tor/server/authority", or by going to
- the bridge authority and asking for "/tor/server/fp/ID". By default,
- they will only try the direct queries. If the user sets
- UpdateBridgesFromAuthority 1
- in his config file, then he will try querying the bridge authority
- first for bridges where he knows a digest (if he only knows an IP
- address and ORPort, then his only option is a direct query).
-
- If the user has at least one working bridge, then he will do further
- queries to the bridge authority through a full three-hop Tor circuit.
- But when bootstrapping, he will make a direct begin_dir-style connection
- to the bridge authority.
-
- As of Tor 0.2.0.10-alpha, if the user attempts to fetch a descriptor
- from the bridge authority and it returns a 404 not found, the user
- will automatically fall back to trying a direct query. Therefore it is
- recommended that bridge users always set UpdateBridgesFromAuthority,
- since at worst it will delay their fetches a little bit and notify
- the bridge authority of the identity fingerprint (but not location)
- of their intended bridges.
-
-3.5. Bridge descriptor retry schedule
-
- Bridge users try to fetch a descriptor for each bridge (using the
- steps in Section 3.4 above) on startup. Whenever they receive a
- bridge descriptor, they reschedule a new descriptor download for 1
- hour from then.
-
- If on the other hand it fails, they try again after 15 minutes for the
- first attempt, after 15 minutes for the second attempt, and after 60
- minutes for subsequent attempts.
-
- In 0.2.2.x we should come up with some smarter retry schedules.
-
-3.6. Vidalia integration
-
- Vidalia 0.0.16 has a checkbox in its Network config window called
- "My ISP blocks connections to the Tor network." Users who click that
- box change their configuration to:
- UseBridges 1
- UpdateBridgesFromAuthority 1
- and should specify at least one Bridge identifier.
-
-3.7. Do we need a second layer of entry guards?
-
- If the bridge user uses the bridge as its entry guard, then the
- triangulation attacks from Lasse and Paul's Oakland paper work to
- locate the user's bridge(s).
-
- Worse, this is another way to enumerate bridges: if the bridge users
- keep rotating through second hops, then if you run a few fast servers
- (and avoid getting considered an Exit or a Guard) you'll quickly get
- a list of the bridges in active use.
-
- That's probably the strongest reason why bridge users will need to
- pick second-layer guards. Would this mean bridge users should switch
- to four-hop circuits?
-
- We should figure this out in the 0.2.1.x timeframe.
-
diff --git a/doc/spec/proposals/126-geoip-reporting.txt b/doc/spec/proposals/126-geoip-reporting.txt
deleted file mode 100644
index 9f3b21c67..000000000
--- a/doc/spec/proposals/126-geoip-reporting.txt
+++ /dev/null
@@ -1,410 +0,0 @@
-Filename: 126-geoip-reporting.txt
-Title: Getting GeoIP data and publishing usage summaries
-Author: Roger Dingledine
-Created: 2007-11-24
-Status: Closed
-Implemented-In: 0.2.0.x
-
-0. Status
-
- In 0.2.0.x, this proposal is implemented to the extent needed to
- address its motivations. See notes below with the test "RESOLUTION"
- for details.
-
-1. Background and motivation
-
- Right now we can keep a rough count of Tor users, both total and by
- country, by watching connections to a single directory mirror. Being
- able to get usage estimates is useful both for our funders (to
- demonstrate progress) and for our own development (so we know how
- quickly we're scaling and can design accordingly, and so we know which
- countries and communities to focus on more). This need for information
- is the only reason we haven't deployed "directory guards" (think of
- them like entry guards but for directory information; in practice,
- it would seem that Tor clients should simply use their entry guards
- as their directory guards; see also proposal 125).
-
- With the move toward bridges, we will no longer be able to track Tor
- clients that use bridges, since they use their bridges as directory
- guards. Further, we need to be able to learn which bridges stop seeing
- use from certain countries (and are thus likely blocked), so we can
- avoid giving them out to other users in those countries.
-
- Right now we already do GeoIP lookups in Vidalia: Vidalia draws relays
- and circuits on its 'network map', and it performs anonymized GeoIP
- lookups to its central servers to know where to put the dots. Vidalia
- caches answers it gets -- to reduce delay, to reduce overhead on
- the network, and to reduce anonymity issues where users reveal their
- knowledge about the network through which IP addresses they ask about.
-
- But with the advent of bridges, Tor clients are asking about IP
- addresses that aren't in the main directory. In particular, bridge
- users inform the central Vidalia servers about each bridge as they
- discover it and their Vidalia tries to map it.
-
- Also, we wouldn't mind letting Vidalia do a GeoIP lookup on the client's
- own IP address, so it can provide a more useful map.
-
- Finally, Vidalia's central servers leave users open to partitioning
- attacks, even if they can't target specific users. Further, as we
- start using GeoIP results for more operational or security-relevant
- goals, such as avoiding or including particular countries in circuits,
- it becomes more important that users can't be singled out in terms of
- their IP-to-country mapping beliefs.
-
-2. The available GeoIP databases
-
- There are at least two classes of GeoIP database out there: "IP to
- country", which tells us the country code for the IP address but
- no more details, and "IP to city", which tells us the country code,
- the name of the city, and some basic latitude/longitude guesses.
-
- A recent ip-to-country.csv is 3421362 bytes. Compressed, it is 564252
- bytes. A typical line is:
- "205500992","208605279","US","USA","UNITED STATES"
- http://ip-to-country.webhosting.info/node/view/5
-
- Similarly, the maxmind GeoLite Country database is also about 500KB
- compressed.
- http://www.maxmind.com/app/geolitecountry
-
- The maxmind GeoLite City database gives more finegrained detail like
- geo coordinates and city name. Vidalia currently makes use of this
- information. On the other hand it's 16MB compressed. A typical line is:
- 206.124.149.146,Bellevue,WA,US,47.6051,-122.1134
- http://www.maxmind.com/app/geolitecity
-
- There are other databases out there, like
- http://www.hostip.info/faq.html
- http://www.webconfs.com/ip-to-city.php
- that want more attention, but for now let's assume that all the db's
- are around this size.
-
-3. What we'd like to solve
-
- Goal #1a: Tor relays collect IP-to-country user stats and publish
- sanitized versions.
- Goal #1b: Tor bridges collect IP-to-country user stats and publish
- sanitized versions.
-
- Goal #2a: Vidalia learns IP-to-city stats for Tor relays, for better
- mapping.
- Goal #2b: Vidalia learns IP-to-country stats for Tor relays, so the user
- can pick countries for her paths.
-
- Goal #3: Vidalia doesn't do external lookups on bridge relay addresses.
-
- Goal #4: Vidalia resolves the Tor client's IP-to-country or IP-to-city
- for better mapping.
-
- Goal #5: Reduce partitioning opportunities where Vidalia central
- servers can give different (distinguishing) responses.
-
-4. Solution overview
-
- Our goal is to allow Tor relays, bridges, and clients to learn enough
- GeoIP information so they can do local private queries.
-
-4.1. The IP-to-country db
-
- Directory authorities should publish a "geoip" file that contains
- IP-to-country mappings. Directory caches will mirror it, and Tor clients
- and relays (including bridge relays) will fetch it. Thus we can solve
- goals 1a and 1b (publish sanitized usage info). Controllers could also
- use this to solve goal 2b (choosing path by country attributes). It
- also solves goal 4 (learning the Tor client's country), though for
- huge countries like the US we'd still need to decide where the "middle"
- should be when we're mapping that address.
-
- The IP-to-country details are described further in Sections 5 and
- 6 below.
-
- [RESOLUTION: The geoip file in 0.2.0.x is not distributed through
- Tor. Instead, it is shipped with the bundle.]
-
-4.2. The IP-to-city db
-
- In an ideal world, the IP-to-city db would be small enough that we
- could distribute it in the above manner too. But for now, it is too
- large. Here's where the design choice forks.
-
- Option A: Vidalia should continue doing its anonymized IP-to-city
- queries. Thus we can achieve goals 2a and 2b. We would solve goal
- 3 by only doing lookups on descriptors that are purpose "general"
- (see Section 4.2.1 for how). We would leave goal 5 unsolved.
-
- Option B: Each directory authority should keep an IP-to-city db,
- lookup the value for each router it lists, and include that line in
- the router's network-status entry. The network-status consensus would
- then use the line that appears in the majority of votes. This approach
- also solves goals 2a and 2b, goal 3 (Vidalia doesn't do any lookups
- at all now), and goal 5 (reduced partitioning risks).
-
- Option B has the advantage that Vidalia can simplify its operation,
- and the advantage that this consensus IP-to-city data is available to
- other controllers besides just Vidalia. But it has the disadvantage
- that the networkstatus consensus becomes larger, even though most of
- the GeoIP information won't change from one consensus to the next. Is
- there another reasonable location for it that can provide similar
- consensus security properties?
-
- [RESOLUTION: IP-to-city is not supported.]
-
-4.2.1. Controllers can query for router annotations
-
- Vidalia needs to stop doing queries on bridge relay IP addresses.
- It could do that by only doing lookups on descriptors that are in
- the networkstatus consensus, but that precludes designs like Blossom
- that might want to map its relay locations. The best answer is that it
- should learn the router annotations, with a new controller 'getinfo'
- command:
- "GETINFO desc-annotations/id/<OR identity>"
- which would respond with something like
- @downloaded-at 2007-11-29 08:06:38
- @source "128.31.0.34"
- @purpose bridge
-
- [We could also make the answer include the digest for the router in
- question, which would enable us to ask GETINFO router-annotations/all.
- Is this worth it? -RD]
-
- Then Vidalia can avoid doing lookups on descriptors with purpose
- "bridge". Even better would be to add a new annotation "@private true"
- so Vidalia can know how to handle new purposes that we haven't created
- yet. Vidalia could special-case "bridge" for now, for compatibility
- with the current 0.2.0.x-alphas.
-
-4.3. Recommendation
-
- My overall recommendation is that we should implement 4.1 soon
- (e.g. early in 0.2.1.x), and we can go with 4.2 option A for now,
- with the hope that later we discover a better way to distribute the
- IP-to-city info and can switch to 4.2 option B.
-
- Below we discuss more how to go about achieving 4.1.
-
-5. Publishing and caching the GeoIP (IP-to-country) database
-
- Each v3 directory authority should put a copy of the "geoip" file in
- its datadirectory. Then its network-status votes should include a hash
- of this file (Recommended-geoip-hash: %s), and the resulting consensus
- directory should specify the consensus hash.
-
- There should be a new URL for fetching this geoip db (by "current.z"
- for testing purposes, and by hash.z for typical downloads). Authorities
- should fetch and serve the one listed in the consensus, even when they
- vote for their own. This would argue for storing the cached version
- in a better filename than "geoip".
-
- Directory mirrors should keep a copy of this file available via the
- same URLs.
-
- We assume that the file would change at most a few times a month. Should
- Tor ship with a bootstrap geoip file? An out-of-date geoip file may
- open you up to partitioning attacks, but for the most part it won't
- be that different.
-
- There should be a config option to disable updating the geoip file,
- in case users want to use their own file (e.g. they have a proprietary
- GeoIP file they prefer to use). In that case we leave it up to the
- user to update his geoip file out-of-band.
-
- [XXX Should consider forward/backward compatibility, e.g. if we want
- to move to a new geoip file format. -RD]
-
- [RESOLUTION: Not done over Tor.]
-
-6. Controllers use the IP-to-country db for mapping and for path building
-
- Down the road, Vidalia could use the IP-to-country mappings for placing
- on its map:
- - The location of the client
- - The location of the bridges, or other relays not in the
- networkstatus, on the map.
- - Any relays that it doesn't yet have an IP-to-city answer for.
-
- Other controllers can also use it to set EntryNodes, ExitNodes, etc
- in a per-country way.
-
- To support these features, we need to export the IP-to-country data
- via the Tor controller protocol.
-
- Is it sufficient just to add a new GETINFO command?
- GETINFO ip-to-country/128.31.0.34
- 250+ip-to-country/128.31.0.34="US","USA","UNITED STATES"
-
- [RESOLUTION: Not done now, except for the getinfo command.]
-
-6.1. Other interfaces
-
- Robert Hogan has also suggested a
-
- GETINFO relays-by-country/cn
-
- as well as torrc options for ExitCountryCodes, EntryCountryCodes,
- ExcludeCountryCodes, etc.
-
- [RESOLUTION: Not implemented in 0.2.0.x. Fodder for a future proposal.]
-
-7. Relays and bridges use the IP-to-country db for usage summaries
-
- Once bridges have a GeoIP database locally, they can start to publish
- sanitized summaries of client usage -- how many users they see and from
- what countries. This might also be a more useful way for ordinary Tor
- relays to convey the level of usage they see, which would allow us to
- switch to using directory guards for all users by default.
-
- But how to safely summarize this information without opening too many
- anonymity leaks?
-
-7.1 Attacks to think about
-
- First, note that we need to have a large enough time window that we're
- not aiding correlation attacks much. I hope 24 hours is enough. So
- that means no publishing stats until you've been up at least 24 hours.
- And you can't publish follow-up stats more often than every 24 hours,
- or people could look at the differential.
-
- Second, note that we need to be sufficiently vague about the IP
- addresses we're reporting. We are hoping that just specifying the
- country will be vague enough. But a) what about active attacks where
- we convince a bridge to use a GeoIP db that labels each suspect IP
- address as a unique country? We have to assume that the consensus GeoIP
- db won't be malicious in this way. And b) could such singling-out
- attacks occur naturally, for example because of countries that have
- a very small IP space? We should investigate that.
-
-7.2. Granularity of users
-
- Do we only want to report countries that have a sufficient anonymity set
- (that is, number of users) for the day? For example, we might avoid
- listing any countries that have seen less than five addresses over
- the 24 hour period. This approach would be helpful in reducing the
- singling-out opportunities -- in the extreme case, we could imagine a
- situation where one blogger from the Sudan used Tor on a given day, and
- we can discover which entry guard she used.
-
- But I fear that especially for bridges, seeing only one hit from a
- given country in a given day may be quite common.
-
- As a compromise, we should start out with an "Other" category in
- the reported stats, which is the sum of unlisted countries; if that
- category is consistently interesting, we can think harder about how
- to get the right data from it safely.
-
- But note that bridge summaries will not be made public individually,
- since doing so would help people enumerate bridges. Whereas summaries
- from normal relays will be public. So perhaps that means we can afford
- to be more specific in bridge summaries? In particular, I'm thinking the
- "other" category should be used by public relays but not for bridges
- (or if it is, used with a lower threshold).
-
- Even for countries that have many Tor users, we might not want to be
- too specific about how many users we've seen. For example, we might
- round down the number of users we report to the nearest multiple of 5.
- My instinct for now is that this won't be that useful.
-
-7.3 Other issues
-
- Another note: we'll likely be overreporting in the case of users with
- dynamic IP addresses: if they rotate to a new address over the course
- of the day, we'll count them twice. So be it.
-
-7.4. Where to publish the summaries?
-
- We designed extrainfo documents for information like this. So they
- should just be more entries in the extrainfo doc.
-
- But if we want to publish summaries every 24 hours (no more often,
- no less often), aren't we tried to the router descriptor publishing
- schedule? That is, if we publish a new router descriptor at the 18
- hour mark, and nothing much has changed at the 24 hour mark, won't
- the new descriptor get dropped as being "cosmetically similar", and
- then nobody will know to ask about the new extrainfo document?
-
- One solution would be to make and remember the 24 hour summary at the
- 24 hour mark, but not actually publish it anywhere until we happen to
- publish a new descriptor for other reasons. If we happen to go down
- before publishing a new descriptor, then so be it, at least we tried.
-
-7.5. What if the relay is unreachable or goes to sleep?
-
- Even if you've been up for 24 hours, if you were hibernating for 18
- of them, then we're not getting as much fuzziness as we'd like. So
- I guess that means that we need a 24-hour period of being "awake"
- before we'll willing to publish a summary. A similar attack works if
- you've been awake but unreachable for the first 18 of the 24 hours. As
- another example, a bridge that's on a laptop might be suspended for
- some of each day.
-
- This implies that some relays and bridges will never publish summary
- stats, because they're not ever reliably working for 24 hours in
- a row. If a significant percentage of our reporters end up being in
- this boat, we should investigate whether we can accumulate 24 hours of
- "usefulness", even if there are holes in the middle, and publish based
- on that.
-
- What other issues are like this? It seems that just moving to a new
- IP address shouldn't be a reason to cancel stats publishing, assuming
- we were usable at each address.
-
-7.6. IP addresses that aren't in the geoip db
-
- Some IP addresses aren't in the public geoip databases. In particular,
- I've found that a lot of African countries are missing, but there
- are also some common ones in the US that are missing, like parts of
- Comcast. We could just lump unknown IP addresses into the "other"
- category, but it might be useful to gather a general sense of how many
- lookups are failing entirely, by adding a separate "Unknown" category.
-
- We could also contribute back to the geoip db, by letting bridges set
- a config option to report the actual IP addresses that failed their
- lookup. Then the bridge authority operators can manually make sure
- the correct answer will be in later geoip files. This config option
- should be disabled by default.
-
-7.7 Bringing it all together
-
- So here's the plan:
-
- 24 hours after starting up (modulo Section 7.5 above), bridges and
- relays should construct a daily summary of client countries they've
- seen, including the above "Unknown" category (Section 7.6) as well.
-
- Non-bridge relays lump all countries with less than K (e.g. K=5) users
- into the "Other" category (see Sec 7.2 above), whereas bridge relays are
- willing to list a country even when it has only one user for the day.
-
- Whenever we have a daily summary on record, we include it in our
- extrainfo document whenever we publish one. The daily summary we
- remember locally gets replaced with a newer one when another 24
- hours pass.
-
-7.8. Some forward secrecy
-
- How should we remember addresses locally? If we convert them into
- country-codes immediately, we will count them again if we see them
- again. On the other hand, we don't really want to keep a list hanging
- around of all IP addresses we've seen in the past 24 hours.
-
- Step one is that we should never write this stuff to disk. Keeping it
- only in ram will make things somewhat better. Step two is to avoid
- keeping any timestamps associated with it: rather than a rolling
- 24-hour window, which would require us to remember the various times
- we've seen that address, we can instead just throw out the whole list
- every 24 hours and start over.
-
- We could hash the addresses, and then compare hashes when deciding if
- we've seen a given address before. We could even do keyed hashes. Or
- Bloom filters. But if our goal is to defend against an adversary
- who steals a copy of our ram while we're running and then does
- guess-and-check on whatever blob we're keeping, we're in bad shape.
-
- We could drop the last octet of the IP address as soon as we see
- it. That would cause us to undercount some users from cablemodem and
- DSL networks that have a high density of Tor users. And it wouldn't
- really help that much -- indeed, the extent to which it does help is
- exactly the extent to which it makes our stats less useful.
-
- Other ideas?
-
diff --git a/doc/spec/proposals/127-dirport-mirrors-downloads.txt b/doc/spec/proposals/127-dirport-mirrors-downloads.txt
deleted file mode 100644
index 72d6c0cb9..000000000
--- a/doc/spec/proposals/127-dirport-mirrors-downloads.txt
+++ /dev/null
@@ -1,155 +0,0 @@
-Filename: 127-dirport-mirrors-downloads.txt
-Title: Relaying dirport requests to Tor download site / website
-Author: Roger Dingledine
-Created: 2007-12-02
-Status: Draft
-
-1. Overview
-
- Some countries and networks block connections to the Tor website. As
- time goes by, this will remain a problem and it may even become worse.
-
- We have a big pile of mirrors (google for "Tor mirrors"), but few of
- our users think to try a search like that. Also, many of these mirrors
- might be automatically blocked since their pages contain words that
- might cause them to get banned. And lastly, we can imagine a future
- where the blockers are aware of the mirror list too.
-
- Here we describe a new set of URLs for Tor's DirPort that will relay
- connections from users to the official Tor download site. Rather than
- trying to cache a bunch of new Tor packages (which is a hassle in terms
- of keeping them up to date, and a hassle in terms of drive space used),
- we instead just proxy the requests directly to Tor's /dist page.
-
- Specifically, we should support
-
- GET /tor/dist/$1
-
- and
-
- GET /tor/website/$1
-
-2. Direct connections, one-hop circuits, or three-hop circuits?
-
- We could relay the connections directly to the download site -- but
- this produces recognizable outgoing traffic on the bridge or cache's
- network, which will probably surprise our nice volunteers. (Is this
- a good enough reason to discard the direct connection idea?)
-
- Even if we don't do direct connections, should we do a one-hop
- begindir-style connection to the mirror site (make a one-hop circuit
- to it, then send a 'begindir' cell down the circuit), or should we do
- a normal three-hop anonymized connection?
-
- If these mirrors are mainly bridges, doing either a direct or a one-hop
- connection creates another way to enumerate bridges. That would argue
- for three-hop. On the other hand, downloading a 10+ megabyte installer
- through a normal Tor circuit can't be fun. But if you're already getting
- throttled a lot because you're in the "relayed traffic" bucket, you're
- going to have to accept a slow transfer anyway. So three-hop it is.
-
- Speaking of which, we would want to label this connection
- as "relay" traffic for the purposes of rate limiting; see
- connection_counts_as_relayed_traffic() and or_conn->client_used. This
- will be a bit tricky though, because these connections will use the
- bridge's guards.
-
-3. Scanning resistance
-
- One other goal we'd like to achieve, or at least not hinder, is making
- it hard to scan large swaths of the Internet to look for responses
- that indicate a bridge.
-
- In general this is a really hard problem, so we shouldn't demand to
- solve it here. But we can note that some bridges should open their
- DirPort (and offer this functionality), and others shouldn't. Then
- some bridges provide a download mirror while others can remain
- scanning-resistant.
-
-4. Integrity checking
-
- If we serve this stuff in plaintext from the bridge, anybody in between
- the user and the bridge can intercept and modify it. The bridge can too.
-
- If we do an anonymized three-hop connection, the exit node can also
- intercept and modify the exe it sends back.
-
- Are we setting ourselves up for rogue exit relays, or rogue bridges,
- that trojan our users?
-
- Answer #1: Users need to do pgp signature checking. Not a very good
- answer, a) because it's complex, and b) because they don't know the
- right signing keys in the first place.
-
- Answer #2: The mirrors could exit from a specific Tor relay, using the
- '.exit' notation. This would make connections a bit more brittle, but
- would resolve the rogue exit relay issue. We could even round-robin
- among several, and the list could be dynamic -- for example, all the
- relays with an Authority flag that allow exits to the Tor website.
-
- Answer #3: The mirrors should connect to the main distribution site
- via SSL. That way the exit relay can't influence anything.
-
- Answer #4: We could suggest that users only use trusted bridges for
- fetching a copy of Tor. Hopefully they heard about the bridge from a
- trusted source rather than from the adversary.
-
- Answer #5: What if the adversary is trawling for Tor downloads by
- network signature -- either by looking for known bytes in the binary,
- or by looking for "GET /tor/dist/"? It would be nice to encrypt the
- connection from the bridge user to the bridge. And we can! The bridge
- already supports TLS. Rather than initiating a TLS renegotiation after
- connecting to the ORPort, the user should actually request a URL. Then
- the ORPort can either pass the connection off as a linked conn to the
- dirport, or renegotiate and become a Tor connection, depending on how
- the client behaves.
-
-5. Linked connections: at what level should we proxy?
-
- Check out the connection_ap_make_link() function, as called from
- directory.c. Tor clients use this to create a "fake" socks connection
- back to themselves, and then they attach a directory request to it,
- so they can launch directory fetches via Tor. We can piggyback on
- this feature.
-
- We need to decide if we're going to be passing the bytes back and
- forth between the web browser and the main distribution site, or if
- we're going to be actually acting like a proxy (parsing out the file
- they want, fetching that file, and serving it back).
-
- Advantages of proxying without looking inside:
- - We don't need to build any sort of http support (including
- continues, partial fetches, etc etc).
- Disadvantages:
- - If the browser thinks it's speaking http, are there easy ways
- to pass the bytes to an https server and have everything work
- correctly? At the least, it would seem that the browser would
- complain about the cert. More generally, ssl wants to be negotiated
- before the URL and headers are sent, yet we need to read the URL
- and headers to know that this is a mirror request; so we have an
- ordering problem here.
- - Makes it harder to do caching later on, if we don't look at what
- we're relaying. (It might be useful down the road to cache the
- answers to popular requests, so we don't have to keep getting
- them again.)
-
-6. Outstanding problems
-
- 1) HTTP proxies already exist. Why waste our time cloning one
- badly? When we clone existing stuff, we usually regret it.
-
- 2) It's overbroad. We only seem to need a secure get-a-tor feature,
- and instead we're contemplating building a locked-down HTTP proxy.
-
- 3) It's going to add a fair bit of complexity to our code. We do
- not currently implement HTTPS. We'd need to refactor lots of the
- low-level connection stuff so that "SSL" and "Cell-based" were no
- longer synonymous.
-
- 4) It's still unclear how effective this proposal would be in
- practice. You need to know that this feature exists, which means
- somebody needs to tell you about a bridge (mirror) address and tell
- you how to use it. And if they're doing that, they could (e.g.) tell
- you about a gmail autoresponder address just as easily, and then you'd
- get better authentication of the Tor program to boot.
-
diff --git a/doc/spec/proposals/128-bridge-families.txt b/doc/spec/proposals/128-bridge-families.txt
deleted file mode 100644
index e5bdcf95c..000000000
--- a/doc/spec/proposals/128-bridge-families.txt
+++ /dev/null
@@ -1,64 +0,0 @@
-Filename: 128-bridge-families.txt
-Title: Families of private bridges
-Author: Roger Dingledine
-Created: 2007-12-xx
-Status: Dead
-
-1. Overview
-
- Proposal 125 introduced the basic notion of how bridge authorities,
- bridge relays, and bridge users should behave. But it doesn't get into
- the various mechanisms of how to distribute bridge relay addresses to
- bridge users.
-
- One of the mechanisms we have in mind is called 'families of bridges'.
- If a bridge user knows about only one private bridge, and that bridge
- shuts off for the night or gets a new dynamic IP address, the bridge
- user is out of luck and needs to re-bootstrap manually or wait and
- hope it comes back. On the other hand, if the bridge user knows about
- a family of bridges, then as long as one of those bridges is still
- reachable his Tor client can automatically learn about where the
- other bridges have gone.
-
- So in this design, a single volunteer could run multiple coordinated
- bridges, or a group of volunteers could each run a bridge. We abstract
- out the details of how these volunteers find each other and decide to
- set up a family.
-
-2. Other notes.
-
- somebody needs to run a bridge authority
-
- it needs to have a torrc option to publish networkstatuses of its bridges
-
- it should also do reachability testing just of those bridges
-
- people ask for the bridge networkstatus by asking for a url that
- contains a password. (it's safe to do this because of begin_dir.)
-
- so the bridge users need to know a) a password, and b) a bridge
- authority line.
-
- the bridge users need to know the bridge authority line.
-
- the bridge authority needs to know the password.
-
-3. Current state
-
- I implemented a BridgePassword config option. Bridge authorities
- should set it, and users who want to use those bridge authorities
- should set it.
-
- Now there is a new directory URL "/tor/networkstatus-bridges" that
- directory mirrors serve if BridgeAuthoritativeDir is set and it's a
- begin_dir connection. It looks for the header
- Authorization: Basic %s
- where %s is the base-64 bridge password.
-
- I never got around to teaching clients how to set the header though,
- so it may or may not, and may or may not do what we ultimate want.
-
- I've marked this proposal dead; it really never should have left the
- ideas/ directory. Somebody should pick it up sometime and finish the
- design and implementation.
-
diff --git a/doc/spec/proposals/129-reject-plaintext-ports.txt b/doc/spec/proposals/129-reject-plaintext-ports.txt
deleted file mode 100644
index 8080ff5b7..000000000
--- a/doc/spec/proposals/129-reject-plaintext-ports.txt
+++ /dev/null
@@ -1,114 +0,0 @@
-Filename: 129-reject-plaintext-ports.txt
-Title: Block Insecure Protocols by Default
-Author: Kevin Bauer & Damon McCoy
-Created: 2008-01-15
-Status: Closed
-Implemented-In: 0.2.0.x
-
-Overview:
-
- Below is a proposal to mitigate insecure protocol use over Tor.
-
- This document 1) demonstrates the extent to which insecure protocols are
- currently used within the Tor network, and 2) proposes a simple solution
- to prevent users from unknowingly using these insecure protocols. By
- insecure, we consider protocols that explicitly leak sensitive user names
- and/or passwords, such as POP, IMAP, Telnet, and FTP.
-
-Motivation:
-
- As part of a general study of Tor use in 2006/2007 [1], we attempted to
- understand what types of protocols are used over Tor. While we observed a
- enormous volume of Web and Peer-to-peer traffic, we were surprised by the
- number of insecure protocols that were used over Tor. For example, over an
- 8 day observation period, we observed the following number of connections
- over insecure protocols:
-
- POP and IMAP:10,326 connections
- Telnet: 8,401 connections
- FTP: 3,788 connections
-
- Each of the above listed protocols exchange user name and password
- information in plain-text. As an upper bound, we could have observed
- 22,515 user names and passwords. This observation echos the reports of
- a Tor router logging and posting e-mail passwords in August 2007 [2]. The
- response from the Tor community has been to further educate users
- about the dangers of using insecure protocols over Tor. However, we
- recently repeated our Tor usage study from last year and noticed that the
- trend in insecure protocol use has not declined. Therefore, we propose that
- additional steps be taken to protect naive Tor users from inadvertently
- exposing their identities (and even passwords) over Tor.
-
-Security Implications:
-
- This proposal is intended to improve Tor's security by limiting the
- use of insecure protocols.
-
- Roger added: By adding these warnings for only some of the risky
- behavior, users may do other risky behavior, not get a warning, and
- believe that it is therefore safe. But overall, I think it's better
- to warn for some of it than to warn for none of it.
-
-Specification:
-
- As an initial step towards mitigating the use of the above-mentioned
- insecure protocols, we propose that the default ports for each respective
- insecure service be blocked at the Tor client's socks proxy. These default
- ports include:
-
- 23 - Telnet
- 109 - POP2
- 110 - POP3
- 143 - IMAP
-
- Notice that FTP is not included in the proposed list of ports to block. This
- is because FTP is often used anonymously, i.e., without any identifying
- user name or password.
-
- This blocking scheme can be implemented as a set of flags in the client's
- torrc configuration file:
-
- BlockInsecureProtocols 0|1
- WarnInsecureProtocols 0|1
-
- When the warning flag is activated, a message should be displayed to
- the user similar to the message given when Tor's socks proxy is given an IP
- address rather than resolving a host name.
-
- We recommend that the default torrc configuration file block insecure
- protocols and provide a warning to the user to explain the behavior.
-
- Finally, there are many popular web pages that do not offer secure
- login features, such as MySpace, and it would be prudent to provide
- additional rules to Privoxy to attempt to protect users from unknowingly
- submitting their login credentials in plain-text.
-
-Compatibility:
-
- None, as the proposed changes are to be implemented in the client.
-
-References:
-
- [1] Shining Light in Dark Places: A Study of Anonymous Network Usage.
- University of Colorado Technical Report CU-CS-1032-07. August 2007.
-
- [2] Rogue Nodes Turn Tor Anonymizer Into Eavesdropper's Paradise.
- http://www.wired.com/politics/security/news/2007/09/embassy_hacks.
- Wired. September 10, 2007.
-
-Implementation:
-
- Roger added this feature in
- http://archives.seul.org/or/cvs/Jan-2008/msg00182.html
- He also added a status event for Vidalia to recognize attempts to use
- vulnerable-plaintext ports, so it can help the user understand what's
- going on and how to fix it.
-
-Next steps:
-
- a) Vidalia should learn to recognize this controller status event,
- so we don't leave users out in the cold when we enable this feature.
-
- b) We should decide which ports to reject by default. The current
- consensus is 23,109,110,143 -- the same set that we warn for now.
-
diff --git a/doc/spec/proposals/130-v2-conn-protocol.txt b/doc/spec/proposals/130-v2-conn-protocol.txt
deleted file mode 100644
index 60e742a62..000000000
--- a/doc/spec/proposals/130-v2-conn-protocol.txt
+++ /dev/null
@@ -1,184 +0,0 @@
-Filename: 130-v2-conn-protocol.txt
-Title: Version 2 Tor connection protocol
-Author: Nick Mathewson
-Created: 2007-10-25
-Status: Closed
-Implemented-In: 0.2.0.x
-
-Overview:
-
- This proposal describes the significant changes to be made in the v2
- Tor connection protocol.
-
- This proposal relates to other proposals as follows:
-
- It refers to and supersedes:
- Proposal 124: Blocking resistant TLS certificate usage
- It refers to aspects of:
- Proposal 105: Version negotiation for the Tor protocol
-
-
- In summary, The Tor connection protocol has been in need of a redesign
- for a while. This proposal describes how we can add to the Tor
- protocol:
-
- - A new TLS handshake (to achieve blocking resistance without
- breaking backward compatibility)
- - Version negotiation (so that future connection protocol changes
- can happen without breaking compatibility)
- - The actual changes in the v2 Tor connection protocol.
-
-Motivation:
-
- For motivation, see proposal 124.
-
-Proposal:
-
-0. Terminology
-
- The version of the Tor connection protocol implemented up to now is
- "version 1". This proposal describes "version 2".
-
- "Old" or "Older" versions of Tor are ones not aware that version 2
- of this protocol exists;
- "New" or "Newer" versions are ones that are.
-
- The connection initiator is referred to below as the Client; the
- connection responder is referred to below as the Server.
-
-1. The revised TLS handshake.
-
- For motivation, see proposal 124. This is a simplified version of the
- handshake that uses TLS's renegotiation capability in order to avoid
- some of the extraneous steps in proposal 124.
-
- The Client connects to the Server and, as in ordinary TLS, sends a
- list of ciphers. Older versions of Tor will send only ciphers from
- the list:
- TLS_DHE_RSA_WITH_AES_256_CBC_SHA
- TLS_DHE_RSA_WITH_AES_128_CBC_SHA
- SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA
- SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA
- Clients that support the revised handshake will send the recommended
- list of ciphers from proposal 124, in order to emulate the behavior of
- a web browser.
-
- If the server notices that the list of ciphers contains only ciphers
- from this list, it proceeds with Tor's version 1 TLS handshake as
- documented in tor-spec.txt.
-
- (The server may also notice cipher lists used by other implementations
- of the Tor protocol (in particular, the BouncyCastle default cipher
- list as used by some Java-based implementations), and whitelist them.)
-
- On the other hand, if the server sees a list of ciphers that could not
- have been sent from an older implementation (because it includes other
- ciphers, and does not match any known-old list), the server sends a
- reply containing a single connection certificate, constructed as for
- the link certificate in the v1 Tor protocol. The subject names in
- this certificate SHOULD NOT have any strings to identify them as
- coming from a Tor server. The server does not ask the client for
- certificates.
-
- Old Servers will (mostly) ignore the cipher list and respond as in the v1
- protocol, sending back a two-certificate chain.
-
- After the Client gets a response from the server, it checks for the
- number of certificates it received. If there are two certificates,
- the client assumes a V1 connection and proceeds as in tor-spec.txt.
- But if there is only one certificate, the client assumes a V2 or later
- protocol and continues.
-
- At this point, the client has established a TLS connection with the
- server, but the parties have not been authenticated: the server hasn't
- sent its identity certificate, and the client hasn't sent any
- certificates at all. To fix this, the client begins a TLS session
- renegotiation. This time, the server continues with two certificates
- as usual, and asks for certificates so that the client will send
- certificates of its own. Because the TLS connection has been
- established, all of this is encrypted. (The certificate sent by the
- server in the renegotiated connection need not be the same that
- as sentin the original connection.)
-
- The server MUST NOT write any data until the client has renegotiated.
-
- Once the renegotiation is finished, the server and client check one
- another's certificates as in V1. Now they are mutually authenticated.
-
-1.1. Revised TLS handshake: implementation notes.
-
- It isn't so easy to adjust server behavior based on the client's
- ciphersuite list. Here's how we can do it using OpenSSL. This is a
- bit of an abuse of the OpenSSL APIs, but it's the best we can do, and
- we won't have to do it forever.
-
- We can use OpenSSL's SSL_set_info_callback() to register a function to
- be called when the state changes. The type/state tuple of
- SSL_CB_ACCEPT_LOOP/SSL3_ST_SW_SRVR_HELLO_A
- happens when we have completely parsed the client hello, and are about
- to send a response. From this callback, we can check the cipherlist
- and act accordingly:
-
- * If the ciphersuite list indicates a v1 protocol, we set the
- verify mode to SSL_VERIFY_NONE with a callback (so we get
- certificates).
-
- * If the ciphersuite list indicates a v2 protocol, we set the
- verify mode to SSL_VERIFY_NONE with no callback (so we get
- no certificates) and set the SSL_MODE_NO_AUTO_CHAIN flag (so that
- we send only 1 certificate in the response.
-
- Once the handshake is done, the server clears the
- SSL_MODE_NO_AUTO_CHAIN flag and sets the callback as for the V1
- protocol. It then starts reading.
-
- The other problem to take care of is missing ciphers and OpenSSL's
- cipher sorting algorithms. The two main issues are a) OpenSSL doesn't
- support some of the default ciphers that Firefox advertises, and b)
- OpenSSL sorts the list of ciphers it offers in a different way than
- Firefox sorts them, so unless we fix that Tor will still look different
- than Firefox.
- [XXXX more on this.]
-
-
-1.2. Compatibility for clients using libraries less hackable than OpenSSL.
-
- As discussed in proposal 105, servers advertise which protocol
- versions they support in their router descriptors. Clients can simply
- behave as v1 clients when connecting to servers that do not support
- link version 2 or higher, and as v2 clients when connecting to servers
- that do support link version 2 or higher.
-
- (Servers can't use this strategy because we do not assume that servers
- know one another's capabilities when connecting.)
-
-2. Version negotiation.
-
- Version negotiation proceeds as described in proposal 105, except as
- follows:
-
- * Version negotiation only happens if the TLS handshake as described
- above completes.
-
- * The TLS renegotiation must be finished before the client sends a
- VERSIONS cell; the server sends its VERSIONS cell in response.
-
- * The VERSIONS cell uses the following variable-width format:
- Circuit [2 octets; set to 0]
- Command [1 octet; set to 7 for VERSIONS]
- Length [2 octets; big-endian]
- Data [Length bytes]
-
- The Data in the cell is a series of big-endian two-byte integers.
-
- * It is not allowed to negotiate V1 conections once the v2 protocol
- has been used. If this happens, Tor instances should close the
- connection.
-
-3. The rest of the "v2" protocol
-
- Once a v2 protocol has been negotiated, NETINFO cells are exchanged
- as in proposal 105, and communications begin as per tor-spec.txt.
- Until NETINFO cells have been exchanged, the connection is not open.
-
-
diff --git a/doc/spec/proposals/131-verify-tor-usage.txt b/doc/spec/proposals/131-verify-tor-usage.txt
deleted file mode 100644
index d3c6efe75..000000000
--- a/doc/spec/proposals/131-verify-tor-usage.txt
+++ /dev/null
@@ -1,148 +0,0 @@
-Filename: 131-verify-tor-usage.txt
-Title: Help users to verify they are using Tor
-Author: Steven J. Murdoch
-Created: 2008-01-25
-Status: Needs-Revision
-
-Overview:
-
- Websites for checking whether a user is accessing them via Tor are a
- very helpful aid to configuring web browsers correctly. Existing
- solutions have both false positives and false negatives when
- checking if Tor is being used. This proposal will discuss how to
- modify Tor so as to make testing more reliable.
-
-Motivation:
-
- Currently deployed websites for detecting Tor use work by comparing
- the client IP address for a request with a list of known Tor nodes.
- This approach is generally effective, but suffers from both false
- positives and false negatives.
-
- If a user has a Tor exit node installed, or just happens to have
- been allocated an IP address previously used by a Tor exit node, any
- web requests will be incorrectly flagged as coming from Tor. If any
- customer of an ISP which implements a transparent proxy runs an exit
- node, all other users of the ISP will be flagged as Tor users.
-
- Conversely, if the exit node chosen by a Tor user has not yet been
- recorded by the Tor checking website, requests will be incorrectly
- flagged as not coming via Tor.
-
- The only reliable way to tell whether Tor is being used or not is for
- the Tor client to flag this to the browser.
-
-Proposal:
-
- A DNS name should be registered and point to an IP address
- controlled by the Tor project and likely to remain so for the
- useful lifetime of a Tor client. A web server should be placed
- at this IP address.
-
- Tor should be modified to treat requests to port 80, at the
- specified DNS name or IP address specially. Instead of opening a
- circuit, it should respond to a HTTP request with a helpful web
- page:
-
- - If the request to open a connection was to the domain name, the web
- page should state that Tor is working properly.
- - If the request was to the IP address, the web page should state
- that there is a DNS-leakage vulnerability.
-
- If the request goes through to the real web server, the page
- should state that Tor has not been set up properly.
-
-Extensions:
-
- Identifying proxy server:
-
- If needed, other applications between the web browser and Tor (e.g.
- Polipo and Privoxy) could piggyback on the same mechanism to flag
- whether they are in use. All three possible web pages should include
- a machine-readable placeholder, into which another program could
- insert their own message.
-
- For example, the webpage returned by Tor to indicate a successful
- configuration could include the following HTML:
- <h2>Connection chain</h2>
- <ul>
- <li>Tor 0.1.2.14-alpha</li>
- <!-- Tor Connectivity Check: success -->
- </ul>
-
- When the proxy server observes this string, in response to a request
- for the Tor connectivity check web page, it would prepend it's own
- message, resulting in the following being returned to the web
- browser:
- <h2>Connection chain
- <ul>
- <li>Tor 0.1.2.14-alpha</li>
- <li>Polipo version 1.0.4</li>
- <!-- Tor Connectivity Check: success -->
- </ul>
-
- Checking external connectivity:
-
- If Tor intercepts a request, and returns a response itself, the user
- will not actually confirm whether Tor is able to build a successful
- circuit. It may then be advantageous to include an image in the web
- page which is loaded from a different domain. If this is able to be
- loaded then the user will know that external connectivity through
- Tor works.
-
- Automatic Firefox Notification:
-
- All forms of the website should return valid XHTML and have a
- hidden link with an id attribute "TorCheckResult" and a target
- property that can be queried to determine the result. For example,
- a hidden link would convey success like this:
-
- <a id="TorCheckResult" target="success" href="/"></a>
-
- failure like this:
-
- <a id="TorCheckResult" target="failure" href="/"></a>
-
- and DNS leaks like this:
-
- <a id="TorCheckResult" target="dnsleak" href="/"></a>
-
- Firefox extensions such as Torbutton would then be able to
- issue an XMLHttpRequest for the page and query the result
- with resultXML.getElementById("TorCheckResult").target
- to automatically report the Tor status to the user when
- they first attempt to enable Tor activity, or whenever
- they request a check from the extension preferences window.
-
- If the check website is to be themed with heavy graphics and/or
- extensive documentation, the check result itself should be
- contained in a seperate lightweight iframe that extensions can
- request via an alternate url.
-
-Security and resiliency implications:
-
- What attacks are possible?
-
- If the IP address used for this feature moves there will be two
- consequences:
- - A new website at this IP address will remain inaccessible over
- Tor
- - Tor users who are leaking DNS will be informed that Tor is not
- working, rather than that it is active but leaking DNS
- We should thus attempt to find an IP address which we reasonably
- believe can remain static.
-
-Open issues:
-
- If a Tor version which does not support this extra feature is used,
- the webpage returned will indicate that Tor is not being used. Can
- this be safely fixed?
-
-Related work:
-
- The proposed mechanism is very similar to config.privoxy.org. The
- most significant difference is that if the web browser is
- misconfigured, Tor will only get an IP address. Even in this case,
- Tor should be able to respond with a webpage to notify the user of how
- to fix the problem. This also implies that Tor must be told of the
- special IP address, and so must be effectively permanent.
diff --git a/doc/spec/proposals/132-browser-check-tor-service.txt b/doc/spec/proposals/132-browser-check-tor-service.txt
deleted file mode 100644
index 6132e5d06..000000000
--- a/doc/spec/proposals/132-browser-check-tor-service.txt
+++ /dev/null
@@ -1,145 +0,0 @@
-Filename: 132-browser-check-tor-service.txt
-Title: A Tor Web Service For Verifying Correct Browser Configuration
-Author: Robert Hogan
-Created: 2008-03-08
-Status: Draft
-
-Overview:
-
- Tor should operate a primitive web service on the loopback network device
- that tests the operation of user's browser, privacy proxy and Tor client.
- The tests are performed by serving unique, randomly generated elements in
- image URLs embedded in static HTML. The images are only displayed if the DNS
- and HTTP requests for them are routed through Tor, otherwise the 'alt' text
- may be displayed. The proposal assumes that 'alt' text is not displayed on
- all browsers so suggests that text and links should accompany each image
- advising the user on next steps in case the test fails.
-
- The service is primarily for the use of controllers, since presumably users
- aren't going to want to edit text files and then type something exotic like
- 127.0.0.1:9999 into their address bar. In the main use case the controller
- will have configured the actual port for the webservice so will know where
- to direct the request. It would also be the responsibility of the controller
- to ensure the webservice is available, and tor is running, before allowing
- the user to access the page through their browser.
-
-Motivation:
-
- This is a complementary approach to proposal 131. It overcomes some of the
- limitations of the approach described in proposal 131: reliance
- on a permanent, real IP address and compatibility with older versions of
- Tor. Unlike 131, it is not as useful to Tor users who are not running a
- controller.
-
-Objective:
-
- Provide a reliable means of helping users to determine if their Tor
- installation, privacy proxy and browser are properly configured for
- anonymous browsing.
-
-Proposal:
-
- When configured to do so, Tor should run a basic web service available
- on a configured port on 127.0.0.1. The purpose of this web service is to
- serve a number of basic test images that will allow the user to determine
- if their browser is properly configured and that Tor is working normally.
-
- The service can consist of a single web page with two columns. The left
- column contains images, the right column contains advice on what the
- display/non-display of the column means.
-
- The rest of this proposal assumes that the service is running on port
- 9999. The port should be configurable, and configuring the port enables the
- service. The service must run on 127.0.0.1.
-
- In all the examples below [uniquesessionid] refers to a random, base64
- encoded string that is unique to the URL it is contained in. Tor only ever
- stores the most recently generated [uniquesessionid] for each URL, storing 3
- in total. Tor should generate a [uniquesessionid] for each of the test URLs
- below every time a HTTP GET is received at 127.0.0.1:9999 for index.htm.
-
- The most suitable image for each test case is an implementation decision.
- Tor will need to store and serve images for the first and second test
- images, and possibly the third (see 'Open Issues').
-
- 1. DNS Request Test Image
-
- This is a HTML element embedded in the page served by Tor at
- http://127.0.0.1:9999:
-
- <IMG src="http://[uniquesessionid]:9999/torlogo.jpg" alt="If you can see
- this text, your browser's DNS requests are not being routed through Tor."
- width="200" height="200" align="middle" border="2">
-
- If the browser's DNS request for [uniquesessionid] is routed through Tor,
- Tor will intercept the request and return 127.0.0.1 as the resolved IP
- address. This will shortly be followed by a HTTP request from the browser
- for http://127.0.0.1:9999/torlogo.jpg. This request should be served with
- the appropriate image.
-
- If the browser's DNS request for [uniquesessionid] is not routed through Tor
- the browser may display the 'alt' text specified in the html element. The
- HTML served by Tor should also contain text accompanying the image to advise
- users what it means if they do not see an image. It should also provide a
- link to click that provides information on how to remedy the problem. This
- behaviour also applies to the images described in 2. and 3. below, so should
- be assumed there as well.
-
-
- 2. Proxy Configuration Test Image
-
- This is a HTML element embedded in the page served by Tor at
- http://127.0.0.1:9999:
-
- <IMG src="http://torproject.org/[uniquesessionid].jpg" alt="If you can see
- this text, your browser is not configured to work with Tor." width="200"
- height="200" align="middle" border="2">
-
- If the HTTP request for the resource [uniquesessionid].jpg is received by
- Tor it will serve the appropriate image in response. It should serve this
- image itself, without attempting to retrieve anything from the Internet.
-
- If Tor can identify the name of the proxy application requesting the
- resource then it could store and serve an image identifying the proxy to the
- user.
-
- 3. Tor Connectivity Test Image
-
- This is a HTML element embedded in the page served by Tor at
- http://127.0.0.1:9999:
-
- <IMG src="http://torproject.org/[uniquesessionid]-torlogo.jpg" alt="If you
- can see this text, your Tor installation cannot connect to the Internet."
- width="200" height="200" align="middle" border="2">
-
- The referenced image should actually exist on the Tor project website. If
- Tor receives the request for the above resource it should remove the random
- base64 encoded digest from the request (i.e. [uniquesessionid]-) and attempt
- to retrieve the real image.
-
- Even on a fully operational Tor client this test may not always succeed. The
- user should be advised that one or more attempts to retrieve this image may
- be necessary to confirm a genuine problem.
-
-Open Issues:
-
- The final connectivity test relies on an externally maintained resource, if
- this resource becomes unavailable the connectivity test will always fail.
- Either the text accompanying the test should advise of this possibility or
- Tor clients should be advised of the location of the test resource in the
- main network directory listings.
-
- Any number of misconfigurations may make the web service unreachable, it is
- the responsibility of the user's controller to recognize these and assist
- the user in eliminating them. Tor can mitigate against the specific
- misconfiguration of routing HTTP traffic to 127.0.0.1 to Tor itself by
- serving such requests through the SOCKS port as well as the configured web
- service report.
-
- Now Tor is inspecting the URLs requested on its SOCKS port and 'dropping'
- them. It already inspects for raw IP addresses (to warn of DNS leaks) but
- maybe the behaviour proposed here is qualitatively different. Maybe this is
- an unwelcome precedent that can be used to beat the project over the head in
- future. Or maybe it's not such a bad thing, Tor is merely attempting to make
- normally invalid resource requests valid for a given purpose.
-
diff --git a/doc/spec/proposals/133-unreachable-ors.txt b/doc/spec/proposals/133-unreachable-ors.txt
deleted file mode 100644
index a1c2dd854..000000000
--- a/doc/spec/proposals/133-unreachable-ors.txt
+++ /dev/null
@@ -1,128 +0,0 @@
-Filename: 133-unreachable-ors.txt
-Title: Incorporate Unreachable ORs into the Tor Network
-Author: Robert Hogan
-Created: 2008-03-08
-Status: Draft
-
-Overview:
-
- Propose a scheme for harnessing the bandwidth of ORs who cannot currently
- participate in the Tor network because they can only make outbound
- TCP connections.
-
-Motivation:
-
- Restrictive local and remote firewalls are preventing many willing
- candidates from becoming ORs on the Tor network.These
- ORs have a casual interest in joining the network but their operator is not
- sufficiently motivated or adept to complete the necessary router or firewall
- configuration. The Tor network is losing out on their bandwidth. At the
- moment we don't even know how many such 'candidate' ORs there are.
-
-
-Objective:
-
- 1. Establish how many ORs are unable to qualify for publication because
- they cannot establish that their ORPort is reachable.
-
- 2. Devise a method for making such ORs available to clients for circuit
- building without prejudicing their anonymity.
-
-Proposal:
-
- ORs whose ORPort reachability testing fails a specified number of
- consecutive times should:
- 1. Enlist themselves with the authorities setting a 'Fallback' flag. This
- flag indicates that the OR is up and running but cannot connect to
- itself.
- 2. Open an orconn with all ORs whose fingerprint begins with the same
- byte as their own. The management of this orconn will be transferred
- entirely to the OR at the other end.
- 2. The fallback OR should update it's router status to contain the
- 'Running' flag if it has managed to open an orconn with 3/4 of the ORs
- with an FP beginning with the same byte as its own.
-
- Tor ORs who are contacted by fallback ORs requesting an orconn should:
- 1. Accept the orconn until they have reached a defined limit of orconn
- connections with fallback ORs.
- 2. Should only accept such orconn requests from listed fallback ORs who
- have an FP beginning with the same byte as its own.
-
- Tor clients can include fallback ORs in the network by doing the
- following:
- 1. When building a circuit, observe the fingerprint of each node they
- wish to connect to.
- 2. When randomly selecting a node from the set of all eligible nodes,
- add all published, running fallback nodes to the set where the first
- byte of the fingerprint matches the previous node in the circuit.
-
-Anonymity Implications:
-
- At least some, and possibly all, nodes on the network will have a set
- of nodes that only they and a few others can build circuits on.
-
- 1. This means that fallback ORs might be unsuitable for use as middlemen
- nodes, because if the exit node is the attacker it knows that the
- number of nodes that could be the entry guard in the circuit is
- reduced to roughly 1/256th of the network, or worse 1/256th of all
- nodes listed as Guards. For the same reason, fallback nodes would
- appear to be unsuitable for two-hop circuits.
-
- 2. This is not a problem if fallback ORs are always exit nodes. If
- the fallback OR is an attacker it will not be able to reduce the
- set of possible nodes for the entry guard any further than a normal,
- published OR.
-
-Possible Attacks/Open Issues:
-
- 1. Gaming Node Selection
- Does running a fallback OR customized for a specific set of published ORs
- improve an attacker's chances of seeing traffic from that set of published
- ORs? Would such a strategy be any more effective than running published
- ORs with other 'attractive' properties?
-
- 2. DOS Attack
- An attacker could prevent all other legitimate fallback ORs with a
- given byte-1 in their FP from functioning by running 20 or 30 fallback ORs
- and monopolizing all available fallback slots on the published ORs.
- This same attacker would then be in a position to monopolize all the
- traffic of the fallback ORs on that byte-1 network segment. I'm not sure
- what this would allow such an attacker to do.
-
- 4. Circuit-Sniffing
- An observer watching exit traffic from a fallback server will know that the
- previous node in the circuit is one of a very small, identifiable
- subset of the total ORs in the network. To establish the full path of the
- circuit they would only have to watch the exit traffic from the fallback
- OR and all the traffic from the 20 or 30 ORs it is likely to be connected
- to. This means it is substantially easier to establish all members of a
- circuit which has a fallback OR as an exit (sniff and analyse 10-50 (i.e.
- 1/256 varying) + 1 ORs) rather than a normal published OR (sniff all 2560
- or so ORs on the network). The same mechanism that allows the client to
- expect a specific fallback OR to be available from a specific published OR
- allows an attacker to prepare his ground.
-
- Mitigant:
- In terms of the resources and access required to monitor 2000 to 3000
- nodes, the effort of the adversary is not significantly diminished when he
- is only interested in 20 or 30. It is hard to see how an adversary who can
- obtain access to a randomly selected portion of the Tor network would face
- any new or qualitatively different obstacles in attempting to access much
- of the rest of it.
-
-
-Implementation Issues:
-
- The number of ORs this proposal would add to the Tor network is not known.
- This is because there is no mechanism at present for recording unsuccessful
- attempts to become an OR. If the proposal is considered promising it may be
- worthwhile to issue an alpha series release where candidate ORs post a
- primitive fallback descriptor to the authority directories. This fallback
- descriptor would not contain any other flag that would make it eligible for
- selection by clients. It would act solely as a means of sizing the number of
- Tor instances that try and fail to become ORs.
-
- The upper limit on the number of orconns from fallback ORs a normal,
- published OR should be willing to accept is an open question. Is one
- hundred, mostly idle, such orconns too onerous?
-
diff --git a/doc/spec/proposals/134-robust-voting.txt b/doc/spec/proposals/134-robust-voting.txt
deleted file mode 100644
index c5dfb3b47..000000000
--- a/doc/spec/proposals/134-robust-voting.txt
+++ /dev/null
@@ -1,123 +0,0 @@
-Filename: 134-robust-voting.txt
-Title: More robust consensus voting with diverse authority sets
-Author: Peter Palfrader
-Created: 2008-04-01
-Status: Rejected
-
-History:
- 2009 May 27: Added note on rejecting this proposal -- Nick
-
-Overview:
-
- A means to arrive at a valid directory consensus even when voters
- disagree on who is an authority.
-
-
-Motivation:
-
- Right now there are about five authoritative directory servers in the
- Tor network, tho this number is expected to rise to about 15 eventually.
-
- Adding a new authority requires synchronized action from all operators of
- directory authorities so that at any time during the update at least half of
- all authorities are running and agree on who is an authority. The latter
- requirement is there so that the authorities can arrive at a common
- consensus: Each authority builds the consensus based on the votes from
- all authorities it recognizes, and so a different set of recognized
- authorities will lead to a different consensus document.
-
-
-Objective:
-
- The modified voting procedure outlined in this proposal obsoletes the
- requirement for most authorities to exactly agree on the list of
- authorities.
-
-
-Proposal:
-
- The vote document each authority generates contains a list of
- authorities recognized by the generating authority. This will be
- a list of authority identity fingerprints.
-
- Authorities will accept votes from and serve/mirror votes also for
- authorities they do not recognize. (Votes contain the signing,
- authority key, and the certificate linking them so they can be
- verified even without knowing the authority beforehand.)
-
- Before building the consensus we will check which votes to use for
- building:
-
- 1) We build a directed graph of which authority/vote recognizes
- whom.
- 2) (Parts of the graph that aren't reachable, directly or
- indirectly, from any authorities we recognize can be discarded
- immediately.)
- 3) We find the largest fully connected subgraph.
- (Should there be more than one subgraph of the same size there
- needs to be some arbitrary ordering so we always pick the same.
- E.g. pick the one who has the smaller (XOR of all votes' digests)
- or something.)
- 4) If we are part of that subgraph, great. This is the list of
- votes we build our consensus with.
- 5) If we are not part of that subgraph, remove all the nodes that
- are part of it and go to 3.
-
- Using this procedure authorities that are updated to recognize a
- new authority will continue voting with the old group until a
- sufficient number has been updated to arrive at a consensus with
- the recently added authority.
-
- In fact, the old set of authorities will probably be voting among
- themselves until all but one has been updated to recognize the
- new authority. Then which set of votes is used for consensus
- building depends on which of the two equally large sets gets
- ordered before the other in step (3) above.
-
- It is necessary to continue with the process in (5) even if we
- are not in the largest subgraph. Otherwise one rogue authority
- could create a number of extra votes (by new authorities) so that
- everybody stops at 5 and no consensus is built, even tho it would
- be trusted by all clients.
-
-
-Anonymity Implications:
-
- The author does not believe this proposal to have anonymity
- implications.
-
-
-Possible Attacks/Open Issues/Some thinking required:
-
- Q: Can a number (less or exactly half) of the authorities cause an honest
- authority to vote for "their" consensus rather than the one that would
- result were all authorities taken into account?
-
-
- Q: Can a set of votes from external authorities, i.e of whom we trust either
- none or at least not all, cause us to change the set of consensus makers we
- pick?
- A: Yes, if other authorities decide they rather build a consensus with them
- then they'll be thrown out in step 3. But that's ok since those other
- authorities will never vote with us anyway.
- If we trust none of them then we throw them out even sooner, so no harm done.
-
- Q: Can this ever force us to build a consensus with authorities we do not
- recognize?
- A: No, we can never build a fully connected set with them in step 3.
-
-------------------------------
-
-I'm rejecting this proposal as insecure.
-
-Suppose that we have a clique of size N, and M hostile members in the
-clique. If these hostile members stop declaring trust for up to M-1
-good members of the clique, the clique with the hostile members will
-in it will be larger than the one without them.
-
-The M hostile members will constitute a majority of this new clique
-when M > (N-(M-1)) / 2, or when M > (N + 1) / 3. This breaks our
-requirement that an adversary must compromise a majority of authorities
-in order to control the consensus.
-
--- Nick
diff --git a/doc/spec/proposals/135-private-tor-networks.txt b/doc/spec/proposals/135-private-tor-networks.txt
deleted file mode 100644
index 19ef68b7b..000000000
--- a/doc/spec/proposals/135-private-tor-networks.txt
+++ /dev/null
@@ -1,281 +0,0 @@
-Filename: 135-private-tor-networks.txt
-Title: Simplify Configuration of Private Tor Networks
-Author: Karsten Loesing
-Created: 29-Apr-2008
-Status: Closed
-Target: 0.2.1.x
-Implemented-In: 0.2.1.2-alpha
-
-Change history:
-
- 29-Apr-2008 Initial proposal for or-dev
- 19-May-2008 Included changes based on comments by Nick to or-dev and
- added a section for test cases.
- 18-Jun-2008 Changed testing-network-only configuration option names.
-
-Overview:
-
- Configuring a private Tor network has become a time-consuming and
- error-prone task with the introduction of the v3 directory protocol. In
- addition to that, operators of private Tor networks need to set an
- increasing number of non-trivial configuration options, and it is hard
- to keep FAQ entries describing this task up-to-date. In this proposal we
- (1) suggest to (optionally) accelerate timing of the v3 directory voting
- process and (2) introduce an umbrella config option specifically aimed at
- creating private Tor networks.
-
-Design:
-
- 1. Accelerate Timing of v3 Directory Voting Process
-
- Tor has reasonable defaults for setting up a large, Internet-scale
- network with comparably high latencies and possibly wrong server clocks.
- However, those defaults are bad when it comes to quickly setting up a
- private Tor network for testing, either on a single node or LAN (things
- might be different when creating a test network on PlanetLab or
- something). Some time constraints should be made configurable for private
- networks. The general idea is to accelerate everything that has to do
- with propagation of directory information, but nothing else, so that a
- private network is available as soon as possible. (As a possible
- safeguard, changing these configuration values could be made dependent on
- the umbrella configuration option introduced in 2.)
-
- 1.1. Initial Voting Schedule
-
- When a v3 directory does not know any consensus, it assumes an initial,
- hard-coded VotingInterval of 30 minutes, VoteDelay of 5 minutes, and
- DistDelay of 5 minutes. This is important for multiple, simultaneously
- restarted directory authorities to meet at a common time and create an
- initial consensus. Unfortunately, this means that it may take up to half
- an hour (or even more) for a private Tor network to bootstrap.
-
- We propose to make these three time constants configurable (note that
- V3AuthVotingInterval, V3AuthVoteDelay, and V3AuthDistDelay do not have an
- effect on the _initial_ voting schedule, but only on the schedule that a
- directory authority votes for). This can be achieved by introducing three
- new configuration options: TestingV3AuthInitialVotingInterval,
- TestingV3AuthInitialVoteDelay, and TestingV3AuthInitialDistDelay.
-
- As first safeguards, Tor should only accept configuration values for
- TestingV3AuthInitialVotingInterval that divide evenly into the default
- value of 30 minutes. The effect is that even if people misconfigured
- their directory authorities, they would meet at the default values at the
- latest. The second safeguard is to allow configuration only when the
- umbrella configuration option TestingTorNetwork is set.
-
- 1.2. Immediately Provide Reachability Information (Running flag)
-
- The default behavior of a directory authority is to provide the Running
- flag only after the authority is available for at least 30 minutes. The
- rationale is that before that time, an authority simply cannot deliver
- useful information about other running nodes. But for private Tor
- networks this may be different. This is currently implemented in the code
- as:
-
- /** If we've been around for less than this amount of time, our
- * reachability information is not accurate. */
- #define DIRSERV_TIME_TO_GET_REACHABILITY_INFO (30*60)
-
- There should be another configuration option
- TestingAuthDirTimeToLearnReachability with a default value of 30 minutes
- that can be changed when running testing Tor networks, e.g. to 0 minutes.
- The configuration value would simply replace the quoted constant. Again,
- changing this option could be safeguarded by requiring the umbrella
- configuration option TestingTorNetwork to be set.
-
- 1.3. Reduce Estimated Descriptor Propagation Time
-
- Tor currently assumes that it takes up to 10 minutes until router
- descriptors are propagated from the authorities to directory caches.
- This is not very useful for private Tor networks, and we want to be able
- to reduce this time, so that clients can download router descriptors in a
- timely manner.
-
- /** Clients don't download any descriptor this recent, since it will
- * probably not have propagated to enough caches. */
- #define ESTIMATED_PROPAGATION_TIME (10*60)
-
- We suggest to introduce a new config option
- TestingEstimatedDescriptorPropagationTime which defaults to 10 minutes,
- but that can be set to any lower non-negative value, e.g. 0 minutes. The
- same safeguards as in 1.2 could be used here, too.
-
- 2. Umbrella Option for Setting Up Private Tor Networks
-
- Setting up a private Tor network requires a number of specific settings
- that are not required or useful when running Tor in the public Tor
- network. Instead of writing down these options in a FAQ entry, there
- should be a single configuration option, e.g. TestingTorNetwork, that
- changes all required settings at once. Newer Tor versions would keep the
- set of configuration options up-to-date. It should still remain possible
- to manually overwrite the settings that the umbrella configuration option
- affects.
-
- The following configuration options are set by TestingTorNetwork:
-
- - ServerDNSAllowBrokenResolvConf 1
- Ignore the situation that private relays are not aware of any name
- servers.
-
- - DirAllowPrivateAddresses 1
- Allow router descriptors containing private IP addresses.
-
- - EnforceDistinctSubnets 0
- Permit building circuits with relays in the same subnet.
-
- - AssumeReachable 1
- Omit self-testing for reachability.
-
- - AuthDirMaxServersPerAddr 0
- - AuthDirMaxServersPerAuthAddr 0
- Permit an unlimited number of nodes on the same IP address.
-
- - ClientDNSRejectInternalAddresses 0
- Believe in DNS responses resolving to private IP addresses.
-
- - ExitPolicyRejectPrivate 0
- Allow exiting to private IP addresses. (This one is a matter of
- taste---it might be dangerous to make this a default in a private
- network, although people setting up private Tor networks should know
- what they are doing.)
-
- - V3AuthVotingInterval 5 minutes
- - V3AuthVoteDelay 20 seconds
- - V3AuthDistDelay 20 seconds
- Accelerate voting schedule after first consensus has been reached.
-
- - TestingV3AuthInitialVotingInterval 5 minutes
- - TestingV3AuthInitialVoteDelay 20 seconds
- - TestingV3AuthInitialDistDelay 20 seconds
- Accelerate initial voting schedule until first consensus is reached.
-
- - TestingAuthDirTimeToLearnReachability 0 minutes
- Consider routers as Running from the start of running an authority.
-
- - TestingEstimatedDescriptorPropagationTime 0 minutes
- Clients try downloading router descriptors from directory caches,
- even when they are not 10 minutes old.
-
- In addition to changing the defaults for these configuration options,
- TestingTorNetwork can only be set when a user has manually configured
- DirServer lines.
-
-Test:
-
- The implementation of this proposal must pass the following tests:
-
- 1. Set TestingTorNetwork and see if dependent configuration options are
- correctly changed.
-
- tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \
- "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000"
- telnet 127.0.0.1 9051
- AUTHENTICATE
- GETCONF TestingTorNetwork TestingAuthDirTimeToLearnReachability
- 250-TestingTorNetwork=1
- 250 TestingAuthDirTimeToLearnReachability=0
- QUIT
-
- 2. Set TestingTorNetwork and a dependent configuration value to see if
- the provided value is used for the dependent option.
-
- tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \
- "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" \
- TestingAuthDirTimeToLearnReachability 5
- telnet 127.0.0.1 9051
- AUTHENTICATE
- GETCONF TestingTorNetwork TestingAuthDirTimeToLearnReachability
- 250-TestingTorNetwork=1
- 250 TestingAuthDirTimeToLearnReachability=5
- QUIT
-
- 3. Start with TestingTorNetwork set and change a dependent configuration
- option later on.
-
- tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \
- "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000"
- telnet 127.0.0.1 9051
- AUTHENTICATE
- SETCONF TestingAuthDirTimeToLearnReachability=5
- GETCONF TestingAuthDirTimeToLearnReachability
- 250 TestingAuthDirTimeToLearnReachability=5
- QUIT
-
- 4. Start with TestingTorNetwork set and a dependent configuration value,
- and reset that dependent configuration value. The result should be
- the testing-network specific default value.
-
- tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \
- "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" \
- TestingAuthDirTimeToLearnReachability 5
- telnet 127.0.0.1 9051
- AUTHENTICATE
- GETCONF TestingAuthDirTimeToLearnReachability
- 250 TestingAuthDirTimeToLearnReachability=5
- RESETCONF TestingAuthDirTimeToLearnReachability
- GETCONF TestingAuthDirTimeToLearnReachability
- 250 TestingAuthDirTimeToLearnReachability=0
- QUIT
-
- 5. Leave TestingTorNetwork unset and check if dependent configuration
- options are left unchanged.
-
- tor DataDirectory . ControlPort 9051 DirServer \
- "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000"
- telnet 127.0.0.1 9051
- AUTHENTICATE
- GETCONF TestingTorNetwork TestingAuthDirTimeToLearnReachability
- 250-TestingTorNetwork=0
- 250 TestingAuthDirTimeToLearnReachability=1800
- QUIT
-
- 6. Leave TestingTorNetwork unset, but set dependent configuration option
- which should fail.
-
- tor DataDirectory . ControlPort 9051 DirServer \
- "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000" \
- TestingAuthDirTimeToLearnReachability 0
- [warn] Failed to parse/validate config:
- TestingAuthDirTimeToLearnReachability may only be changed in testing
- Tor networks!
-
- 7. Start with TestingTorNetwork unset and change dependent configuration
- option later on which should fail.
-
- tor DataDirectory . ControlPort 9051 DirServer \
- "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000"
- telnet 127.0.0.1 9051
- AUTHENTICATE
- SETCONF TestingAuthDirTimeToLearnReachability=0
- 513 Unacceptable option value: TestingAuthDirTimeToLearnReachability
- may only be changed in testing Tor networks!
-
- 8. Start with TestingTorNetwork unset and set it later on which should
- fail.
-
- tor DataDirectory . ControlPort 9051 DirServer \
- "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000"
- telnet 127.0.0.1 9051
- AUTHENTICATE
- SETCONF TestingTorNetwork=1
- 553 Transition not allowed: While Tor is running, changing
- TestingTorNetwork is not allowed.
-
- 9. Start with TestingTorNetwork set and unset it later on which should
- fail.
-
- tor DataDirectory . ControlPort 9051 TestingTorNetwork 1 DirServer \
- "mydir 127.0.0.1:1234 0000000000000000000000000000000000000000"
- telnet 127.0.0.1 9051
- AUTHENTICATE
- RESETCONF TestingTorNetwork
- 513 Unacceptable option value: TestingV3AuthInitialVotingInterval may
- only be changed in testing Tor networks!
-
- 10. Set TestingTorNetwork, but do not provide an alternate DirServer
- which should fail.
-
- tor DataDirectory . ControlPort 9051 TestingTorNetwork 1
- [warn] Failed to parse/validate config: TestingTorNetwork may only be
- configured in combination with a non-default set of DirServers.
-
diff --git a/doc/spec/proposals/136-legacy-keys.txt b/doc/spec/proposals/136-legacy-keys.txt
deleted file mode 100644
index f2b1b5c7f..000000000
--- a/doc/spec/proposals/136-legacy-keys.txt
+++ /dev/null
@@ -1,100 +0,0 @@
-Filename: 136-legacy-keys.txt
-Title: Mass authority migration with legacy keys
-Author: Nick Mathewson
-Created: 13-May-2008
-Status: Closed
-Implemented-In: 0.2.0.x
-
-Overview:
-
- This document describes a mechanism to change the keys of more than
- half of the directory servers at once without breaking old clients
- and caches immediately.
-
-Motivation:
-
- If a single authority's identity key is believed to be compromised,
- the solution is obvious: remove that authority from the list,
- generate a new certificate, and treat the new cert as belonging to a
- new authority. This approach works fine so long as less than 1/2 of
- the authority identity keys are bad.
-
- Unfortunately, the mass-compromise case is possible if there is a
- sufficiently bad bug in Tor or in any OS used by a majority of v3
- authorities. Let's be prepared for it!
-
- We could simply stop using the old keys and start using new ones,
- and tell all clients running insecure versions to upgrade.
- Unfortunately, this breaks our cacheing system pretty badly, since
- caches won't cache a consensus that they don't believe in. It would
- be nice to have everybody become secure the moment they upgrade to a
- version listing the new authority keys, _without_ breaking upgraded
- clients until the caches upgrade.
-
- So, let's come up with a way to provide a time window where the
- consensuses are signed with the new keys and with the old.
-
-Design:
-
- We allow directory authorities to list a single "legacy key"
- fingerprint in their votes. Each authority may add a single legacy
- key. The format for this line is:
-
- legacy-dir-key FINGERPRINT
-
- We describe a new consensus method for generating directory
- consensuses. This method is consensus method "3".
-
- When the authorities decide to use method "3" (as described in 3.4.1
- of dir-spec.txt), for every included vote with a legacy-dir-key line,
- the consensus includes an extra dir-source line. The fingerprint in
- this extra line is as in the legacy-dir-key line. The ports and
- addresses are in the dir-source line. The nickname is as in the
- dir-source line, with the string "-legacy" appended.
-
- [We need to include this new dir-source line because the code
- won't accept or preserve signatures from authorities not listed
- as contributing to the consensus.]
-
- Authorities using legacy dir keys include two signatures on their
- consensuses: one generated with a signing key signed with their real
- signing key, and another generated with a signing key signed with
- another signing key attested to by their identity key. These
- signing keys MUST be different. Authorities MUST serve both
- certificates if asked.
-
-Process:
-
- In the event of a mass key failure, we'll follow the following
- (ugly) procedure:
- - All affected authorities generate new certificates and identity
- keys, and circulate their new dirserver lines. They copy their old
- certificates and old broken keys, but put them in new "legacy
- key files".
- - At the earliest time that can be arranged, the authorities
- replace their signing keys, identity keys, and certificates
- with the new uncompromised versions, and update to the new list
- of dirserer lines.
- - They add an "V3DirAdvertiseLegacyKey 1" option to their torrc.
- - Now, new consensuses will be generated using the new keys, but
- the results will also be signed with the old keys.
- - Clients and caches are told they need to upgrade, and given a
- time window to do so.
- - At the end of the time window, authorities remove the
- V3DirAdvertiseLegacyKey option.
-
-Notes:
-
- It might be good to get caches to cache consensuses that they do not
- believe in. I'm not sure the best way of how to do this.
-
- It's a superficially neat idea to have new signing keys and have
- them signed by the new and by the old authority identity keys. This
- breaks some code, though, and doesn't actually gain us anything,
- since we'd still need to include each signature twice.
-
- It's also a superficially neat idea, if identity keys and signing
- keys are compromised, to at least replace all the signing keys.
- I don't think this achieves us anything either, though.
-
-
diff --git a/doc/spec/proposals/137-bootstrap-phases.txt b/doc/spec/proposals/137-bootstrap-phases.txt
deleted file mode 100644
index ebe044c70..000000000
--- a/doc/spec/proposals/137-bootstrap-phases.txt
+++ /dev/null
@@ -1,235 +0,0 @@
-Filename: 137-bootstrap-phases.txt
-Title: Keep controllers informed as Tor bootstraps
-Author: Roger Dingledine
-Created: 07-Jun-2008
-Status: Closed
-Implemented-In: 0.2.1.x
-
-1. Overview.
-
- Tor has many steps to bootstrapping directory information and
- initial circuits, but from the controller's perspective we just have
- a coarse-grained "CIRCUIT_ESTABLISHED" status event. Tor users with
- slow connections or with connectivity problems can wait a long time
- staring at the yellow onion, wondering if it will ever change color.
-
- This proposal describes a new client status event so Tor can give
- more details to the controller. Section 2 describes the changes to the
- controller protocol; Section 3 describes Tor's internal bootstrapping
- phases when everything is going correctly; Section 4 describes when
- Tor detects a problem and issues a bootstrap warning; Section 5 covers
- suggestions for how controllers should display the results.
-
-2. Controller event syntax.
-
- The generic status event is:
-
- "650" SP StatusType SP StatusSeverity SP StatusAction
- [SP StatusArguments] CRLF
-
- So in this case we send
- 650 STATUS_CLIENT NOTICE/WARN BOOTSTRAP \
- PROGRESS=num TAG=Keyword SUMMARY=String \
- [WARNING=String REASON=Keyword COUNT=num RECOMMENDATION=Keyword]
-
- The arguments MAY appear in any order. Controllers MUST accept unrecognized
- arguments.
-
- "Progress" gives a number between 0 and 100 for how far through
- the bootstrapping process we are. "Summary" is a string that can be
- displayed to the user to describe the *next* task that Tor will tackle,
- i.e., the task it is working on after sending the status event. "Tag"
- is an optional string that controllers can use to recognize bootstrap
- phases from Section 3, if they want to do something smarter than just
- blindly displaying the summary string.
-
- The severity describes whether this is a normal bootstrap phase
- (severity notice) or an indication of a bootstrapping problem
- (severity warn). If severity warn, it should also include a "warning"
- argument string with any hints Tor has to offer about why it's having
- troubles bootstrapping, a "reason" string that lists one of the reasons
- allowed in the ORConn event, a "count" number that tells how many
- bootstrap problems there have been so far at this phase, and a
- "recommendation" keyword to indicate how the controller ought to react.
-
-3. The bootstrap phases.
-
- This section describes the various phases currently reported by
- Tor. Controllers should not assume that the percentages and tags listed
- here will continue to match up, or even that the tags will stay in
- the same order. Some phases might also be skipped (not reported) if the
- associated bootstrap step is already complete, or if the phase no longer
- is necessary. Only "starting" and "done" are guaranteed to exist in all
- future versions.
-
- Current Tor versions enter these phases in order, monotonically;
- future Tors MAY revisit earlier stages.
-
- Phase 0:
- tag=starting summary="starting"
-
- Tor starts out in this phase.
-
- Phase 5:
- tag=conn_dir summary="Connecting to directory mirror"
-
- Tor sends this event as soon as Tor has chosen a directory mirror ---
- one of the authorities if bootstrapping for the first time or after
- a long downtime, or one of the relays listed in its cached directory
- information otherwise.
-
- Tor will stay at this phase until it has successfully established
- a TCP connection with some directory mirror. Problems in this phase
- generally happen because Tor doesn't have a network connection, or
- because the local firewall is dropping SYN packets.
-
- Phase 10
- tag=handshake_dir summary="Finishing handshake with directory mirror"
-
- This event occurs when Tor establishes a TCP connection with a relay used
- as a directory mirror (or its https proxy if it's using one). Tor remains
- in this phase until the TLS handshake with the relay is finished.
-
- Problems in this phase generally happen because Tor's firewall is
- doing more sophisticated MITM attacks on it, or doing packet-level
- keyword recognition of Tor's handshake.
-
- Phase 15:
- tag=onehop_create summary="Establishing one-hop circuit for dir info"
-
- Once TLS is finished with a relay, Tor will send a CREATE_FAST cell
- to establish a one-hop circuit for retrieving directory information.
- It will remain in this phase until it receives the CREATED_FAST cell
- back, indicating that the circuit is ready.
-
- Phase 20:
- tag=requesting_status summary="Asking for networkstatus consensus"
-
- Once we've finished our one-hop circuit, we will start a new stream
- for fetching the networkstatus consensus. We'll stay in this phase
- until we get the 'connected' relay cell back, indicating that we've
- established a directory connection.
-
- Phase 25:
- tag=loading_status summary="Loading networkstatus consensus"
-
- Once we've established a directory connection, we will start fetching
- the networkstatus consensus document. This could take a while; this
- phase is a good opportunity for using the "progress" keyword to indicate
- partial progress.
-
- This phase could stall if the directory mirror we picked doesn't
- have a copy of the networkstatus consensus so we have to ask another,
- or it does give us a copy but we don't find it valid.
-
- Phase 40:
- tag=loading_keys summary="Loading authority key certs"
-
- Sometimes when we've finished loading the networkstatus consensus,
- we find that we don't have all the authority key certificates for the
- keys that signed the consensus. At that point we put the consensus we
- fetched on hold and fetch the keys so we can verify the signatures.
-
- Phase 45
- tag=requesting_descriptors summary="Asking for relay descriptors"
-
- Once we have a valid networkstatus consensus and we've checked all
- its signatures, we start asking for relay descriptors. We stay in this
- phase until we have received a 'connected' relay cell in response to
- a request for descriptors.
-
- Phase 50:
- tag=loading_descriptors summary="Loading relay descriptors"
-
- We will ask for relay descriptors from several different locations,
- so this step will probably make up the bulk of the bootstrapping,
- especially for users with slow connections. We stay in this phase until
- we have descriptors for at least 1/4 of the usable relays listed in
- the networkstatus consensus. This phase is also a good opportunity to
- use the "progress" keyword to indicate partial steps.
-
- Phase 80:
- tag=conn_or summary="Connecting to entry guard"
-
- Once we have a valid consensus and enough relay descriptors, we choose
- some entry guards and start trying to build some circuits. This step
- is similar to the "conn_dir" phase above; the only difference is
- the context.
-
- If a Tor starts with enough recent cached directory information,
- its first bootstrap status event will be for the conn_or phase.
-
- Phase 85:
- tag=handshake_or summary="Finishing handshake with entry guard"
-
- This phase is similar to the "handshake_dir" phase, but it gets reached
- if we finish a TCP connection to a Tor relay and we have already reached
- the "conn_or" phase. We'll stay in this phase until we complete a TLS
- handshake with a Tor relay.
-
- Phase 90:
- tag=circuit_create "Establishing circuits"
-
- Once we've finished our TLS handshake with an entry guard, we will
- set about trying to make some 3-hop circuits in case we need them soon.
-
- Phase 100:
- tag=done summary="Done"
-
- A full 3-hop circuit has been established. Tor is ready to handle
- application connections now.
-
-4. Bootstrap problem events.
-
- When an OR Conn fails, we send a "bootstrap problem" status event, which
- is like the standard bootstrap status event except with severity warn.
- We include the same progress, tag, and summary values as we would for
- a normal bootstrap event, but we also include "warning", "reason",
- "count", and "recommendation" key/value combos.
-
- The "reason" values are long-term-stable controller-facing tags to
- identify particular issues in a bootstrapping step. The warning
- strings, on the other hand, are human-readable. Controllers SHOULD
- NOT rely on the format of any warning string. Currently the possible
- values for "recommendation" are either "ignore" or "warn" -- if ignore,
- the controller can accumulate the string in a pile of problems to show
- the user if the user asks; if warn, the controller should alert the
- user that Tor is pretty sure there's a bootstrapping problem.
-
- Currently Tor uses recommendation=ignore for the first nine bootstrap
- problem reports for a given phase, and then uses recommendation=warn
- for subsequent problems at that phase. Hopefully this is a good
- balance between tolerating occasional errors and reporting serious
- problems quickly.
-
-5. Suggested controller behavior.
-
- Controllers should start out with a yellow onion or the equivalent
- ("starting"), and then watch for either a bootstrap status event
- (meaning the Tor they're using is sufficiently new to produce them,
- and they should load up the progress bar or whatever they plan to use
- to indicate progress) or a circuit_established status event (meaning
- bootstrapping is finished).
-
- In addition to a progress bar in the display, controllers should also
- have some way to indicate progress even when no controller window is
- open. For example, folks using Tor Browser Bundle in hostile Internet
- cafes don't want a big splashy screen up. One way to let the user keep
- informed of progress in a more subtle way is to change the task tray
- icon and/or tooltip string as more bootstrap events come in.
-
- Controllers should also have some mechanism to alert their user when
- bootstrapping problems are reported. Perhaps we should gather a set of
- help texts and the controller can send the user to the right anchor in a
- "bootstrapping problems" page in the controller's help subsystem?
-
-6. Getting up to speed when the controller connects.
-
- There's a new "GETINFO /status/bootstrap-phase" option, which returns
- the most recent bootstrap phase status event sent. Specifically,
- it returns a string starting with either "NOTICE BOOTSTRAP ..." or
- "WARN BOOTSTRAP ...".
-
- Controllers should use this getinfo when they connect or attach to
- Tor to learn its current state.
-
diff --git a/doc/spec/proposals/138-remove-down-routers-from-consensus.txt b/doc/spec/proposals/138-remove-down-routers-from-consensus.txt
deleted file mode 100644
index 776911b5c..000000000
--- a/doc/spec/proposals/138-remove-down-routers-from-consensus.txt
+++ /dev/null
@@ -1,49 +0,0 @@
-Filename: 138-remove-down-routers-from-consensus.txt
-Title: Remove routers that are not Running from consensus documents
-Author: Peter Palfrader
-Created: 11-Jun-2008
-Status: Closed
-Implemented-In: 0.2.1.2-alpha
-
-1. Overview.
-
- Tor directory authorities hourly vote and agree on a consensus document
- which lists all the routers on the network together with some of their
- basic properties, like if a router is an exit node, whether it is
- stable or whether it is a version 2 directory mirror.
-
- One of the properties given with each router is the 'Running' flag.
- Clients do not use routers that are not listed as running.
-
- This proposal suggests that routers without the Running flag are not
- listed at all.
-
-2. Current status
-
- At a typical bootstrap a client downloads a 140KB consensus, about
- 10KB of certificates to verify that consensus, and about 1.6MB of
- server descriptors, about 1/4 of which it requires before it will
- start building circuits.
-
- Another proposal deals with how to get that huge 1.6MB fraction to
- effectively zero (by downloading only individual descriptors, on
- demand). Should that get successfully implemented that will leave the
- 140KB compressed consensus as a large fraction of what a client needs
- to get in order to work.
-
- About one third of the routers listed in a consensus are not running
- and will therefore never be used by clients who use this consensus.
- Not listing those routers will save about 30% to 40% in size.
-
-3. Proposed change
-
- Authority directory servers produce vote documents that include all
- the servers they know about, running or not, like they currently
- do. In addition these vote documents also state that the authority
- supports a new consensus forming method (method number 4).
-
- If more than two thirds of votes that an authority has received claim
- they support method 4 then this new method will be used: The
- consensus document is formed like before but a new last step removes
- all routers from the listing that are not marked as Running.
-
diff --git a/doc/spec/proposals/139-conditional-consensus-download.txt b/doc/spec/proposals/139-conditional-consensus-download.txt
deleted file mode 100644
index 941f5ad6b..000000000
--- a/doc/spec/proposals/139-conditional-consensus-download.txt
+++ /dev/null
@@ -1,94 +0,0 @@
-Filename: 139-conditional-consensus-download.txt
-Title: Download consensus documents only when it will be trusted
-Author: Peter Palfrader
-Created: 2008-04-13
-Status: Closed
-Implemented-In: 0.2.1.x
-
-Overview:
-
- Servers only provide consensus documents to clients when it is known that
- the client will trust it.
-
-Motivation:
-
- When clients[1] want a new network status consensus they request it
- from a Tor server using the URL path /tor/status-vote/current/consensus.
- Then after downloading the client checks if this consensus can be
- trusted. Whether the client trusts the consensus depends on the
- authorities that the client trusts and how many of those
- authorities signed the consensus document.
-
- If the client cannot trust the consensus document it is disregarded
- and a new download is tried at a later time. Several hundred
- kilobytes of server bandwidth were wasted by this single client's
- request.
-
- With hundreds of thousands of clients this will have undesirable
- consequences when the list of authorities has changed so much that a
- large number of established clients no longer can trust any consensus
- document formed.
-
-Objective:
-
- The objective of this proposal is to make clients not download
- consensuses they will not trust.
-
-Proposal:
-
- The list of authorities that are trusted by a client are encoded in
- the URL they send to the directory server when requesting a consensus
- document.
-
- The directory server then only sends back the consensus when more than
- half of the authorities listed in the request have signed the
- consensus. If it is known that the consensus will not be trusted
- a 404 error code is sent back to the client.
-
- This proposal does not require directory caches to keep more than one
- consensus document. This proposal also does not require authorities
- to verify the signature on the consensus document of authorities they
- do not recognize.
-
- The new URL scheme to download a consensus is
- /tor/status-vote/current/consensus/<F> where F is a list of
- fingerprints, sorted in ascending order, and concatenated using a +
- sign.
-
- Fingerprints are uppercase hexadecimal encodings of the authority
- identity key's digest. Servers should also accept requests that
- use lower case or mixed case hexadecimal encodings.
-
- A .z URL for compressed versions of the consensus will be provided
- similarly to existing resources and is the URL that usually should
- be used by clients.
-
-Migration:
-
- The old location of the consensus should continue to work
- indefinitely. Not only is it used by old clients, but it is a useful
- resource for automated tools that do not particularly care which
- authorities have signed the consensus.
-
- Authorities that are known to the client a priori by being shipped
- with the Tor code are assumed to handle this format.
-
- When downloading a consensus document from caches that do not support this
- new format they fall back to the old download location.
-
- Caches support the new format starting with Tor version 0.2.1.1-alpha.
-
-Anonymity Implications:
-
- By supplying the list of authorities a client trusts to the directory
- server we leak information (like likely version of Tor client) to the
- directory server. In the current system we also leak that we are
- very old - by re-downloading the consensus over and over again, but
- only when we are so old that we no longer can trust the consensus.
-
-
-
-Footnotes:
- 1. For the purpose of this proposal a client can be any Tor instance
- that downloads a consensus document. This includes relays,
- directory caches as well as end users.
diff --git a/doc/spec/proposals/140-consensus-diffs.txt b/doc/spec/proposals/140-consensus-diffs.txt
deleted file mode 100644
index 8bc4070bf..000000000
--- a/doc/spec/proposals/140-consensus-diffs.txt
+++ /dev/null
@@ -1,156 +0,0 @@
-Filename: 140-consensus-diffs.txt
-Title: Provide diffs between consensuses
-Author: Peter Palfrader
-Created: 13-Jun-2008
-Status: Accepted
-Target: 0.2.2.x
-
-0. History
-
- 22-May-2009: Restricted the ed format even more strictly for ease of
- implementation. -nickm
-
-1. Overview.
-
- Tor clients and servers need a list of which relays are on the
- network. This list, the consensus, is created by authorities
- hourly and clients fetch a copy of it, with some delay, hourly.
-
- This proposal suggests that clients download diffs of consensuses
- once they have a consensus instead of hourly downloading a full
- consensus.
-
-2. Numbers
-
- After implementing proposal 138 which removes nodes that are not
- running from the list a consensus document is about 92 kilobytes
- in size after compression.
-
- The diff between two consecutive consensus, in ed format, is on
- average 13 kilobytes compressed.
-
-3. Proposal
-
-3.1 Clients
-
- If a client has a consensus that is recent enough it SHOULD
- try to download a diff to get the latest consensus rather than
- fetching a full one.
-
- [XXX: what is recent enough?
- time delta in hours / size of compressed diff
- 0 20
- 1 9650
- 2 17011
- 3 23150
- 4 29813
- 5 36079
- 6 39455
- 7 43903
- 8 48907
- 9 54549
- 10 60057
- 11 67810
- 12 71171
- 13 73863
- 14 76048
- 15 80031
- 16 84686
- 17 89862
- 18 94760
- 19 94868
- 20 94223
- 21 93921
- 22 92144
- 23 90228
- [ size of gzip compressed "diff -e" between the consensus on
- 2008-06-01-00:00:00 and the following consensuses that day.
- Consensuses have been modified to exclude down routers per
- proposal 138. ]
-
- Data suggests that for the first few hours diffs are very useful,
- saving about 60% for the first three hours, 30% for the first 10,
- and almost nothing once we are past 16 hours.
- ]
-
-3.2 Servers
-
- Directory authorities and servers need to keep up to X [XXX: depends
- on how long clients try to download diffs per above] old consensus
- documents so they can build diffs. They should offer a diff to the
- most recent consensus at the URL
-
- http://tor.noreply.org/tor/status-vote/current/consensus/diff/<HASH>/<FPRLIST>
-
- where hash is the full digest of the consensus the client currently
- has, and FPRLIST is a list of (abbreviated) fingerprints of
- authorities the client trusts.
-
- Servers will only return a consensus if more than half of the requested
- authorities have signed the document, otherwise a 404 error will be sent
- back. The fingerprints can be shortened to a length of any multiple of
- two, using only the leftmost part of the encoded fingerprint. Tor uses
- 3 bytes (6 hex characters) of the fingerprint. (This is just like the
- conditional consensus downloads that Tor supports starting with
- 0.1.2.1-alpha.)
-
- If a server cannot offer a diff from the consensus identified by the
- hash but has a current consensus it MUST return the full consensus.
-
- [XXX: what should we do when the client already has the latest
- consensus? I can think of the following options:
- - send back 3xx not modified
- - send back 200 ok and an empty diff
- - send back 404 nothing newer here.
-
- I currently lean towards the empty diff.]
-
-4. Diff Format
-
- Diffs start with the token "network-status-diff-version" followed by a
- space and the version number, currently "1".
-
- If a document does not start with network-status-diff it is assumed
- to be a full consensus download and would therefore currently start
- with "network-status-version 3".
-
- Following the network-status-diff header line is a diff, or patch, in
- limited ed format. We choose this format because it is easy to create
- and process with standard tools (patch, diff -e, ed). This will help
- us in developing and testing this proposal and it should make future
- debugging easier.
-
- [ If at one point in the future we decide that the space benefits from
- a custom diff format outweighs these benefits we can always
- introduce a new diff format and offer it at for instance
- ../diff2/... ]
-
- We support the following ed commands, each on a line by itself:
- - "<n1>d" Delete line n1
- - "<n1>,<n2>d" Delete lines n1 through n2, including
- - "<n1>c" Replace line n1 with the following block
- - "<n1>,<n2>c" Replace lines n1 through n2, including, with the
- following block.
- - "<n1>a" Append the following block after line n1.
- - "a" Append the following block after the current line.
- - "s/.//" Remove the first character in the current line.
-
- Note that line numbers always apply to the file after all previous
- commands have already been applied.
-
- The commands MUST apply to the file from back to front, such that
- lines are only ever referred to by their position in the original
- file.
-
- The "current line" is either the first line of the file, if this is
- the first command, the last line of a block we added in an append or
- change command, or the line immediate following a set of lines we just
- deleted (or the last line of the file if there are no lines after
- that).
-
- The replace and append command take blocks. These blocks are simply
- appended to the diff after the line with the command. A line with
- just a period (".") ends the block (and is not part of the lines
- to add). Note that it is impossible to insert a line with just
- a single dot. Recommended procedure is to insert a line with
- two dots, then remove the first character of that line using s/.//.
diff --git a/doc/spec/proposals/141-jit-sd-downloads.txt b/doc/spec/proposals/141-jit-sd-downloads.txt
deleted file mode 100644
index 2ac7a086b..000000000
--- a/doc/spec/proposals/141-jit-sd-downloads.txt
+++ /dev/null
@@ -1,323 +0,0 @@
-Filename: 141-jit-sd-downloads.txt
-Title: Download server descriptors on demand
-Author: Peter Palfrader
-Created: 15-Jun-2008
-Status: Draft
-
-1. Overview
-
- Downloading all server descriptors is the most expensive part
- of bootstrapping a Tor client. These server descriptors currently
- amount to about 1.5 Megabytes of data, and this size will grow
- linearly with network size.
-
- Fetching all these server descriptors takes a long while for people
- behind slow network connections. It is also a considerable load on
- our network of directory mirrors.
-
- This document describes proposed changes to the Tor network and
- directory protocol so that clients will no longer need to download
- all server descriptors.
-
- These changes consist of moving load balancing information into
- network status documents, implementing a means to download server
- descriptors on demand in an anonymity-preserving way, and dealing
- with exit node selection.
-
-2. What is in a server descriptor
-
- When a Tor client starts the first thing it will try to get is a
- current network status document: a consensus signed by a majority
- of directory authorities. This document is currently about 100
- Kilobytes in size, tho it will grow linearly with network size.
- This document lists all servers currently running on the network.
- The Tor client will then try to get a server descriptor for each
- of the running servers. All server descriptors currently amount
- to about 1.5 Megabytes of downloads.
-
- A Tor client learns several things about a server from its descriptor.
- Some of these it already learned from the network status document
- published by the authorities, but the server descriptor contains it
- again in a single statement signed by the server itself, not just by
- the directory authorities.
-
- Tor clients use the information from server descriptors for
- different purposes, which are considered in the following sections.
-
- #three ways: One, to determine if a server will be able to handle
- #this client's request; two, to actually communicate or use the server;
- #three, for load balancing decisions.
- #
- #These three points are considered in the following subsections.
-
-2.1 Load balancing
-
- The Tor load balancing mechanism is quite complex in its details, but
- it has a simple goal: The more traffic a server can handle the more
- traffic it should get. That means the more traffic a server can
- handle the more likely a client will use it.
-
- For this purpose each server descriptor has bandwidth information
- which tries to convey a server's capacity to clients.
-
- Currently we weigh servers differently for different purposes. There
- is a weight for when we use a server as a guard node (our entry to the
- Tor network), there is one weight we assign servers for exit duties,
- and a third for when we need intermediate (middle) nodes.
-
-2.2 Exit information
-
- When a Tor wants to exit to some resource on the internet it will
- build a circuit to an exit node that allows access to that resource's
- IP address and TCP Port.
-
- When building that circuit the client can make sure that the circuit
- ends at a server that will be able to fulfill the request because the
- client already learned of all the servers' exit policies from their
- descriptors.
-
-2.3 Capability information
-
- Server descriptors contain information about the specific version of
- the Tor protocol they understand [proposal 105].
-
- Furthermore the server descriptor also contains the exact version of
- the Tor software that the server is running and some decisions are
- made based on the server version number (for instance a Tor client
- will only make conditional consensus requests [proposal 139] when
- talking to Tor servers version 0.2.1.1-alpha or later).
-
-2.4 Contact/key information
-
- A server descriptor lists a server's IP address and TCP ports on which
- it accepts onion and directory connections. Furthermore it contains
- the onion key (a short lived RSA key to which clients encrypt CREATE
- cells).
-
-2.5 Identity information
-
- A Tor client learns the digest of a server's key from the network
- status document. Once it has a server descriptor this descriptor
- contains the full RSA identity key of the server. Clients verify
- that 1) the digest of the identity key matches the expected digest
- it got from the consensus, and 2) that the signature on the descriptor
- from that key is valid.
-
-
-3. No longer require clients to have copies of all SDs
-
-3.1 Load balancing info in consensus documents
-
- One of the reasons why clients download all server descriptors is for
- doing load proper load balancing as described in 2.1. In order for
- clients to not require all server descriptors this information will
- have to move into the network status document.
-
- Consensus documents will have a new line per router similar
- to the "r", "s", and "v" lines that already exist. This line
- will convey weight information to clients.
-
- "w Bandwidth=193"
-
- The bandwidth number is the lesser of observed bandwidth and bandwidth
- rate limit from the server descriptor that the "r" line referenced by
- digest (1st and 3rd field of the bandwidth line in the descriptor).
- It is given in kilobytes per second so the byte value in the
- descriptor has to be divided by 1024 (and is then truncated, i.e.
- rounded down).
-
- Authorities will cap the bandwidth number at some arbitrary value,
- currently 10MB/sec. If a router claims a larger bandwidth an
- authority's vote will still only show Bandwidth=10240.
-
- The consensus value for bandwidth is the median of all bandwidth
- numbers given in votes. In case of an even number of votes we use
- the lower median. (Using this procedure allows us to change the
- cap value more easily.)
-
- Clients should believe the bandwidth as presented in the consensus,
- not capping it again.
-
-3.2 Fetching descriptors on demand
-
- As described in 2.4 a descriptor lists IP address, OR- and Dir-Port,
- and the onion key for a server.
-
- A client already knows the IP address and the ports from the consensus
- documents, but without the onion key it will not be able to send
- CREATE/EXTEND cells for that server. Since the client needs the onion
- key it needs the descriptor.
-
- If a client only downloaded a few descriptors in an observable manner
- then that would leak which nodes it was going to use.
-
- This proposal suggests the following:
-
- 1) when connecting to a guard node for which the client does not
- yet have a cached descriptor it requests the descriptor it
- expects by hash. (The consensus document that the client holds
- has a hash for the descriptor of this server. We want exactly
- that descriptor, not a different one.)
-
- It does that by sending a RELAY_REQUEST_SD cell.
-
- A client MAY cache the descriptor of the guard node so that it does
- not need to request it every single time it contacts the guard.
-
- 2) when a client wants to extend a circuit that currently ends in
- server B to a new next server C, the client will send a
- RELAY_REQUEST_SD cell to server B. This cell contains in its
- payload the hash of a server descriptor the client would like
- to obtain (C's server descriptor). The server sends back the
- descriptor and the client can now form a valid EXTEND/CREATE cell
- encrypted to C's onion key.
-
- Clients MUST NOT cache such descriptors. If they did they might
- leak that they already extended to that server at least once
- before.
-
- Replies to RELAY_REQUEST_SD requests need to be padded to some
- constant upper limit in order to conceal a client's destination
- from anybody who might be counting cells/bytes.
-
- RELAY_REQUEST_SD cells contain the following information:
- - hash of the server descriptor requested
- - hash of the identity digest of the server for which we want the SD
- - IP address and OR-port or the server for which we want the SD
- - padding factor - the number of cells we want the answer
- padded to.
- [XXX this just occured to me and it might be smart. or it might
- be stupid. clients would learn the padding factor they want
- to use from the consensus document. This allows us to grow
- the replies later on should SDs become larger.]
- [XXX: figure out a decent padding size]
-
-3.3 Protocol versions
-
- Server descriptors contain optional information of supported
- link-level and circuit-level protocols in the form of
- "opt protocols Link 1 2 Circuit 1". These are not currently needed
- and will probably eventually move into the "v" (version) line in
- the consensus. This proposal does not deal with them.
-
- Similarly a server descriptor contains the version number of
- a Tor node. This information is already present in the consensus
- and is thus available to all clients immediately.
-
-3.4 Exit selection
-
- Currently finding an appropriate exit node for a user's request is
- easy for a client because it has complete knowledge of all the exit
- policies of all servers on the network.
-
- The consensus document will once again be extended to contain the
- information required by clients. This information will be a summary
- of each node's exit policy. The exit policy summary will only contain
- the list of ports to which a node exits to most destination IP
- addresses.
-
- A summary should claim a router exits to a specific TCP port if,
- ignoring private IP addresses, the exit policy indicates that the
- router would exit to this port to most IP address. either two /8
- netblocks, or one /8 and a couple of /12s or any other combination).
- The exact algorith used is this: Going through all exit policy items
- - ignore any accept that is not for all IP addresses ("*"),
- - ignore rejects for these netblocks (exactly, no subnetting):
- 0.0.0.0/8, 169.254.0.0/16, 127.0.0.0/8, 192.168.0.0/16, 10.0.0.0/8,
- and 172.16.0.0/12m
- - for each reject count the number of IP addresses rejected against
- the affected ports,
- - once we hit an accept for all IP addresses ("*") add the ports in
- that policy item to the list of accepted ports, if they don't have
- more than 2^25 IP addresses (that's two /8 networks) counted
- against them (i.e. if the router exits to a port to everywhere but
- at most two /8 networks).
-
- An exit policy summary will be included in votes and consensus as a
- new line attached to each exit node. The line will have the format
- "p" <space> "accept"|"reject" <portlist>
- where portlist is a comma seperated list of single port numbers or
- portranges (e.g. "22,80-88,1024-6000,6667").
-
- Whether the summary shows the list of accepted ports or the list of
- rejected ports depends on which list is shorter (has a shorter string
- representation). In case of ties we choose the list of accepted
- ports. As an exception to this rule an allow-all policy is
- represented as "accept 1-65535" instead of "reject " and a reject-all
- policy is similarly given as "reject 1-65535".
-
- Summary items are compressed, that is instead of "80-88,89-100" there
- only is a single item of "80-100", similarly instead of "20,21" a
- summary will say "20-21".
-
- Port lists are sorted in ascending order.
-
- The maximum allowed length of a policy summary (including the "accept "
- or "reject ") is 1000 characters. If a summary exceeds that length we
- use an accept-style summary and list as much of the port list as is
- possible within these 1000 bytes.
-
-3.4.1 Consensus selection
-
- When building a consensus, authorities have to agree on a digest of
- the server descriptor to list in the router line for each router.
- This is documented in dir-spec section 3.4.
-
- All authorities that listed that agreed upon descriptor digest in
- their vote should also list the same exit policy summary - or list
- none at all if the authority has not been upgraded to list that
- information in their vote.
-
- If we have votes with matching server descriptor digest of which at
- least one of them has an exit policy then we differ between two cases:
- a) all authorities agree (or abstained) on the policy summary, and we
- use the exit policy summary that they all listed in their vote,
- b) something went wrong (or some authority is playing foul) and we
- have different policy summaries. In that case we pick the one
- that is most commonly listed in votes with the matching
- descriptor. We break ties in favour of the lexigraphically larger
- vote.
-
- If none one of the votes with a matching server descriptor digest has
- an exit policy summary we use the most commonly listed one in all
- votes, breaking ties like in case b above.
-
-3.4.2 Client behaviour
-
- When choosing an exit node for a specific request a Tor client will
- choose from the list of nodes that exit to the requested port as given
- by the consensus document. If a client has additional knowledge (like
- cached full descriptors) that indicates the so chosen exit node will
- reject the request then it MAY use that knowledge (or not include such
- nodes in the selection to begin with). However, clients MUST NOT use
- nodes that do not list the port as accepted in the summary (but for
- which they know that the node would exit to that address from other
- sources, like a cached descriptor).
-
- An exception to this is exit enclave behaviour: A client MAY use the
- node at a specific IP address to exit to any port on the same address
- even if that node is not listed as exiting to the port in the summary.
-
-4. Migration
-
-4.1 Consensus document changes.
-
- The consensus will need to include
- - bandwidth information (see 3.1)
- - exit policy summaries (3.4)
-
- A new consensus method (number TBD) will be chosen for this.
-
-5. Future possibilities
-
- This proposal still requires that all servers have the descriptors of
- every other node in the network in order to answer RELAY_REQUEST_SD
- cells. These cells are sent when a circuit is extended from ending at
- node B to a new node C. In that case B would have to answer a
- RELAY_REQUEST_SD cell that asks for C's server descriptor (by SD digest).
-
- In order to answer that request B obviously needs a copy of C's server
- descriptor. The RELAY_REQUEST_SD cell already has all the info that
- B needs to contact C so it can ask about the descriptor before passing it
- back to the client.
-
diff --git a/doc/spec/proposals/142-combine-intro-and-rend-points.txt b/doc/spec/proposals/142-combine-intro-and-rend-points.txt
deleted file mode 100644
index 3abd5c863..000000000
--- a/doc/spec/proposals/142-combine-intro-and-rend-points.txt
+++ /dev/null
@@ -1,277 +0,0 @@
-Filename: 142-combine-intro-and-rend-points.txt
-Title: Combine Introduction and Rendezvous Points
-Author: Karsten Loesing, Christian Wilms
-Created: 27-Jun-2008
-Status: Dead
-
-Change history:
-
- 27-Jun-2008 Initial proposal for or-dev
- 04-Jul-2008 Give first security property the new name "Responsibility"
- and change new cell formats according to rendezvous protocol
- version 3 draft.
- 19-Jul-2008 Added comment by Nick (but no solution, yet) that sharing of
- circuits between multiple clients is not supported by Tor.
-
-Overview:
-
- Establishing a connection to a hidden service currently involves two Tor
- relays, introduction and rendezvous point, and 10 more relays distributed
- over four circuits to connect to them. The introduction point is
- established in the mid-term by a hidden service to transfer introduction
- requests from client to the hidden service. The rendezvous point is set
- up by the client for a single hidden service request and actually
- transfers end-to-end encrypted application data between client and hidden
- service.
-
- There are some reasons for separating the two roles of introduction and
- rendezvous point: (1) Responsibility: A relay shall not be made
- responsible that it relays data for a certain hidden service; in the
- original design as described in [1] an introduction point relays no
- application data, and a rendezvous points neither knows the hidden
- service nor can it decrypt the data. (2) Scalability: The hidden service
- shall not have to maintain a number of open circuits proportional to the
- expected number of client requests. (3) Attack resistance: The effect of
- an attack on the only visible parts of a hidden service, its introduction
- points, shall be as small as possible.
-
- However, elimination of a separate rendezvous connection as proposed by
- Øverlier and Syverson [2] is the most promising approach to improve the
- delay in connection establishment. From all substeps of connection
- establishment extending a circuit by only a single hop is responsible for
- a major part of delay. Reducing on-demand circuit extensions from two to
- one results in a decrease of mean connection establishment times from 39
- to 29 seconds [3]. Particularly, eliminating the delay on hidden-service
- side allows the client to better observe progress of connection
- establishment, thus allowing it to use smaller timeouts. Proposal 114
- introduced new introduction keys for introduction points and provides for
- user authorization data in hidden service descriptors; it will be shown
- in this proposal that introduction keys in combination with new
- introduction cookies provide for the first security property
- responsibility. Further, eliminating the need for a separate introduction
- connection benefits the overall network load by decreasing the number of
- circuit extensions. After all, having only one connection between client
- and hidden service reduces the overall protocol complexity.
-
-Design:
-
- 1. Hidden Service Configuration
-
- Hidden services should be able to choose whether they would like to use
- this protocol. This might be opt-in for 0.2.1.x and opt-out for later
- major releases.
-
- 2. Contact Point Establishment
-
- When preparing a hidden service, a Tor client selects a set of relays to
- act as contact points instead of introduction points. The contact point
- combines both roles of introduction and rendezvous point as proposed in
- [2]. The only requirement for a relay to be picked as contact point is
- its capability of performing this role. This can be determined from the
- Tor version number that needs to be equal or higher than the first
- version that implements this proposal.
-
- The easiest way to implement establishment of contact points is to
- introduce v2 ESTABLISH_INTRO cells. By convention, the relay recognizes
- version 2 ESTABLISH_INTRO cells as requests to establish a contact point
- rather than an introduction point.
-
- V Format byte: set to 255 [1 octet]
- V Version byte: set to 2 [1 octet]
- KLEN Key length [2 octets]
- PK Public introduction key [KLEN octets]
- HS Hash of session info [20 octets]
- SIG Signature of above information [variable]
-
- The hidden service does not create a fixed number of contact points, like
- 3 in the current protocol. It uses a minimum of 3 contact points, but
- increases this number depending on the history of client requests within
- the last hour. The hidden service also increases this number depending on
- the frequency of failing contact points in order to defend against
- attacks on its contact points. When client authorization as described in
- proposal 121 is used, a hidden service can also use the number of
- authorized clients as first estimate for the required number of contact
- points.
-
- 3. Hidden Service Descriptor Creation
-
- A hidden service needs to issue a fresh introduction cookie for each
- established introduction point. By requiring clients to use this cookie
- in a later connection establishment, an introduction point cannot access
- the hidden service that it works for. Together with the fresh
- introduction key that was introduced in proposal 114, this reduces
- responsibility of a contact point for a specific hidden service.
-
- The v2 hidden service descriptor format contains an
- "intro-authentication" field that may contain introduction-point specific
- keys. The hidden service creates a random string, comparable to the
- rendezvous cookie, and includes it in the descriptor as introduction
- cookie for auth-type "1". By convention, clients recognize existence of
- auth-type 1 as possibility to connect to a hidden service via a contact
- point rather than an introduction point. Older clients that do not
- understand this new protocol simply ignore that cookie.
-
- 4. Connection Establishment
-
- When establishing a connection to a hidden service a client learns about
- the capability of using the new protocol from the hidden service
- descriptor. It may choose whether to use this new protocol or not,
- whereas older clients cannot understand the new capability and can only
- use the current protocol. Client using version 0.2.1.x should be able to
- opt-in for using the new protocol, which should change to opt-out for
- later major releases.
-
- When using the new capability the client creates a v2 INTRODUCE1 cell
- that extends an unversioned INTRODUCE1 cell by adding the content of an
- ESTABLISH_RENDEZVOUS cell. Further, the client sends this cell using the
- new cell type 41 RELAY_INTRODUCE1_VERSIONED to the introduction point,
- because unversioned and versioned INTRODUCE1 cells are indistinguishable:
-
- Cleartext
- V Version byte: set to 2 [1 octet]
- PK_ID Identifier for Bob's PK [20 octets]
- RC Rendezvous cookie [20 octets]
- Encrypted to introduction key:
- VER Version byte: set to 3. [1 octet]
- AUTHT The auth type that is supported [1 octet]
- AUTHL Length of auth data [2 octets]
- AUTHD Auth data [variable]
- RC Rendezvous cookie [20 octets]
- g^x Diffie-Hellman data, part 1 [128 octets]
-
- The cleartext part contains the rendezvous cookie that the contact point
- remembers just as a rendezvous point would do.
-
- The encrypted part contains the introduction cookie as auth data for the
- auth type 1. The rendezvous cookie is contained as before, but there is
- no further rendezvous point information, as there is no separate
- rendezvous point.
-
- 5. Rendezvous Establishment
-
- The contact point recognizes a v2 INTRODUCE1 cell with auth type 1 as a
- request to be used in the new protocol. It remembers the contained
- rendezvous cookie, replies to the client with an INTRODUCE_ACK cell
- (omitting the RENDEZVOUS_ESTABLISHED cell), and forwards the encrypted
- part of the INTRODUCE1 cell as INTRODUCE2 cell to the hidden service.
-
- 6. Introduction at Hidden Service
-
- The hidden services recognizes an INTRODUCE2 cell containing an
- introduction cookie as authorization data. In this case, it does not
- extend a circuit to a rendezvous point, but sends a RENDEZVOUS1 cell
- directly back to its contact point as usual.
-
- 7. Rendezvous at Contact Point
-
- The contact point processes a RENDEZVOUS1 cell just as a rendezvous point
- does. The only difference is that the hidden-service-side circuit is not
- exclusive for the client connection, but shared among multiple client
- connections.
-
- [Tor does not allow sharing of a single circuit among multiple client
- connections easily. We need to think about a smart and efficient way to
- implement this. Comment by Nick. -KL]
-
-Security Implications:
-
- (1) Responsibility
-
- One of the original reasons for the separation of introduction and
- rendezvous points is that a relay shall not be made responsible that it
- relays data for a certain hidden service. In the current design an
- introduction point relays no application data and a rendezvous points
- neither knows the hidden service nor can it decrypt the data.
-
- This property is also fulfilled in this new design. A contact point only
- learns a fresh introduction key instead of the hidden service key, so
- that it cannot recognize a hidden service. Further, the introduction
- cookie, which is unknown to the contact point, prevents it from accessing
- the hidden service itself. The only way for a contact point to access a
- hidden service is to look up whether it is contained in the descriptors
- of known hidden services. A contact point cannot directly be made
- responsible for which hidden service it is working. In addition to that,
- it cannot learn the data that it transfers, because all communication
- between client and hidden service are end-to-end encrypted.
-
- (2) Scalability
-
- Another goal of the existing hidden service protocol is that a hidden
- service does not have to maintain a number of open circuits proportional
- to the expected number of client requests. The rationale behind this is
- better scalability.
-
- The new protocol eliminates the need for a hidden service to extend
- circuits on demand, which has a positive effect on circuits establishment
- times and overall network load. The solution presented here to establish
- a number of contact points proportional to the history of connection
- requests reduces the number of circuits to a minimum number that fits the
- hidden service's needs.
-
- (3) Attack resistance
-
- The third goal of separating introduction and rendezvous points is to
- limit the effect of an attack on the only visible parts of a hidden
- service which are the contact points in this protocol.
-
- In theory, the new protocol is more vulnerable to this attack. An
- attacker who can take down a contact point does not only eliminate an
- access point to the hidden service, but also breaks current client
- connections to the hidden service using that contact point.
-
- Øverlier and Syverson proposed the concept of valet nodes as additional
- safeguard for introduction/contact points [4]. Unfortunately, this
- increases hidden service protocol complexity conceptually and from an
- implementation point of view. Therefore, it is not included in this
- proposal.
-
- However, in practice attacking a contact point (or introduction point) is
- not as rewarding as it might appear. The cost for a hidden service to set
- up a new contact point and publish a new hidden service descriptor is
- minimal compared to the efforts necessary for an attacker to take a Tor
- relay down. As a countermeasure to further frustrate this attack, the
- hidden service raises the number of contact points as a function of
- previous contact point failures.
-
- Further, the probability of breaking client connections due to attacking
- a contact point is minimal. It can be assumed that the probability of one
- of the other five involved relays in a hidden service connection failing
- or being shut down is higher than that of a successful attack on a
- contact point.
-
- (4) Resistance against Locating Attacks
-
- Clients are no longer able to force a hidden service to create or extend
- circuits. This further reduces an attacker's capabilities of locating a
- hidden server as described by Øverlier and Syverson [5].
-
-Compatibility:
-
- The presented protocol does not raise compatibility issues with current
- Tor versions. New relay versions support both, the existing and the
- proposed protocol as introduction/rendezvous/contact points. A contact
- point acts as introduction point simultaneously. Hidden services and
- clients can opt-in to use the new protocol which might change to opt-out
- some time in the future.
-
-References:
-
- [1] Roger Dingledine, Nick Mathewson, and Paul Syverson, Tor: The
- Second-Generation Onion Router. In the Proceedings of the 13th USENIX
- Security Symposium, August 2004.
-
- [2] Lasse Øverlier and Paul Syverson, Improving Efficiency and Simplicity
- of Tor Circuit Establishment and Hidden Services. In the Proceedings of
- the Seventh Workshop on Privacy Enhancing Technologies (PET 2007),
- Ottawa, Canada, June 2007.
-
- [3] Christian Wilms, Improving the Tor Hidden Service Protocol Aiming at
- Better Performance, diploma thesis, June 2008, University of Bamberg.
-
- [4] Lasse Øverlier and Paul Syverson, Valet Services: Improving Hidden
- Servers with a Personal Touch. In the Proceedings of the Sixth Workshop
- on Privacy Enhancing Technologies (PET 2006), Cambridge, UK, June 2006.
-
- [5] Lasse Øverlier and Paul Syverson, Locating Hidden Servers. In the
- Proceedings of the 2006 IEEE Symposium on Security and Privacy, May 2006.
-
diff --git a/doc/spec/proposals/143-distributed-storage-improvements.txt b/doc/spec/proposals/143-distributed-storage-improvements.txt
deleted file mode 100644
index 0f7468f1d..000000000
--- a/doc/spec/proposals/143-distributed-storage-improvements.txt
+++ /dev/null
@@ -1,194 +0,0 @@
-Filename: 143-distributed-storage-improvements.txt
-Title: Improvements of Distributed Storage for Tor Hidden Service Descriptors
-Author: Karsten Loesing
-Created: 28-Jun-2008
-Status: Open
-Target: 0.2.1.x
-
-Change history:
-
- 28-Jun-2008 Initial proposal for or-dev
-
-Overview:
-
- An evaluation of the distributed storage for Tor hidden service
- descriptors and subsequent discussions have brought up a few improvements
- to proposal 114. All improvements are backwards compatible to the
- implementation of proposal 114.
-
-Design:
-
- 1. Report Bad Directory Nodes
-
- Bad hidden service directory nodes could deny existence of previously
- stored descriptors. A bad directory node that does this with all stored
- descriptors causes harm to the distributed storage in general, but
- replication will cope with this problem in most cases. However, an
- adversary that attempts to make a specific hidden service unavailable by
- running relays that become responsible for all of a service's
- descriptors poses a more serious threat. The distributed storage needs to
- defend against this attack by detecting and removing bad directory nodes.
-
- As a countermeasure hidden services try to download their descriptors
- every hour at random times from the hidden service directories that are
- responsible for storing it. If a directory node replies with 404 (Not
- found), the hidden service reports the supposedly bad directory node to
- a random selection of half of the directory authorities (with version
- numbers equal to or higher than the first version that implements this
- proposal). The hidden service posts a complaint message using HTTP 'POST'
- to a URL "/tor/rendezvous/complain" with the following message format:
-
- "hidden-service-directory-complaint" identifier NL
-
- [At start, exactly once]
-
- The identifier of the hidden service directory node to be
- investigated.
-
- "rendezvous-service-descriptor" descriptor NL
-
- [At end, Excatly once]
-
- The hidden service descriptor that the supposedly bad directory node
- does not serve.
-
- The directory authority checks if the descriptor is valid and the hidden
- service directory responsible for storing it. It waits for a random time
- of up to 30 minutes before posting the descriptor to the hidden service
- directory. If the publication is acknowledged, the directory authority
- waits another random time of up to 30 minutes before attempting to
- request the descriptor that it has posted. If the directory node replies
- with 404 (Not found), it will be blacklisted for being a hidden service
- directory node for the next 48 hours.
-
- A blacklisted hidden service directory is assigned the new flag BadHSDir
- instead of the HSDir flag in the vote that a directory authority creates.
- In a consensus a relay is only assigned a HSDir flag if the majority of
- votes contains a HSDir flag and no more than one third of votes contains
- a BadHSDir flag. As a result, clients do not have to learn about the
- BadHSDir flag. A blacklisted directory node will simply not be assigned
- the HSDir flag in the consensus.
-
- In order to prevent an attacker from setting up new nodes as replacement
- for blacklisted directory nodes, all directory nodes in the same /24
- subnet are blacklisted, too. Furthermore, if two or more directory nodes
- are blacklisted in the same /16 subnet concurrently, all other directory
- nodes in that /16 subnet are blacklisted, too. Blacklisting holds for at
- most 48 hours.
-
- 2. Publish Fewer Replicas
-
- The evaluation has shown that the probability of a directory node to
- serve a previously stored descriptor is 85.7% (more precisely, this is
- the 0.001-quantile of the empirical distribution with the rationale that
- it holds for 99.9% of all empirical cases). If descriptors are replicated
- to x directory nodes, the probability of at least one of the replicas to
- be available for clients is 1 - (1 - 85.7%) ^ x. In order to achieve an
- overall availability of 99.9%, x = 3.55 replicas need to be stored. From
- this follows that 4 replicas are sufficient, rather than the currently
- stored 6 replicas.
-
- Further, the current design stores 2 sets of descriptors on 3 directory
- nodes with consecutive identities. Originally, this was meant to
- facilitate replication between directory nodes, which has not been and
- will not be implemented (the selection criterion of 24 hours uptime does
- not make it necessary). As a result, storing descriptors on directory
- nodes with consecutive identities is not required. In fact it should be
- avoided to enable an attacker to create "black holes" in the identifier
- ring.
-
- Hidden services should store their descriptors on 4 non-consecutive
- directory nodes, and clients should request descriptors from these
- directory nodes only. For compatibility reasons, hidden services also
- store their descriptors on 2 consecutive directory nodes. Hence, 0.2.0.x
- clients will be able to retrieve 4 out of 6 descriptors, but will fail
- for the remaining 2 descriptors, which is sufficient for reliability. As
- soon as 0.2.0.x is deprecated, hidden services can stop publishing the
- additional 2 replicas.
-
- 3. Change Default Value of Being Hidden Service Directory
-
- The requirements for becoming a hidden service directory node are an open
- directory port and an uptime of at least 24 hours. The evaluation has
- shown that there are 300 hidden service directory candidates in the mean,
- but only 6 of them are configured to act as hidden service directories.
- This is bad, because those 6 nodes need to serve a large share of all
- hidden service descriptors. Optimally, there should be hundreds of hidden
- service directories. Having a large number of 0.2.1.x directory nodes
- also has a positive effect on 0.2.0.x hidden services and clients.
-
- Therefore, the new default of HidServDirectoryV2 should be 1, so that a
- Tor relay that has an open directory port automatically accepts and
- serves v2 hidden service descriptors. A relay operator can still opt-out
- running a hidden service directory by changing HidServDirectoryV2 to 0.
- The additional bandwidth requirements for running a hidden service
- directory node in addition to being a directory cache are negligible.
-
- 4. Make Descriptors Persistent on Directory Nodes
-
- Hidden service directories that are restarted by their operators or after
- a failure will not be selected as hidden service directories within the
- next 24 hours. However, some clients might still think that these nodes
- are responsible for certain descriptors, because they work on the basis
- of network consensuses that are up to three hours old. The directory
- nodes should be able to serve the previously received descriptors to
- these clients. Therefore, directory nodes make all received descriptors
- persistent and load previously received descriptors on startup.
-
- 5. Store and Serve Descriptors Regardless of Responsibility
-
- Currently, directory nodes only accept descriptors for which they think
- they are responsible. This may lead to problems when a directory node
- uses an older or newer network consensus than hidden service or client
- or when a directory node has been restarted recently. In fact, there are
- no security issues in storing or serving descriptors for which a
- directory node thinks it is not responsible. To the contrary, doing so
- may improve reliability in border cases. As a result, a directory node
- does not pay attention to responsibilty when receiving a publication or
- fetch request, but stores or serves the requested descriptor. Likewise,
- the directory node does not remove descriptors when it thinks it is not
- responsible for them any more.
-
- 6. Avoid Periodic Descriptor Re-Publication
-
- In the current implementation a hidden service re-publishes its
- descriptor either when its content changes or an hour elapses. However,
- the evaluation has shown that failures of hidden service directory nodes,
- i.e. of nodes that have not failed within the last 24 hours, are very
- rare. Together with making descriptors persistent on directory nodes,
- there is no necessity to re-publish descriptors hourly.
-
- The only two events leading to descriptor re-publication should be a
- change of the descriptor content and a new directory node becoming
- responsible for the descriptor. Hidden services should therefore consider
- re-publication every time they learn about a new network consensus
- instead of hourly.
-
- 7. Discard Expired Descriptors
-
- The current implementation lets directory nodes keep a descriptor for two
- days before discarding it. However, with the v2 design, descriptors are
- only valid for at most one day. Directory nodes should determine the
- validity of stored descriptors and discard them one hour after they have
- expired (to compensate wrong clocks on clients).
-
- 8. Shorten Client-Side Descriptor Fetch History
-
- When clients try to download a hidden service descriptor, they memorize
- fetch requests to directory nodes for up to 15 minutes. This allows them
- to request all replicas of a descriptor to avoid bad or failing directory
- nodes, but without querying the same directory node twice.
-
- The downside is that a client that has requested a descriptor without
- success, will not be able to find a hidden service that has been started
- during the following 15 minutes after the client's last request.
-
- This can be improved by shortening the fetch history to only 5 minutes.
- This time should be sufficient to complete requests for all replicas of a
- descriptor, but without ending in an infinite request loop.
-
-Compatibility:
-
- All proposed improvements are compatible to the currently implemented
- design as described in proposal 114.
-
diff --git a/doc/spec/proposals/144-enforce-distinct-providers.txt b/doc/spec/proposals/144-enforce-distinct-providers.txt
deleted file mode 100644
index aa460482f..000000000
--- a/doc/spec/proposals/144-enforce-distinct-providers.txt
+++ /dev/null
@@ -1,165 +0,0 @@
-Filename: 144-enforce-distinct-providers.txt
-Title: Increase the diversity of circuits by detecting nodes belonging the
- same provider
-Author: Mfr
-Created: 2008-06-15
-Status: Draft
-
-Overview:
-
- Increase network security by reducing the capacity of the relay or
- ISPs monitoring personally or requisition, a large part of traffic
- Tor trying to break circuits privacy. A way to increase the
- diversity of circuits without killing the network performance.
-
-Motivation:
-
- Since 2004, Roger an Nick publication about diversity [1], very fast
- relays Tor running are focused among an half dozen of providers,
- controlling traffic of some dozens of routers [2].
-
- In the same way the generalization of VMs clonables paid by hour,
- allowing starting in few minutes and for a small cost, a set of very
- high-speed relay whose in a few hours can attract a big traffic that
- can be analyzed, increasing the vulnerability of the network.
-
- Whether ISPs or domU providers, these usually have several groups of
- IP Class B. Also the restriction in place EnforceDistinctSubnets
- automatically excluding IP subnet class B is only partially
- effective. By contrast a restriction at the class A will be too
- restrictive.
-
- Therefore it seems necessary to consider another approach.
-
-Proposal:
-
- Add a provider control based on AS number added by the router on is
- descriptor, controlled by Directories Authorities, and used like the
- declarative family field for circuit creating.
-
-Design:
-
-Step 1 :
-
- Add to the router descriptor a provider information get request [4]
- by the router itself.
-
- "provider" name NL
-
- 'names' is the AS number of the router formated like this:
- 'ASxxxxxx' where AS is fixed and xxxxxx is the AS number,
- left aligned ( ex: AS98304 , AS4096,AS1 ) or if AS number
- is missing the network A class number is used like that:
- 'ANxxx' where AN is fixed and xxx is the first 3 digits of
- the IP (ex: for the IP 1.1.1.2 AN1) or an 'L' value is set
- if it's a local network IP.
-
- If two ORs list one another in their "provider" entries,
- then OPs should treat them as a single OR for the purpose
- of path selection.
-
- For example, if node A's descriptor contains "provider B",
- and node B's descriptor contains "provider A", then node A
- and node B should never be used on the same circuit.
-
- Add the regarding config option in torrc
-
- EnforceDistinctProviders set to 1 by default.
- Permit building circuits with relays in the same provider
- if set to 0.
- Regarding to proposal 135 if TestingTorNetwork is set
- need to be EnforceDistinctProviders is unset.
-
- Control by Authorities Directories of the AS numbers
-
- The Directories Authority control the AS numbers of the new node
- descriptor uploaded.
-
- If an old version is operated by the node this test is
- bypassed.
-
- If AS number get by request is different from the
- description, router is flagged as non-Valid by the testing
- Authority for the voting process.
-
-Step 2 When a ' significant number of nodes' of valid routers are
-generating descriptor with provider information.
-
- Add missing provider information get by DNS request
-functionality for the circuit user:
-
- During circuit building, computing, OP apply first
- family check and EnforceDistinctSubnets directives for
- performance, then if provider info is needed and
- missing in router descriptor try to get AS provider
- info by DNS request [4]. This information could be
- DNS cached. AN ( class A number) is never generated
- during this process to prevent DNS block problems. If
- DNS request fails ignore and continue building
- circuit.
-
-Step 3 When the 'whole majority' of valid Tor clients are providing
-DNS request.
-
- Older versions are deprecated and mark as no-Valid.
-
- EnforceDistinctProviders replace EnforceDistinctSubnets functionnality.
-
- EnforceDistinctSubnets is removed.
-
- Functionalities deployed in step 2 are removed.
-
-Security implications:
-
- This providermeasure will increase the number of providers
- addresses that an attacker must use in order to carry out
- traffic analysis.
-
-Compatibility:
-
- The presented protocol does not raise compatibility issues
- with current Tor versions. The compatibility is preserved by
- implementing this functionality in 3 steps, giving time to
- network users to upgrade clients and routers.
-
-Performance and scalability notes:
-
- Provider change for all routers could reduce a little
- performance if the circuit to long.
-
- During step 2 Get missing provider information could increase
- building path time and should have a time out.
-
-Possible Attacks/Open Issues/Some thinking required:
-
- These proposal seems be compatible with proposal 135 Simplify
- Configuration of Private Tor Networks.
-
- This proposal does not resolve multiples AS owners and top
- providers traffic monitoring attacks [5].
-
- Unresolved AS number are treated as a Class A network. Perhaps
- should be marked as invalid. But there's only fives items on
- last check see [2].
-
- Need to define what's a 'significant number of nodes' and
- 'whole majority' ;-)
-
-References:
-[1] Location Diversity in Anonymity Networks by Nick Feamster and Roger
-Dingledine.
-In the Proceedings of the Workshop on Privacy in the Electronic Society
-(WPES 2004), Washington, DC, USA, October 2004
-http://freehaven.net/anonbib/#feamster:wpes2004
-[2] http://as4jtw5gc6efb267.onion/IPListbyAS.txt
-[3] see Goodell Tor Exit Page
-http://cassandra.eecs.harvard.edu/cgi-bin/exit.py
-[4] see the great IP to ASN DNS Tool
-http://www.team-cymru.org/Services/ip-to-asn.html
-[5] Sampled Traffic Analysis by Internet-Exchange-Level Adversaries by
-Steven J. Murdoch and Piotr Zielinski.
-In the Proceedings of the Seventh Workshop on Privacy Enhancing Technologies
-
-(PET 2007), Ottawa, Canada, June 2007.
-http://freehaven.net/anonbib/#murdoch-pet2007
-[5] http://bugs.noreply.org/flyspray/index.php?do=details&id=690
diff --git a/doc/spec/proposals/145-newguard-flag.txt b/doc/spec/proposals/145-newguard-flag.txt
deleted file mode 100644
index 9e61e30be..000000000
--- a/doc/spec/proposals/145-newguard-flag.txt
+++ /dev/null
@@ -1,39 +0,0 @@
-Filename: 145-newguard-flag.txt
-Title: Separate "suitable as a guard" from "suitable as a new guard"
-Author: Nick Mathewson
-Created: 1-Jul-2008
-Status: Open
-Target: 0.2.1.x
-
-[This could be obsoleted by proposal 141, which could replace NewGuard
-with a Guard weight.]
-
-Overview
-
- Right now, Tor has one flag that clients use both to tell which
- nodes should be kept as guards, and which nodes should be picked
- when choosing new guards. This proposal separates this flag into
- two.
-
-Motivation
-
- Balancing clients amoung guards is not done well by our current
- algorithm. When a new guard appears, it is chosen by clients
- looking for a new guard with the same probability as all existing
- guards... but new guards are likelier to be under capacity, whereas
- old guards are likelier to be under more use.
-
-Implementation
-
- We add a new flag, NewGuard. Clients will change so that when they
- are choosing new guards, they only consider nodes with the NewGuard
- flag set.
-
- For now, authorities will always set NewGuard if they are setting
- the Guard flag. Later, it will be easy to migrate authorities to
- set NewGuard for underused guards.
-
-Alternatives
-
- We might instead have authorities list weights with which nodes
- should be picked as guards.
diff --git a/doc/spec/proposals/146-long-term-stability.txt b/doc/spec/proposals/146-long-term-stability.txt
deleted file mode 100644
index 9af001744..000000000
--- a/doc/spec/proposals/146-long-term-stability.txt
+++ /dev/null
@@ -1,84 +0,0 @@
-Filename: 146-long-term-stability.txt
-Title: Add new flag to reflect long-term stability
-Author: Nick Mathewson
-Created: 19-Jun-2008
-Status: Open
-Target: 0.2.1.x
-
-Overview
-
- This document proposes a new flag to indicate that a router has
- existed at the same address for a long time, describes how to
- implement it, and explains what it's good for.
-
-Motivation
-
- Tor has had three notions of "stability" for servers. Older
- directory protocols based a server's stability on its
- (self-reported) uptime: a server that had been running for a day was
- more stable than a server that had been running for five minutes,
- regardless of their past history. Current directory protocols track
- weighted mean time between failure (WMTBF) and weighted fractional
- uptime (WFU). WFU is computed as the fraction of time for which the
- server is running, with measurements weighted to exponentially
- decay such that old days count less. WMTBF is computed as the
- average length of intervals for which the server runs between
- downtime, with old intervals weighted to count less.
-
- WMTBF is useful in answering the question: "If a server is running
- now, how long is it likely to stay running?" This makes it a good
- choice for picking servers for streams that need to be long-lived.
- WFU is useful in answering the question: "If I try connecting to
- this server at an arbitrary time, is it likely to be running?" This
- makes it an important factor for picking guard nodes, since we want
- guard nodes to be usually-up.
-
- There are other questions that clients want to answer, however, for
- which the current flags aren't very useful. The one that this
- proposal addresses is,
-
- "If I found this server in an old consensus, is it likely to
- still be running at the same address?"
-
- This one is useful when we're trying to find directory mirrors in a
- fallback-consensus file. This property is equivalent to,
-
- "If I find this server in a current consensus, how long is it
- likely to exist on the network?"
-
- This one is useful if we're trying to pick introduction points or
- something and care more about churn rate than about whether every IP
- will be up all the time.
-
-Implementation:
-
- I propose we add a new flag, called "Longterm." Authorities should
- set this flag for routers if their Longevity is in the upper
- quartile of all routers. A router's Longevity is computed as the
- total amount of days in the last year or so[*] for which the router has
- been Running at least once at its current IP:orport pair.
-
- Clients should use directory servers from a fallback-consensus only
- if they have the Longterm flag set.
-
- Authority ops should be able to mark particular routers as not
- Longterm, regardless of history. (For instance, it makes sense to
- remove the Longterm flag from a router whose op says that it will
- need to shutdown in a month.)
-
- [*] This is deliberately vague, to permit efficient implementations.
-
-Compatibility and migration issues:
-
- The voting protocol already acts gracefully when new flags are
- added, so no change to the voting protocol is needed.
-
- Tor won't have collected this data, however. It might be desirable
- to bootstrap it from historical consensuses. Alternatively, we can
- just let the algorithm run for a month or two.
-
-Issues and future possibilities:
-
- Longterm is a really awkward name.
-
-
diff --git a/doc/spec/proposals/147-prevoting-opinions.txt b/doc/spec/proposals/147-prevoting-opinions.txt
deleted file mode 100644
index 3d9659c98..000000000
--- a/doc/spec/proposals/147-prevoting-opinions.txt
+++ /dev/null
@@ -1,58 +0,0 @@
-Filename: 147-prevoting-opinions.txt
-Title: Eliminate the need for v2 directories in generating v3 directories
-Author: Nick Mathewson
-Created: 2-Jul-2008
-Status: Accepted
-Target: 0.2.1.x
-
-Overview
-
- We propose a new v3 vote document type to replace the role of v2
- networkstatus information in generating v3 consensuses.
-
-Motivation
-
- When authorities vote on which descriptors are to be listed in the
- next consensus, it helps if they all know about the same descriptors
- as one another. But a hostile, confused, or out-of-date server may
- upload a descriptor to only some authorities. In the current v3
- directory design, the authorities don't have a good way to tell one
- another about the new descriptor until they exchange votes... but by
- the time this happens, they are already committed to their votes,
- and they can't add anybody they learn about from other authorities
- until the next voting cycle. That's no good!
-
- The current Tor implementation avoids this problem by having
- authorities also look at v2 networkstatus documents, but we'd like
- in the long term to eliminate these, once 0.1.2.x is obsolete.
-
-Design:
-
- We add a new value for vote-status in v3 consensus documents in
- addition to "consensus" and "vote": "opinion". Authorities generate
- and sign an opinion document as if they were generating a vote,
- except that they generate opinions earlier than they generate votes.
-
- Authorities don't need to generate more than one opinion document
- per voting interval, but may. They should send it to the other
- authorities they know about, at the regular vote upload URL, before
- the authorities begin voting, so that enough time remains for the
- authorities to fetch new descriptors.
-
- Additionally, authories make their opinions available at
- http://<hostname>/tor/status-vote/next/opinion.z
- and download opinions from authorities they haven't heard from in a
- while.
-
- Authorities MAY generate opinions on demand.
-
- Upon receiving an opinion document, authorities scan it for any
- descriptors that:
- - They might accept.
- - Are for routers they don't know about, or are published more
- recently than any descriptor they have for that router.
- Authorities then begin downloading such descriptors from authorities
- that claim to have them.
-
- Authorities MAY cache opinion documents, but don't need to.
-
diff --git a/doc/spec/proposals/148-uniform-client-end-reason.txt b/doc/spec/proposals/148-uniform-client-end-reason.txt
deleted file mode 100644
index 1db3b3e59..000000000
--- a/doc/spec/proposals/148-uniform-client-end-reason.txt
+++ /dev/null
@@ -1,57 +0,0 @@
-Filename: 148-uniform-client-end-reason.txt
-Title: Stream end reasons from the client side should be uniform
-Author: Roger Dingledine
-Created: 2-Jul-2008
-Status: Closed
-Implemented-In: 0.2.1.9-alpha
-
-Overview
-
- When a stream closes before it's finished, the end relay cell that's
- sent includes an "end stream reason" to tell the other end why it
- closed. It's useful for the exit relay to send a reason to the client,
- so the client can choose a different circuit, inform the user, etc. But
- there's no reason to include it from the client to the exit relay,
- and in some cases it can even harm anonymity.
-
- We should pick a single reason for the client-to-exit-relay direction
- and always just send that.
-
-Motivation
-
- Back when I first deployed the Tor network, it was useful to have
- the Tor relays learn why a stream closed, so I could debug both ends
- of the stream at once. Now that streams have worked for many years,
- there's no need to continue telling the exit relay whether the client
- gave up on a stream because of "timeout" or "misc" or what.
-
- Then in Tor 0.2.0.28-rc, I fixed this bug:
- - Fix a bug where, when we were choosing the 'end stream reason' to
- put in our relay end cell that we send to the exit relay, Tor
- clients on Windows were sometimes sending the wrong 'reason'. The
- anonymity problem is that exit relays may be able to guess whether
- the client is running Windows, thus helping partition the anonymity
- set. Down the road we should stop sending reasons to exit relays,
- or otherwise prevent future versions of this bug.
-
- It turned out that non-Windows clients were choosing their reason
- correctly, whereas Windows clients were potentially looking at errno
- wrong and so always choosing 'misc'.
-
- I fixed that particular bug, but I think we should prevent future
- versions of the bug too.
-
- (We already fixed it so *circuit* end reasons don't get sent from
- the client to the exit relay. But we appear to be have skipped over
- stream end reasons thus far.)
-
-Design:
-
- One option would be to no longer include any 'reason' field in end
- relay cells. But that would introduce a partitioning attack ("users
- running the old version" vs "users running the new version").
-
- Instead I suggest that clients all switch to sending the "misc" reason,
- like most of the Windows clients currently do and like the non-Windows
- clients already do sometimes.
-
diff --git a/doc/spec/proposals/149-using-netinfo-data.txt b/doc/spec/proposals/149-using-netinfo-data.txt
deleted file mode 100644
index 8bf8375d5..000000000
--- a/doc/spec/proposals/149-using-netinfo-data.txt
+++ /dev/null
@@ -1,42 +0,0 @@
-Filename: 149-using-netinfo-data.txt
-Title: Using data from NETINFO cells
-Author: Nick Mathewson
-Created: 2-Jul-2008
-Status: Open
-Target: 0.2.1.x
-
-Overview
-
- Current Tor versions send signed IP and timestamp information in
- NETINFO cells, but don't use them to their fullest. This proposal
- describes how they should start using this info in 0.2.1.x.
-
-Motivation
-
- Our directory system relies on clients and routers having
- reasonably accurate clocks to detect replayed directory info, and
- to set accurate timestamps on directory info they publish
- themselves. NETINFO cells contain timestamps.
-
- Also, the directory system relies on routers having a reasonable
- idea of their own IP addresses, so they can publish correct
- descriptors. This is also in NETINFO cells.
-
-Learning the time and IP address
-
- We need to think about attackers here. Just because a router tells
- us that we have a given IP or a given clock skew doesn't mean that
- it's true. We believe this information only if we've heard it from
- a majority of the routers we've connected to recently, including at
- least 3 routers. Routers only believe this information if the
- majority includes at least one authority.
-
-Avoiding MITM attacks
-
- Current Tors use the IP addresses published in the other router's
- NETINFO cells to see whether the connection is "canonical". Right
- now, we prefer to extend circuits over "canonical" connections. In
- 0.2.1.x, we should refuse to extend circuits over non-canonical
- connections without first trying to build a canonical one.
-
-
diff --git a/doc/spec/proposals/150-exclude-exit-nodes.txt b/doc/spec/proposals/150-exclude-exit-nodes.txt
deleted file mode 100644
index b497ae62c..000000000
--- a/doc/spec/proposals/150-exclude-exit-nodes.txt
+++ /dev/null
@@ -1,47 +0,0 @@
-Filename: 150-exclude-exit-nodes.txt
-Title: Exclude Exit Nodes from a circuit
-Author: Mfr
-Created: 2008-06-15
-Status: Closed
-Implemented-In: 0.2.1.3-alpha
-
-Overview
-
- Right now, Tor users can manually exclude a node from all positions
- in their circuits created using the directive ExcludeNodes.
- This proposal makes this exclusion less restrictive, allowing users to
- exclude a node only from the exit part of a circuit.
-
-Motivation
-
- This feature would Help the integration into vidalia (tor exit
- branch) or other tools, of features to exclude a country for exit
- without reducing circuits possibilities, and privacy. This feature
- could help people from a country were many sites are blocked to
- exclude this country for browsing, giving them a more stable
- navigation. It could also add the possibility for the user to
- exclude a currently used exit node.
-
-Implementation
-
- ExcludeExitNodes is similar to ExcludeNodes except it's only
- the exit node which is excluded for circuit build.
-
- Tor doesn't warn if node from this list is not an exit node.
-
-Security implications:
-
- Open also possibilities for a future user bad exit reporting
-
-Risks:
-
- Use of this option can make users partitionable under certain attack
- assumptions. However, ExitNodes already creates this possibility,
- so there isn't much increased risk in ExcludeExitNodes.
-
- We should still encourage people who exclude an exit node because
- of bad behavior to report it instead of just adding it to their
- ExcludeExit list. It would be unfortunate if we didn't find out
- about broken exits because of this option. This issue can probably
- be addressed sufficiently with documentation.
-
diff --git a/doc/spec/proposals/151-path-selection-improvements.txt b/doc/spec/proposals/151-path-selection-improvements.txt
deleted file mode 100644
index af89f2119..000000000
--- a/doc/spec/proposals/151-path-selection-improvements.txt
+++ /dev/null
@@ -1,148 +0,0 @@
-Filename: 151-path-selection-improvements.txt
-Title: Improving Tor Path Selection
-Author: Fallon Chen, Mike Perry
-Created: 5-Jul-2008
-Status: Finished
-In-Spec: path-spec.txt
-
-Overview
-
- The performance of paths selected can be improved by adjusting the
- CircuitBuildTimeout and avoiding failing guard nodes. This proposal
- describes a method of tracking buildtime statistics at the client, and
- using those statistics to adjust the CircuitBuildTimeout.
-
-Motivation
-
- Tor's performance can be improved by excluding those circuits that
- have long buildtimes (and by extension, high latency). For those Tor
- users who require better performance and have lower requirements for
- anonymity, this would be a very useful option to have.
-
-Implementation
-
- Gathering Build Times
-
- Circuit build times are stored in the circular array
- 'circuit_build_times' consisting of uint32_t elements as milliseconds.
- The total size of this array is based on the number of circuits
- it takes to converge on a good fit of the long term distribution of
- the circuit builds for a fixed link. We do not want this value to be
- too large, because it will make it difficult for clients to adapt to
- moving between different links.
-
- From our observations, the minimum value for a reasonable fit appears
- to be on the order of 500 (MIN_CIRCUITS_TO_OBSERVE). However, to keep
- a good fit over the long term, we store 5000 most recent circuits in
- the array (NCIRCUITS_TO_OBSERVE).
-
- The Tor client will build test circuits at a rate of one per
- minute (BUILD_TIMES_TEST_FREQUENCY) up to the point of
- MIN_CIRCUITS_TO_OBSERVE. This allows a fresh Tor to have
- a CircuitBuildTimeout estimated within 8 hours after install,
- upgrade, or network change (see below).
-
- Long Term Storage
-
- The long-term storage representation is implemented by storing a
- histogram with BUILDTIME_BIN_WIDTH millisecond buckets (default 50) when
- writing out the statistics to disk. The format this takes in the
- state file is 'CircuitBuildTime <bin-ms> <count>', with the total
- specified as 'TotalBuildTimes <total>'
- Example:
-
- TotalBuildTimes 100
- CircuitBuildTimeBin 25 50
- CircuitBuildTimeBin 75 25
- CircuitBuildTimeBin 125 13
- ...
-
- Reading the histogram in will entail inserting <count> values
- into the circuit_build_times array each with the value of
- <bin-ms> milliseconds. In order to evenly distribute the values
- in the circular array, the Fisher-Yates shuffle will be performed
- after reading values from the bins.
-
- Learning the CircuitBuildTimeout
-
- Based on studies of build times, we found that the distribution of
- circuit buildtimes appears to be a Frechet distribution. However,
- estimators and quantile functions of the Frechet distribution are
- difficult to work with and slow to converge. So instead, since we
- are only interested in the accuracy of the tail, we approximate
- the tail of the distribution with a Pareto curve starting at
- the mode of the circuit build time sample set.
-
- We will calculate the parameters for a Pareto distribution
- fitting the data using the estimators at
- http://en.wikipedia.org/wiki/Pareto_distribution#Parameter_estimation.
-
- The timeout itself is calculated by using the Quartile function (the
- inverted CDF) to give us the value on the CDF such that
- BUILDTIME_PERCENT_CUTOFF (80%) of the mass of the distribution is
- below the timeout value.
-
- Thus, we expect that the Tor client will accept the fastest 80% of
- the total number of paths on the network.
-
- Detecting Changing Network Conditions
-
- We attempt to detect both network connectivity loss and drastic
- changes in the timeout characteristics.
-
- We assume that we've had network connectivity loss if 3 circuits
- timeout and we've received no cells or TLS handshakes since those
- circuits began. We then set the timeout to 60 seconds and stop
- counting timeouts.
-
- If 3 more circuits timeout and the network still has not been
- live within this new 60 second timeout window, we then discard
- the previous timeouts during this period from our history.
-
- To detect changing network conditions, we keep a history of
- the timeout or non-timeout status of the past RECENT_CIRCUITS (20)
- that successfully completed at least one hop. If more than 75%
- of these circuits timeout, we discard all buildtimes history,
- reset the timeout to 60, and then begin recomputing the timeout.
-
- Testing
-
- After circuit build times, storage, and learning are implemented,
- the resulting histogram should be checked for consistency by
- verifying it persists across successive Tor invocations where
- no circuits are built. In addition, we can also use the existing
- buildtime scripts to record build times, and verify that the histogram
- the python produces matches that which is output to the state file in Tor,
- and verify that the Pareto parameters and cutoff points also match.
-
- We will also verify that there are no unexpected large deviations from
- node selection, such as nodes from distant geographical locations being
- completely excluded.
-
- Dealing with Timeouts
-
- Timeouts should be counted as the expectation of the region of
- of the Pareto distribution beyond the cutoff. This is done by
- generating a random sample for each timeout at points on the
- curve beyond the current timeout cutoff.
-
- Future Work
-
- At some point, it may be desirable to change the cutoff from a
- single hard cutoff that destroys the circuit to a soft cutoff and
- a hard cutoff, where the soft cutoff merely triggers the building
- of a new circuit, and the hard cutoff triggers destruction of the
- circuit.
-
- It may also be beneficial to learn separate timeouts for each
- guard node, as they will have slightly different distributions.
- This will take longer to generate initial values though.
-
-Issues
-
- Impact on anonymity
-
- Since this follows a Pareto distribution, large reductions on the
- timeout can be achieved without cutting off a great number of the
- total paths. This will eliminate a great deal of the performance
- variation of Tor usage.
diff --git a/doc/spec/proposals/152-single-hop-circuits.txt b/doc/spec/proposals/152-single-hop-circuits.txt
deleted file mode 100644
index d0b28b1c7..000000000
--- a/doc/spec/proposals/152-single-hop-circuits.txt
+++ /dev/null
@@ -1,62 +0,0 @@
-Filename: 152-single-hop-circuits.txt
-Title: Optionally allow exit from single-hop circuits
-Author: Geoff Goodell
-Created: 13-Jul-2008
-Status: Closed
-Implemented-In: 0.2.1.6-alpha
-
-Overview
-
- Provide a special configuration option that adds a line to descriptors
- indicating that a router can be used as an exit for one-hop circuits,
- and allow clients to attach streams to one-hop circuits provided
- that the descriptor for the router in the circuit includes this
- configuration option.
-
-Motivation
-
- At some point, code was added to restrict the attachment of streams
- to one-hop circuits.
-
- The idea seems to be that we can use the cost of forking and
- maintaining a patch as a lever to prevent people from writing
- controllers that jeopardize the operational security of routers
- and the anonymity properties of the Tor network by creating and
- using one-hop circuits rather than the standard three-hop circuits.
- It may be, for example, that some users do not actually seek true
- anonymity but simply reachability through network perspectives
- afforded by the Tor network, and since anonymity is stronger in
- numbers, forcing users to contribute to anonymity and decrease the
- risk to server operators by using full-length paths may be reasonable.
-
- As presently implemented, the sweeping restriction of one-hop circuits
- for all routers limits the usefulness of Tor as a general-purpose
- technology for building circuits. In particular, we should allow
- for controllers, such as Blossom, that create and use single-hop
- circuits involving routers that are not part of the Tor network.
-
-Design
-
- Introduce a configuration option for Tor servers that, when set,
- indicates that a router is willing to provide exit from one-hop
- circuits. Routers with this policy will not require that a circuit
- has at least two hops when it is used as an exit.
-
- In addition, routers for which this configuration option
- has been set will have a line in their descriptors, "opt
- exit-from-single-hop-circuits". Clients will keep track of which
- routers have this option and allow streams to be attached to
- single-hop circuits that include such routers.
-
-Security Considerations
-
- This approach seems to eliminate the worry about operational router
- security, since server operators will not set the configuraiton
- option unless they are willing to take on such risk.
-
- To reduce the impact on anonymity of the network resulting
- from including such "risky" routers in regular Tor path
- selection, clients may systematically exclude routers with "opt
- exit-from-single-hop-circuits" when choosing random paths through
- the Tor network.
-
diff --git a/doc/spec/proposals/153-automatic-software-update-protocol.txt b/doc/spec/proposals/153-automatic-software-update-protocol.txt
deleted file mode 100644
index c2979bb69..000000000
--- a/doc/spec/proposals/153-automatic-software-update-protocol.txt
+++ /dev/null
@@ -1,175 +0,0 @@
-Filename: 153-automatic-software-update-protocol.txt
-Title: Automatic software update protocol
-Author: Jacob Appelbaum
-Created: 14-July-2008
-Status: Superseded
-
-[Superseded by thandy-spec.txt]
-
-
- Automatic Software Update Protocol Proposal
-
-0.0 Introduction
-
-The Tor project and its users require a robust method to update shipped
-software bundles. The software bundles often includes Vidalia, Privoxy, Polipo,
-Torbutton and of course Tor itself. It is not inconcievable that an update
-could include all of the Tor Browser Bundle. It seems reasonable to make this
-a standalone program that can be called in shell scripts, cronjobs or by
-various Tor controllers.
-
-0.1 Minimal Tasks To Implement Automatic Updating
-
-At the most minimal, an update must be able to do the following:
-
- 0 - Detect the curent Tor version, note the working status of Tor.
- 1 - Detect the latest Tor version.
- 2 - Fetch the latest version in the form of a platform specific package(s).
- 3 - Verify the itegrity of the downloaded package(s).
- 4 - Install the verified package(s).
- 5 - Test that the new package(s) works properly.
-
-0.2 Specific Enumeration Of Minimal Tasks
-
-To implement requirement 0, we need to detect the current Tor version of both
-the updater and the current running Tor. The update program itself should be
-versioned internally. This requirement should also test connecting through Tor
-itself and note if such connections are possible.
-
-To implement requirement 1, we need to learn the concensus from the directory
-authorities or fail back to a known good URL with cryptographically signed
-content.
-
-To implement requirement 2, we need to download Tor - hopefully over Tor.
-
-To implement requirement 3, we need to verify the package signature.
-
-To implement requirement 4, we need to use a platform specific method of
-installation. The Tor controller performing the update perform these platform
-specific methods.
-
-To implement requirement 5, we need to be able to extend circuits and reach
-the internet through Tor.
-
-0.x Implementation Goals
-
-The update system will be cross platform and rely on as little external code
-as possible. If the update system uses it, it must be updated by the update
-system itself. It will consist only of free software and will not rely on any
-non-free components until the actual installation phase. If a package manager
-is in use, it will be platform specific and thus only invoked by the update
-system implementing the update protocol.
-
-The update system itself will attempt to perform update related network
-activity over Tor. Possibly it will attempt to use a hidden service first.
-It will attempt to use novel and not so novel caching
-when possible, it will always verify cryptographic signatures before any
-remotely fetched code is executed. In the event of an unusable Tor system,
-it will be able to attempt to fetch updates without Tor. This should be user
-configurable, some users will be unwilling to update without the protection of
-using Tor - others will simply be unable because of blocking of the main Tor
-website.
-
-The update system will track current version numbers of Tor and supporting
-software. The update system will also track known working versions to assist
-with automatic The update system itself will be a standalone library. It will be
-strongly versioned internally to match the Tor bundle it was shiped with. The
-update system will keep track of the given platform, cpu architecture, lsb_release,
-package management functionality and any other platform specific metadata.
-
-We have referenced two popular automatic update systems, though neither fit
-our needs, both are useful as an idea of what others are doing in the same
-area.
-
-The first is sparkle[0] but it is sadly only available for Cocoa
-environments and is written in Objective C. This doesn't meet our requirements
-because it is directly tied into the private Apple framework.
-
-The second is the Mozilla Automatic Update System[1]. It is possibly useful
-as an idea of how other free software projects automatically update. It is
-however not useful in its currently documented form.
-
-
- [0] http://sparkle.andymatuschak.org/documentation/
- [1] http://wiki.mozilla.org/AUS:Manual
-
-0.x Previous methods of Tor and related software update
-
-Previously, Tor users updated their Tor related software by hand. There has
-been no fully automatic method for any user to update. In addition, there
-hasn't been any specific way to find out the most current stable version of Tor
-or related software as voted on by the directory authority concensus.
-
-0.x Changes to the directory specification
-
-We will want to supplement client-versions and server-versions in the
-concensus voting with another version identifier known as
-'auto-update-versions'. This will keep track of the current concensus of
-specific versions that are best per platform and per architecture. It should
-be noted that while the Mac OS X universal binary may be the best for x86
-processers with Tiger, it may not be the best for PPC users on Panther. This
-goes for all of the package updates. We want to prevent updates that cause Tor
-to break even if the updating program can recover gracefully.
-
-x.x Assumptions About Operating System Package Management
-
-It is assumed that users will use their package manager unless they are on
-Microsoft Windows (any version) or Mac OS X (any version). Microsoft Windows
-users will have integration with the normal "add/remove program" functionality
-that said users would expect.
-
-x.x Package Update System Failure Modes
-
-The package update will try to ensure that a user always has a working Tor at
-the very least. It will keep state to remember versions of Tor that were able
-to bootstrap properly and reach the rest of the Tor network. It will also keep
-note of which versions broke. It will select the best Tor that works for the
-user. It will also allow for anonymized bug reporting on the packages
-available and tested by the auto-update system.
-
-x.x Package Signature Verification
-
-The update system will be aware of replay attacks against the update signature
-system itself. It will not allow package update signatures that are radically
-out of date. It will be a multi-key system to prevent any single party from
-forging an update. The key will be updated regularly. This is like authority
-key (see proposal 103) usage.
-
-x.x Package Caching
-
-The update system will iterate over different update methods. Whichever method
-is picked will have caching functionality. Each Tor server itself should be
-able to serve cached update files. This will be an option that friendly server
-administrators can turn on should they wish to support caching. In addition,
-it is possible to cache the full contents of a package in an
-authoratative DNS zone. Users can then query the DNS zone for their package.
-If we wish to further distribute the update load, we can also offer packages
-with encrypted bittorrent. Clients who wish to share the updates but do not
-wish to be a server can help distribute Tor updates. This can be tied together
-with the DNS caching[2][3] if needed.
-
- [2] http://www.netrogenic.com/dnstorrent/
- [3] http://www.doxpara.com/ozymandns_src_0.1.tgz
-
-x.x Helping Our Users Spread Tor
-
-There should be a way for a user to participate in the packaging caching as
-described in section x.x. This option should be presented by the Tor
-controller.
-
-x.x Simple HTTP Proxy To The Tor Project Website
-
-It has been suggested that we should provide a simple proxy that allows a user
-to visit the main Tor website to download packages. This was part of a
-previous proposal and has not been closely examined.
-
-x.x Package Installation
-
-Platform specific methods for proper package installation will be left to the
-controller that is calling for an update. Each platform is different, the
-installation options and user interface will be specific to the controller in
-question.
-
-x.x Other Things
-
-Other things should be added to this proposal. What are they?
diff --git a/doc/spec/proposals/154-automatic-updates.txt b/doc/spec/proposals/154-automatic-updates.txt
deleted file mode 100644
index 4c2c6d389..000000000
--- a/doc/spec/proposals/154-automatic-updates.txt
+++ /dev/null
@@ -1,377 +0,0 @@
-Filename: 154-automatic-updates.txt
-Title: Automatic Software Update Protocol
-Author: Matt Edman
-Created: 30-July-2008
-Status: Superseded
-Target: 0.2.1.x
-
-Superseded by thandy-spec.txt
-
-Scope
-
- This proposal specifies the method by which an automatic update client can
- determine the most recent recommended Tor installation package for the
- user's platform, download the package, and then verify that the package was
- downloaded successfully. While this proposal focuses on only the Tor
- software, the protocol defined is sufficiently extensible such that other
- components of the Tor bundles, like Vidalia, Polipo, and Torbutton, can be
- managed and updated by the automatic update client as well.
-
- The initial target platform for the automatic update framework is Windows,
- given that's the platform used by a majority of our users and that it lacks
- a sane package management system that many Linux distributions already have.
- Our second target platform will be Mac OS X, and so the protocol will be
- designed with this near-future direction in mind.
-
- Other client-side aspects of the automatic update process, such as user
- interaction, the interface presented, and actual package installation
- procedure, are outside the scope of this proposal.
-
-
-Motivation
-
- Tor releases new versions frequently, often with important security,
- anonymity, and stability fixes. Thus, it is important for users to be able
- to promptly recognize when new versions are available and to easily
- download, authenticate, and install updated Tor and Tor-related software
- packages.
-
- Tor's control protocol [2] provides a method by which controllers can
- identify when the user's Tor software is obsolete or otherwise no longer
- recommended. Currently, however, no mechanism exists for clients to
- automatically download and install updated Tor and Tor-related software for
- the user.
-
-
-Design Overview
-
- The core of the automatic update framework is a well-defined file called a
- "recommended-packages" file. The recommended-packages file is accessible via
- HTTP[S] at one or more well-defined URLs. An example recommended-packages
- URL may be:
-
- https://updates.torproject.org/recommended-packages
-
- The recommended-packages document is formatted according to Section 1.2
- below and specifies the most recent recommended installation package
- versions for Tor or Tor-related software, as well as URLs at which the
- packages and their signatures can be downloaded.
-
- An automatic update client process runs on the Tor user's computer and
- periodically retrieves the recommended-packages file according to the method
- described in Section 2.0. As described further in Section 1.2, the
- recommended-packages file is signed and can be verified by the automatic
- update client with one or more public keys included in the client software.
- Since it is signed, the recommended-packages file can be mirrored by
- multiple hosts (e.g., Tor directory authorities), whose URLs are included in
- the automatic update client's configuration.
-
- After retrieving and verifying the recommended-packages file, the automatic
- update client compares the versions of the recommended software packages
- listed in the file with those currently installed on the end-user's
- computer. If one or more of the installed packages is determined to be out
- of date, an updated package and its signature will be downloaded from one of
- the package URLs listed in the recommended-packages file as described in
- Section 2.2.
-
- The automatic update system uses a multilevel signing key scheme for package
- signatures. There are a small number of entities we call "packaging
- authorities" that each have their own signing key. A packaging authority is
- responsible for signing and publishing the recommended-packages file.
- Additionally, each individual packager responsible for producing an
- installation package for one or more platforms has their own signing key.
- Every packager's signing key must be signed by at least one of the packaging
- authority keys.
-
-
-Specification
-
- 1. recommended-packages Specification
-
- In this section we formally specify the format of the published
- recommended-packages file.
-
- 1.1. Document Meta-format
-
- The recommended-packages document follows the lightweight extensible
- information format defined in Tor's directory protocol specification [1]. In
- the interest of self-containment, we have reproduced the relevant portions
- of that format's specification in this Section. (Credits to Nick Mathewson
- for much of the original format definition language.)
-
- The highest level object is a Document, which consists of one or more
- Items. Every Item begins with a KeywordLine, followed by zero or more
- Objects. A KeywordLine begins with a Keyword, optionally followed by
- whitespace and more non-newline characters, and ends with a newline. A
- Keyword is a sequence of one or more characters in the set [A-Za-z0-9-].
- An Object is a block of encoded data in pseudo-Open-PGP-style
- armor. (cf. RFC 2440)
-
- More formally:
-
- Document ::= (Item | NL)+
- Item ::= KeywordLine Object*
- KeywordLine ::= Keyword NL | Keyword WS ArgumentChar+ NL
- Keyword ::= KeywordChar+
- KeywordChar ::= 'A' ... 'Z' | 'a' ... 'z' | '0' ... '9' | '-'
- ArgumentChar ::= any printing ASCII character except NL.
- WS ::= (SP | TAB)+
- Object ::= BeginLine Base-64-encoded-data EndLine
- BeginLine ::= "-----BEGIN " Keyword "-----" NL
- EndLine ::= "-----END " Keyword "-----" NL
-
- The BeginLine and EndLine of an Object must use the same keyword.
-
- In our Document description below, we also tag Items with a multiplicity in
- brackets. Possible tags are:
-
- "At start, exactly once": These items MUST occur in every instance of the
- document type, and MUST appear exactly once, and MUST be the first item in
- their documents.
-
- "Exactly once": These items MUST occur exactly one time in every
- instance of the document type.
-
- "Once or more": These items MUST occur at least once in any instance
- of the document type, and MAY occur more than once.
-
- "At end, exactly once": These items MUST occur in every instance of
- the document type, and MUST appear exactly once, and MUST be the
- last item in their documents.
-
- 1.2. recommended-packages Document Format
-
- When interpreting a recommended-packages Document, software MUST ignore
- any KeywordLine that starts with a keyword it doesn't recognize; future
- implementations MUST NOT require current automatic update clients to
- understand any KeywordLine not currently described.
-
- In lines that take multiple arguments, extra arguments SHOULD be
- accepted and ignored.
-
- The currently defined Items contained in a recommended-packages document
- are:
-
- "recommended-packages-format" SP number NL
-
- [Exactly once]
-
- This Item specifies the version of the recommended-packages format that
- is contained in the subsequent document. The version defined in this
- proposal is version "1". Subsequent iterations of this protocol MUST
- increment this value if they introduce incompatible changes to the
- document format and MAY increment this value if they only introduce
- additional Keywords.
-
- "published" SP YYYY-MM-DD SP HH:MM:SS NL
-
- [Exactly once]
-
- The time, in GMT, when this recommended-packages document was generated.
- Automatic update clients SHOULD ignore Documents over 60 days old.
-
- "tor-stable-win32-version" SP TorVersion NL
-
- [Exactly once]
-
- This keyword specifies the latest recommended release of Tor's "stable"
- branch for the Windows platform that has an installation package
- available. Note that this version does not necessarily correspond to the
- most recently tagged stable Tor version, since that version may not yet
- have an installer package available, or may have known issues on
- Windows.
-
- The TorVersion field is formatted according to Section 2 of Tor's
- version specification [3].
-
- "tor-stable-win32-package" SP Url NL
-
- [Once or more]
-
- This Item specifies the location from which the most recent
- recommended Windows installation package for Tor's stable branch can be
- downloaded.
-
- When this Item appears multiple times within the Document, automatic
- update clients SHOULD select randomly from the available package
- mirrors.
-
- "tor-dev-win32-version" SP TorVersion NL
-
- [Exactly once]
-
- This Item specifies the latest recommended release of Tor's
- "development" branch for the Windows platform that has an installation
- package available. The same caveats from the description of
- "tor-stable-win32-version" also apply to this keyword.
-
- The TorVersion field is formatted according to Section 2 of Tor's
- version specification [3].
-
- "tor-dev-win32-package" SP Url NL
-
- [Once or more]
-
- This Item specifies the location from which the most recent recommended
- Windows installation package and its signature for Tor's development
- branch can be downloaded.
-
- When this Keyword appears multiple times within the Document, automatic
- update clients SHOULD select randomly from the available package
- mirrors.
-
- "signature" NL SIGNATURE NL
-
- [At end, exactly once]
-
- The "SIGNATURE" Object contains a PGP signature (using a packaging
- authority signing key) of the entire document, taken from the beginning
- of the "recommended-packages-format" keyword, through the newline after
- the "signature" Keyword.
-
-
- 2. Automatic Update Client Behavior
-
- The client-side component of the automatic update framework is an
- application that runs on the end-user's machine. It is responsible for
- fetching and verifying a recommended-packages document, as well as
- downloading, verifying, and subsequently installing any necessary updated
- software packages.
-
- 2.1. Download and verify a recommended-packages document
-
- The first step in the automatic update process is for the client to download
- a copy of the recommended-packages file. The automatic update client
- contains a (hardcoded and/or user-configurable) list of URLs from which it
- will attempt to retrieve a recommended-packages file.
-
- Connections to each of the recommended-packages URLs SHOULD be attempted in
- the following order:
-
- 1) HTTPS over Tor
- 2) HTTP over Tor
- 3) Direct HTTPS
- 4) Direct HTTP
-
- If the client fails to retrieve a recommended-packages document via any of
- the above connection methods from any of the configured URLs, the client
- SHOULD retry its download attempts following an exponential back-off
- algorithm. After the first failed attempt, the client SHOULD delay one hour
- before attempting again, up to a maximum of 24 hours delay between retry
- attempts.
-
- After successfully downloading a recommended-packages file, the automatic
- update client will verify the signature using one of the public keys
- distributed with the client software. If more than one recommended-packages
- file is downloaded and verified, the file with the most recent "published"
- date that is verified will be retained and the rest discarded.
-
- 2.2. Download and verify the updated packages
-
- The automatic update client next compares the latest recommended package
- version from the recommended-packages document with the currently installed
- Tor version. If the user currently has installed a Tor version from Tor's
- "development" branch, then the version specified in "tor-dev-*-version" Item
- is used for comparison. Similarly, if the user currently has installed a Tor
- version from Tor's "stable" branch, then the version specified in the
- "tor-stable-*version" Item is used for comparison. Version comparisons are
- done according to Tor's version specification [3].
-
- If the automatic update client determines an installation package newer than
- the user's currently installed version is available, it will attempt to
- download a package appropriate for the user's platform and Tor branch from a
- URL specified by a "tor-[branch]-[platform]-package" Item. If more than one
- mirror for the selected package is available, a mirror will be chosen at
- random from all those available.
-
- The automatic update client must also download a ".asc" signature file for
- the retrieved package. The URL for the package signature is the same as that
- for the package itself, except with the extension ".asc" appended to the
- package URL.
-
- Connections to download the updated package and its signature SHOULD be
- attempted in the same order described in Section 2.1.
-
- After completing the steps described in Sections 2.1 and 2.2, the automatic
- update client will have downloaded and verified a copy of the latest Tor
- installation package. It can then take whatever subsequent platform-specific
- steps are necessary to install the downloaded software updates.
-
- 2.3. Periodic checking for updates
-
- The automatic update client SHOULD maintain a local state file in which it
- records (at a minimum) the timestamp at which it last retrieved a
- recommended-packages file and the timestamp at which the client last
- successfully downloaded and installed a software update.
-
- Automatic update clients SHOULD check for an updated recommended-packages
- document at most once per day but at least once every 30 days.
-
-
- 3. Future Extensions
-
- There are several possible areas for future extensions of this framework.
- The extensions below are merely suggestions and should be the subject of
- their own proposal before being implemented.
-
- 3.1. Additional Software Updates
-
- There are several software packages often included in Tor bundles besides
- Tor, such as Vidalia, Privoxy or Polipo, and Torbutton. The versions and
- download locations of updated installation packages for these bundle
- components can be easily added to the recommended-packages document
- specification above.
-
- 3.2. Including ChangeLog Information
-
- It may be useful for automatic update clients to be able to display for
- users a summary of the changes made in the latest Tor or Tor-related
- software release, before the user chooses to install the update. In the
- future, we can add keywords to the specification in Section 1.2 that specify
- the location of a ChangeLog file for the latest recommended package
- versions. It may also be desirable to allow localized ChangeLog information,
- so that the automatic update client can fetch release notes in the
- end-user's preferred language.
-
- 3.3. Weighted Package Mirror Selection
-
- We defined in Section 1.2 a method by which automatic update clients can
- select from multiple available package mirrors. We may want to add a Weight
- argument to the "*-package" Items that allows the recommended-packages file
- to suggest to clients the probability with which a package mirror should be
- chosen. This will allow clients to more appropriately distribute package
- downloads across available mirrors proportional to their approximate
- bandwidth.
-
-
-Implementation
-
- Implementation of this proposal will consist of two separate components.
-
- The first component is a small "au-publish" tool that takes as input a
- configuration file specifying the information described in Section 1.2 and a
- private key. The tool is run by a "packaging authority" (someone responsible
- for publishing updated installation packages), who will be prompted to enter
- the passphrase for the private key used to sign the recommended-packages
- document. The output of the tool is a document formatted according to
- Section 1.2, with a signature appended at the end. The resulting document
- can then be published to any of the update mirrors.
-
- The second component is an "au-client" tool that is run on the end-user's
- machine. It periodically checks for updated installation packages according
- to Section 2 and fetches the packages if necessary. The public keys used
- to sign the recommended-packages file and any of the published packages are
- included in the "au-client" tool.
-
-
-References
-
- [1] Tor directory protocol (version 3),
- https://tor-svn.freehaven.net/svn/tor/trunk/doc/spec/dir-spec.txt
-
- [2] Tor control protocol (version 2),
- https://tor-svn.freehaven.net/svn/tor/trunk/doc/spec/control-spec.txt
-
- [3] Tor version specification,
- https://tor-svn.freehaven.net/svn/tor/trunk/doc/spec/version-spec.txt
-
diff --git a/doc/spec/proposals/155-four-hidden-service-improvements.txt b/doc/spec/proposals/155-four-hidden-service-improvements.txt
deleted file mode 100644
index e342bf1c3..000000000
--- a/doc/spec/proposals/155-four-hidden-service-improvements.txt
+++ /dev/null
@@ -1,120 +0,0 @@
-Filename: 155-four-hidden-service-improvements.txt
-Title: Four Improvements of Hidden Service Performance
-Author: Karsten Loesing, Christian Wilms
-Created: 25-Sep-2008
-Status: Finished
-Implemented-In: 0.2.1.x
-
-Change history:
-
- 25-Sep-2008 Initial proposal for or-dev
-
-Overview:
-
- A performance analysis of hidden services [1] has brought up a few
- possible design changes to reduce advertisement time of a hidden service
- in the network as well as connection establishment time. Some of these
- design changes have side-effects on anonymity or overall network load
- which had to be weighed up against individual performance gains. A
- discussion of seven possible design changes [2] has led to a selection
- of four changes [3] that are proposed to be implemented here.
-
-Design:
-
- 1. Shorter Circuit Extension Timeout
-
- When establishing a connection to a hidden service a client cannibalizes
- an existing circuit and extends it by one hop to one of the service's
- introduction points. In most cases this can be accomplished within a few
- seconds. Therefore, the current timeout of 60 seconds for extending a
- circuit is far too high.
-
- Assuming that the timeout would be reduced to a lower value, for example
- 30 seconds, a second (or third) attempt to cannibalize and extend would
- be started earlier. With the current timeout of 60 seconds, 93.42% of all
- circuits can be established, whereas this fraction would have been only
- 0.87% smaller at 92.55% with a timeout of 30 seconds.
-
- For a timeout of 30 seconds the performance gain would be approximately 2
- seconds in the mean as opposed to the current timeout of 60 seconds. At
- the same time a smaller timeout leads to discarding an increasing number
- of circuits that might have been completed within the current timeout of
- 60 seconds.
-
- Measurements with simulated low-bandwidth connectivity have shown that
- there is no significant effect of client connectivity on circuit
- extension times. The reason for this might be that extension messages are
- small and thereby independent of the client bandwidth. Further, the
- connection between client and entry node only constitutes a single hop of
- a circuit, so that its influence on the whole circuit is limited.
-
- The exact value of the new timeout does not necessarily have to be 30
- seconds, but might also depend on the results of circuit build timeout
- measurements as described in proposal 151.
-
- 2. Parallel Connections to Introduction Points
-
- An additional approach to accelerate extension of introduction circuits
- is to extend a second circuit in parallel to a different introduction
- point. Such parallel extension attempts should be started after a short
- delay of, e.g., 15 seconds in order to prevent unnecessary circuit
- extensions and thereby save network resources. Whichever circuit
- extension succeeds first is used for introduction, while the other
- attempt is aborted.
-
- An evaluation has been performed for the more resource-intensive approach
- of starting two parallel circuits immediately instead of waiting for a
- short delay. The result was a reduction of connection establishment times
- from 27.4 seconds in the original protocol to 22.5 seconds.
-
- While the effect of the proposed approach of delayed parallelization on
- mean connection establishment times is expected to be smaller,
- variability of connection attempt times can be reduced significantly.
-
- 3. Increase Count of Internal Circuits
-
- Hidden services need to create or cannibalize and extend a circuit to a
- rendezvous point for every client request. Really popular hidden services
- require more than two internal circuits in the pool to answer multiple
- client requests at the same time. This scenario was not yet analyzed, but
- will probably exhibit worse performance than measured in the previous
- analysis. The number of preemptively built internal circuits should be a
- function of connection requests in the past to adapt to changing needs.
- Furthermore, an increased number of internal circuits on client side
- would allow clients to establish connections to more than one hidden
- service at a time.
-
- Under the assumption that a popular hidden service cannot make use of
- cannibalization for connecting to rendezvous points, the circuit creation
- time needs to be added to the current results. In the mean, the
- connection establishment time to a popular hidden service would increase
- by 4.7 seconds.
-
- 4. Build More Introduction Circuits
-
- When establishing introduction points, a hidden service should launch 5
- instead of 3 introduction circuits at the same time and use only the
- first 3 that could be established. The remaining two circuits could still
- be used for other purposes afterwards.
-
- The effect has been simulated using previously measured data, too.
- Therefore, circuit establishment times were derived from log files and
- written to an array. Afterwards, a simulation with 10,000 runs was
- performed picking 5 (4, 6) random values and using the 3 lowest values in
- contrast to picking only 3 values at random. The result is that the mean
- time of the 3-out-of-3 approach is 8.1 seconds, while the mean time of
- the 3-out-of-5 approach is 4.4 seconds.
-
- The effect on network load is minimal, because the hidden service can
- reuse the slower internal circuits for other purposes, e.g., rendezvous
- circuits. The only change is that a hidden service starts establishing
- more circuits at once instead of subsequently doing so.
-
-References:
-
- [1] http://freehaven.net/~karsten/hidserv/perfanalysis-2008-06-15.pdf
-
- [2] http://freehaven.net/~karsten/hidserv/discussion-2008-07-15.pdf
-
- [3] http://freehaven.net/~karsten/hidserv/design-2008-08-15.pdf
-
diff --git a/doc/spec/proposals/156-tracking-blocked-ports.txt b/doc/spec/proposals/156-tracking-blocked-ports.txt
deleted file mode 100644
index 419de7e74..000000000
--- a/doc/spec/proposals/156-tracking-blocked-ports.txt
+++ /dev/null
@@ -1,527 +0,0 @@
-Filename: 156-tracking-blocked-ports.txt
-Title: Tracking blocked ports on the client side
-Author: Robert Hogan
-Created: 14-Oct-2008
-Status: Open
-Target: 0.2.?
-
-Motivation:
-Tor clients that are behind extremely restrictive firewalls can end up
-waiting a while for their first successful OR connection to a node on the
-network. Worse, the more restrictive their firewall the more susceptible
-they are to an attacker guessing their entry nodes. Tor routers that
-are behind extremely restrictive firewalls can only offer a limited,
-'partitioned' service to other routers and clients on the network. Exit
-nodes behind extremely restrictive firewalls may advertise ports that they
-are actually not able to connect to, wasting network resources in circuit
-constructions that are doomed to fail at the last hop on first use.
-
-Proposal:
-
-When a client attempts to connect to an entry guard it should avoid
-further attempts on ports that fail once until it has connected to at
-least one entry guard successfully. (Maybe it should wait for more than
-one failure to reduce the skew on the first node selection.) Thereafter
-it should select entry guards regardless of port and warn the user if
-it observes that connections to a given port have failed every multiple
-of 5 times without success or since the last success.
-
-Tor should warn the operators of exit, middleman and entry nodes if it
-observes that connections to a given port have failed a multiple of 5
-times without success or since the last success. If attempts on a port
-fail 20 or more times without or since success, Tor should add the port
-to a 'blocked-ports' entry in its descriptor's extra-info. Some thought
-needs to be given to what the authorities might do with this information.
-
-Related TODO item:
- "- Automatically determine what ports are reachable and start using
- those, if circuits aren't working and it's a pattern we
- recognize ("port 443 worked once and port 9001 keeps not
- working")."
-
-
-I've had a go at implementing all of this in the attached.
-
-Addendum:
-Just a note on the patch, storing the digest of each router that uses the port
-is a bit of a memory hog, and its only real purpose is to provide a count of
-routers using that port when warning the user. That could be achieved when
-warning the user by iterating through the routerlist instead.
-
-Index: src/or/connection_or.c
-===================================================================
---- src/or/connection_or.c (revision 17104)
-+++ src/or/connection_or.c (working copy)
-@@ -502,6 +502,9 @@
- connection_or_connect_failed(or_connection_t *conn,
- int reason, const char *msg)
- {
-+ if ((reason == END_OR_CONN_REASON_NO_ROUTE) ||
-+ (reason == END_OR_CONN_REASON_REFUSED))
-+ or_port_hist_failure(conn->identity_digest,TO_CONN(conn)->port);
- control_event_or_conn_status(conn, OR_CONN_EVENT_FAILED, reason);
- if (!authdir_mode_tests_reachability(get_options()))
- control_event_bootstrap_problem(msg, reason);
-@@ -580,6 +583,7 @@
- /* already marked for close */
- return NULL;
- }
-+
- return conn;
- }
-
-@@ -909,6 +913,7 @@
- control_event_or_conn_status(conn, OR_CONN_EVENT_CONNECTED, 0);
-
- if (started_here) {
-+ or_port_hist_success(TO_CONN(conn)->port);
- rep_hist_note_connect_succeeded(conn->identity_digest, now);
- if (entry_guard_register_connect_status(conn->identity_digest,
- 1, now) < 0) {
-Index: src/or/rephist.c
-===================================================================
---- src/or/rephist.c (revision 17104)
-+++ src/or/rephist.c (working copy)
-@@ -18,6 +18,7 @@
- static void bw_arrays_init(void);
- static void predicted_ports_init(void);
- static void hs_usage_init(void);
-+static void or_port_hist_init(void);
-
- /** Total number of bytes currently allocated in fields used by rephist.c. */
- uint64_t rephist_total_alloc=0;
-@@ -89,6 +90,25 @@
- digestmap_t *link_history_map;
- } or_history_t;
-
-+/** or_port_hist_t contains our router/client's knowledge of
-+ all OR ports offered on the network, and how many servers with each port we
-+ have succeeded or failed to connect to. */
-+typedef struct {
-+ /** The port this entry is tracking. */
-+ uint16_t or_port;
-+ /** Have we ever connected to this port on another OR?. */
-+ unsigned int success:1;
-+ /** The ORs using this port. */
-+ digestmap_t *ids;
-+ /** The ORs using this port we have failed to connect to. */
-+ digestmap_t *failure_ids;
-+ /** Are we excluding ORs with this port during entry selection?*/
-+ unsigned int excluded;
-+} or_port_hist_t;
-+
-+static unsigned int still_searching = 0;
-+static smartlist_t *or_port_hists;
-+
- /** When did we last multiply all routers' weighted_run_length and
- * total_run_weights by STABILITY_ALPHA? */
- static time_t stability_last_downrated = 0;
-@@ -164,6 +184,16 @@
- tor_free(hist);
- }
-
-+/** Helper: free storage held by a single OR port history entry. */
-+static void
-+or_port_hist_free(or_port_hist_t *p)
-+{
-+ tor_assert(p);
-+ digestmap_free(p->ids,NULL);
-+ digestmap_free(p->failure_ids,NULL);
-+ tor_free(p);
-+}
-+
- /** Update an or_history_t object <b>hist</b> so that its uptime/downtime
- * count is up-to-date as of <b>when</b>.
- */
-@@ -1639,7 +1669,7 @@
- tmp_time = smartlist_get(predicted_ports_times, i);
- if (*tmp_time + PREDICTED_CIRCS_RELEVANCE_TIME < now) {
- tmp_port = smartlist_get(predicted_ports_list, i);
-- log_debug(LD_CIRC, "Expiring predicted port %d", *tmp_port);
-+ log_debug(LD_HIST, "Expiring predicted port %d", *tmp_port);
- smartlist_del(predicted_ports_list, i);
- smartlist_del(predicted_ports_times, i);
- rephist_total_alloc -= sizeof(uint16_t)+sizeof(time_t);
-@@ -1821,6 +1851,12 @@
- tor_free(last_stability_doc);
- built_last_stability_doc_at = 0;
- predicted_ports_free();
-+ if (or_port_hists) {
-+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, p,
-+ or_port_hist_free(p));
-+ smartlist_free(or_port_hists);
-+ or_port_hists = NULL;
-+ }
- }
-
- /****************** hidden service usage statistics ******************/
-@@ -2356,3 +2392,225 @@
- tor_free(fname);
- }
-
-+/** Create a new entry in the port tracking cache for the or_port in
-+ * <b>ri</b>. */
-+void
-+or_port_hist_new(const routerinfo_t *ri)
-+{
-+ or_port_hist_t *result;
-+ const char *id=ri->cache_info.identity_digest;
-+
-+ if (!or_port_hists)
-+ or_port_hist_init();
-+
-+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp,
-+ {
-+ /* Cope with routers that change their advertised OR port or are
-+ dropped from the networkstatus. We don't discard the failures of
-+ dropped routers because they are still valid when counting
-+ consecutive failures on a port.*/
-+ if (digestmap_get(tp->ids, id) && (tp->or_port != ri->or_port)) {
-+ digestmap_remove(tp->ids, id);
-+ }
-+ if (tp->or_port == ri->or_port) {
-+ if (!(digestmap_get(tp->ids, id)))
-+ digestmap_set(tp->ids, id, (void*)1);
-+ return;
-+ }
-+ });
-+
-+ result = tor_malloc_zero(sizeof(or_port_hist_t));
-+ result->or_port=ri->or_port;
-+ result->success=0;
-+ result->ids=digestmap_new();
-+ digestmap_set(result->ids, id, (void*)1);
-+ result->failure_ids=digestmap_new();
-+ result->excluded=0;
-+ smartlist_add(or_port_hists, result);
-+}
-+
-+/** Create the port tracking cache. */
-+/*XXX: need to call this when we rebuild/update our network status */
-+static void
-+or_port_hist_init(void)
-+{
-+ routerlist_t *rl = router_get_routerlist();
-+
-+ if (!or_port_hists)
-+ or_port_hists=smartlist_create();
-+
-+ if (rl && rl->routers) {
-+ SMARTLIST_FOREACH(rl->routers, routerinfo_t *, ri,
-+ {
-+ or_port_hist_new(ri);
-+ });
-+ }
-+}
-+
-+#define NOT_BLOCKED 0
-+#define FAILURES_OBSERVED 1
-+#define POSSIBLY_BLOCKED 5
-+#define PROBABLY_BLOCKED 10
-+/** Return the list of blocked ports for our router's extra-info.*/
-+char *
-+or_port_hist_get_blocked_ports(void)
-+{
-+ char blocked_ports[2048];
-+ char *bp;
-+
-+ tor_snprintf(blocked_ports,sizeof(blocked_ports),"blocked-ports");
-+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp,
-+ {
-+ if (digestmap_size(tp->failure_ids) >= PROBABLY_BLOCKED)
-+ tor_snprintf(blocked_ports+strlen(blocked_ports),
-+ sizeof(blocked_ports)," %u,",tp->or_port);
-+ });
-+ if (strlen(blocked_ports) == 13)
-+ return NULL;
-+ bp=tor_strdup(blocked_ports);
-+ bp[strlen(bp)-1]='\n';
-+ bp[strlen(bp)]='\0';
-+ return bp;
-+}
-+
-+/** Revert to client-only mode if we have seen to many failures on a port or
-+ * range of ports.*/
-+static void
-+or_port_hist_report_block(unsigned int min_severity)
-+{
-+ or_options_t *options=get_options();
-+ char failures_observed[2048],possibly_blocked[2048],probably_blocked[2048];
-+ char port[1024];
-+
-+ memset(failures_observed,0,sizeof(failures_observed));
-+ memset(possibly_blocked,0,sizeof(possibly_blocked));
-+ memset(probably_blocked,0,sizeof(probably_blocked));
-+
-+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp,
-+ {
-+ unsigned int failures = digestmap_size(tp->failure_ids);
-+ if (failures >= min_severity) {
-+ tor_snprintf(port, sizeof(port), " %u (%u failures %s out of %u on the"
-+ " network)",tp->or_port,failures,
-+ (!tp->success)?"and no successes": "since last success",
-+ digestmap_size(tp->ids));
-+ if (failures >= PROBABLY_BLOCKED) {
-+ strlcat(probably_blocked, port, sizeof(probably_blocked));
-+ } else if (failures >= POSSIBLY_BLOCKED)
-+ strlcat(possibly_blocked, port, sizeof(possibly_blocked));
-+ else if (failures >= FAILURES_OBSERVED)
-+ strlcat(failures_observed, port, sizeof(failures_observed));
-+ }
-+ });
-+
-+ log_warn(LD_HIST,"%s%s%s%s%s%s%s%s",
-+ server_mode(options) &&
-+ ((min_severity==FAILURES_OBSERVED) || strlen(probably_blocked))?
-+ "You should consider disabling your Tor server.":"",
-+ (min_severity==FAILURES_OBSERVED)?
-+ "Tor appears to be blocked from connecting to a range of ports "
-+ "with the result that it cannot connect to one tenth of the Tor "
-+ "network. ":"",
-+ strlen(failures_observed)?
-+ "Tor has observed failures on the following ports: ":"",
-+ failures_observed,
-+ strlen(possibly_blocked)?
-+ "Tor is possibly blocked on the following ports: ":"",
-+ possibly_blocked,
-+ strlen(probably_blocked)?
-+ "Tor is almost certainly blocked on the following ports: ":"",
-+ probably_blocked);
-+
-+}
-+
-+/** Record the success of our connection to <b>digest</b>'s
-+ * OR port. */
-+void
-+or_port_hist_success(uint16_t or_port)
-+{
-+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp,
-+ {
-+ if (tp->or_port != or_port)
-+ continue;
-+ /*Reset our failure stats so we can notice if this port ever gets
-+ blocked again.*/
-+ tp->success=1;
-+ if (digestmap_size(tp->failure_ids)) {
-+ digestmap_free(tp->failure_ids,NULL);
-+ tp->failure_ids=digestmap_new();
-+ }
-+ if (still_searching) {
-+ still_searching=0;
-+ SMARTLIST_FOREACH(or_port_hists,or_port_hist_t *,t,t->excluded=0;);
-+ }
-+ return;
-+ });
-+}
-+/** Record the failure of our connection to <b>digest</b>'s
-+ * OR port. Warn, exclude the port from future entry guard selection, or
-+ * add port to blocked-ports in our server's extra-info as appropriate. */
-+void
-+or_port_hist_failure(const char *digest, uint16_t or_port)
-+{
-+ int total_failures=0, ports_excluded=0, report_block=0;
-+ int total_routers=smartlist_len(router_get_routerlist()->routers);
-+
-+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp,
-+ {
-+ ports_excluded += tp->excluded;
-+ total_failures+=digestmap_size(tp->failure_ids);
-+ if (tp->or_port != or_port)
-+ continue;
-+ /* We're only interested in unique failures */
-+ if (digestmap_get(tp->failure_ids, digest))
-+ return;
-+
-+ total_failures++;
-+ digestmap_set(tp->failure_ids, digest, (void*)1);
-+ if (still_searching && !tp->success) {
-+ tp->excluded=1;
-+ ports_excluded++;
-+ }
-+ if ((digestmap_size(tp->ids) >= POSSIBLY_BLOCKED) &&
-+ !(digestmap_size(tp->failure_ids) % POSSIBLY_BLOCKED))
-+ report_block=POSSIBLY_BLOCKED;
-+ });
-+
-+ if (total_failures >= (int)(total_routers/10))
-+ or_port_hist_report_block(FAILURES_OBSERVED);
-+ else if (report_block)
-+ or_port_hist_report_block(report_block);
-+
-+ if (ports_excluded >= smartlist_len(or_port_hists)) {
-+ log_warn(LD_HIST,"During entry node selection Tor tried every port "
-+ "offered on the network on at least one server "
-+ "and didn't manage a single "
-+ "successful connection. This suggests you are behind an "
-+ "extremely restrictive firewall. Tor will keep trying to find "
-+ "a reachable entry node.");
-+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp, tp->excluded=0;);
-+ }
-+}
-+
-+/** Add any ports marked as excluded in or_port_hist_t to <b>rt</b> */
-+void
-+or_port_hist_exclude(routerset_t *rt)
-+{
-+ SMARTLIST_FOREACH(or_port_hists, or_port_hist_t *, tp,
-+ {
-+ char portpolicy[9];
-+ if (tp->excluded) {
-+ tor_snprintf(portpolicy,sizeof(portpolicy),"*:%u", tp->or_port);
-+ log_warn(LD_HIST,"Port %u may be blocked, excluding it temporarily "
-+ "from entry guard selection.", tp->or_port);
-+ routerset_parse(rt, portpolicy, "Ports");
-+ }
-+ });
-+}
-+
-+/** Allow the exclusion of ports during our search for an entry node. */
-+void
-+or_port_hist_search_again(void)
-+{
-+ still_searching=1;
-+}
-Index: src/or/or.h
-===================================================================
---- src/or/or.h (revision 17104)
-+++ src/or/or.h (working copy)
-@@ -3864,6 +3864,13 @@
- int any_predicted_circuits(time_t now);
- int rep_hist_circbuilding_dormant(time_t now);
-
-+void or_port_hist_failure(const char *digest, uint16_t or_port);
-+void or_port_hist_success(uint16_t or_port);
-+void or_port_hist_new(const routerinfo_t *ri);
-+void or_port_hist_exclude(routerset_t *rt);
-+void or_port_hist_search_again(void);
-+char *or_port_hist_get_blocked_ports(void);
-+
- /** Possible public/private key operations in Tor: used to keep track of where
- * we're spending our time. */
- typedef enum {
-Index: src/or/routerparse.c
-===================================================================
---- src/or/routerparse.c (revision 17104)
-+++ src/or/routerparse.c (working copy)
-@@ -1401,6 +1401,8 @@
- goto err;
- }
-
-+ or_port_hist_new(router);
-+
- if (!router->platform) {
- router->platform = tor_strdup("<unknown>");
- }
-Index: src/or/router.c
-===================================================================
---- src/or/router.c (revision 17104)
-+++ src/or/router.c (working copy)
-@@ -1818,6 +1818,7 @@
- char published[ISO_TIME_LEN+1];
- char digest[DIGEST_LEN];
- char *bandwidth_usage;
-+ char *blocked_ports;
- int result;
- size_t len;
-
-@@ -1825,7 +1826,6 @@
- extrainfo->cache_info.identity_digest, DIGEST_LEN);
- format_iso_time(published, extrainfo->cache_info.published_on);
- bandwidth_usage = rep_hist_get_bandwidth_lines(1);
--
- result = tor_snprintf(s, maxlen,
- "extra-info %s %s\n"
- "published %s\n%s",
-@@ -1835,6 +1835,16 @@
- if (result<0)
- return -1;
-
-+ blocked_ports = or_port_hist_get_blocked_ports();
-+ if (blocked_ports) {
-+ result = tor_snprintf(s+strlen(s), maxlen-strlen(s),
-+ "%s",
-+ blocked_ports);
-+ tor_free(blocked_ports);
-+ if (result<0)
-+ return -1;
-+ }
-+
- if (should_record_bridge_info(options)) {
- static time_t last_purged_at = 0;
- char *geoip_summary;
-Index: src/or/circuitbuild.c
-===================================================================
---- src/or/circuitbuild.c (revision 17104)
-+++ src/or/circuitbuild.c (working copy)
-@@ -62,6 +62,7 @@
-
- static void entry_guards_changed(void);
- static time_t start_of_month(time_t when);
-+static int num_live_entry_guards(void);
-
- /** Iterate over values of circ_id, starting from conn-\>next_circ_id,
- * and with the high bit specified by conn-\>circ_id_type, until we get
-@@ -1627,12 +1628,14 @@
- smartlist_t *excluded;
- or_options_t *options = get_options();
- router_crn_flags_t flags = 0;
-+ routerset_t *_ExcludeNodes;
-
- if (state && options->UseEntryGuards &&
- (purpose != CIRCUIT_PURPOSE_TESTING || options->BridgeRelay)) {
- return choose_random_entry(state);
- }
-
-+ _ExcludeNodes = routerset_new();
- excluded = smartlist_create();
-
- if (state && (r = build_state_get_exit_router(state))) {
-@@ -1670,12 +1673,18 @@
- if (options->_AllowInvalid & ALLOW_INVALID_ENTRY)
- flags |= CRN_ALLOW_INVALID;
-
-+ if (options->ExcludeNodes)
-+ routerset_union(_ExcludeNodes,options->ExcludeNodes);
-+
-+ or_port_hist_exclude(_ExcludeNodes);
-+
- choice = router_choose_random_node(
- NULL,
- excluded,
-- options->ExcludeNodes,
-+ _ExcludeNodes,
- flags);
- smartlist_free(excluded);
-+ routerset_free(_ExcludeNodes);
- return choice;
- }
-
-@@ -2727,6 +2736,7 @@
- entry_guards_update_state(or_state_t *state)
- {
- config_line_t **next, *line;
-+ unsigned int have_reachable_entry=0;
- if (! entry_guards_dirty)
- return;
-
-@@ -2740,6 +2750,7 @@
- char dbuf[HEX_DIGEST_LEN+1];
- if (!e->made_contact)
- continue; /* don't write this one to disk */
-+ have_reachable_entry=1;
- *next = line = tor_malloc_zero(sizeof(config_line_t));
- line->key = tor_strdup("EntryGuard");
- line->value = tor_malloc(HEX_DIGEST_LEN+MAX_NICKNAME_LEN+2);
-@@ -2785,6 +2796,11 @@
- if (!get_options()->AvoidDiskWrites)
- or_state_mark_dirty(get_or_state(), 0);
- entry_guards_dirty = 0;
-+
-+ /* XXX: Is this the place to decide that we no longer have any reachable
-+ guards? */
-+ if (!have_reachable_entry)
-+ or_port_hist_search_again();
- }
-
- /** If <b>question</b> is the string "entry-guards", then dump
-
diff --git a/doc/spec/proposals/157-specific-cert-download.txt b/doc/spec/proposals/157-specific-cert-download.txt
deleted file mode 100644
index 204b20973..000000000
--- a/doc/spec/proposals/157-specific-cert-download.txt
+++ /dev/null
@@ -1,102 +0,0 @@
-Filename: 157-specific-cert-download.txt
-Title: Make certificate downloads specific
-Author: Nick Mathewson
-Created: 2-Dec-2008
-Status: Accepted
-Target: 0.2.1.x
-
-History:
-
- 2008 Dec 2, 22:34
- Changed name of cross certification field to match the other authority
- certificate fields.
-
-Status:
-
- As of 0.2.1.9-alpha:
- Cross-certification is implemented for new certificates, but not yet
- required. Directories support the tor/keys/fp-sk urls.
-
-Overview:
-
- Tor's directory specification gives two ways to download a certificate:
- by its identity fingerprint, or by the digest of its signing key. Both
- are error-prone. We propose a new download mechanism to make sure that
- clients get the certificates they want.
-
-Motivation:
-
- When a client wants a certificate to verify a consensus, it has two choices
- currently:
- - Download by identity key fingerprint. In this case, the client risks
- getting a certificate for the same authority, but with a different
- signing key than the one used to sign the consensus.
-
- - Download by signing key fingerprint. In this case, the client risks
- getting a forged certificate that contains the right signing key
- signed with the wrong identity key. (Since caches are willing to
- cache certs from authorities they do not themselves recognize, the
- attacker wouldn't need to compromise an authority's key to do this.)
-
-Current solution:
-
- Clients fetch by identity keys, and re-fetch with backoff if they don't get
- certs with the signing key they want.
-
-Proposed solution:
-
- Phase 1: Add a URL type for clients to download certs by identity _and_
- signing key fingerprint. Unless both fields match, the client doesn't
- accept the certificate(s). Clients begin using this method when their
- randomly chosen directory cache supports it.
-
- Phase 1A: Simultaneously, add a cross-certification element to
- certificates.
-
- Phase 2: Once many directory caches support phase 1, clients should prefer
- to fetch certificates using that protocol when available.
-
- Phase 2A: Once all authorities are generating cross-certified certificates
- as in phase 1A, require cross-certification.
-
-Specification additions:
-
- The key certificate whose identity key fingerprint is <F> and whose signing
- key fingerprint is <S> should be available at:
-
- http://<hostname>/tor/keys/fp-sk/<F>-<S>.z
-
- As usual, clients may request multiple certificates using:
-
- http://<hostname>/tor/keys/fp-sk/<F1>-<S1>+<F2>-<S2>.z
-
- Clients SHOULD use this format whenever they know both key fingerprints for
- a desired certificate.
-
-
- Certificates SHOULD contain the following field (at most once):
-
- "dir-key-crosscert" NL CrossSignature NL
-
- where CrossSignature is a signature, made using the certificate's signing
- key, of the digest of the PKCS1-padded hash of the certificate's identity
- key. For backward compatibility with broken versions of the parser, we
- wrap the base64-encoded signature in -----BEGIN ID SIGNATURE---- and
- -----END ID SIGNATURE----- tags. (See bug 880.) Implementations MUST allow
- the "ID " portion to be omitted, however.
-
- When encountering a certificate with a dir-key-crosscert entry,
- implementations MUST verify that the signature is a correct signature of
- the hash of the identity key using the signing key.
-
- (In a future version of this specification, dir-key-crosscert entries will
- be required.)
-
-Why cross-certify too?
-
- Cross-certification protects clients who haven't updated yet, by reducing
- the number of caches that are willing to hold and serve bogus certificates.
-
-References:
-
- This is related to part 2 of bug 854.
diff --git a/doc/spec/proposals/158-microdescriptors.txt b/doc/spec/proposals/158-microdescriptors.txt
deleted file mode 100644
index e6966c0ce..000000000
--- a/doc/spec/proposals/158-microdescriptors.txt
+++ /dev/null
@@ -1,198 +0,0 @@
-Filename: 158-microdescriptors.txt
-Title: Clients download consensus + microdescriptors
-Author: Roger Dingledine
-Created: 17-Jan-2009
-Status: Open
-
-0. History
-
- 15 May 2009: Substantially revised based on discussions on or-dev
- from late January. Removed the notion of voting on how to choose
- microdescriptors; made it just a function of the consensus method.
- (This lets us avoid the possibility of "desynchronization.")
- Added suggestion to use a new consensus flavor. Specified use of
- SHA256 for new hashes. -nickm
-
- 15 June 2009: Cleaned up based on comments from Roger. -nickm
-
-1. Overview
-
- This proposal replaces section 3.2 of proposal 141, which was
- called "Fetching descriptors on demand". Rather than modifying the
- circuit-building protocol to fetch a server descriptor inline at each
- circuit extend, we instead put all of the information that clients need
- either into the consensus itself, or into a new set of data about each
- relay called a microdescriptor.
-
- Descriptor elements that are small and frequently changing should go
- in the consensus itself, and descriptor elements that are small and
- relatively static should go in the microdescriptor. If we ever end up
- with descriptor elements that aren't small yet clients need to know
- them, we'll need to resume considering some design like the one in
- proposal 141.
-
- Note also that any descriptor element which clients need to use to
- decide which servers to fetch info about, or which servers to fetch
- info from, needs to stay in the consensus.
-
-2. Motivation
-
- See
- http://archives.seul.org/or/dev/Nov-2008/msg00000.html and
- http://archives.seul.org/or/dev/Nov-2008/msg00001.html and especially
- http://archives.seul.org/or/dev/Nov-2008/msg00007.html
- for a discussion of the options and why this is currently the best
- approach.
-
-3. Design
-
- There are three pieces to the proposal. First, authorities will list in
- their votes (and thus in the consensus) the expected hash of
- microdescriptor for each relay. Second, authorities will serve
- microdescriptors, directory mirrors will cache and serve
- them. Third, clients will ask for them and cache them.
-
-3.1. Consensus changes
-
- If the authorities choose a consensus method of a given version or
- later, a microdescriptor format is implicit in that version.
- A microdescriptor should in every case be a pure function of the
- router descriptor and the consensus method.
-
- In votes, we need to include the hash of each expected microdescriptor
- in the routerstatus section. I suggest a new "m" line for each stanza,
- with the base64 of the SHA256 hash of the router's microdescriptor.
-
- For every consensus method that an authority supports, it includes a
- separate "m" line in each router section of its vote, containing:
- "m" SP methods 1*(SP AlgorithmName "=" digest) NL
- where methods is a comma-separated list of the consensus methods
- that the authority believes will produce "digest".
-
- (As with base64 encoding of SHA1 hashes in consensuses, let's
- omit the trailing =s)
-
- The consensus microdescriptor-elements and "m" lines are then computed
- as described in Section 3.1.2 below.
-
- (This means we need a new consensus-method that knows
- how to compute the microdescriptor-elements and add "m" lines.)
-
- The microdescriptor consensus uses the directory-signature format from
- proposal 162, with the "sha256" algorithm.
-
-
-3.1.1. Descriptor elements to include for now
-
- In the first version, the microdescriptor should contain the
- onion-key element, and the family element from the router descriptor,
- and the exit policy summary as currently specified in dir-spec.txt.
-
-3.1.2. Computing consensus for microdescriptor-elements and "m" lines
-
- When we are generating a consensus, we use whichever m line
- unambiguously corresponds to the descriptor digest that will be
- included in the consensus.
-
- (If different votes have different microdescriptor digests for a
- single <descriptor-digest, consensus-method> pair, then at least one
- of the authorities is broken. If this happens, the consensus should
- contain whichever microdescriptor digest is most common. If there is
- no winner, we break ties in the favor of the lexically earliest.
- Either way, we should log a warning: there is definitely a bug.)
-
- The "m" lines in a consensus contain only the digest, not a list of
- consensus methods.
-
-3.1.3. A new flavor of consensus
-
- Rather than inserting "m" lines in the current consensus format,
- they should be included in a new consensus flavor (see proposal
- 162).
-
- This flavor can safely omit descriptor digests.
-
- When we implement this voting method, we can remove the exit policy
- summary from the current "ns" flavor of consensus, since no current
- clients use them, and they take up about 5% of the compressed
- consensus.
-
- This new consensus flavor should be signed with the sha256 signature
- format as documented in proposal 162.
-
-3.2. Directory mirrors fetch, cache, and serve microdescriptors
-
- Directory mirrors should fetch, catch, and serve each microdescriptor
- from the authorities. (They need to continue to serve normal relay
- descriptors too, to handle old clients.)
-
- The microdescriptors with base64 hashes <D1>,<D2>,<D3> should be
- available at:
- http://<hostname>/tor/micro/d/<D1>-<D2>-<D3>.z
- (We use base64 for size and for consistency with the consensus
- format. We use -s instead of +s to separate these items, since
- the + character is used in base64 encoding.)
-
- All the microdescriptors from the current consensus should also be
- available at:
- http://<hostname>/tor/micro/all.z
- so a client that's bootstrapping doesn't need to send a 70KB URL just
- to name every microdescriptor it's looking for.
-
- Microdescriptors have no header or footer.
- The hash of the microdescriptor is simply the hash of the concatenated
- elements.
-
- Directory mirrors should check to make sure that the microdescriptors
- they're about to serve match the right hashes (either the hashes from
- the fetch URL or the hashes from the consensus, respectively).
-
- We will probably want to consider some sort of smart data structure to
- be able to quickly convert microdescriptor hashes into the appropriate
- microdescriptor. Clients will want this anyway when they load their
- microdescriptor cache and want to match it up with the consensus to
- see what's missing.
-
-3.3. Clients fetch them and cache them
-
- When a client gets a new consensus, it looks to see if there are any
- microdescriptors it needs to learn. If it needs to learn more than
- some threshold of the microdescriptors (half?), it requests 'all',
- else it requests only the missing ones. Clients MAY try to
- determine whether the upload bandwidth for listing the
- microdescriptors they want is more or less than the download
- bandwidth for the microdescriptors they do not want.
-
- Clients maintain a cache of microdescriptors along with metadata like
- when it was last referenced by a consensus, and which identity key
- it corresponds to. They keep a microdescriptor
- until it hasn't been mentioned in any consensus for a week. Future
- clients might cache them for longer or shorter times.
-
-3.3.1. Information leaks from clients
-
- If a client asks you for a set of microdescs, then you know she didn't
- have them cached before. How much does that leak? What about when
- we're all using our entry guards as directory guards, and we've seen
- that user make a bunch of circuits already?
-
- Fetching "all" when you need at least half is a good first order fix,
- but might not be all there is to it.
-
- Another future option would be to fetch some of the microdescriptors
- anonymously (via a Tor circuit).
-
- Another crazy option (Roger's phrasing) is to do decoy fetches as
- well.
-
-4. Transition and deployment
-
- Phase one, the directory authorities should start voting on
- microdescriptors, and putting them in the consensus.
-
- Phase two, directory mirrors should learn how to serve them, and learn
- how to read the consensus to find out what they should be serving.
-
- Phase three, clients should start fetching and caching them instead
- of normal descriptors.
-
diff --git a/doc/spec/proposals/159-exit-scanning.txt b/doc/spec/proposals/159-exit-scanning.txt
deleted file mode 100644
index 7090f2ed0..000000000
--- a/doc/spec/proposals/159-exit-scanning.txt
+++ /dev/null
@@ -1,142 +0,0 @@
-Filename: 159-exit-scanning.txt
-Title: Exit Scanning
-Author: Mike Perry
-Created: 13-Feb-2009
-Status: Open
-
-Overview:
-
-This proposal describes the implementation and integration of an
-automated exit node scanner for scanning the Tor network for malicious,
-misconfigured, firewalled or filtered nodes.
-
-Motivation:
-
-Tor exit nodes can be run by anyone with an Internet connection. Often,
-these users aren't fully aware of limitations of their networking
-setup. Content filters, antivirus software, advertisements injected by
-their service providers, malicious upstream providers, and the resource
-limitations of their computer or networking equipment have all been
-observed on the current Tor network.
-
-It is also possible that some nodes exist purely for malicious
-purposes. In the past, there have been intermittent instances of
-nodes spoofing SSH keys, as well as nodes being used for purposes of
-plaintext surveillance.
-
-While it is not realistic to expect to catch extremely targeted or
-completely passive malicious adversaries, the goal is to prevent
-malicious adversaries from deploying dragnet attacks against large
-segments of the Tor userbase.
-
-
-Scanning methodology:
-
-The first scans to be implemented are HTTP, HTML, Javascript, and
-SSL scans.
-
-The HTTP scan scrapes Google for common filetype urls such as exe, msi,
-doc, dmg, etc. It then fetches these urls through Non-Tor and Tor, and
-compares the SHA1 hashes of the resulting content.
-
-The SSL scan downloads certificates for all IPs a domain will locally
-resolve to and compares these certificates to those seen over Tor. The
-scanner notes if a domain had rotated certificates locally in the
-results for each scan.
-
-The HTML scan checks HTML, Javascript, and plugin content for
-modifications. Because of the dynamic nature of most of the web, the
-scanner has a number of mechanisms built in to filter out false
-positives that are used when a change is noticed between Tor and
-Non-Tor.
-
-All tests also share a URL-based false positive filter that
-automatically removes results retroactively if the number of failures
-exceeds a certain percentage of nodes tested with the URL.
-
-
-Deployment Stages:
-
-To avoid instances where bugs cause us to mark exit nodes as BadExit
-improperly, it is proposed that we begin use of the scanner in stages.
-
-1. Manual Review:
-
- In the first stage, basic scans will be run by a small number of
- people while we stabilize the scanner. The scanner has the ability
- to resume crashed scans, and to rescan nodes that fail various
- tests.
-
-2. Human Review:
-
- In the second stage, results will be automatically mailed to
- an email list of interested parties for review. We will also begin
- classifying failure types into three to four different severity
- levels, based on both the reliability of the test and the nature of
- the failure.
-
-3. Automatic BadExit Marking:
-
- In the final stage, the scanner will begin marking exits depending
- on the failure severity level in one of three different ways: by
- node idhex, by node IP, or by node IP mask. A potential fourth, less
- severe category of results may still be delivered via email only for
- review.
-
- BadExit markings will be delivered in batches upon completion
- of whole-network scans, so that the final false positive
- filter has an opportunity to filter out URLs that exhibit
- dynamic content beyond what we can filter.
-
-
-Specification of Exit Marking:
-
-Technically, BadExit could be marked via SETCONF AuthDirBadExit over
-the control port, but this would allow full access to the directory
-authority configuration and operation.
-
-The approved-routers file could also be used, but currently it only
-supports fingerprints, and it also contains other data unrelated to
-exit scanning that would be difficult to coordinate.
-
-Instead, we propose that a new badexit-routers file that has three
-keywords:
-
- BadExitNet 1*[exitpattern from 2.3 in dir-spec.txt]
- BadExitFP 1*[hexdigest from 2.3 in dir-spec.txt]
-
-BadExitNet lines would follow the codepaths used by AuthDirBadExit to
-set authdir_badexit_policy, and BadExitFP would follow the codepaths
-from approved-router's !badexit lines.
-
-The scanner would have exclusive ability to write, append, rewrite,
-and modify this file. Prior to building a new consensus vote, a
-participating Tor authority would read in a fresh copy.
-
-
-Security Implications:
-
-Aside from evading the scanner's detection, there are two additional
-high-level security considerations:
-
-1. Ensure nodes cannot be marked BadExit by an adversary at will
-
-It is possible individual website owners will be able to target certain
-Tor nodes, but once they begin to attempt to fail more than the URL
-filter percentage of the exits, their sites will be automatically
-discarded.
-
-Failing specific nodes is possible, but scanned results are fully
-reproducible, and BadExits should be rare enough that humans are never
-fully removed from the loop.
-
-State (cookies, cache, etc) does not otherwise persist in the scanner
-between exit nodes to enable one exit node to bias the results of a
-later one.
-
-2. Ensure that scanner compromise does not yield authority compromise
-
-Having a separate file that is under the exclusive control of the
-scanner allows us to heavily isolate the scanner from the Tor
-authority, potentially even running them on separate machines.
-
diff --git a/doc/spec/proposals/160-bandwidth-offset.txt b/doc/spec/proposals/160-bandwidth-offset.txt
deleted file mode 100644
index 96935ade7..000000000
--- a/doc/spec/proposals/160-bandwidth-offset.txt
+++ /dev/null
@@ -1,105 +0,0 @@
-Filename: 160-bandwidth-offset.txt
-Title: Authorities vote for bandwidth offsets in consensus
-Author: Roger Dingledine
-Created: 4-May-2009
-Status: Finished
-Target: 0.2.2.x
-
-1. Motivation
-
- As part of proposal 141, we moved the bandwidth value for each relay
- into the consensus. Now clients can know how they should load balance
- even before they've fetched the corresponding relay descriptors.
-
- Putting the bandwidth in the consensus also lets the directory
- authorities choose more accurate numbers to advertise, if we come up
- with a better algorithm for deciding weightings.
-
- Our original plan was to teach directory authorities how to measure
- bandwidth themselves; then every authority would vote for the bandwidth
- it prefers, and we'd take the median of votes as usual.
-
- The problem comes when we have 7 authorities, and only a few of them
- have smarter bandwidth allocation algorithms. So long as the majority
- of them are voting for the number in the relay descriptor, the minority
- that have better numbers will be ignored.
-
-2. Options
-
- One fix would be to demand that every authority also run the
- new bandwidth measurement algorithms: in that case, part of the
- responsibility of being an authority operator is that you need to run
- this code too. But in practice we can't really require all current
- authority operators to do that; and if we want to expand the set of
- authority operators even further, it will become even more impractical.
- Also, bandwidth testing adds load to the network, so we don't really
- want to require that the number of concurrent bandwidth tests match
- the number of authorities we have.
-
- The better fix is to allow certain authorities to specify that they are
- voting on bandwidth measurements: more accurate bandwidth values that
- have actually been evaluated. In this way, authorities can vote on
- the median measured value if sufficient measured votes exist for a router,
- and otherwise fall back to the median value taken from the published router
- descriptors.
-
-3. Security implications
-
- If only some authorities choose to vote on an offset, then a majority of
- those voting authorities can arbitrarily change the bandwidth weighting
- for the relay. At the extreme, if there's only one offset-voting
- authority, then that authority can dictate which relays clients will
- find attractive.
-
- This problem isn't entirely new: we already have the worry wrt
- the subset of authorities that vote for BadExit.
-
- To make it not so bad, we should deploy at least three offset-voting
- authorities.
-
- Also, authorities that know how to vote for offsets should vote for
- an offset of zero for new nodes, rather than choosing not to vote on
- any offset in those cases.
-
-4. Design
-
- First, we need a new consensus method to support this new calculation.
-
- Now v3 votes can have an additional value on the "w" line:
- "w Bandwidth=X Measured=" INT.
-
- Once we're using the new consensus method, the new way to compute the
- Bandwidth weight is by checking if there are at least 3 "Measured"
- votes. If so, the median of these is taken. Otherwise, the median
- of the "Bandwidth=" values are taken, as described in Proposal 141.
-
- Then the actual consensus looks just the same as it did before,
- so clients never have to know that this additional calculation is
- happening.
-
-5. Implementation
-
- The Measured values will be read from a file provided by the scanners
- described in proposal 161. Files with a timestamp older than 3 days
- will be ignored.
-
- The file will be read in from dirserv_generate_networkstatus_vote_obj()
- in a location specified by a new config option "V3MeasuredBandwidths".
- A helper function will be called to populate new 'measured' and
- 'has_measured' fields of the routerstatus_t 'routerstatuses' list with
- values read from this file.
-
- An additional for_vote flag will be passed to
- routerstatus_format_entry() from format_networkstatus_vote(), which will
- indicate that the "Measured=" string should be appended to the "w Bandwith="
- line with the measured value in the struct.
-
- routerstatus_parse_entry_from_string() will be modified to parse the
- "Measured=" lines into routerstatus_t struct fields.
-
- Finally, networkstatus_compute_consensus() will set rs_out.bandwidth
- to the median of the measured values if there are more than 3, otherwise
- it will use the bandwidth value median as normal.
-
-
-
diff --git a/doc/spec/proposals/161-computing-bandwidth-adjustments.txt b/doc/spec/proposals/161-computing-bandwidth-adjustments.txt
deleted file mode 100644
index d21982666..000000000
--- a/doc/spec/proposals/161-computing-bandwidth-adjustments.txt
+++ /dev/null
@@ -1,174 +0,0 @@
-Title: Computing Bandwidth Adjustments
-Filename: 161-computing-bandwidth-adjustments.txt
-Author: Mike Perry
-Created: 12-May-2009
-Target: 0.2.2.x
-Status: Finished
-
-
-1. Motivation
-
- There is high variance in the performance of the Tor network. Despite
- our efforts to balance load evenly across the Tor nodes, some nodes are
- significantly slower and more overloaded than others.
-
- Proposal 160 describes how we can augment the directory authorities to
- vote on measured bandwidths for routers. This proposal describes what
- goes into the measuring process.
-
-
-2. Measurement Selection
-
- The general idea is to determine a load factor representing the ratio
- of the capacity of measured nodes to the rest of the network. This load
- factor could be computed from three potentially relevant statistics:
- circuit failure rates, circuit extend times, or stream capacity.
-
- Circuit failure rates and circuit extend times appear to be
- non-linearly proportional to node load. We've observed that the same
- nodes when scanned at US nighttime hours (when load is presumably
- lower) exhibit almost no circuit failure, and significantly faster
- extend times than when scanned during the day.
-
- Stream capacity, however, is much more uniform, even during US
- nighttime hours. Moreover, it is a more intuitive representation of
- node capacity, and also less dependent upon distance and latency
- if amortized over large stream fetches.
-
-
-3. Average Stream Bandwidth Calculation
-
- The average stream bandwidths are obtained by dividing the network into
- slices of 50 nodes each, grouped according to advertised node bandwidth.
-
- Two hop circuits are built using nodes from the same slice, and a large
- file is downloaded via these circuits. The file sizes are set based
- on node percentile rank as follows:
-
- 0-10: 2M
- 10-20: 1M
- 20-30: 512k
- 30-50: 256k
- 50-100: 128k
-
- These sizes are based on measurements performed during test scans.
-
- This process is repeated until each node has been chosen to participate
- in at least 5 circuits.
-
-
-4. Ratio Calculation
-
- The ratios are calculated by dividing each measured value by the
- network-wide average.
-
-
-5. Ratio Filtering
-
- After the base ratios are calculated, a second pass is performed
- to remove any streams with nodes of ratios less than X=0.5 from
- the results of other nodes. In addition, all outlying streams
- with capacity of one standard deviation below a node's average
- are also removed.
-
- The final ratio result will be greater of the unfiltered ratio
- and the filtered ratio.
-
-
-6. Pseudocode for Ratio Calculation Algorithm
-
- Here is the complete pseudocode for the ratio algorithm:
-
- Slices = {S | S is 50 nodes of similar consensus capacity}
- for S in Slices:
- while exists node N in S with circ_chosen(N) < 7:
- fetch_slice_file(build_2hop_circuit(N, (exit in S)))
- for N in S:
- BW_measured(N) = MEAN(b | b is bandwidth of a stream through N)
- Bw_stddev(N) = STDDEV(b | b is bandwidth of a stream through N)
- Bw_avg(S) = MEAN(b | b = BW_measured(N) for all N in S)
- for N in S:
- Normal_Streams(N) = {stream via N | bandwidth >= BW_measured(N)}
- BW_Norm_measured(N) = MEAN(b | b is a bandwidth of Normal_Streams(N))
-
- Bw_net_avg(Slices) = MEAN(BW_measured(N) for all N in Slices)
- Bw_Norm_net_avg(Slices) = MEAN(BW_Norm_measured(N) for all N in Slices)
-
- for N in all Slices:
- Bw_net_ratio(N) = Bw_measured(N)/Bw_net_avg(Slices)
- Bw_Norm_net_ratio(N) = BW_Norm_measured(N)/Bw_Norm_net_avg(Slices)
-
- ResultRatio(N) = MAX(Bw_net_ratio(N), Bw_Norm_net_ratio(N))
-
-
-7. Security implications
-
- The ratio filtering will deal with cases of sabotage by dropping
- both very slow outliers in stream average calculations, as well
- as dropping streams that used very slow nodes from the calculation
- of other nodes.
-
- This scheme will not address nodes that try to game the system by
- providing better service to scanners. The scanners can be detected
- at the entry by IP address, and at the exit by the destination fetch
- IP.
-
- Measures can be taken to obfuscate and separate the scanners' source
- IP address from the directory authority IP address. For instance,
- scans can happen offsite and the results can be rsynced into the
- authorities. The destination server IP can also change.
-
- Neither of these methods are foolproof, but such nodes can already
- lie about their bandwidth to attract more traffic, so this solution
- does not set us back any in that regard.
-
-
-8. Parallelization
-
- Because each slice takes as long as 6 hours to complete, we will want
- to parallelize as much as possible. This will be done by concurrently
- running multiple scanners from each authority to deal with different
- segments of the network. Each scanner piece will continually loop
- over a portion of the network, outputting files of the form:
-
- node_id=<idhex> SP strm_bw=<BW_measured(N)> SP
- filt_bw=<BW_Norm_measured(N)> ns_bw=<CurrentConsensusBw(N)> NL
-
- The most recent file from each scanner will be periodically gathered
- by another script that uses them to produce network-wide averages
- and calculate ratios as per the algorithm in section 6. Because nodes
- may shift in capacity, they may appear in more than one slice and/or
- appear more than once in the file set. The most recently measured
- line will be chosen in this case.
-
-
-9. Integration with Proposal 160
-
- The final results will be produced for the voting mechanism
- described in Proposal 160 by multiplying the derived ratio by
- the average published consensus bandwidth during the course of the
- scan, and taking the weighted average with the previous consensus
- bandwidth:
-
- Bw_new = Round((Bw_current * Alpha + Bw_scan_avg*Bw_ratio)/(Alpha + 1))
-
- The Alpha parameter is a smoothing parameter intended to prevent
- rapid oscillation between loaded and unloaded conditions. It is
- currently fixed at 0.333.
-
- The Round() step consists of rounding to the 3 most significant figures
- in base10, and then rounding that result to the nearest 1000, with
- a minimum value of 1000.
-
- This will produce a new bandwidth value that will be output into a
- file consisting of lines of the form:
-
- node_id=<idhex> SP bw=<Bw_new> NL
-
- The first line of the file will contain a timestamp in UNIX time()
- seconds. This will be used by the authority to decide if the
- measured values are too old to use.
-
- This file can be either copied or rsynced into a directory readable
- by the directory authority.
-
diff --git a/doc/spec/proposals/162-consensus-flavors.txt b/doc/spec/proposals/162-consensus-flavors.txt
deleted file mode 100644
index e3b697afe..000000000
--- a/doc/spec/proposals/162-consensus-flavors.txt
+++ /dev/null
@@ -1,188 +0,0 @@
-Filename: 162-consensus-flavors.txt
-Title: Publish the consensus in multiple flavors
-Author: Nick Mathewson
-Created: 14-May-2009
-Target: 0.2.2
-Status: Open
-
-Overview:
-
- This proposal describes a way to publish each consensus in
- multiple simultaneous formats, or "flavors". This will reduce the
- amount of time needed to deploy new consensus-like documents, and
- reduce the size of consensus documents in the long term.
-
-Motivation:
-
- In the future, we will almost surely want different fields and
- data in the network-status document. Examples include:
- - Publishing hashes of microdescriptors instead of hashes of
- full descriptors (Proposal 158).
- - Including different digests of descriptors, instead of the
- perhaps-soon-to-be-totally-broken SHA1.
-
- Note that in both cases, from the client's point of view, this
- information _replaces_ older information. If we're using a
- SHA256 hash, we don't need to see the SHA1. If clients only want
- microdescriptors, they don't (necessarily) need to see hashes of
- other things.
-
- Our past approach to cases like this has been to shovel all of
- the data into the consensus document. But this is rather poor
- for bandwidth. Adding a single SHA256 hash to a consensus for
- each router increases the compressed consensus size by 47%. In
- comparison, replacing a single SHA1 hash with a SHA256 hash for
- each listed router increases the consensus size by only 18%.
-
-Design in brief:
-
- Let the voting process remain as it is, until a consensus is
- generated. With future versions of the voting algorithm, instead
- of just a single consensus being generated, multiple consensus
- "flavors" are produced.
-
- Consensuses (all of them) include a list of which flavors are
- being generated. Caches fetch and serve all flavors of consensus
- that are listed, regardless of whether they can parse or validate
- them, and serve them to clients. Thus, once this design is in
- place, we won't need to deploy more cache changes in order to get
- new flavors of consensus to be cached.
-
- Clients download only the consensus flavor they want.
-
-A note on hashes:
-
- Everything in this document is specified to use SHA256, and to be
- upgradeable to use better hashes in the future.
-
-Spec modifications:
-
- 1. URLs and changes to the current consensus format.
-
- Every consensus flavor has a name consisting of a sequence of one
- or more alphanumeric characters and dashes. For compatibility
- current descriptor flavor is called "ns".
-
- The supported consensus flavors are defined as part of the
- authorities' consensus method.
-
- For each supported flavor, every authority calculates another
- consensus document of as-yet-unspecified format, and exchanges
- detached signatures for these documents as in the current consensus
- design.
-
- In addition to the consensus currently served at
- /tor/status-vote/(current|next)/consensus.z and
- /tor/status-vote/(current|next)/consensus/<FP1>+<FP2>+<FP3>+....z ,
- authorities serve another consensus of each flavor "F" from the
- locations /tor/status-vote/(current|next)/consensus-F.z. and
- /tor/status-vote/(current|next)/consensus-F/<FP1>+....z.
-
- When caches serve these documents, they do so from the same
- locations.
-
- 2. Document format: generic consensus.
-
- The format of a flavored consensus is as-yet-unspecified, except
- that the first line is:
- "network-status-version" SP version SP flavor NL
-
- where version is 3 or higher, and the flavor is a string
- consisting of alphanumeric characters and dashes, matching the
- corresponding flavor listed in the unflavored consensus.
-
- 3. Document format: detached signatures.
-
- We amend the detached signature format to include more than one
- consensus-digest line, and more than one set of signatures.
-
- After the consensus-digest line, we allow more lines of the form:
- "additional-digest" SP flavor SP algname SP digest NL
-
- Before the directory-signature lines, we allow more entries of the form:
- "additional-signature" SP flavor SP algname SP identity SP
- signing-key-digest NL signature.
-
- [We do not use "consensus-digest" or "directory-signature" for flavored
- consensuses, since this could confuse older Tors.]
-
- The consensus-signatures URL should contain the signatures
- for _all_ flavors of consensus.
-
- 4. The consensus index:
-
- Authorities additionally generate and serve a consensus-index
- document. Its format is:
-
- Header ValidAfter ValidUntil Documents Signatures
-
- Header = "consensus-index" SP version NL
- ValidAfter = as in a consensus
- ValidUntil = as in a consensus
- Documents = Document*
- Document = "document" SP flavor SP SignedLength
- 1*(SP AlgorithmName "=" Digest) NL
- Signatures = Signature*
- Signature = "directory-signature" SP algname SP identity
- SP signing-key-digest NL signature
-
- There must be one Document line for each generated consensus flavor.
- Each Document line describes the length of the signed portion of
- a consensus (the signatures themselves are not included), along
- with one or more digests of that signed portion. Digests are
- given in hex. The algorithm "sha256" MUST be included; others
- are allowed.
-
- The algname part of a signature describes what algorithm was
- used to hash the identity and signing keys, and to compute the
- signature. The algorithm "sha256" MUST be recognized;
- signatures with unrecognized algorithms MUST be ignored.
- (See below).
-
- The consensus index is made available at
- /tor/status-vote/(current|next)/consensus-index.z.
-
- Caches should fetch this document so they can check the
- correctness of the different consensus documents they fetch.
- They do not need to check anything about an unrecognized
- consensus document beyond its digest and length.
-
- 4.1. The "sha256" signature format.
-
- The 'SHA256' signature format for directory objects is defined as
- the RSA signature of the OAEP+-padded SHA256 digest of the item to
- be signed. When checking signatures, the signature MUST be treated
- as valid if the signature material begins with SHA256(document);
- this allows us to add other data later.
-
-Considerations:
-
- - We should not create a new flavor of consensus when adding a
- field instead wouldn't be too onerous.
-
- - We should not proliferate flavors lightly: clients will be
- distinguishable based on which flavor they download.
-
-Migration:
-
- - Stage one: authorities begin generating and serving
- consensus-index files.
-
- - Stage two: Caches begin downloading consensus-index files,
- validating them, and using them to decide what flavors of
- consensus documents to cache. They download all listed
- documents, and compare them to the digests given in the
- consensus.
-
- - Stage three: Once we want to make a significant change to the
- consensus format, we deploy another flavor of consensus at the
- authorities. This will immediately start getting cached by the
- caches, and clients can start fetching the new flavor without
- waiting a version or two for enough caches to begin supporting
- it.
-
-Acknowledgements:
-
- Aspects of this design and its applications to hash migration were
- heavily influenced by IRC conversations with Marian.
-
diff --git a/doc/spec/proposals/163-detecting-clients.txt b/doc/spec/proposals/163-detecting-clients.txt
deleted file mode 100644
index d838b1706..000000000
--- a/doc/spec/proposals/163-detecting-clients.txt
+++ /dev/null
@@ -1,115 +0,0 @@
-Filename: 163-detecting-clients.txt
-Title: Detecting whether a connection comes from a client
-Author: Nick Mathewson
-Created: 22-May-2009
-Target: 0.2.2
-Status: Open
-
-
-Overview:
-
- Some aspects of Tor's design require relays to distinguish
- connections from clients from connections that come from relays.
- The existing means for doing this is easy to spoof. We propose
- a better approach.
-
-Motivation:
-
- There are at least two reasons for which Tor servers want to tell
- which connections come from clients and which come from other
- servers:
-
- 1) Some exits, proposal 152 notwithstanding, want to disallow
- their use as single-hop proxies.
- 2) Some performance-related proposals involve prioritizing
- traffic from relays, or limiting traffic per client (but not
- per relay).
-
- Right now, we detect client vs server status based on how the
- client opens circuits. (Check out the code that implements the
- AllowSingleHopExits option if you want all the details.) This
- method is depressingly easy to fake, though. This document
- proposes better means.
-
-Goals:
-
- To make grabbing relay privileges at least as difficult as just
- running a relay.
-
- In the analysis below, "using server privileges" means taking any
- action that only servers are supposed to do, like delivering a
- BEGIN cell to an exit node that doesn't allow single hop exits,
- or claiming server-like amounts of bandwidth.
-
-Passive detection:
-
- A connection is definitely a client connection if it takes one of
- the TLS methods during setup that does not establish an identity
- key.
-
- A circuit is definitely a client circuit if it is initiated with
- a CREATE_FAST cell, though the node could be a client or a server.
-
- A node that's listed in a recent consensus is probably a server.
-
- A node to which we have successfully extended circuits from
- multiple origins is probably a server.
-
-Active detection:
-
- If a node doesn't try to use server privileges at all, we never
- need to care whether it's a server.
-
- When a node or circuit tries to use server privileges, if it is
- "definitely a client" as per above, we can refuse it immediately.
-
- If it's "probably a server" as per above, we can accept it.
-
- Otherwise, we have either a client, or a server that is neither
- listed in any consensus or used by any other clients -- in other
- words, a new or private server.
-
- For these servers, we should attempt to build one or more test
- circuits through them. If enough of the circuits succeed, the
- node is a real relay. If not, it is probably a client.
-
- While we are waiting for the test circuits to succeed, we should
- allow a short grace period in which server privileges are
- permitted. When a test is done, we should remember its outcome
- for a while, so we don't need to do it again.
-
-Why it's hard to do good testing:
-
- Doing a test circuit starting with an unlisted router requires
- only that we have an open connection for it. Doing a test
- circuit starting elsewhere _through_ an unlisted router--though
- more reliable-- would require that we have a known address, port,
- identity key, and onion key for the router. Only the address and
- identity key are easily available via the current Tor protocol in
- all cases.
-
- We could fix this part by requiring that all servers support
- BEGIN_DIR and support downloading at least a current descriptor
- for themselves.
-
-Open questions:
-
- What are the thresholds for the needed numbers of circuits
- for us to decide that a node is a relay?
-
- [Suggested answer: two circuits from two distinct hosts.]
-
- How do we pick grace periods? How long do we remember the
- outcome of a test?
-
- [Suggested answer: 10 minute grace period; 48 hour memory of
- test outcomes.]
-
- If we can build circuits starting at a suspect node, but we don't
- have enough information to try extending circuits elsewhere
- through the node, should we conclude that the node is
- "server-like" or not?
-
- [Suggested answer: for now, just try making circuits through
- the node. Extend this to extending circuits as needed.]
-
diff --git a/doc/spec/proposals/164-reporting-server-status.txt b/doc/spec/proposals/164-reporting-server-status.txt
deleted file mode 100644
index 705f5f1a8..000000000
--- a/doc/spec/proposals/164-reporting-server-status.txt
+++ /dev/null
@@ -1,91 +0,0 @@
-Filename: 164-reporting-server-status.txt
-Title: Reporting the status of server votes
-Author: Nick Mathewson
-Created: 22-May-2009
-Target: 0.2.2
-Status: Open
-
-
-Overview:
-
- When a given node isn't listed in the directory, it isn't always easy
- to tell why. This proposal suggest a quick-and-dirty way for
- authorities to export not only how they voted, but why, and a way to
- collate the information.
-
-Motivation:
-
- Right now, if you want to know the reason why your server was listed
- a certain way in the Tor directory, the following steps are
- recommended:
-
- - Look through your log for reports of what the authority said
- when you tried to upload.
-
- - Look at the consensus; see if you're listed.
-
- - Wait a while, see if things get better.
-
- - Download the votes from all the authorities, and see how they
- voted. Try to figure out why.
-
- - If you think they'll listen to you, ask some authority
- operators to look you up in their mtbf files and logs to see
- why they voted as they did.
-
- This is far too hard.
-
-Solution:
-
- We should add a new vote-like information-only document that
- authorities serve on request. Call it a "vote info". It is
- generated at the same time as a vote, but used only for
- determining why a server voted as it did. It is served from
- /tor/status-vote-info/current/authority[.z]
-
- It differs from a vote in that:
-
- * Its vote-status field is 'vote-info'.
-
- * It includes routers that the authority would not include
- in its vote.
-
- For these, it includes an "omitted" line with an English
- message explaining why they were omitted.
-
- * For each router, it includes a line describing its WFU and
- MTBF. The format is:
-
- "stability <mtbf> up-since='date'"
- "uptime <wfu> down-since='date'"
-
- * It describes the WFU and MTBF thresholds it requires to
- vote for a given router in various roles in the header.
- The format is:
-
- "flag-requirement <flag-name> <field> <op> <value>"
-
- e.g.
-
- "flag-requirement Guard uptime > 80"
-
- * It includes info on routers all of whose descriptors that
- were uploaded but rejected over the past few hours. The
- "r" lines for these are the same as for regular routers.
- The other lines are omitted for these routers, and are
- replaced with a single "rejected" line, explaining (in
- English) why the router was rejected.
-
-
- A status site (like Torweather or Torstatus or another
- tool) can poll these files when they are generated, collate
- the data, and make it available to server operators.
-
-Risks:
-
- This document makes no provisions for caching these "vote
- info" documents. If many people wind up fetching them
- aggressively from the authorities, that would be bad.
-
-
-
diff --git a/doc/spec/proposals/165-simple-robust-voting.txt b/doc/spec/proposals/165-simple-robust-voting.txt
deleted file mode 100644
index f813285a8..000000000
--- a/doc/spec/proposals/165-simple-robust-voting.txt
+++ /dev/null
@@ -1,133 +0,0 @@
-Filename: 165-simple-robust-voting.txt
-Title: Easy migration for voting authority sets
-Author: Nick Mathewson
-Created: 2009-05-28
-Status: Open
-
-Overview:
-
- This proposal describes any easy-to-implement, easy-to-verify way to
- change the set of authorities without creating a "flag day" situation.
-
-Motivation:
-
- From proposal 134 ("More robust consensus voting with diverse
- authority sets") by Peter Palfrader:
-
- Right now there are about five authoritative directory servers
- in the Tor network, tho this number is expected to rise to about
- 15 eventually.
-
- Adding a new authority requires synchronized action from all
- operators of directory authorities so that at any time during the
- update at least half of all authorities are running and agree on
- who is an authority. The latter requirement is there so that the
- authorities can arrive at a common consensus: Each authority
- builds the consensus based on the votes from all authorities it
- recognizes, and so a different set of recognized authorities will
- lead to a different consensus document.
-
- In response to this problem, proposal 134 suggested that every
- candidate authority list in its vote whom it believes to be an
- authority. These A-says-B-is-an-authority relationships form a
- directed graph. Each authority then iteratively finds the largest
- clique in the graph and remove it, until they find one containing
- them. They vote with this clique.
-
- Proposal 134 had some problems:
-
- - It had a security problem in that M hostile authorities in a
- clique could effectively kick out M-1 honest authorities. This
- could enable a minority of the original authorities to take over.
-
- - It was too complex in its implications to analyze well: it took us
- over a year to realize that it was insecure.
-
- - It tried to solve a bigger problem: general fragmentation of
- authority trust. Really, all we wanted to have was the ability to
- add and remove authorities without forcing a flag day.
-
-Proposed protocol design:
-
- A "Voting Set" is a set of authorities. Each authority has a list of
- the voting sets it considers acceptable. These sets are chosen
- manually by the authority operators. They must always contain the
- authority itself. Each authority lists all of these voting sets in
- its votes.
-
- Authorities exchange votes with every other authority in any of their
- voting sets.
-
- When it is time to calculate a consensus, an authority votes with
- whichever voting set it lists that is listed by the most members of
- that set. In other words, given two sets S1 and S2 that an authority
- lists, that authority will prefer to vote with S1 over S2 whenever
- the number of other authorities in S1 that themselves list S1 is
- higher than the number of other authorities in S2 that themselves
- list S2.
-
- For example, suppose authority A recognizes two sets, "A B C D" and
- "A E F G H". Suppose that the first set is recognized by all of A,
- B, C, and D, whereas the second set is recognized only by A, E, and
- F. Because the first set is recognize by more of the authorities in
- it than the other one, A will vote with the first set.
-
- Ties are broken in favor of some arbitrary function of the identity
- keys of the authorities in the set.
-
-How to migrate authority sets:
-
- In steady state, each authority operator should list only the current
- actual voting set as accepted.
-
- When we want to add an authority, each authority operator configures
- his or her server to list two voting sets: one containing all the old
- authorities, and one containing the old authorities and the new
- authority too. Once all authorities are listing the new set of
- authorities, they will start voting with that set because of its
- size.
-
- What if one or two authority operators are slow to list the new set?
- Then the other operators can stop listing the old set once there are
- enough authorities listing the new set to make its voting successful.
- (Note that these authorities not listing the new set will still have
- their votes counted, since they themselves will be members of the new
- set. They will only fail to sign the consensus generated by the
- other authorities who are using the new set.)
-
- When we want to remove an authority, the operators list two voting
- sets: one containing all the authorities, and one omitting the
- authority we want to remove. Once enough authorities list the new
- set as acceptable, we start having authority operators stop listing
- the old set. Once there are more listing the new set than the old
- set, the new set will win.
-
-Data format changes:
-
- Add a new 'voting-set' line to the vote document format. Allow it to
- occur any number of times. Its format is:
-
- voting-set SP 'fingerprint' SP 'fingerprint' ... NL
-
- where each fingerprint is the hex fingerprint of an identity key of
- an authority. Sort fingerprints in ascending order.
-
- When the consensus method is at least 'X' (decide this when we
- implement the proposal), add this line to the consensus format as
- well, before the first dir-source line. [This information is not
- redundant with the dir-source sections in the consensus: If an
- authority is recognized but didn't vote, that authority will appear in
- the voting-set line but not in the dir-source sections.]
-
- We don't need to list other information about authorities in our
- vote.
-
-Migration issues:
-
- We should keep track somewhere of which Tor client versions
- recognized which authorities.
-
-Acknowledgments:
-
- The design came out of an IRC conversation with Peter Palfrader. He
- had the basic idea first.
diff --git a/doc/spec/proposals/166-statistics-extra-info-docs.txt b/doc/spec/proposals/166-statistics-extra-info-docs.txt
deleted file mode 100644
index ab2716a71..000000000
--- a/doc/spec/proposals/166-statistics-extra-info-docs.txt
+++ /dev/null
@@ -1,391 +0,0 @@
-Filename: 166-statistics-extra-info-docs.txt
-Title: Including Network Statistics in Extra-Info Documents
-Author: Karsten Loesing
-Created: 21-Jul-2009
-Target: 0.2.2
-Status: Accepted
-
-Change history:
-
- 21-Jul-2009 Initial proposal for or-dev
-
-
-Overview:
-
- The Tor network has grown to almost two thousand relays and millions
- of casual users over the past few years. With growth has come
- increasing performance problems and attempts by some countries to
- block access to the Tor network. In order to address these problems,
- we need to learn more about the Tor network. This proposal suggests to
- measure additional statistics and include them in extra-info documents
- to help us understand the Tor network better.
-
-
-Introduction:
-
- As of May 2009, relays, bridges, and directories gather the following
- data for statistical purposes:
-
- - Relays and bridges count the number of bytes that they have pushed
- in 15-minute intervals over the past 24 hours. Relays and bridges
- include these data in extra-info documents that they send to the
- directory authorities whenever they publish their server descriptor.
-
- - Bridges further include a rough number of clients per country that
- they have seen in the past 48 hours in their extra-info documents.
-
- - Directories can be configured to count the number of clients they
- see per country in the past 24 hours and to write them to a local
- file.
-
- Since then we extended the network statistics in Tor. These statistics
- include:
-
- - Directories now gather more precise statistics about connecting
- clients. Fixes include measuring in intervals of exactly 24 hours,
- counting unsuccessful requests, measuring download times, etc. The
- directories append their statistics to a local file every 24 hours.
-
- - Entry guards count the number of clients per country per day like
- bridges do and write them to a local file every 24 hours.
-
- - Relays measure statistics of the number of cells in their circuit
- queues and how much time these cells spend waiting there. Relays
- write these statistics to a local file every 24 hours.
-
- - Exit nodes count the number of read and written bytes on exit
- connections per port as well as the number of opened exit streams
- per port in 24-hour intervals. Exit nodes write their statistics to
- a local file.
-
- The following four sections contain descriptions for adding these
- statistics to the relays' extra-info documents.
-
-
-Directory request statistics:
-
- The first type of statistics aims at measuring directory requests sent
- by clients to a directory mirror or directory authority. More
- precisely, these statistics aim at requests for v2 and v3 network
- statuses only. These directory requests are sent non-anonymously,
- either via HTTP-like requests to a directory's Dir port or tunneled
- over a 1-hop circuit.
-
- Measuring directory request statistics is useful for several reasons:
- First, the number of locally seen directory requests can be used to
- estimate the total number of clients in the Tor network. Second, the
- country-wise classification of requests using a GeoIP database can
- help counting the relative and absolute number of users per country.
- Third, the download times can give hints on the available bandwidth
- capacity at clients.
-
- Directory requests do not give any hints on the contents that clients
- send or receive over the Tor network. Every client requests network
- statuses from the directories, so that there are no anonymity-related
- concerns to gather these statistics. It might be, though, that clients
- wish to hide the fact that they are connecting to the Tor network.
- Therefore, IP addresses are resolved to country codes in memory,
- events are accumulated over 24 hours, and numbers are rounded up to
- multiples of 4 or 8.
-
- "dirreq-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
- [At most once.]
-
- YYYY-MM-DD HH:MM:SS defines the end of the included measurement
- interval of length NSEC seconds (86400 seconds by default).
-
- A "dirreq-stats-end" line, as well as any other "dirreq-*" line,
- is only added when the relay has opened its Dir port and after 24
- hours of measuring directory requests.
-
- "dirreq-v2-ips" CC=N,CC=N,... NL
- [At most once.]
- "dirreq-v3-ips" CC=N,CC=N,... NL
- [At most once.]
-
- List of mappings from two-letter country codes to the number of
- unique IP addresses that have connected from that country to
- request a v2/v3 network status, rounded up to the nearest multiple
- of 8. Only those IP addresses are counted that the directory can
- answer with a 200 OK status code.
-
- "dirreq-v2-reqs" CC=N,CC=N,... NL
- [At most once.]
- "dirreq-v3-reqs" CC=N,CC=N,... NL
- [At most once.]
-
- List of mappings from two-letter country codes to the number of
- requests for v2/v3 network statuses from that country, rounded up
- to the nearest multiple of 8. Only those requests are counted that
- the directory can answer with a 200 OK status code.
-
- "dirreq-v2-share" num% NL
- [At most once.]
- "dirreq-v3-share" num% NL
- [At most once.]
-
- The share of v2/v3 network status requests that the directory
- expects to receive from clients based on its advertised bandwidth
- compared to the overall network bandwidth capacity. Shares are
- formatted in percent with two decimal places. Shares are
- calculated as means over the whole 24-hour interval.
-
- "dirreq-v2-resp" status=num,... NL
- [At most once.]
- "dirreq-v3-resp" status=nul,... NL
- [At most once.]
-
- List of mappings from response statuses to the number of requests
- for v2/v3 network statuses that were answered with that response
- status, rounded up to the nearest multiple of 4. Only response
- statuses with at least 1 response are reported. New response
- statuses can be added at any time. The current list of response
- statuses is as follows:
-
- "ok": a network status request is answered; this number
- corresponds to the sum of all requests as reported in
- "dirreq-v2-reqs" or "dirreq-v3-reqs", respectively, before
- rounding up.
- "not-enough-sigs: a version 3 network status is not signed by a
- sufficient number of requested authorities.
- "unavailable": a requested network status object is unavailable.
- "not-found": a requested network status is not found.
- "not-modified": a network status has not been modified since the
- If-Modified-Since time that is included in the request.
- "busy": the directory is busy.
-
- "dirreq-v2-direct-dl" key=val,... NL
- [At most once.]
- "dirreq-v3-direct-dl" key=val,... NL
- [At most once.]
- "dirreq-v2-tunneled-dl" key=val,... NL
- [At most once.]
- "dirreq-v3-tunneled-dl" key=val,... NL
- [At most once.]
-
- List of statistics about possible failures in the download process
- of v2/v3 network statuses. Requests are either "direct"
- HTTP-encoded requests over the relay's directory port, or
- "tunneled" requests using a BEGIN_DIR cell over the relay's OR
- port. The list of possible statistics can change, and statistics
- can be left out from reporting. The current list of statistics is
- as follows:
-
- Successful downloads and failures:
-
- "complete": a client has finished the download successfully.
- "timeout": a download did not finish within 10 minutes after
- starting to send the response.
- "running": a download is still running at the end of the
- measurement period for less than 10 minutes after starting to
- send the response.
-
- Download times:
-
- "min", "max": smallest and largest measured bandwidth in B/s.
- "d[1-4,6-9]": 1st to 4th and 6th to 9th decile of measured
- bandwidth in B/s. For a given decile i, i/10 of all downloads
- had a smaller bandwidth than di, and (10-i)/10 of all downloads
- had a larger bandwidth than di.
- "q[1,3]": 1st and 3rd quartile of measured bandwidth in B/s. One
- fourth of all downloads had a smaller bandwidth than q1, one
- fourth of all downloads had a larger bandwidth than q3, and the
- remaining half of all downloads had a bandwidth between q1 and
- q3.
- "md": median of measured bandwidth in B/s. Half of the downloads
- had a smaller bandwidth than md, the other half had a larger
- bandwidth than md.
-
-
-Entry guard statistics:
-
- Entry guard statistics include the number of clients per country and
- per day that are connecting directly to an entry guard.
-
- Entry guard statistics are important to learn more about the
- distribution of clients to countries. In the future, this knowledge
- can be useful to detect if there are or start to be any restrictions
- for clients connecting from specific countries.
-
- The information which client connects to a given entry guard is very
- sensitive. This information must not be combined with the information
- what contents are leaving the network at the exit nodes. Therefore,
- entry guard statistics need to be aggregated to prevent them from
- becoming useful for de-anonymization. Aggregation includes resolving
- IP addresses to country codes, counting events over 24-hour intervals,
- and rounding up numbers to the next multiple of 8.
-
- "entry-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
- [At most once.]
-
- YYYY-MM-DD HH:MM:SS defines the end of the included measurement
- interval of length NSEC seconds (86400 seconds by default).
-
- An "entry-stats-end" line, as well as any other "entry-*"
- line, is first added after the relay has been running for at least
- 24 hours.
-
- "entry-ips" CC=N,CC=N,... NL
- [At most once.]
-
- List of mappings from two-letter country codes to the number of
- unique IP addresses that have connected from that country to the
- relay and which are no known other relays, rounded up to the
- nearest multiple of 8.
-
-
-Cell statistics:
-
- The third type of statistics have to do with the time that cells spend
- in circuit queues. In order to gather these statistics, the relay
- memorizes when it puts a given cell in a circuit queue and when this
- cell is flushed. The relay further notes the life time of the circuit.
- These data are sufficient to determine the mean number of cells in a
- queue over time and the mean time that cells spend in a queue.
-
- Cell statistics are necessary to learn more about possible reasons for
- the poor network performance of the Tor network, especially high
- latencies. The same statistics are also useful to determine the
- effects of design changes by comparing today's data with future data.
-
- There are basically no privacy concerns from measuring cell
- statistics, regardless of a node being an entry, middle, or exit node.
-
- "cell-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
- [At most once.]
-
- YYYY-MM-DD HH:MM:SS defines the end of the included measurement
- interval of length NSEC seconds (86400 seconds by default).
-
- A "cell-stats-end" line, as well as any other "cell-*" line,
- is first added after the relay has been running for at least 24
- hours.
-
- "cell-processed-cells" num,...,num NL
- [At most once.]
-
- Mean number of processed cells per circuit, subdivided into
- deciles of circuits by the number of cells they have processed in
- descending order from loudest to quietest circuits.
-
- "cell-queued-cells" num,...,num NL
- [At most once.]
-
- Mean number of cells contained in queues by circuit decile. These
- means are calculated by 1) determining the mean number of cells in
- a single circuit between its creation and its termination and 2)
- calculating the mean for all circuits in a given decile as
- determined in "cell-processed-cells". Numbers have a precision of
- two decimal places.
-
- "cell-time-in-queue" num,...,num NL
- [At most once.]
-
- Mean time cells spend in circuit queues in milliseconds. Times are
- calculated by 1) determining the mean time cells spend in the
- queue of a single circuit and 2) calculating the mean for all
- circuits in a given decile as determined in
- "cell-processed-cells".
-
- "cell-circuits-per-decile" num NL
- [At most once.]
-
- Mean number of circuits that are included in any of the deciles,
- rounded up to the next integer.
-
-
-Exit statistics:
-
- The last type of statistics affects exit nodes counting the number of
- bytes written and read and the number of streams opened per port and
- per 24 hours. Exit port statistics can be measured from looking at
- headers of BEGIN and DATA cells. A BEGIN cell contains the exit port
- that is required for the exit node to open a new exit stream.
- Subsequent DATA cells coming from the client or being sent back to the
- client contain a length field stating how many bytes of application
- data are contained in the cell.
-
- Exit port statistics are important to measure in order to identify
- possible load-balancing problems with respect to exit policies. Exit
- nodes that permit more ports than others are very likely overloaded
- with traffic for those ports plus traffic for other ports. Improving
- load balancing in the Tor network improves the overall utilization of
- bandwidth capacity.
-
- Exit traffic is one of the most sensitive parts of network data in the
- Tor network. Even though these statistics do not require looking at
- traffic contents, statistics are aggregated so that they are not
- useful for de-anonymizing users. Only those ports are reported that
- have seen at least 0.1% of exiting or incoming bytes, numbers of bytes
- are rounded up to full kibibytes (KiB), and stream numbers are rounded
- up to the next multiple of 4.
-
- "exit-stats-end" YYYY-MM-DD HH:MM:SS (NSEC s) NL
- [At most once.]
-
- YYYY-MM-DD HH:MM:SS defines the end of the included measurement
- interval of length NSEC seconds (86400 seconds by default).
-
- An "exit-stats-end" line, as well as any other "exit-*" line, is
- first added after the relay has been running for at least 24 hours
- and only if the relay permits exiting (where exiting to a single
- port and IP address is sufficient).
-
- "exit-kibibytes-written" port=N,port=N,... NL
- [At most once.]
- "exit-kibibytes-read" port=N,port=N,... NL
- [At most once.]
-
- List of mappings from ports to the number of kibibytes that the
- relay has written to or read from exit connections to that port,
- rounded up to the next full kibibyte.
-
- "exit-streams-opened" port=N,port=N,... NL
- [At most once.]
-
- List of mappings from ports to the number of opened exit streams
- to that port, rounded up to the nearest multiple of 4.
-
-
-Implementation notes:
-
- Right now, relays that are configured accordingly write similar
- statistics to those described in this proposal to disk every 24 hours.
- With this proposal being implemented, relays include the contents of
- these files in extra-info documents.
-
- The following steps are necessary to implement this proposal:
-
- 1. The current format of [dirreq|entry|buffer|exit]-stats files needs
- to be adapted to the description in this proposal. This step
- basically means renaming keywords.
-
- 2. The timing of writing the four *-stats files should be unified, so
- that they are written exactly 24 hours after starting the
- relay. Right now, the measurement intervals for dirreq, entry, and
- exit stats starts with the first observed request, and files are
- written when observing the first request that occurs more than 24
- hours after the beginning of the measurement interval. With this
- proposal, the measurement intervals should all start at the same
- time, and files should be written exactly 24 hours later.
-
- 3. It is advantageous to cache statistics in local files in the data
- directory until they are included in extra-info documents. The
- reason is that the 24-hour measurement interval can be very
- different from the 18-hour publication interval of extra-info
- documents. When a relay crashes after finishing a measurement
- interval, but before publishing the next extra-info document,
- statistics would get lost. Therefore, statistics are written to
- disk when finishing a measurement interval and read from disk when
- generating an extra-info document. Only the statistics that were
- appended to the *-stats files within the past 24 hours are included
- in extra-info documents. Further, the contents of the *-stats files
- need to be checked in the process of generating extra-info documents.
-
- 4. With the statistics patches being tested, the ./configure options
- should be removed and the statistics code be compiled by default.
- It is still required for relay operators to add configuration
- options (DirReqStatistics, ExitPortStatistics, etc.) to enable
- gathering statistics. However, in the near future, statistics shall
- be enabled gathered by all relays by default, where requiring a
- ./configure option would be a barrier for many relay operators.
diff --git a/doc/spec/proposals/167-params-in-consensus.txt b/doc/spec/proposals/167-params-in-consensus.txt
deleted file mode 100644
index d23bc9c01..000000000
--- a/doc/spec/proposals/167-params-in-consensus.txt
+++ /dev/null
@@ -1,47 +0,0 @@
-Filename: 167-params-in-consensus.txt
-Title: Vote on network parameters in consensus
-Author: Roger Dingledine
-Created: 18-Aug-2009
-Status: Closed
-Implemented-In: 0.2.2
-
-0. History
-
-
-1. Overview
-
- Several of our new performance plans involve guessing how to tune
- clients and relays, yet we won't be able to learn whether we guessed
- the right tuning parameters until many people have upgraded. Instead,
- we should have directory authorities vote on the parameters, and teach
- Tors to read the currently recommended values out of the consensus.
-
-2. Design
-
- V3 votes should include a new "params" line after the known-flags
- line. It contains key=value pairs, where value is an integer.
-
- Consensus documents that are generated with a sufficiently new consensus
- method (7?) then include a params line that includes every key listed
- in any vote, and the median value for that key (in case of ties,
- we use the median closer to zero).
-
-2.1. Planned keys.
-
- The first planned parameter is "circwindow=101", which is the initial
- circuit packaging window that clients and relays should use. Putting
- it in the consensus will let us perform experiments with different
- values once enough Tors have upgraded -- see proposal 168.
-
- Later parameters might include a weighting for how much to favor quiet
- circuits over loud circuits in our round-robin algorithm; a weighting
- for how much to prioritize relays over clients if we use an incentive
- scheme like the gold-star design; and what fraction of circuits we
- should throw out from proposal 151.
-
-2.2. What about non-integers?
-
- I'm not sure how we would do median on non-integer values. Further,
- I don't have any non-integer values in mind yet. So I say we cross
- that bridge when we get to it.
-
diff --git a/doc/spec/proposals/168-reduce-circwindow.txt b/doc/spec/proposals/168-reduce-circwindow.txt
deleted file mode 100644
index c10cf41e2..000000000
--- a/doc/spec/proposals/168-reduce-circwindow.txt
+++ /dev/null
@@ -1,134 +0,0 @@
-Filename: 168-reduce-circwindow.txt
-Title: Reduce default circuit window
-Author: Roger Dingledine
-Created: 12-Aug-2009
-Status: Open
-Target: 0.2.2
-
-0. History
-
-
-1. Overview
-
- We should reduce the starting circuit "package window" from 1000 to
- 101. The lower package window will mean that clients will only be able
- to receive 101 cells (~50KB) on a circuit before they need to send a
- 'sendme' acknowledgement cell to request 100 more.
-
- Starting with a lower package window on exit relays should save on
- buffer sizes (and thus memory requirements for the exit relay), and
- should save on queue sizes (and thus latency for users).
-
- Lowering the package window will induce an extra round-trip for every
- additional 50298 bytes of the circuit. This extra step is clearly a
- slow-down for large streams, but ultimately we hope that a) clients
- fetching smaller streams will see better response, and b) slowing
- down the large streams in this way will produce lower e2e latencies,
- so the round-trips won't be so bad.
-
-2. Motivation
-
- Karsten's torperf graphs show that the median download time for a 50KB
- file over Tor in mid 2009 is 7.7 seconds, whereas the median download
- time for 1MB and 5MB are around 50s and 150s respectively. The 7.7
- second figure is way too high, whereas the 50s and 150s figures are
- surprisingly low.
-
- The median round-trip latency appears to be around 2s, with 25% of
- the data points taking more than 5s. That's a lot of variance.
-
- We designed Tor originally with the original goal of maximizing
- throughput. We figured that would also optimize other network properties
- like round-trip latency. Looks like we were wrong.
-
-3. Design
-
- Wherever we initialize the circuit package window, initialize it to
- 101 rather than 1000. Reducing it should be safe even when interacting
- with old Tors: the old Tors will receive the 101 cells and send back
- a sendme ack cell. They'll still have much higher deliver windows,
- but the rest of their deliver window will go unused.
-
- You can find the patch at arma/circwindow. It seems to work.
-
-3.1. Why not 100?
-
- Tor 0.0.0 through 0.2.1.19 have a bug where they only send the sendme
- ack cell after 101 cells rather than the intended 100 cells.
-
- Once 0.2.1.19 is obsolete we can change it back to 100 if we like. But
- hopefully we'll have moved to some datagram protocol long before
- 0.2.1.19 becomes obsolete.
-
-3.2. What about stream packaging windows?
-
- Right now the stream packaging windows start at 500. The goal was to
- set the stream window to half the circuit window, to provide a crude
- load balancing between streams on the same circuit. Once we lower
- the circuit packaging window, the stream packaging window basically
- becomes redundant.
-
- We could leave it in -- it isn't hurting much in either case. Or we
- could take it out -- people building other Tor clients would thank us
- for that step. Alas, people building other Tor clients are going to
- have to be compatible with current Tor clients, so in practice there's
- no point taking out the stream packaging windows.
-
-3.3. What about variable circuit windows?
-
- Once upon a time we imagined adapting the circuit package window to
- the network conditions. That is, we would start the window small,
- and raise it based on the latency and throughput we see.
-
- In theory that crude imitation of TCP's windowing system would allow
- us to adapt to fill the network better. In practice, I think we want
- to stick with the small window and never raise it. The low cap reduces
- the total throughput you can get from Tor for a given circuit. But
- that's a feature, not a bug.
-
-4. Evaluation
-
- How do we know this change is actually smart? It seems intuitive that
- it's helpful, and some smart systems people have agreed that it's
- a good idea (or said another way, they were shocked at how big the
- default package window was before).
-
- To get a more concrete sense of the benefit, though, Karsten has been
- running torperf side-by-side on exit relays with the old package window
- vs the new one. The results are mixed currently -- it is slightly faster
- for fetching 40KB files, and slightly slower for fetching 50KB files.
-
- I think it's going to be tough to get a clear conclusion that this is
- a good design just by comparing one exit relay running the patch. The
- trouble is that the other hops in the circuits are still getting bogged
- down by other clients introducing too much traffic into the network.
-
- Ultimately, we'll want to put the circwindow parameter into the
- consensus so we can test a broader range of values once enough relays
- have upgraded.
-
-5. Transition and deployment
-
- We should put the circwindow in the consensus (see proposal 167),
- with an initial value of 101. Then as more exit relays upgrade,
- clients should seamlessly get the better behavior.
-
- Note that upgrading the exit relay will only affect the "download"
- package window. An old client that's uploading lots of bytes will
- continue to use the old package window at the client side, and we
- can't throttle that window at the exit side without breaking protocol.
-
- The real question then is what we should backport to 0.2.1. Assuming
- this could be a big performance win, we can't afford to wait until
- 0.2.2.x comes out before starting to see the changes here. So we have
- two options as I see them:
- a) once clients in 0.2.2.x know how to read the value out of the
- consensus, and it's been tested for a bit, backport that part to
- 0.2.1.x.
- b) if it's too complex to backport, just pick a number, like 101, and
- backport that number.
-
- Clearly choice (a) is the better one if the consensus parsing part
- isn't very complex. Let's shoot for that, and fall back to (b) if the
- patch turns out to be so big that we reconsider.
-
diff --git a/doc/spec/proposals/169-eliminating-renegotiation.txt b/doc/spec/proposals/169-eliminating-renegotiation.txt
deleted file mode 100644
index 2c90f9c9e..000000000
--- a/doc/spec/proposals/169-eliminating-renegotiation.txt
+++ /dev/null
@@ -1,404 +0,0 @@
-Filename: 169-eliminating-renegotiation.txt
-Title: Eliminate TLS renegotiation for the Tor connection handshake
-Author: Nick Mathewson
-Created: 27-Jan-2010
-Status: Draft
-Target: 0.2.2
-
-1. Overview
-
- I propose a backward-compatible change to the Tor connection
- establishment protocol to avoid the use of TLS renegotiation.
-
- Rather than doing a TLS renegotiation to exchange certificates
- and authenticate the original handshake, this proposal takes an
- approach similar to Steven Murdoch's proposal 124, and uses Tor
- cells to finish authenticating the parties' identities once the
- initial TLS handshake is finished.
-
- Terminological note: I use "client" below to mean the Tor
- instance (a client or a relay) that initiates a TLS connection,
- and "server" to mean the Tor instance (a relay) that accepts it.
-
-2. Motivation and history
-
- In the original Tor TLS connection handshake protocol ("V1", or
- "two-cert"), parties that wanted to authenticate provided a
- two-cert chain of X.509 certificates during the handshake setup
- phase. Every party that wanted to authenticate sent these
- certificates.
-
- In the current Tor TLS connection handshake protocol ("V2", or
- "renegotiating"), the parties begin with a single certificate
- sent from the server (responder) to the client (initiator), and
- then renegotiate to a two-certs-from-each-authenticating party.
- We made this change to make Tor's handshake look like a browser
- speaking SSL to a webserver. (See proposal 130, and
- tor-spec.txt.) To tell whether to use the V1 or V2 handshake,
- servers look at the list of ciphers sent by the client. (This is
- ugly, but there's not much else in the ClientHello that they can
- look at.) If the list contains any cipher not used by the V1
- protocol, the server sends back a single cert and expects a
- renegotiation. If the client gets back a single cert, then it
- withholds its own certificates until the TLS renegotiation phase.
-
- In other words, initiator behavior now looks like this:
-
- - Begin TLS negotiation with V2 cipher list; wait for
- certificate(s).
- - If we get a certificate chain:
- - Then we are using the V1 handshake. Send our own
- certificate chain as part of this initial TLS handshake
- if we want to authenticate; otherwise, send no
- certificates. When the handshake completes, check
- certificates. We are now mutually authenticated.
-
- Otherwise, if we get just a single certificate:
- - Then we are using the V2 handshake. Do not send any
- certificates during this handshake.
- - When the handshake is done, immediately start a TLS
- renegotiation. During the renegotiation, expect
- a certificate chain from the server; send a certificate
- chain of our own if we want to authenticate ourselves.
- - After the renegotiation, check the certificates. Then
- send (and expect) a VERSIONS cell from the other side to
- establish the link protocol version.
-
- And V2 responder behavior now looks like this:
-
- - When we get a TLS ClientHello request, look at the cipher
- list.
- - If the cipher list contains only the V1 ciphersuites:
- - Then we're doing a V1 handshake. Send a certificate
- chain. Expect a possible client certificate chain in
- response.
- Otherwise, if we get other ciphersuites:
- - We're using the V2 handshake. Send back a single
- certificate and let the handshake complete.
- - Do not accept any data until the client has renegotiated.
- - When the client is renegotiating, send a certificate
- chain, and expect (possibly multiple) certificates in
- reply.
- - Check the certificates when the renegotiation is done.
- Then exchange VERSIONS cells.
-
- Late in 2009, researchers found a flaw in most applications' use
- of TLS renegotiation: Although TLS renegotiation does not
- reauthenticate any information exchanged before the renegotiation
- takes place, many applications were treating it as though it did,
- and assuming that data sent _before_ the renegotiation was
- authenticated with the credentials negotiated _during_ the
- renegotiation. This problem was exacerbated by the fact that
- most TLS libraries don't actually give you an obvious good way to
- tell where the renegotiation occurred relative to the datastream.
- Tor wasn't directly affected by this vulnerability, but its
- aftermath hurts us in a few ways:
-
- 1) OpenSSL has disabled renegotiation by default, and created
- a "yes we know what we're doing" option we need to set to
- turn it back on. (Two options, actually: one for openssl
- 0.9.8l and one for 0.9.8m and later.)
-
- 2) Some vendors have removed all renegotiation support from
- their versions of OpenSSL entirely, forcing us to tell
- users to either replace their versions of OpenSSL or to
- link Tor against a hand-built one.
-
- 3) Because of 1 and 2, I'd expect TLS renegotiation to become
- rarer and rarer in the wild, making our own use stand out
- more.
-
-3. Design
-
-3.1. The view in the large
-
- Taking a cue from Steven Murdoch's proposal 124, I propose that
- we move the work currently done by the TLS renegotiation step
- (that is, authenticating the parties to one another) and do it
- with Tor cells instead of with TLS.
-
- Using _yet another_ variant response from the responder (server),
- we allow the client to learn that it doesn't need to rehandshake
- and can instead use a cell-based authentication system. Once the
- TLS handshake is done, the client and server exchange VERSIONS
- cells to determine link protocol version (including
- handshake version). If they're using the handshake version
- specified here, the client and server arrive at link protocol
- version 3 (or higher), and use cells to exchange further
- authentication information.
-
-3.2. New TLS handshake variant
-
- We already used the list of ciphers from the clienthello to
- indicate whether the client can speak the V2 ("renegotiating")
- handshake or later, so we can't encode more information there.
-
- We can, however, change the DN in the certificate passed by the
- server back to the client. Currently, all V2 certificates are
- generated with CN values ending with ".net". I propose that we
- have the ".net" commonName ending reserved to indicate the V2
- protocol, and use commonName values ending with ".com" to
- indicate the V3 ("minimal") handshake described herein.
-
- Now, once the initial TLS handshake is done, the client can look
- at the server's certificate(s). If there is a certificate chain,
- the handshake is V1. If there is a single certificate whose
- subject commonName ends in ".net", the handshake is V2 and the
- client should try to renegotiate as it would currently.
- Otherwise, the client should assume that the handshake is V3+.
- [Servers should _only_ send ".com" addesses, to allow room for
- more signaling in the future.]
-
-3.3. Authenticating inside Tor
-
- Once the TLS handshake is finished, if the client renegotiates,
- then the server should go on as it does currently.
-
- If the client implements this proposal, however, and the server
- has shown it can understand the V3+ handshake protocol, the
- client immediately sends a VERSIONS cell to the server
- and waits to receive a VERSIONS cell in return. We negotiate
- the Tor link protocol version _before_ we proceed with the
- negotiation, in case we need to change the authentication
- protocol in the future.
-
- Once either party has seen the VERSIONS cell from the other, it
- knows which version they will pick (that is, the highest version
- shared by both parties' VERSIONS cells). All Tor instances using
- the handshake protocol described in 3.2 MUST support at least
- link protocol version 3 as described here.
-
- On learning the link protocol, the server then sends the client a
- CERT cell and a NETINFO cell. If the client wants to
- authenticate to the server, it sends a CERT cell, an AUTHENTICATE
- cell, and a NETINFO cell, or it may simply send a NETINFO cell if
- it does not want to authenticate.
-
- The CERT cell describes the keys that a Tor instance is claiming
- to have. It is a variable-length cell. Its payload format is:
-
- N: Number of certs in cell [1 octet]
- N times:
- CLEN [2 octets]
- Certificate [CLEN octets]
-
- Any extra octets at the end of a CERT cell MUST be ignored.
-
- Each certificate has the form:
-
- CertType [1 octet]
- CertPurpose [1 octet]
- PublicKeyLen [2 octets]
- PublicKey [PublicKeyLen octets]
- NotBefore [4 octets]
- NotAfter [4 octets]
- SignerID [HASH256_LEN octets]
- SignatureLen [2 octets]
- Signature [SignatureLen octets]
-
- where CertType is 1 (meaning "RSA/SHA256")
- CertPurpose is 1 (meaning "link certificate")
- PublicKey is the DER encoding of the ASN.1 representation
- of the RSA key of the subject of this certificate,
- NotBefore is a time in HOURS since January 1, 1970, 00:00
- UTC before which this certificate should not be
- considered valid.
- NotAfter is a time in HOURS since January 1, 1970, 00:00
- UTC after which this certificate should not be
- considered valid.
- SignerID is the SHA-256 digest of the public key signing
- this certificate
- and Signature is the signature of the all other fields in
- this certificate, using SHA256 as described in proposal
- 158.
-
- While authenticating, a server need send only a self-signed
- certificate for its identity key. (Its TLS certificate already
- contains its link key signed by its identity key.) A client that
- wants to authenticate MUST send two certificates: one containing
- a public link key signed by its identity key, and one self-signed
- cert for its identity.
-
- Tor instances MUST ignore any certificate with an unrecognized
- CertType or CertPurpose, and MUST ignore extra bytes in the cert.
-
- The AUTHENTICATE cell proves to the server that the client with
- whom it completed the initial TLS handshake is the one possessing
- the link public key in its certificate. It is a variable-length
- cell. Its contents are:
-
- SignatureType [2 octets]
- SignatureLen [2 octets]
- Signature [SignatureLen octets]
-
- where SignatureType is 1 (meaning "RSA-SHA256") and Signature is
- an RSA-SHA256 signature of the HMAC-SHA256, using the TLS master
- secret key as its key, of the following elements:
-
- - The SignatureType field (0x00 0x01)
- - The NUL terminated ASCII string: "Tor certificate verification"
- - client_random, as sent in the Client Hello
- - server_random, as sent in the Server Hello
-
- Once the above handshake is complete, the client knows (from the
- initial TLS handshake) that it has a secure connection to an
- entity that controls a given link public key, and knows (from the
- CERT cell) that the link public key is a valid public key for a
- given Tor identity.
-
- If the client authenticates, the server learns from the CERT cell
- that a given Tor identity has a given current public link key.
- From the AUTHENTICATE cell, it knows that an entity with that
- link key knows the master secret for the TLS connection, and
- hence must be the party with whom it's talking, if TLS works.
-
-3.4. Security checks
-
- If the TLS handshake indicates a V2 or V3+ connection, the server
- MUST reject any connection from the client that does not begin
- with either a renegotiation attempt or a VERSIONS cell containing
- at least link protocol version "3". If the TLS handshake
- indicates a V3+ connection, the client MUST reject any connection
- where the server sends anything before the client has sent a
- VERSIONS cell, and any connection where the VERSIONS cell does
- not contain at least link protocol version "3".
-
- If link protocol version 3 is chosen:
-
- Clients and servers MUST check that all digests and signatures
- on the certificates in CERT cells they are given are as
- described above.
-
- After the VERSIONS cell, clients and servers MUST close the
- connection if anything besides a CERT or AUTH cell is sent
- before the
-
- CERT or AUTHENTICATE cells anywhere after the first NETINFO
- cell must be rejected.
-
- ... [write more here. What else?] ...
-
-3.5. Summary
-
- We now revisit the protocol outlines from section 2 to incorporate
- our changes. New or modified steps are marked with a *.
-
- The new initiator behavior now looks like this:
-
- - Begin TLS negotiation with V2 cipher list; wait for
- certificate(s).
- - If we get a certificate chain:
- - Then we are using the V1 handshake. Send our own
- certificate chain as part of this initial TLS handshake
- if we want to authenticate; otherwise, send no
- certificates. When the handshake completes, check
- certificates. We are now mutually authenticated.
- Otherwise, if we get just a single certificate:
- - Then we are using the V2 or the V3+ handshake. Do not
- send any certificates during this handshake.
- * When the handshake is done, look at the server's
- certificate's subject commonName.
- * If it ends with ".net", we're doing a V2 handshake:
- - Immediately start a TLS renegotiation. During the
- renegotiation, expect a certificate chain from the
- server; send a certificate chain of our own if we
- want to authenticate ourselves.
- - After the renegotiation, check the certificates. Then
- send (and expect) a VERSIONS cell from the other side
- to establish the link protocol version.
- * If it ends with anything else, assume a V3 or later
- handshake:
- * Send a VERSIONS cell, and wait for a VERSIONS cell
- from the server.
- * If we are authenticating, send CERT and AUTHENTICATE
- cells.
- * Send a NETINFO cell. Wait for a CERT and a NETINFO
- cell from the server.
- * If the CERT cell contains a valid self-identity cert,
- and the identity key in the cert can be used to check
- the signature on the x.509 certificate we got during
- the TLS handshake, then we know we connected to the
- server with that identity. If any of these checks
- fail, or the identity key was not what we expected,
- then we close the connection.
- * Once the NETINFO cell arrives, continue as before.
-
- And V3+ responder behavior now looks like this:
-
- - When we get a TLS ClientHello request, look at the cipher
- list.
-
- - If the cipher list contains only the V1 ciphersuites:
- - Then we're doing a V1 handshake. Send a certificate
- chain. Expect a possible client certificate chain in
- response.
- Otherwise, if we get other ciphersuites:
- - We're using the V2 handshake. Send back a single
- certificate whose subject commonName ends with ".com",
- and let the handshake complete.
- * If the client does anything besides renegotiate or send a
- VERSIONS cell, drop the connection.
- - If the client renegotiates immediately, it's a V2
- connection:
- - When the client is renegotiating, send a certificate
- chain, and expect (possibly multiple certificates in
- reply).
- - Check the certificates when the renegotiation is done.
- Then exchange VERSIONS cells.
- * Otherwise we got a VERSIONS cell and it's a V3 handshake.
- * Send a VERSIONS cell, a CERT cell, an AUTHENTICATE
- cell, and a NETINFO cell.
- * Wait for the client to send cells in reply. If the
- client sends a CERT and an AUTHENTICATE and a NETINFO,
- use them to authenticate the client. If the client
- sends a NETINFO, it is unauthenticated. If it sends
- anything else before its NETINFO, it's rejected.
-
-4. Numbers to assign
-
- We need a version number for this link protocol. I've been
- calling it "3".
-
- We need to reserve command numbers for CERT and AUTH cells. I
- suggest that in link protocol 3 and higher, we reserve command
- numbers 128..240 for variable-length cells. (241-256 we can hold
- for future extensions.
-
-5. Efficiency
-
- This protocol add a round-trip step when the client sends a
- VERSIONS cell to the server, and waits for the {VERSIONS, CERT,
- NETINFO} response in turn. (The server then waits for the
- client's {NETINFO} or {CERT, AUTHENTICATE, NETINFO} reply,
- but it would have already been waiting for the client's NETINFO,
- so that's not an additional wait.)
-
- This is actually fewer round-trip steps than required before for
- TLS renegotiation, so that's a win.
-
-6. Open questions:
-
- - Should we use X.509 certificates instead of the certificate-ish
- things we describe here? They are more standard, but more ugly.
-
- - May we cache which certificates we've already verified? It
- might leak in timing whether we've connected with a given server
- before, and how recently.
-
- - Is there a better secret than the master secret to use in the
- AUTHENTICATE cell? Say, a portable one? Can we get at it for
- other libraries besides OpenSSL?
-
- - Does using the client_random and server_random data in the
- AUTHENTICATE message actually help us? How hard is it to pull
- them out of the OpenSSL data structure?
-
- - Can we give some way for clients to signal "I want to use the
- V3 protocol if possible, but I can't renegotiate, so don't give
- me the V2"? Clients currently have a fair idea of server
- versions, so they could potentially do the V3+ handshake with
- servers that support it, and fall back to V1 otherwise.
-
- - What should servers that don't have TLS renegotiation do? For
- now, I think they should just get it. Eventually we can
- deprecate the V2 handshake as we did with the V1 handshake.
diff --git a/doc/spec/proposals/170-user-path-config.txt b/doc/spec/proposals/170-user-path-config.txt
deleted file mode 100644
index fa74c76f7..000000000
--- a/doc/spec/proposals/170-user-path-config.txt
+++ /dev/null
@@ -1,95 +0,0 @@
-Title: Configuration options regarding circuit building
-Filename: 170-user-path-config.txt
-Author: Sebastian Hahn
-Created: 01-March-2010
-Status: Draft
-
-Overview:
-
- This document outlines how Tor handles the user configuration
- options to influence the circuit building process.
-
-Motivation:
-
- Tor's treatment of the configuration *Nodes options was surprising
- to many users, and quite a few conspiracy theories have crept up. We
- should update our specification and code to better describe and
- communicate what is going during circuit building, and how we're
- honoring configuration. So far, we've been tracking a bugreport
- about this behaviour (
- https://bugs.torproject.org/flyspray/index.php?do=details&id=1090 )
- and Nick replied in a thread on or-talk (
- http://archives.seul.org/or/talk/Feb-2010/msg00117.html ).
-
- This proposal tries to document our intention for those configuration
- options.
-
-Design:
-
- Five configuration options are available to users to influence Tor's
- circuit building. EntryNodes and ExitNodes define a list of nodes
- that are for the Entry/Exit position in all circuits. ExcludeNodes
- is a list of nodes that are used for no circuit, and
- ExcludeExitNodes is a list of nodes that aren't used as the last
- hop. StrictNodes defines Tor's behaviour in case of a conflict, for
- example when a node that is excluded is the only available
- introduction point. Setting StrictNodes to 1 breaks Tor's
- functionality in that case, and it will refuse to build such a
- circuit.
-
- Neither Nick's email nor bug 1090 have clear suggestions how we
- should behave in each case, so I tried to come up with something
- that made sense to me.
-
-Security implications:
-
- Deviating from normal circuit building can break one's anonymity, so
- the documentation of the above option should contain a warning to
- make users aware of the pitfalls.
-
-Specification:
-
- It is proposed that the "User configuration" part of path-spec
- (section 2.2.2) be replaced with this:
-
- Users can alter the default behavior for path selection with
- configuration options. In case of conflicts (excluding and requiring
- the same node) the "StrictNodes" option is used to determine
- behaviour. If a nodes is both excluded and required via a
- configuration option, the exclusion takes preference.
-
- - If "ExitNodes" is provided, then every request requires an exit
- node on the ExitNodes list. If a request is supported by no nodes
- on that list, and "StrictNodes" is false, then Tor treats that
- request as if ExitNodes were not provided.
-
- - "EntryNodes" behaves analogously.
-
- - If "ExcludeNodes" is provided, then no circuit uses any of the
- nodes listed. If a circuit requires an excluded node to be used,
- and "StrictNodes" is false, then Tor uses the node in that
- position while not using any other of the excluded nodes.
-
- - If "ExcludeExitNodes" is provided, then Tor will not use the nodes
- listed for the exit position in a circuit. If a circuit requires
- an excluded node to be used in the exit position and "StrictNodes"
- is false, then Tor builds that circuit as if ExcludeExitNodes were
- not provided.
-
- - If a user tries to connect to or resolve a hostname of the form
- <target>.<servername>.exit and the "AllowDotExit" configuration
- option is set to 1, the request is rewritten to a request for
- <target>, and the request is only supported by the exit whose
- nickname or fingerprint is <servername>. If "AllowDotExit" is set
- to 0 (default), any request for <anything>.exit is denied.
-
- - When any of the *Nodes settings are changed, all circuits are
- expired immediately, to prevent a situation where a previously
- built circuit is used even though some of its nodes are now
- excluded.
-
-
-Compatibility:
-
- The old Strict*Nodes options are deprecated, and the StrictNodes
- option is new. Tor users may need to update their configuration file.
diff --git a/doc/spec/proposals/171-separate-streams.txt b/doc/spec/proposals/171-separate-streams.txt
deleted file mode 100644
index 9842265db..000000000
--- a/doc/spec/proposals/171-separate-streams.txt
+++ /dev/null
@@ -1,357 +0,0 @@
-Filename: 171-separate-streams.txt
-Title: Separate streams across circuits by connection metadata
-Author: Robert Hogan, Jacob Appelbaum, Damon McCoy, Nick Mathewson
-Created: 21-Oct-2008
-Modified: 7-Dec-2010
-Status: Open
-
-Summary:
-
- We propose a new set of options to isolate unrelated streams from one
- another, putting them on separate circuits so that semantically
- unrelated traffic is not inadvertently made linkable.
-
-Motivation:
-
- Currently, Tor attaches regular streams (that is, ones not carrying
- rendezvous or directory traffic) to circuits based only on whether Tor
- circuit's current exit node supports the destination, and whether the
- circuit has been dirty (that is, in use) for too long.
-
- This means that traffic that would otherwise be unrelated sometimes
- gets sent over the same circuit, allowing the exit node to link such
- streams with certainty, and allowing other parties to link such
- streams probabilistically.
-
- Older versions of onion routing tried to address this problem by
- sending every stream over a separate circuit; performance issues made
- this unfeasible. Moreover, in the presence of a localized adversary,
- separating streams by circuits increases the odds that, for any given
- linked set of streams, at least one will go over a compromised
- circuit.
-
- Therefore we ought to look for ways to allow streams that ought to be
- linked to travel over a single circuit, while keeping streams that
- ought not be linked isolated to separate circuits.
-
-Discussion:
-
- Let's call a series of inherently-linked streams (like a set of
- streams downloading objects from the same webpage, or a browsing
- session where the user requests several related webpages) a "Session".
-
- "Sessions" are a necessarily a fuzzy concept. While users typically
- consider some activities as wholly unrelated to each other ("My IM
- session has nothing to do with my web browsing!"), the boundaries
- between activities are sometimes hard to determine. If I'm reading
- lolcats in one browser tab and reading about treatments for an
- embarrassing disease in another, those are probably separate sessions.
- If I search for a forum, log in, read it for a while, and post a few
- messages on unrelated topics, that's probably all the same session.
-
- So with the proviso that no automated process can identify sessions
- 100% accurately, let's see which options we have available.
-
- Generally, all the streams on a session come from a single
- application. Unfortunately, isolating streams by application
- automatically isn't feasible, given the lack of any nice
- cross-platform way to tell which local process originated a given
- connection. (Yes, lsof works. But a quick review of the lsof code
- should be sufficient to scare you away from thinking there is a
- portable option, much less a portable O(1) option.) So instead, we'll
- have to use some other aspect of a Tor request as a proxy for the
- application.
-
- Generally, traffic from separate applications is not in the same
- session.
-
- With some applications (IRC, for example), each stream is a session.
-
- Some applications (most notably web browsing) can't be meaningfully
- split into sessions without inspecting the traffic itself and
- maintaining a lot of state.
-
- How well do ports correspond to sessions? Early versions of this
- proposal focused on using destination ports as a proxy for
- application, since a connection to port 22 for SSH is probably not in
- the same session as one to port 80. This only works with some
- applications better than others, though: while SSH users typically
- know when they're on port 22 and when they aren't, a web browser can
- be coaxed (though img urls or any number of releated tricks) into
- connecting to any port at all. Moreover, when Tor gets a DNS lookup
- request, it doesn't know in advance which port the resulting address
- will be used to connect to.
-
- So in summary, each kind of traffic wants to follow different rules,
- and assuming the existence of a web browser and a hostile web page or
- exit node, we can't tell one kind of traffic from another by simply
- looking at the destination:port of the traffic.
-
- Fortunately, we're not doomed.
-
-Design:
-
- When a stream arrives at Tor, we have the following data to examine:
- 1) The destination address
- 2) The destination port (unless this a DNS lookup)
- 3) The protocol used by the application to send the stream to Tor:
- SOCKS4, SOCKS4A, SOCKS5, or whatever local "transparent proxy"
- mechanism the kernel gives us.
- 4) The port used by the application to send the stream to Tor --
- that is, the SOCKSListenAddress or TransListenAddress that the
- application used, if we have more than one.
- 5) The SOCKS username and password, if any.
- 6) The source address and port for the application.
-
- We propose to use 3, 4, and 5 as a backchannel for applications to
- tell Tor about different sessions. Rather than running only one
- SOCKSPort, a Tor user who would prefer better session isolation should
- run multiple SOCKSPorts/TransPorts, and configure different
- applications to use separate ports. Applications that support SOCKS
- authentication can further be separated on a single port by their
- choice of username/password. Streams sent to separate ports or using
- different authentication information should never be sent over the
- same circuit. We allow each port to have its own settings for
- isolation based on destination port, destination address, or both.
-
- Handling DNS can be a challenge. We can get hostnames by one of three
- means:
-
- A) A SOCKS4a request, or a SOCKS5 request with a hostname. This
- case is handled trivially using the rules above.
- B) A RESOLVE request on a SOCKSPort. This case is handled using the
- rules above, except that port isolation can't work to isolate
- RESOLVE requests into a proper session, since we don't know which
- port will eventually be used when we connect to the returned
- address.
- C) A request on a DNSPort. We have no way of knowing which
- address/port will be used to connect to the requested address.
-
- When B or C is required but problematic, we could favor the use of
- AutomapHostsOnResolve.
-
-Interface:
-
- We propose that {SOCKS,Natd,Trans,DNS}ListenAddr be deprecated in
- favor of an expanded {SOCKS,Natd,Trans,DNS}Port syntax:
-
- ClientPortLine = OptionName SP (Addr ":")? Port (SP Options?)
- OptionName = "SOCKSPort" / "NatdPort" / "TransPort" / "DNSPort"
- Addr = An IPv4 address / an IPv6 address surrounded by brackets.
- If optional, we default to 127.0.0.1
- Port = An integer from 1 through 65535 inclusive
- Options = Option
- Options = Options SP Option
- Option = IsolateOption / GroupOption
- GroupOption = "SessionGroup=" UINT
- IsolateOption = OptNo ("IsolateDestPort" / "IsolateDestAddr" /
- "IsolateSOCKSUser"/ "IsolateClientProtocol" /
- "IsolateClientAddr") OptPlural
- OptNo = "No" ?
- OptPlural = "s" ?
- SP = " "
- UINT = An unsigned integer
-
- All options are case-insensitive.
-
- The "IsolateSOCKSUser" and "IsolateClientAddr" options are on by
- default; "NoIsolateSOCKSUser" and "NoIsolateClientAddr" respectively
- turn them off. The IsolateDestPort and IsolateDestAddr and
- IsolateClientProtocol options are off by default. NoIsolateDestPort and
- NoIsolateDestAddr and NoIsolateClientProtocol have no effect.
-
- Given a set of ClientPortLines, streams must NOT be placed on the same
- circuit if ANY of the following hold:
-
- * They were sent to two different client ports, unless the two
- client ports both specify a "SessionGroup" option with the same
- integer value.
- * At least one was sent to a client port with the IsolateDestPort
- active, and they have different destination ports.
- * At least one was sent to a client port with IsolateDestAddr
- active, and they have different destination addresses.
- * At least one was sent to a client port with IsolateClientProtocol
- active, and they use different protocols (where SOCKS4, SOCKS4a,
- SOCKS5, TransPort, NatdPort, and DNS are the protocols in question)
- * At least one was sent to a client port with IsolateSOCKSUser
- active, and they have different SOCKS username/password values
- configurations. (For the purposes of this option, the
- username/password pair of ""/"" is distinct from SOCKS without
- authentication, and both are distinct from any non-SOCKS client's
- non-authentication.)
- * At least one was sent to a client port with IsolateClientAddr
- active, and they came from different client addresses. (For the
- purpose of this option, any local interface counts as the same
- address. So if the host is configured with addresses 10.0.0.1,
- 192.0.32.10, and 127.0.0.1, then traffic from those addresses can
- leave on the same circuit, but traffic to from 10.0.0.2 (for
- example) could not share a circuit with any of them.)
-
- These rules apply regardless of whether the streams are active at the
- same time. In other words, if the rules say that streams A and B must
- not be on the same circuit, and stream A is attached to circuit X,
- then stream B must never be attached to stream X, even if stream A is
- closed first.
-
-Alternative Interface:
-
- We're cramming a lot onto one line in the design above. Perhaps
- instead it would be a better idea to have grouped lines of the form:
-
- StreamGroup 1
- SOCKSPort 9050
- TransPort 9051
- IsolateDestPort 1
- IsolateClientProtocol 0
- EndStreamGroup
-
- StreamGroup 2
- SOCKSPort 9052
- DNSPort 9053
- IsolateDestAddr 1
- EndStreamGroup
-
- This would be equivalent to:
- SOCKSPort 9050 SessionGroup=1 IsolateDestPort NoIsolateClientProtocol
- TransPort 9051 SessionGroup=1 IsolateDestPort NoIsolateClientProtocol
- SOCKSPort 9052 SessionGroup=2 IsolateDestAddr
- DNSPort 9053 SessionGroup=2 IsolateDestAddr
-
- But it would let us extend range of allowed options later without
- having client port lines group without bound. For example, we might
- give different circuit building parameters to different session
- groups.
-
-Example of use:
-
- Suppose that we want to use a web browser, an IRC client, and a SSH
- client all at the same time. Let's assume that we want web traffic to
- be isolated from all other traffic, even if the browser makes
- connections to ports usually used for IRC or SSH. Let's also assume
- that IRC and SSH are both used for relatively long-lived connections,
- and we want to keep all IRC/SSH sessions separate from one another.
-
- In this case, we could say:
-
- SOCKSPort 9050
- SOCKSPort 9051 IsolateDestAddr IsolateDestPort
-
- We would then configure our browser to use 9050 and our IRC/SSH
- clients to use 9051.
-
-Advanced example of use, #2:
-
- Suppose that we have a bunch of applications, and we launch them all
- using torsocks, and we want to keep each applications isolated from
- one another. We just create a shell script, "torlaunch":
- #!/bin/bash
- export TORSOCKS_USERNAME="$1"
- exec torsocks $@
- And we configure our SOCKSPort with IsolateSOCKSUser.
-
- Or if we're on Linux and we want to isolate by application invocation,
- we would change the TORSOCKS_USERNAME line to:
-
- export TORSOCKS_USERNAME="`cat /proc/sys/kernel/random/uuid`"
-
-Advanced example of use, #2:
-
- Now suppose that we want to achieve the benefits of the first example
- of use, but we are stuck using transparent proxies. Let's suppose
- this is Linux.
-
- TransPort 9090
- TransPort 9091 IsolateDestAddr IsolateDestPort
- DNSPort 5353
- AutomapHostsOnResolve 1
-
- Here we use the iptables --cmd-owner filter to distinguish which
- command is originating the packets, directing traffic from our irc
- client and our SSH client to port 9091, and directing other traffic to
- 9090. Using AutomapHostsOnResolve will confuse ssh in its default
- configuration; we'll need to find a way around that.
-
-Security Risks:
-
- Disabling IsolateClientAddr is a pretty bad idea.
-
- Setting up a set of applications to use this system effectively is a
- big problem. It's likely that lots of people who try to do this will
- mess it up. We should try to see which setups are sensible, and see
- if we can provide good feedback to explain which streams are isolated
- how.
-
-Performance Risks:
-
- This proposal will result in clients building many more circuits than
- they do today. To avoid accidentally hammering the network, we should
- have in-process limits on the maximum circuit creation rate and the
- total maximum client circuits.
-
-Specification:
-
- The Tor client circuit selection process is not entirely specified.
- Any client circuit specification must take these changes into account.
-
-Implementation notes:
-
- The more obvious ways to implement the "find a good circuit to attach
- to" part of this proposal involve doing an O(n_circuits) operation
- every time we have a stream to attach. We already do such an
- operation, so it's not as if we need to hunt for fancy ways to make it
- O(1). What will be harder is implementing the "launch circuits as
- needed" part of the proposal. Still, it should come down to "a simple
- matter of programming."
-
- The SOCKS4 spec has the client provide authentication info when it
- connects; accepting such info is no problem. But the SOCKS5 spec has
- the client send a list of known auth methods, then has the server send
- back the authentication method it chooses. We'll need to update the
- SOCKS5 implementation so it can accept user/password authentication if
- it's offered.
-
- If we use the second syntax for describing these options, we'll want
- to add a new "section-based" entry type for the configuration parser.
- Not a huge deal; we already have kludged up something similar for
- hidden service configurations.
-
- Opening circuits for predicted ports has the potential to get a little
- more complicated; we can probably get away with the existing
- algorithm, though, to see where its weak points are and look for
- better ones.
-
- Perhaps we can get our next-gen HTTP proxy to communicate browser tab
- or session into to tor via authentication, or have torbutton do it
- directly. More design is needed here, though.
-
-Alternative designs:
-
- The implementation of this option may want to consider cases where the
- same exit node is shared by two or more circuits and
- IsolateStreamsByPort is in force. Since one possible use of the option
- is to reduce the opportunity of Exit Nodes to attack traffic from the
- same source on multiple ports, the implementation may need to ensure
- that circuits reserved for the exclusive use of given ports do not
- share the same exit node. On the other hand, if our goal is only that
- streams should be unlinkable, deliberately shunting them to different
- exit nodes is unnecessary and slightly counterproductive.
-
- Earlier versions of this design included a mechanism to isolate
- _particular_ destination ports and addresses, so that traffic sent to,
- say, port 22 would never share a port with any traffic *not* sent to
- port 22. You can achieve this here by having all applications that
- send traffic to one of these ports use a separate SOCKSPort, and
- then setting IsolateDestPorts on that SOCKSPort.
-
-Future work:
-
- Nikita Borisov suggests that different session profiles -- so long as
- there aren't too many of them -- could well get different guard node
- allocations in order to prevent guard profiling. This can be done
- orthogonally to the rest of this proposal.
-
-Lingering questions:
-
- I suspect there are issues remaining with DNS and TransPort users, and
- that my "just use AutomapHostsOnResolve" suggestion may be
- insufficient.
diff --git a/doc/spec/proposals/172-circ-getinfo-option.txt b/doc/spec/proposals/172-circ-getinfo-option.txt
deleted file mode 100644
index b7fd79c9a..000000000
--- a/doc/spec/proposals/172-circ-getinfo-option.txt
+++ /dev/null
@@ -1,138 +0,0 @@
-Filename: 172-circ-getinfo-option.txt
-Title: GETINFO controller option for circuit information
-Author: Damian Johnson
-Created: 03-June-2010
-Status: Accepted
-
-Overview:
-
- This details an additional GETINFO option that would provide information
- concerning a relay's current circuits.
-
-Motivation:
-
- The original proposal was for connection related information, but Jake make
- the excellent point that any information retrieved from the control port
- is...
-
- 1. completely ineffectual for auditing purposes since either (a) these
- results can be fetched from netstat already or (b) the information would
- only be provided via tor and can't be validated.
-
- 2. The more useful uses for connection information can be achieved with
- much less (and safer) information.
-
- Hence the proposal is now for circuit based rather than connection based
- information. This would strip the most controversial and sensitive data
- entirely (ip addresses, ports, and connection based bandwidth breakdowns)
- while still being useful for the following purposes:
-
- - Basic Relay Usage Questions
- How is the bandwidth I'm contributing broken down? Is it being evenly
- distributed or is someone hogging most of it? Do these circuits belong to
- the hidden service I'm running or something else? Now that I'm using exit
- policy X am I desirable as an exit, or are most people just using me as a
- relay?
-
- - Debugging
- Say a relay has a restrictive firewall policy for outbound connections,
- with the ORPort whitelisted but doesn't realize that tor needs random high
- ports. Tor would report success ("your orport is reachable - excellent")
- yet the relay would be nonfunctional. This proposed information would
- reveal numerous RELAY -> YOU -> UNESTABLISHED circuits, giving a good
- indicator of what's wrong.
-
- - Visualization
- A nice benefit of visualizing tor's behavior is that it becomes a helpful
- tool in puzzling out how tor works. For instance, tor spawns numerous
- client connections at startup (even if unused as a client). As a newcomer
- to tor these asymmetric (outbound only) connections mystified me for quite
- a while until until Roger explained their use to me. The proposed
- TYPE_FLAGS would let controllers clearly label them as being client
- related, making their purpose a bit clearer.
-
- At the moment connection data can only be retrieved via commands like
- netstat, ss, and lsof. However, providing an alternative via the control
- port provides several advantages:
-
- - scrubbing for private data
- Raw connection data has no notion of what's sensitive and what is
- not. The relay's flags and cached consensus can be used to take
- educated guesses concerning which connections could possibly belong
- to client or exit traffic, but this is both difficult and inaccurate.
- Anything provided via the control port can scrubbed to make sure we
- aren't providing anything we think relay operators should not see.
-
- - additional information
- All connection querying commands strictly provide the ip address and
- port of connections, and nothing else. However, for the uses listed
- above the far more interesting attributes are the circuit's type,
- bandwidth usage and uptime.
-
- - improved performance
- Querying connection data is an expensive activity, especially for
- busy relays or low end processors (such as mobile devices). Tor
- already internally knows its circuits, allowing for vastly quicker
- lookups.
-
- - cross platform capability
- The connection querying utilities mentioned above not only aren't
- available under Windows, but differ widely among different *nix
- platforms. FreeBSD in particular takes a very unique approach,
- dropping important options from netstat and assigning ss to a
- spreadsheet application instead. A controller interface, however,
- would provide a uniform means of retrieving this information.
-
-Security Implications:
-
- This is an open question. This proposal lacks the most controversial pieces
- of information (ip addresses and ports) and insight into potential threats
- this would pose would be very welcomed!
-
-Specification:
-
- The following addition would be made to the control-spec's GETINFO section:
-
- "rcirc/id/<Circuit identity>" -- Provides entry for the associated relay
- circuit, formatted as:
- CIRC_ID=<circuit ID> CREATED=<timestamp> UPDATED=<timestamp> TYPE=<flag>
- READ=<bytes> WRITE=<bytes>
-
- none of the parameters contain whitespace, and additional results must be
- ignored to allow for future expansion. Parameters are defined as follows:
- CIRC_ID - Unique numeric identifier for the circuit this belongs to.
- CREATED - Unix timestamp (as seconds since the Epoch) for when the
- circuit was created.
- UPDATED - Unix timestamp for when this information was last updated.
- TYPE - Single character flags indicating attributes in the circuit:
- (E)ntry : has a connection that doesn't belong to a known Tor server,
- indicating that this is either the first hop or bridged
- E(X)it : has been used for at least one exit stream
- (R)elay : has been extended
- Rende(Z)vous : is being used for a rendezvous point
- (I)ntroduction : is being used for a hidden service introduction
- (N)one of the above: none of the above have happened yet.
- READ - Total bytes transmitted toward the exit over the circuit.
- WRITE - Total bytes transmitted toward the client over the circuit.
-
- "rcirc/all" -- The 'rcirc/id/*' output for all current circuits, joined by
- newlines.
-
- The following would be included for circ info update events.
-
-4.1.X. Relay circuit status changed
-
- The syntax is:
- "650" SP "RCIRC" SP CircID SP Notice [SP Created SP Updated SP Type SP
- Read SP Write] CRLF
-
- Notice =
- "NEW" / ; first information being provided for this circuit
- "UPDATE" / ; update for a previously reported circuit
- "CLOSED" ; notice that the circuit no longer exists
-
- Notice indicating that queryable information on a relay related circuit has
- changed. If the Notice parameter is either "NEW" or "UPDATE" then this
- provides the same fields that would be given by calling "GETINFO rcirc/id/"
- with the CircID.
-
diff --git a/doc/spec/proposals/173-getinfo-option-expansion.txt b/doc/spec/proposals/173-getinfo-option-expansion.txt
deleted file mode 100644
index 03e18ef8d..000000000
--- a/doc/spec/proposals/173-getinfo-option-expansion.txt
+++ /dev/null
@@ -1,101 +0,0 @@
-Filename: 173-getinfo-option-expansion.txt
-Title: GETINFO Option Expansion
-Author: Damian Johnson
-Created: 02-June-2010
-Status: Accepted
-
-Overview:
-
- Over the course of developing arm there's been numerous hacks and
- workarounds to gleam pieces of basic, desirable information about the tor
- process. As per Roger's request I've compiled a list of these pain points
- to try and improve the control protocol interface.
-
-Motivation:
-
- The purpose of this proposal is to expose additional process and relay
- related information that is currently unavailable in a convenient,
- dependable, and/or platform independent way. Examples of this are...
-
- - The relay's total contributed bandwidth. This is a highly requested
- piece of information and, based on the following patch from pipe, looks
- trivial to include.
- http://www.mail-archive.com/or-talk@freehaven.net/msg13085.html
-
- - The process ID of the tor process. There is a high degree of guess work
- in obtaining this. Arm for instance uses pidof, netstat, and ps yet
- still fails on some platforms, and Orbot recently got a ticket about
- its own attempt to fetch it with ps:
- https://trac.torproject.org/projects/tor/ticket/1388
-
- This just includes the pieces of missing information I've noticed
- (suggestions or questions of their usefulness are welcome!).
-
-Security Implications:
-
- None that I'm aware of. From a security standpoint this seems decently
- innocuous.
-
-Specification:
-
- The following addition would be made to the control-spec's GETINFO section:
-
- "relay/bw-limit" -- Effective relayed bandwidth limit.
-
- "relay/burst-limit" -- Effective relayed burst limit.
-
- "relay/read-total" -- Total bytes relayed (download).
-
- "relay/write-total" -- Total bytes relayed (upload).
-
- "relay/flags" -- Space separated listing of flags currently held by the
- relay as repored by the currently cached consensus.
-
- "process/user" -- Username under which the tor process is running,
- providing an empty string if none exists.
-
- "process/pid" -- Process id belonging to the main tor process, -1 if none
- exists for the platform.
-
- "process/uptime" -- Total uptime of the tor process (in seconds).
-
- "process/uptime-reset" -- Time since last reset (startup, sighup, or RELOAD
- signal, in seconds).
-
- "process/descriptors-used" -- Count of file descriptors used.
-
- "process/descriptor-limit" -- File descriptor limit (getrlimit results).
-
- "ns/authority" -- Router status info (v2 directory style) for all
- recognized directory authorities, joined by newlines.
-
- "state/names" -- A space-separated list of all the keys supported by this
- version of Tor's state.
-
- "state/val/<key>" -- Provides the current state value belonging to the
- given key. If undefined, this provides the key's default value.
-
- "status/ports-seen" -- A summary of which ports we've seen connections
- circuits connect to recently, formatted the same as the EXITS_SEEN status
- event described in Section 4.1.XX. This GETINFO option is currently
- available only for exit relays.
-
-4.1.XX. Per-port exit stats
-
- The syntax is:
- "650" SP "EXITS_SEEN" SP TimeStarted SP PortSummary CRLF
-
- We just generated a new summary of which ports we've seen exiting circuits
- connecting to recently. The controller could display this for the user, e.g.
- in their "relay" configuration window, to give them a sense of how they're
- being used (popularity of the various ports they exit to). Currently only
- exit relays will receive this event.
-
- TimeStarted is a quoted string indicating when the reported summary
- counts from (in GMT).
-
- The PortSummary keyword has as its argument a comma-separated, possibly
- empty set of "port=count" pairs. For example (without linebreak),
- 650-EXITS_SEEN TimeStarted="2008-12-25 23:50:43"
- PortSummary=80=16,443=8
-
diff --git a/doc/spec/proposals/174-optimistic-data-server.txt b/doc/spec/proposals/174-optimistic-data-server.txt
deleted file mode 100644
index d97c45e90..000000000
--- a/doc/spec/proposals/174-optimistic-data-server.txt
+++ /dev/null
@@ -1,242 +0,0 @@
-Filename: 174-optimistic-data-server.txt
-Title: Optimistic Data for Tor: Server Side
-Author: Ian Goldberg
-Created: 2-Aug-2010
-Status: Open
-
-Overview:
-
-When a SOCKS client opens a TCP connection through Tor (for an HTTP
-request, for example), the query latency is about 1.5x higher than it
-needs to be. Simply, the problem is that the sequence of data flows
-is this:
-
-1. The SOCKS client opens a TCP connection to the OP
-2. The SOCKS client sends a SOCKS CONNECT command
-3. The OP sends a BEGIN cell to the Exit
-4. The Exit opens a TCP connection to the Server
-5. The Exit returns a CONNECTED cell to the OP
-6. The OP returns a SOCKS CONNECTED notification to the SOCKS client
-7. The SOCKS client sends some data (the GET request, for example)
-8. The OP sends a DATA cell to the Exit
-9. The Exit sends the GET to the server
-10. The Server returns the HTTP result to the Exit
-11. The Exit sends the DATA cells to the OP
-12. The OP returns the HTTP result to the SOCKS client
-
-Note that the Exit node knows that the connection to the Server was
-successful at the end of step 4, but is unable to send the HTTP query to
-the server until step 9.
-
-This proposal (as well as its upcoming sibling concerning the client
-side) aims to reduce the latency by allowing:
-1. SOCKS clients to optimistically send data before they are notified
- that the SOCKS connection has completed successfully
-2. OPs to optimistically send DATA cells on streams in the CONNECT_WAIT
- state
-3. Exit nodes to accept and queue DATA cells while in the
- EXIT_CONN_STATE_CONNECTING state
-
-This particular proposal deals with #3.
-
-In this way, the flow would be as follows:
-
-1. The SOCKS client opens a TCP connection to the OP
-2. The SOCKS client sends a SOCKS CONNECT command, followed immediately
- by data (such as the GET request)
-3. The OP sends a BEGIN cell to the Exit, followed immediately by DATA
- cells
-4. The Exit opens a TCP connection to the Server
-5. The Exit returns a CONNECTED cell to the OP, and sends the queued GET
- request to the Server
-6. The OP returns a SOCKS CONNECTED notification to the SOCKS client,
- and the Server returns the HTTP result to the Exit
-7. The Exit sends the DATA cells to the OP
-8. The OP returns the HTTP result to the SOCKS client
-
-Motivation:
-
-This change will save one OP<->Exit round trip (down to one from two).
-There are still two SOCKS Client<->OP round trips (negligible time) and
-two Exit<->Server round trips. Depending on the ratio of the
-Exit<->Server (Internet) RTT to the OP<->Exit (Tor) RTT, this will
-decrease the latency by 25 to 50 percent. Experiments validate these
-predictions. [Goldberg, PETS 2010 rump session; see
-https://thunk.cs.uwaterloo.ca/optimistic-data-pets2010-rump.pdf ]
-
-Design:
-
-The current code actually correctly handles queued data at the Exit; if
-there is queued data in a EXIT_CONN_STATE_CONNECTING stream, that data
-will be immediately sent when the connection succeeds. If the
-connection fails, the data will be correctly ignored and freed. The
-problem with the current server code is that the server currently
-drops DATA cells on streams in the EXIT_CONN_STATE_CONNECTING state.
-Also, if you try to queue data in the EXIT_CONN_STATE_RESOLVING state,
-bad things happen because streams in that state don't yet have
-conn->write_event set, and so some existing sanity checks (any stream
-with queued data is at least potentially writable) are no longer sound.
-
-The solution is to simply not drop received DATA cells while in the
-EXIT_CONN_STATE_CONNECTING state. Also do not send SENDME cells in this
-state, so that the OP cannot send more than one window's worth of data
-to be queued at the Exit. Finally, patch the sanity checks so that
-streams in the EXIT_CONN_STATE_RESOLVING state that have buffered data
-can pass.
-
-If no clients ever send such optimistic data, the new code will never be
-executed, and the behaviour of Tor will not change. When clients begin
-to send optimistic data, the performance of those clients' streams will
-improve.
-
-After discussion with nickm, it seems best to just have the server
-version number be the indicator of whether a particular Exit supports
-optimistic data. (If a client sends optimistic data to an Exit which
-does not support it, the data will be dropped, and the client's request
-will fail to complete.) What do version numbers for hypothetical future
-protocol-compatible implementations look like, though?
-
-Security implications:
-
-Servers (for sure the Exit, and possibly others, by watching the
-pattern of packets) will be able to tell that a particular client
-is using optimistic data. This will be discussed more in the sibling
-proposal.
-
-On the Exit side, servers will be queueing a little bit extra data, but
-no more than one window. Clients today can cause Exits to queue that
-much data anyway, simply by establishing a Tor connection to a slow
-machine, and sending one window of data.
-
-Specification:
-
-tor-spec section 6.2 currently says:
-
- The OP waits for a RELAY_CONNECTED cell before sending any data.
- Once a connection has been established, the OP and exit node
- package stream data in RELAY_DATA cells, and upon receiving such
- cells, echo their contents to the corresponding TCP stream.
- RELAY_DATA cells sent to unrecognized streams are dropped.
-
-It is not clear exactly what an "unrecognized" stream is, but this last
-sentence would be changed to say that RELAY_DATA cells received on a
-stream that has processed a RELAY_BEGIN cell and has not yet issued a
-RELAY_END or a RELAY_CONNECTED cell are queued; that queue is processed
-immediately after a RELAY_CONNECTED cell is issued for the stream, or
-freed after a RELAY_END cell is issued for the stream.
-
-The earlier part of this section will be addressed in the sibling
-proposal.
-
-Compatibility:
-
-There are compatibility issues, as mentioned above. OPs MUST NOT send
-optimistic data to Exit nodes whose version numbers predate (something).
-OPs MAY send optimistic data to Exit nodes whose version numbers match
-or follow that value. (But see the question about independent server
-reimplementations, above.)
-
-Implementation:
-
-Here is a simple patch. It seems to work with both regular streams and
-hidden services, but there may be other corner cases I'm not aware of.
-(Do streams used for directory fetches, hidden services, etc. take a
-different code path?)
-
-diff --git a/src/or/connection.c b/src/or/connection.c
-index 7b1493b..f80cd6e 100644
---- a/src/or/connection.c
-+++ b/src/or/connection.c
-@@ -2845,7 +2845,13 @@ _connection_write_to_buf_impl(const char *string, size_t len,
- return;
- }
-
-- connection_start_writing(conn);
-+ /* If we receive optimistic data in the EXIT_CONN_STATE_RESOLVING
-+ * state, we don't want to try to write it right away, since
-+ * conn->write_event won't be set yet. Otherwise, write data from
-+ * this conn as the socket is available. */
-+ if (conn->state != EXIT_CONN_STATE_RESOLVING) {
-+ connection_start_writing(conn);
-+ }
- if (zlib) {
- conn->outbuf_flushlen += buf_datalen(conn->outbuf) - old_datalen;
- } else {
-@@ -3382,7 +3388,11 @@ assert_connection_ok(connection_t *conn, time_t now)
- tor_assert(conn->s < 0);
-
- if (conn->outbuf_flushlen > 0) {
-- tor_assert(connection_is_writing(conn) || conn->write_blocked_on_bw ||
-+ /* With optimistic data, we may have queued data in
-+ * EXIT_CONN_STATE_RESOLVING while the conn is not yet marked to writing.
-+ * */
-+ tor_assert(conn->state == EXIT_CONN_STATE_RESOLVING ||
-+ connection_is_writing(conn) || conn->write_blocked_on_bw ||
- (CONN_IS_EDGE(conn) && TO_EDGE_CONN(conn)->edge_blocked_on_circ));
- }
-
-diff --git a/src/or/relay.c b/src/or/relay.c
-index fab2d88..e45ff70 100644
---- a/src/or/relay.c
-+++ b/src/or/relay.c
-@@ -1019,6 +1019,9 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ,
- relay_header_t rh;
- unsigned domain = layer_hint?LD_APP:LD_EXIT;
- int reason;
-+ int optimistic_data = 0; /* Set to 1 if we receive data on a stream
-+ that's in the EXIT_CONN_STATE_RESOLVING
-+ or EXIT_CONN_STATE_CONNECTING states.*/
-
- tor_assert(cell);
- tor_assert(circ);
-@@ -1038,9 +1041,20 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ,
- /* either conn is NULL, in which case we've got a control cell, or else
- * conn points to the recognized stream. */
-
-- if (conn && !connection_state_is_open(TO_CONN(conn)))
-- return connection_edge_process_relay_cell_not_open(
-- &rh, cell, circ, conn, layer_hint);
-+ if (conn && !connection_state_is_open(TO_CONN(conn))) {
-+ if ((conn->_base.state == EXIT_CONN_STATE_CONNECTING ||
-+ conn->_base.state == EXIT_CONN_STATE_RESOLVING) &&
-+ rh.command == RELAY_COMMAND_DATA) {
-+ /* We're going to allow DATA cells to be delivered to an exit
-+ * node in state EXIT_CONN_STATE_CONNECTING or
-+ * EXIT_CONN_STATE_RESOLVING. This speeds up HTTP, for example. */
-+ log_warn(domain, "Optimistic data received.");
-+ optimistic_data = 1;
-+ } else {
-+ return connection_edge_process_relay_cell_not_open(
-+ &rh, cell, circ, conn, layer_hint);
-+ }
-+ }
-
- switch (rh.command) {
- case RELAY_COMMAND_DROP:
-@@ -1090,7 +1104,9 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ,
- log_debug(domain,"circ deliver_window now %d.", layer_hint ?
- layer_hint->deliver_window : circ->deliver_window);
-
-- circuit_consider_sending_sendme(circ, layer_hint);
-+ if (!optimistic_data) {
-+ circuit_consider_sending_sendme(circ, layer_hint);
-+ }
-
- if (!conn) {
- log_info(domain,"data cell dropped, unknown stream (streamid %d).",
-@@ -1107,7 +1123,9 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ,
- stats_n_data_bytes_received += rh.length;
- connection_write_to_buf(cell->payload + RELAY_HEADER_SIZE,
- rh.length, TO_CONN(conn));
-- connection_edge_consider_sending_sendme(conn);
-+ if (!optimistic_data) {
-+ connection_edge_consider_sending_sendme(conn);
-+ }
- return 0;
- case RELAY_COMMAND_END:
- reason = rh.length > 0 ?
-
-Performance and scalability notes:
-
-There may be more RAM used at Exit nodes, as mentioned above, but it is
-transient.
diff --git a/doc/spec/proposals/175-automatic-node-promotion.txt b/doc/spec/proposals/175-automatic-node-promotion.txt
deleted file mode 100644
index c990b3f06..000000000
--- a/doc/spec/proposals/175-automatic-node-promotion.txt
+++ /dev/null
@@ -1,238 +0,0 @@
-Filename: 175-automatic-node-promotion.txt
-Title: Automatically promoting Tor clients to nodes
-Author: Steven Murdoch
-Created: 12-Mar-2010
-Status: Draft
-
-1. Overview
-
- This proposal describes how Tor clients could determine when they
- have sufficient bandwidth capacity and are sufficiently reliable to
- become either bridges or Tor relays. When they meet this
- criteria, they will automatically promote themselves, based on user
- preferences. The proposal also defines the new controller messages
- and options which will control this process.
-
- Note that for the moment, only transitions between client and
- bridge are being considered. Transitions to public relay will
- be considered at a future date, but will use the same
- infrastructure for measuring capacity and reliability.
-
-2. Motivation and history
-
- Tor has a growing user-base and one of the major impediments to the
- quality of service offered is the lack of network capacity. This is
- particularly the case for bridges, because these are gradually
- being blocked, and thus no longer of use to people within some
- countries. By automatically promoting Tor clients to bridges, and
- perhaps also to full public relays, this proposal aims to solve
- these problems.
-
- Only Tor clients which are sufficiently useful should be promoted,
- and the process of determining usefulness should be performed
- without reporting the existence of the client to the central
- authorities. The criteria used for determining usefulness will be
- in terms of bandwidth capacity and uptime, but parameters should be
- specified in the directory consensus. State stored at the client
- should be in no more detail than necessary, to prevent sensitive
- information being recorded.
-
-3. Design
-
-3.x Opt-in state model
-
- Tor can be in one of five node-promotion states:
-
- - off (O): Currently a client, and will stay as such
- - auto (A): Currently a client, but will consider promotion
- - bridge (B): Currently a bridge, and will stay as such
- - auto-bridge (AB): Currently a bridge, but will consider promotion
- - relay (R): Currently a public relay, and will stay as such
-
- The state can be fully controlled from the configuration file or
- controller, but the normal state transitions are as follows:
-
- Any state -> off: User has opted out of node promotion
- Off -> any state: Only permitted with user consent
-
- Auto -> auto-bridge: Tor has detected that it is sufficiently
- reliable to be a *bridge*
- Auto -> bridge: Tor has detected that it is sufficiently reliable
- to be a *relay*, but the user has chosen to remain a *bridge*
- Auto -> relay: Tor has detected that it is sufficiently reliable
- to be *relay*, and will skip being a *bridge*
- Auto-bridge -> relay: Tor has detected that it is sufficiently
- reliable to be a *relay*
-
- Note that this model does not support automatic demotion. If this
- is desirable, there should be some memory as to whether the
- previous state was relay, bridge, or auto-bridge. Otherwise the
- user may be prompted to become a relay, although he has opted to
- only be a bridge.
-
-3.x User interaction policy
-
- There are a variety of options in how to involve the user into the
- decision as to whether and when to perform node promotion. The
- choice also may be different when Tor is running from Vidalia (and
- thus can readily prompt the user for information), and standalone
- (where Tor can only log messages, which may or may not be read).
-
- The option requiring minimal user interaction is to automatically
- promote nodes according to reliability, and allow the user to opt
- out, by changing settings in the configuration file or Vidalia user
- interface.
-
- Alternatively, if a user interface is available, Tor could prompt
- the user when it detects that a transition is available, and allow
- the user to choose which of the available options to select. If
- Vidalia is not available, it still may be possible to solicit an
- email address on install, and contact the operator to ask whether
- a transition to bridge or relay is permitted.
-
- Finally, Tor could by default not make any transition, and the user
- would need to opt in by stating the maximum level (bridge or
- relay) to which the node may automatically promote itself.
-
-3.x Performance monitoring model
-
- To prevent a large number of clients activating as relays, but
- being too unreliable to be useful, clients should measure their
- performance. If this performance meets a parameterized acceptance
- criteria, a client should consider promotion. To measure
- reliability, this proposal adopts a simple user model:
-
- - A user decides to use Tor at times which follow a Poisson
- distribution
- - At each time, the user will be happy if the bridge chosen has
- adequate bandwidth and is reachable
- - If the chosen bridge is down or slow too many times, the user
- will consider Tor to be bad
-
- If we additionally assume that the recent history of relay
- performance matches the current performance, we can measure
- reliability by simulating this simple user.
-
- The following parameters are distributed to clients in the
- directory consensus:
-
- - min_bandwidth: Minimum self-measured bandwidth for a node to be
- considered useful, in bytes per second
- - check_period: How long, in seconds, to wait between checking
- reachability and bandwidth (on average)
- - num_samples: Number of recent samples to keep
- - num_useful: Minimum number of recent samples where the node was
- reachable and had at least min_bandwidth capacity, for a client
- to consider promoting to a bridge
-
- A different set of parameters may be used for considering when to
- promote a bridge to a full relay, but this will be the subject of a
- future revision of the proposal.
-
-3.x Performance monitoring algorithm
-
- The simulation described above can be implemented as follows:
-
- Every 60 seconds:
- 1. Tor generates a random floating point number x in
- the interval [0, 1).
- 2. If x > (1 / (check_period / 60)) GOTO end; otherwise:
- 3. Tor sets the value last_check to the current_time (in seconds)
- 4. Tor measures reachability
- 5. If the client is reachable, Tor measures its bandwidth
- 6. If the client is reachable and the bandwidth is >=
- min_bandwidth, the test has succeeded, otherwise it has failed.
- 7. Tor adds the test result to the end of a ring-buffer containing
- the last num_samples results: measurement_results
- 8. Tor saves last_check and measurements_results to disk
- 9. If the length of measurements_results == num_samples and
- the number of successes >= num_useful, Tor should consider
- promotion to a bridge
- end.
-
- When Tor starts, it must fill in the samples for which it was not
- running. This can only happen once the consensus has downloaded,
- because the value of check_period is needed.
-
- 1. Tor generates a random number y from the Poisson distribution [1]
- with lambda = (current_time - last_check) * (1 / check_period)
- 2. Tor sets the value last_check to the current_time (in seconds)
- 3. Add y test failures to the ring buffer measurements_results
- 4. Tor saves last_check and measurements_results to disk
-
- In this way, a Tor client will measure its bandwidth and
- reachability every check_period seconds, on average. Provided
- check_period is sufficiently greater than a minute (say, at least an
- hour), the times of check will follow a Poisson distribution. [2]
-
- While this does require that Tor does record the state of a client
- over time, this does not leak much information. Only a binary
- reachable/non-reachable is stored, and the timing of samples becomes
- increasingly fuzzy as the data becomes less recent.
-
- On IP address changes, Tor should clear the ring-buffer, because
- from the perspective of users with the old IP address, this node
- might as well be a new one with no history. This policy may change
- once we start allowing the bridge authority to hand out new IP
- addresses given the fingerprint.
-
-3.x Bandwidth measurement
-
- Tor needs to measure its bandwidth to test the usefulness as a
- bridge. A non-intrusive way to do this would be to passively measure
- the peak data transfer rate since the last reachability test. Once
- this exceeds min_bandwidth, Tor can set a flag that this node
- currently has sufficient bandwidth to pass the bandwidth component
- of the upcoming performance measurement.
-
- For the first version we may simply skip the bandwidth test,
- because the existing reachability test sends 500 kB over several
- circuits, and checks whether the node can transfer at least 50
- kB/s. This is probably good enough for a bridge, so this test
- might be sufficient to record a success in the ring buffer.
-
-3.x New options
-
-3.x New controller message
-
-4. Migration plan
-
- We should start by setting a high bandwidth and uptime requirement
- in the consensus, so as to avoid overloading the bridge authority
- with too many bridges. Once we are confident our systems can scale,
- the criteria can be gradually shifted down to gain more bridges.
-
-5. Related proposals
-
-6. Open questions:
-
- - What user interaction policy should we take?
-
- - When (if ever) should we turn a relay into an exit relay?
-
- - What should the rate limits be for auto-promoted bridges/relays?
- Should we prompt the user for this?
-
- - Perhaps the bridge authority should tell potential bridges
- whether to enable themselves, by taking into account whether
- their IP address is blocked
-
- - How do we explain the possible risks of running a bridge/relay
- * Use of bandwidth/congestion
- * Publication of IP address
- * Blocking from IRC (even for non-exit relays)
-
- - What feedback should we give to bridge relays, to encourage then
- e.g. number of recent users (what about reserve bridges)?
-
- - Can clients back-off from doing these tests (yes, we should do
- this)
-
-[1] For algorithms to generate random numbers from the Poisson
- distribution, see: http://en.wikipedia.org/wiki/Poisson_distribution#Generating_Poisson-distributed_random_variables
-[2] "The sample size n should be equal to or larger than 20 and the
- probability of a single success, p, should be smaller than or equal to
- .05. If n >= 100, the approximation is excellent if np is also <= 10."
- http://www.itl.nist.gov/div898/handbook/pmc/section3/pmc331.htm (e-Handbook of Statistical Methods)
-
-% vim: spell ai et:
diff --git a/doc/spec/proposals/176-revising-handshake.txt b/doc/spec/proposals/176-revising-handshake.txt
deleted file mode 100644
index db7ea4a66..000000000
--- a/doc/spec/proposals/176-revising-handshake.txt
+++ /dev/null
@@ -1,623 +0,0 @@
-Filename: 176-revising-handshake.txt
-Title: Proposed version-3 link handshake for Tor
-Author: Nick Mathewson
-Created: 31-Jan-2011
-Status: Draft
-Target: 0.2.3
-Supersedes: 169
-
-1. Overview
-
- I propose a (mostly) backward-compatible change to the Tor
- connection establishment protocol to avoid the use of TLS
- renegotiation, to avoid certain protocol fingerprinting attacks,
- and to make it easier to write Tor clients and servers.
-
- Rather than doing a TLS renegotiation to exchange certificates
- and authenticate the original handshake, this proposal takes an
- approach similar to Steven Murdoch's proposal 124 and my old
- proposal 169, and uses Tor cells to finish authenticating the
- parties' identities once the initial TLS handshake is finished.
-
- I discuss some alternative design choices and why I didn't make
- them in section 7; please have a quick look there before
- telling me that something is pointless or makes no sense.
-
- Terminological note: I use "client" below to mean the Tor
- instance (a client or a bridge or a relay) that initiates a TLS
- connection, and "server" to mean the Tor instance (a bridge or a
- relay) that accepts it.
-
-2. History and Motivation
-
- The _goals_ of the Tor link handshake have remained basically uniform
- since our earliest versions. They are:
-
- * Provide data confidentiality, data integrity
- * Provide forward secrecy
- * Allow responder authentication or bidirectional authentication.
- * Try to look like some popular too-important-to-block-at-whim
- encryption protocol, to avoid fingerprinting and censorship.
- * Try to be implementatble -- on the client side at least! --
- by as many TLS implementations as possible.
-
- When we added the v2 handshake, we added another goal:
-
- * Remain compatible with older versions of the handshake
- protocol.
-
- In the original Tor TLS connection handshake protocol ("V1", or
- "two-cert"), parties that wanted to authenticate provided a
- two-cert chain of X.509 certificates during the handshake setup
- phase. Every party that wanted to authenticate sent these
- certificates. The security properties of this protocol are just
- fine; the problem was that our behavior of sending
- two-certificate chains made Tor easy to identify.
-
- In the current Tor TLS connection handshake protocol ("V2", or
- "renegotiating"), the parties begin with a single certificate
- sent from the server (responder) to the client (initiator), and
- then renegotiate to a two-certs-from-each-authenticating party.
- We made this change to make Tor's handshake look like a browser
- speaking SSL to a webserver. (See proposal 130, and
- tor-spec.txt.) So from an observer's point of view, two parties
- performing the V2 handshake begin by making a regular TLS
- handshake with a single certificate, then renegotiate
- immediately.
-
- To tell whether to use the V1 or V2 handshake, the servers look
- at the list of ciphers sent by the client. (This is ugly, but
- there's not much else in the ClientHello that they can look at.)
- If the list contains any cipher not used by the V1 protocol, the
- server sends back a single cert and expects a renegotiation. If
- the client gets back a single cert, then it withholds its own
- certificates until the TLS renegotiation phase.
-
- In other words, V2-supporting initiator behavior currently looks
- like this:
-
- - Begin TLS negotiation with V2 cipher list; wait for
- certificate(s).
- - If we get a certificate chain:
- - Then we are using the V1 handshake. Send our own
- certificate chain as part of this initial TLS handshake
- if we want to authenticate; otherwise, send no
- certificates. When the handshake completes, check
- certificates. We are now mutually authenticated.
-
- Otherwise, if we get just a single certificate:
- - Then we are using the V2 handshake. Do not send any
- certificates during this handshake.
- - When the handshake is done, immediately start a TLS
- renegotiation. During the renegotiation, expect
- a certificate chain from the server; send a certificate
- chain of our own if we want to authenticate ourselves.
- - After the renegotiation, check the certificates. Then
- send (and expect) a VERSIONS cell from the other side to
- establish the link protocol version.
-
- And V2-supporting responder behavior now looks like this:
-
- - When we get a TLS ClientHello request, look at the cipher
- list.
- - If the cipher list contains only the V1 ciphersuites:
- - Then we're doing a V1 handshake. Send a certificate
- chain. Expect a possible client certificate chain in
- response.
- Otherwise, if we get other ciphersuites:
- - We're using the V2 handshake. Send back a single
- certificate and let the handshake complete.
- - Do not accept any data until the client has renegotiated.
- - When the client is renegotiating, send a certificate
- chain, and expect (possibly multiple) certificates in
- reply.
- - Check the certificates when the renegotiation is done.
- Then exchange VERSIONS cells.
-
- Late in 2009, researchers found a flaw in most applications' use
- of TLS renegotiation: Although TLS renegotiation does not
- reauthenticate any information exchanged before the renegotiation
- takes place, many applications were treating it as though it did,
- and assuming that data sent _before_ the renegotiation was
- authenticated with the credentials negotiated _during_ the
- renegotiation. This problem was exacerbated by the fact that
- most TLS libraries don't actually give you an obvious good way to
- tell where the renegotiation occurred relative to the datastream.
- Tor wasn't directly affected by this vulnerability, but the
- aftermath hurts us in a few ways:
-
- 1) OpenSSL has disabled renegotiation by default, and created
- a "yes we know what we're doing" option we need to set to
- turn it back on. (Two options, actually: one for openssl
- 0.9.8l and one for 0.9.8m and later.)
-
- 2) Some vendors have removed all renegotiation support from
- their versions of OpenSSL entirely, forcing us to tell
- users to either replace their versions of OpenSSL or to
- link Tor against a hand-built one.
-
- 3) Because of 1 and 2, I'd expect TLS renegotiation to become
- rarer and rarer in the wild, making our own use stand out
- more.
-
- Furthermore, there are other issues related to TLS and
- fingerprinting that we want to fix in any revised handshake:
-
- 1) We should make it easier to use self-signed certs, or maybe
- even existing HTTPS certificates, for the server side
- handshake, since most non-Tor SSL handshakes use either
- self-signed certificates or
-
- 2) We should make it harder to probe for a Tor server. Right
- now, you can just do a handshake with a server,
- renegotiate, then see if it gives you a VERSIONS cell.
- That's no good.
-
- 3) We should allow other changes in our use of TLS and in our
- certificates so as to resist fingerprinting based on how
- our certificates look.
-
-3. Design
-
-3.1. The view in the large
-
- Taking a cue from Steven Murdoch's proposal 124 and my old
- proposal 169, I propose that we move the work currently done by
- the TLS renegotiation step (that is, authenticating the parties
- to one another) and do it with Tor cells instead of with TLS
- alone.
-
- This section outlines the protocol; we go into more detail below.
-
- To tell the client that it can use the new cell-based
- authentication system, the server sends a "V3 certificate" during
- the initial TLS handshake. (More on what makes a certificate
- "v3" below.) If the client recognizes the format of the
- certificate and decides to pursue the V3 handshake, then instead
- of renegotiating immediately on completion of the initial TLS
- handshake, the client instead sends a VERSIONS cell (and the
- negotiation begins).
-
- So the flowchart on the server side is:
-
- Wait for a ClientHello.
- IF the client sends a ClientHello that indicates V1:
- - Send a certificate chain.
- - When the TLS handshake is done, if the client sent us a
- certificate chain, then check it.
- If the client sends a ClientHello that indicates V2 or V3:
- - Send a self-signed certificate or a CA-signed certificate
- - When the TLS handshake is done, wait for renegotiation or data.
- - If renegotiation occurs, the client is V2: send a
- certificate chain and maybe receive one. Check the
- certificate chain as in V1.
- - If the client sends data without renegotiating, it is
- starting the V3 handshake. Proceed with the V3
- handshake as below.
-
- And the client-side flowchart is:
-
- - Send a ClientHello with a set of ciphers that indicates V2/V3.
- - After the handshake is done:
- - If the server sent us a certificate chain, check it: we
- are using the V1 handshake.
- - If the server sent us a single "V2 certificate", we are
- using the v2 handshake: the client begins to renegotiate
- and proceeds as before.
- - Finally, if the server sent us a "v3 certificate", we are
- doing the V3 handshake below.
-
- And the cell-based part of the V3 handshake, in summary, is:
-
- C<->S: TLS handshake where S sends a "v3 certificate"
-
- In TLS:
-
- C->S: VERSIONS cell
- S->C: VERSIONS cell, CERT cell, AUTH_CHALLENGE cell, NETINFO cell
-
- C->S: Optionally: CERT cell, AUTHENTICATE cell
-
- A "CERTS" cell contains a set of certificates; an "AUTHENTICATE"
- cell authenticates the client to the server. More on these
- later.
-
-3.2. Distinguishing V2 and V3 certificates
-
- In the protocol outline above, we require that the client can
- distinguish between v2 certificates (that is, those sent by
- current servers) and a v3 certificates. We further require that
- existing clients will accept v3 certificates as they currently
- accept v2 certificates.
-
- Fortunately, current certificates have a few characteristics that
- make them fairly mannered as it is. We say that a certificate
- indicates a V2-only server if ALL of the following hold:
- * The certificate is not self-signed.
- * There is no DN field set in the certificate's issuer or
- subject other than "commonName".
- * The commonNames of the issuer and subject both end with
- ".net"
- * The public modulus is at most 1024 bits long.
-
- Otherwise, the client should assume that the server supports the
- V3 handshake.
-
- To the best of my knowledge, current clients will behave properly
- on receiving non-v2 certs during the initial TLS handshake so
- long as they eventually get the correct V2 cert chain during the
- renegotiation.
-
- The v3 requirements are easy to meet: any certificate designed to
- resist fingerprinting will likely be self-signed, or if it's
- signed by a CA, then the issuer will surely have more DN fields
- set. Certificates that aren't trying to resist fingerprinting
- can trivially become v3 by using a CN that doesn't end with .net,
- or using a 1024-bit key.
-
-
-3.3. Authenticating via Tor cells: server authentication
-
- Once the TLS handshake is finished, if the client renegotiates,
- then the server should go on as it does currently.
-
- If the client implements this proposal, however, and the server
- has shown it can understand the V3+ handshake protocol, the
- client immediately sends a VERSIONS cell to the server
- and waits to receive a VERSIONS cell in return. We negotiate
- the Tor link protocol version _before_ we proceed with the
- negotiation, in case we need to change the authentication
- protocol in the future.
-
- Once either party has seen the VERSIONS cell from the other, it
- knows which version they will pick (that is, the highest version
- shared by both parties' VERSIONS cells). All Tor instances using
- the handshake protocol described in 3.2 MUST support at least
- link protocol version 3 as described here. If a version lower
- than 3 is negotiated with the V3 handshake in place, a Tor
- instance MUST close the connection.
-
- On learning the link protocol, the server then sends the client a
- CERT cell and a NETINFO cell. If the client wants to
- authenticate to the server, it sends a CERT cell, an AUTHENTICATE
- cell, and a NETINFO cell, or it may simply send a NETINFO cell if
- it does not want to authenticate.
-
- The CERT cell describes the keys that a Tor instance is claiming
- to have. It is a variable-length cell. Its payload format is:
-
- N: Number of certs in cell [1 octet]
- N times:
- CertType [1 octet]
- CLEN [2 octets]
- Certificate [CLEN octets]
-
- Any extra octets at the end of a CERT cell MUST be ignored.
-
- CertType values are:
- 1: Link key certificate from RSA1024 identity
- 2: RSA1024 Identity certificate
- 3: RSA1024 AUTHENTICATE cell link certificate
-
- The certificate format is X509.
-
- To authenticate the server, the client MUST check the following:
- * The CERTS cell contains exactly one CertType 1 "Link" certificate.
- * The CERTS cell contains exactly one CertType 2 "ID"
- certificate.
- * Both certificates have validAfter and validUntil dates that
- are not expired.
- * The certified key in the Link certificate matches the
- link key that was used to negotiate the TLS connection.
- * The certified key in the ID certificate is a 1024-bit RSA key.
- * The certified key in the ID certificate was used to sign both
- certificates.
- * The link certificate is correctly signed with the key in the
- ID certificate
- * The ID certificate is correctly self-signed.
-
- If all of these conditions hold, then the client knows that it is
- connected to the server whose identity key is certified in the ID
- certificate. If any condition does not hold, the client closes
- the connection. If the client wanted to connect to a server with
- a different identity key, the client closes the connection.
-
-
- An AUTH_CHALLENGE cell is a variable-length cell with the following
- fields:
- Challenge [32 octets]
- It is sent from the server to the client. Clients MUST ignore
- unexpected bytes at the end of the cell. Servers MUST generate
- every challenge using a strong RNG or PRNG.
-
-3.4. Authenticating via Tor cells: Client authentication
-
- A client does not need to authenticate to the server. If it
- does not wish to, it responds to the server's valid CERT cell by
- sending NETINFO cell: once it has gotten a valid NETINFO cell
- back, the client should consider the connection open, and the
- server should consider the connection as opened by an
- unauthenticated client.
-
- If a client wants to authenticate, it responds to the
- AUTH_CHALLENGE cell with a CERT cell and an AUTHENTICATE cell.
- The CERT cell is as a server would send, except that instead of
- sending a CertType 1 cert for an arbitrary link certificate, the
- client sends a CertType 3 cert for an RSA AUTHENTICATE key.
- (This difference is because we allow any link key type on a TLS
- link, but the protocol described here will only work for 1024-bit
- RSA keys. A later protocol version should extend the protocol
- here to work with non-1024-bit, non-RSA keys.)
-
- AuthType [2 octets]
- AuthLen [2 octets]
- Authentication [AuthLen octets]
-
-
- Servers MUST ignore extra bytes at the end of an AUTHENTICATE
- cell. If AuthType is 1 (meaning "RSA-SHA256-TLSSecret"), then the
- Authentication contains the following:
-
- Type: The characters "AUTH0001" [8 octets]
- CID: A SHA256 hash of the client's RSA1024 identity key [32 octets]
- SID: A SHA256 hash of the server's RSA1024 identity key [32 octets]
- SLOG: A SHA256 hash of all bytes sent from the server to the client
- as part of the negotiation up to and including the
- AUTH_CHALLENGE cell; that is, the VERSIONS cell,
- the CERT cell, and the AUTH_CHALLENGE cell. [32 octets]
- CLOG: A SHA256 hash of all bytes sent from the client to the
- server as part of the negotiation so far; that is, the
- VERSIONS cell and the CERT cell. [32 octets]
- SCERT: A SHA256 hash of the server's TLS link
- certificate. [32 octets]
- TLSSECRETS: Either 32 zero octets, or a SHA256 HMAC, using
- the TLS master secret as the secret key, of the following:
- - client_random, as sent in the TLS Client Hello
- - server_random, as sent in the TLS Server Hello
- - the NUL terminated ASCII string:
- "Tor V3 handshake TLS cross-certification"
- [32 octets]
- TIME: The time of day in seconds since the POSIX epoch. [8 octets]
- NONCE: A 16 byte value, randomly chosen by the client [16 octets]
- SIG: A signature of a SHA256 hash of all the previous fields
- using the client's "Authenticate" key as presented. (As
- always in Tor, we use OAEP-MGF1 padding; see tor-spec.txt
- section 0.3.)
- [variable length]
-
- To check the AUTHENTICATE cell, a server checks that all fields
- containing a hash contain the correct value, then verifies the
- signature. The server MUST ignore any extra bytes after
- the SHA256 hash.
-
- When possible (that is, when implemented using C TLS API),
- implementations SHOULD include and verify the TLSSECRETS field.
-
-3.5. Responding to extra cells, and other security checks.
-
- If the handshake is a V3+ TLS handshake, both parties MUST reject
- any negotiated link version less than 3. Both parties MUST check
- this and close the connection if it is violated.
-
- If the handshake is not a V3+ TLS handshake, both parties MUST
- still advertise all link protocols they support in their versions
- cell. Both parties MUST close the link if it turns out they both
- would have supported version 3 or higher, but they somehow wound
- up using a v2 or v1 handshake. (More on this in section 6.4.)
-
- A server SHOULD NOT send any sequence of cells when starting a v3
- negotiation other than "VERSIONS, CERT, AUTH_CHALLENGE,
- NETINFO". A client SHOULD drop a CERT, AUTH_CHALLENGE, or
- NETINFO cell that appears at any other time or out of sequence.
-
- A client should not begin a v3 negotiation with any sequence
- other than "VERSIONS, NETINFO" or "VERSIONS, CERT, AUTHENTICATE,
- NETINFO". A server SHOULD drop a CERT, AUTH_CHALLENGE, or
- NETINFO cell that appears at any other time or out of sequence.
-
-4. Numbers to assign
-
- We need a version number for this link protocol. I've been
- calling it "3".
-
- We need to reserve command numbers for CERT, AUTH_CHALLENGE, and
- AUTHENTICATE. I suggest that in link protocol 3 and higher, we
- reserve a separate range of commands for variable-length cells.
-
-5. Efficiency
-
- This protocol adds a round-trip step when the client sends a
- VERSIONS cell to the server, and waits for the {VERSIONS, CERT,
- NETINFO} response in turn. (The server then waits for the
- client's {NETINFO} or {CERT, AUTHENTICATE, NETINFO} reply,
- but it would have already been waiting for the client's NETINFO,
- so that's not an additional wait.)
-
- This is actually fewer round-trip steps than required before for
- TLS renegotiation, so that's a win over v2.
-
-6. Security argument
-
- These aren't crypto proofs, since I don't write those. They are
- meant be reasonably convincing.
-
-6.1. The server is authenticated
-
- TLS guarantees that if the TLS handshake completes successfully,
- the client knows that it is speaking to somebody who knows the
- private key corresponding to the public link key that was used in
- the TLS handshake.
-
- Because this public link key is signed by the server's identity
- key in the CERT cell, the client knows that somebody who holds
- the server's private identity key says that the server's public
- link key corresponds to the server's public identity key.
-
- Therefore, if the crypto works, and if TLS works, and if the keys
- aren't compromised, then the client is talking to somebody who
- holds the server's private identity key.
-
-6.2. The client is authenticated
-
- Once the server has checked the client's certificates, the server
- knows that somebody who knows the client's private identity key
- says that he is the one holding the private key corresponding to
- the client's presented link-authentication public key.
-
- Once the server has checked the signature in the AUTHENTICATE
- cell, the server knows that somebody holding the client's
- link-authentication private key signed the data in question. By
- the standard certification argument above, the server knows that
- somebody holding the client's private identity key signed the
- data in question.
-
- So the server's remaining question is: am I really talking to
- somebody holding the client's identity key, or am I getting a
- replayed or MITM'd AUTHENTICATE cell that was previously sent by
- the client?
-
- If the client included a non-zero TLSSECRET component, and the
- server is able to verify it, then the answer is easy: the server
- knows for certain that it is talking to the party with whom it
- did the TLS handshake, since if somebody else generated a correct
- TLSSECRET, they would have to know the master secret of the TLS
- connection, which would require them to have broken TLS.
-
- If the client was not able to include a non-zero TLSSECRET
- component, or the server can't check it, the answer is a little
- trickier. The server knows that it is not getting a replayed
- AUTHENTICATE cell, since the cell authenticates (among other
- stuff) the server's AUTH_CHALLENGE cell, which it has never used
- before. The server knows that it is not getting a MITM'd
- AUTHENTICATE cell, since the cell includes a hash of the server's
- link certificate, which nobody else should have been able to use
- in a successful TLS negotiation.
-
-6.3. MITM attacks won't work any better than they do against TLS
-
- TLS guarantees that a man-in-the-middle attacker can't read the
- content of a successfully negotiated encrypted connection, nor
- alter the content in any way other than truncating it, unless he
- compromises the session keys or one of the key-exchange secret
- keys used to establish that connection. Let's make sure we do at
- least that well.
-
- Suppose that a client Alice connects to an MITM attacker Mallory,
- thinking that he is connecting to some server Bob. Let's assume
- that the TLS handshake between Alice and Mallory finishes
- successfully and the v3 protocol is chosen. [If the v1 or v2
- protocol is chosen, those already resist MITM. If the TLS
- handshake doesn't complete, then Alice isn't connected to anybody.]
-
- During the v3 handshake, Mallory can't convince Alice that she is
- talking to Bob, since she should not be able to produce a CERT
- cell containing a certificate chain signed by Bob's identity key
- and used to authenticate the link key that Mallory used during
- TLS. (If Mallory used her own link key for the TLS handshake, it
- won't match anything Bob signed unless Bob is compromised.
- Mallory can't use any key that Bob _did_ produce a certificate
- for, since she doesn't know the private key.)
-
- Even if Alice fails to check the certificates from Bob, Mallory
- still can't convince Bob that she is really Alice. Assuming that
- Alice's keys aren't compromised, Mallory can't sent a CERT cell
- with a cert chain from Alice's identity key to a key that Mallory
- controls, so if Mallory wants to impersonate Alice's identity
- key, she can only do so by sending an AUTHENTICATE cell really
- generated by Alice. Because Bob will check that the random bytes
- in the AUTH_CHALLENGE cell will influence the SLOG hash, Mallory
- needs to send Bob's challenge to Alice, and can't use any other
- AUTHENTICATE cell that Alice generated before. But because the
- AUTHENTICATE cell Alice will generate will include in the SCERT
- field a hash of the link certificate used by Mallory, Bob will
- reject it as not being valid to connect to him.
-
-6.4. Protocol downgrade attacks won't work.
-
- Assuming that Alice checks the certificates from Bob, she knows
- that Bob really sent her the VERSION cell that she received.
-
- Because the AUTHENTICATE cell from Alice includes signed hashes
- of the VERSIONS cells from Alice and Bob, Bob knows that Alice
- got the VERSIONS cell he sent and sent the VERSIONS cell that he
- received.
-
- But what about attempts to downgrade the protocol earlier in the
- handshake? Here TLS comes to the rescue: because the TLS
- Finished handshake message includes an authenticated digest of
- everything previously said during the handshake, an attacker
- can't replace the client's ciphersuite list (to trigger a
- downgrade to the v1 protocol) or the server's certificate [chain]
- (to trigger a downgrade to the v1 or v2 protocol).
-
-7. Design considerations
-
- I previously considered adding our own certificate format in
- order to avoid the pain associated with X509, but decided instead
- to simply use X509 since a correct Tor implementation will
- already need to have X509 code to handle the other handshake
- versions and to use TLS.
-
- The trickiest part of the design here is deciding what to stick
- in the AUTHENTICATE cell. Some of it is strictly necessary, and
- some of it is left there for security margin in case my other
- security arguments fail. Because of the CID and SID elements
- you can't use an AUTHENTICATE cell for anything other than
- authenticating a client ID to a server with an appropriate
- server ID. The SLOG and CLOG elements are there mostly to
- authenticate the VERSIONS cells and resist downgrade attacks
- once there are two versions of this. The presence of the
- AUTH_CHALLENGE field in the stuff authenticated in SLOG
- prevents replays and ensures that the AUTHENTICATE cell was
- really generated by somebody who is reading what the server is
- sending over the TLS connection. The SCERT element is meant to
- prevent MITM attacks. When the TLSSECRET field is
- used, it should prevent the use of the AUTHENTICATE cell for
- anything other than the TLS connection the client had in mind.
-
- A signature of the TLSSECRET element on its own should be
- sufficient to prevent the attacks we care about, but because we
- don't necessarily have access to the TLS master secret when using
- a non-C TLS library, we can't depend on it. I added it anyway
- so that, if there is some problem with the rest of the protocol,
- clients and servers that _are_ written in C (that is, the official
- Tor implementation) can still be secure.
-
- If the client checks the server's certificates and matches them
- to the TLS connection link key before proceding with the
- handshake, then signing the contents of the AUTH_CHALLENGE cell
- would be sufficient to authenticate the client. But implementers
- of allegedly compatible Tor clients have in the past skipped
- certificate verification steps, and I didn't want a client's
- failure to verify certificates to mean that a server couldn't
- trust that he was really talking to the client. To prevent this,
- I added the TLS link certificate to the authenticated data: even
- if the Tor client code doesn't check any certificates, the TLS
- library code will still check that the certificate used in the
- handshake contains a link key that matches the one used in the
- handshake.
-
-8. Open questions:
-
- - May we cache which certificates we've already verified? It
- might leak in timing whether we've connected with a given server
- before, and how recently.
-
- - With which TLS libraries is it feasible to yoink client_random,
- server_random, and the master secret? If the answer is "All
- free C TLS libraries", great. If the answer is "OpenSSL only",
- not so great.
-
- - Should we do anything to check the timestamp in the AUTHENTICATE
- cell?
-
- - Can we give some way for clients to signal "I want to use the
- V3 protocol if possible, but I can't renegotiate, so don't give
- me the V2"? Clients currently have a fair idea of server
- versions, so they could potentially do the V3+ handshake with
- servers that support it, and fall back to V1 otherwise.
-
- - What should servers that don't have TLS renegotiation do? For
- now, I think they should just stick with V1. Eventually we can
- deprecate the V2 handshake as we did with the V1 handshake.
- When that happens, servers can be V3-only.
diff --git a/doc/spec/proposals/177-flag-abstention.txt b/doc/spec/proposals/177-flag-abstention.txt
deleted file mode 100644
index 0b4a9babb..000000000
--- a/doc/spec/proposals/177-flag-abstention.txt
+++ /dev/null
@@ -1,104 +0,0 @@
-Filename: 177-flag-abstention.txt
-Title: Abstaining from votes on individual flags
-Author: Nick Mathewson
-Created: 14 Feb 2011
-Status: Draft
-
-Overview:
-
- We should have a way for authorities to vote on flags in
- particular instances, without having to vote on that flag for all
- servers.
-
-Motivation:
-
- Suppose that the status of some router becomes controversial, and
- an authority wants to vote for or against the BadExit status of
- that router. Suppose also that the authority is not currently
- voting on the BadExit flag. If the authority wants to say that
- the router is or is not "BadExit", it cannot currently do so
- without voting yea or nay on the BadExit status of all other
- routers.
-
- Suppose that an authority wants to vote "Valid" or "Invalid" on a
- large number of routers, but does not have an opinion on some of
- them. Currently, it cannot do so: if it votes for the Valid flag
- anywhere, it votes for it everywhere.
-
-Design:
-
- We add a new line "extra-flags" in directory votes, to appear
- after "known-flags". It lists zero or more flags that an
- authority has occasional opinions on, but for which the authority
- will usually abstain. No flag may appear in both extra-flags and
- known-flags.
-
- In the router-status section for each directory vote, we allow an
- optional "s2" line to appear after the "s" line. It contains
- zero or more flag votes. A flag vote is of the form of one of
- "+", "-", or "/" followed by the name of a flag. "+" denotes a
- yea vote, and "-" denotes a nay vote, and "/" notes an
- abstention. Authorities may omit most abstentions, except as
- noted below. No flag may appear in an s2 line unless it appears
- in the known-flags or extra-flags line.We retain the rule that no
- flag may appear in an s line unless it appears in the known-flags
- line.
-
- When using an appropriate consensus method to vote, we use these
- new rules to determine flags:
-
- A flag is listed in the consensus if it is in the known-flags
- section of at least one voter, and in the known-flags or
- extra-flags section of at least three voters (or half the
- authorities, whichever set is smaller).
-
- A single authority's vote for a given flag on a given router is
- interpreted as follows:
-
- - If the authority votes +Flag or -Flag or /Flag in the s2 line for
- that router, the vote is "yea" or "nay" or "abstain" respectively.
- - Otherwise, if the flag is listed on the "s" line for the
- router, then the vote is "yea".
- - Otherwise, if the flag is listed in the known-flags line,
- then the vote is "nay".
- - Otherwise, the vote is "abstain".
-
- A router is assigned a flag in the consensus iff the total "yeas"
- outnumber the total "nays".
-
- As an exception, this proposal does not affect the behavior of
- the "Named" and "Unnamed" flags; these are still treated as
- before. (An authority can already abstain from a single naming
- decision by not voting Named on any router with a given name.)
-
-Examples:
-
- Suppose that it becomes important to know which Tor servers are
- operated by burrowing marsupials. Some authority operators
- diligently research this question; others want to vote about
- individual routers on an ad hoc basis when they learn about a
- particular router's being e.g. located underground in New South
- Wales.
-
- If an authority usually has no opinions on the RunByWombats flag,
- it should list it in the "extra-flags" of its votes. If it
- occasionally wants to vote that a router is (or is not) run by
- wombats, it should list "s2 +RunByWombats" or "s2 -RunByWombats"
- for the routers in question. Otherwise it can omit the flag from
- its s and s2 lines entirely.
-
- If an authority usually has an opinion on the RunByWombats flag,
- but wants to abstain in some cases, it should list "RunByWombats"
- in the "known-flags" part of its votes, and include
- "RunByWombats" in the s line for every router that it believes is
- run by wombats. When it wants to vote that a router is not run
- by wombats, it should list the RunByWombats flag in neither the s
- nor the s2 line. When it wants to abstain, it should list "s2
- /RunByWombats".
-
- In both cases, when the new consensus method is used, a router
- will get listed as "RunByWombats" if there are more authorities
- that say it is run by wombats than there are authorities saying
- it is not run by wombats. (As now, "no" votes win ties.)
-
-
diff --git a/doc/spec/proposals/ideas/xxx-auto-update.txt b/doc/spec/proposals/ideas/xxx-auto-update.txt
deleted file mode 100644
index dc9a857c1..000000000
--- a/doc/spec/proposals/ideas/xxx-auto-update.txt
+++ /dev/null
@@ -1,39 +0,0 @@
-
-Notes on an auto updater:
-
-steve wants a "latest" symlink so he can always just fetch that.
-
-roger worries that this will exacerbate the "what version are you
-using?" "latest." problem.
-
-weasel suggests putting the latest recommended version in dns. then
-we don't have to hit the website. it's got caching, it's lightweight,
-it scales. just put it in a TXT record or something.
-
-but, no dnssec.
-
-roger suggests a file on the https website that lists the latest
-recommended version (or filename or url or something like that).
-
-(steve seems to already be doing this with xerobank. he additionally
-suggests a little blurb that can be displayed to the user to describe
-what's new.)
-
-how to verify you're getting the right file?
-a) it's https.
-b) ship with a signing key, and use some openssl functions to verify.
-c) both
-
-andrew reminds us that we have a "recommended versions" line in the
-consensus directory already.
-
-if only we had some way to point out the "latest stable recommendation"
-from this list. we could list it first, or something.
-
-the recommended versions line also doesn't take into account which
-packages are available -- e.g. on Windows one version might be the best
-available, and on OS X it might be a different one.
-
-aren't there existing solutions to this? surely there is a beautiful,
-efficient, crypto-correct auto updater lib out there. even for windows.
-
diff --git a/doc/spec/proposals/ideas/xxx-bridge-disbursement.txt b/doc/spec/proposals/ideas/xxx-bridge-disbursement.txt
deleted file mode 100644
index 6c9a3c71e..000000000
--- a/doc/spec/proposals/ideas/xxx-bridge-disbursement.txt
+++ /dev/null
@@ -1,174 +0,0 @@
-
-How to hand out bridges.
-
-Divide bridges into 'strategies' as they come in. Do this uniformly
-at random for now.
-
-For each strategy, we'll hand out bridges in a different way to
-clients. This document describes two strategies: email-based and
-IP-based.
-
-0. Notation:
-
- HMAC(k,v) : an HMAC of v using the key k.
-
- A|B: The string A concatenated with the string B.
-
-
-1. Email-based.
-
- Goal: bootstrap based on one or more popular email service's sybil
- prevention algorithms.
-
-
- Parameters:
- HMAC -- an HMAC function
- P -- a time period
- K -- the number of bridges to send in a period.
-
- Setup: Generate two nonces, N and M.
-
- As bridges arrive, put them into a ring according to HMAC(N,ID)
- where ID is the bridges's identity digest.
-
- Divide time into divisions of length P.
-
- When we get an email:
-
- If it's not from a supported email service, reject it.
-
- If we already sent a response to that email address (normalized)
- in this period, send _exactly_ the same response.
-
- If it is from a supported service, generate X = HMAC(M,PS|E) where E
- is the lowercased normalized email address for the user, and
- where PS is the start of the currrent period. Send
- the first K bridges in the ring after point X.
-
- [If we want to make sure that repeat queries are given exactly the
- same results, then we can't let the ring change during the
- time period. For a long time period like a month, that's quite a
- hassle. How about instead just keeping a replay cache of addresses
- that have been answered, and sending them a "sorry, you already got
- your addresses for the time period; perhaps you should try these
- other fine distribution strategies while you wait?" response? This
- approach would also resolve the "Make sure you can't construct a
- distinct address to match an existing one" note below. -RD]
-
- [I think, if we get a replay, we need to send back the same
- answer as we did the first time, not say "try again."
- Otherwise we need to worry that an attacker can keep people
- from getting bridges by preemtively asking for them,
- or that an attacker may force them to prove they haven't
- gotten any bridges by asking. -NM]
-
- [While we're at it, if we do the replay cache thing and don't need
- repeatable answers, we could just pick K random answers from the
- pool. Is it beneficial that a bridge user who knows about a clump of
- nodes will be sharing them with other users who know about a similar
- (overlapping) clump? One good aspect is against an adversary who
- learns about a clump this way and watches those bridges to learn
- other users and discover *their* bridges: he doesn't learn about
- as many new bridges as he might if they were randomly distributed.
- A drawback is against an adversary who happens to pick two email
- addresses in P that include overlapping answers: he can measure
- the difference in clumps and estimate how quickly the bridge pool
- is growing. -RD]
-
- [Random is one more darn thing to implement; rings are already
- there. -NM]
-
- [If we make the period P be mailbox-specific, and make it a random
- value around some mean, then we make it harder for an attacker to
- know when to try using his small army of gmail addresses to gather
- another harvest. But we also make it harder for users to know when
- they can try again. -RD]
-
- [Letting the users know about when they can try again seems
- worthwhile. Otherwise users and attackers will all probe and
- probe and probe until they get an answer. No additional
- security will be achieved, but bandwidth will be lost. -NM]
-
- To normalize an email address:
- Start with the RFC822 address. Consider only the mailbox {???}
- portion of the address (username@domain). Put this into lowercase
- ascii.
-
- Questions:
- What to do with weird character encodings? Look up the RFC.
-
- Notes:
- Make sure that you can't force a single email address to appear
- in lots of different ways. IOW, if nickm@freehaven.net and
- NICKM@freehaven.net aren't treated the same, then I can get lots
- more bridges than I should.
-
- Make sure you can't construct a distinct address to match an
- existing one. IOW, if we treat nickm@X and nickm@Y as the same
- user, then anybody can register nickm@Z and use it to tell which
- bridges nickm@X got (or would get).
-
- Make sure that we actually check headers so we can't be trivially
- used to spam people.
-
-
-2. IP-based.
-
- Goal: avoid handing out all the bridges to users in a similar IP
- space and time.
-
- Parameters:
-
- T_Flush -- how long it should take a user on a single network to
- see a whole cluster of bridges.
-
- N_C
-
- K -- the number of bridges we hand out in response to a single
- request.
-
- Setup: using an AS map or a geoip map or some other flawed input
- source, divide IP space into "areas" such that surveying a large
- collection of "areas" is hard. For v0, use /24 address blocks.
-
- Group areas into N_C clusters.
-
- Generate secrets L, M, N.
-
- Set the period P such that P*(bridges-per-cluster/K) = T_flush.
- Don't set P to greater than a week, or less than three hours.
-
- When we get a bridge:
-
- Based on HMAC(L,ID), assign the bridge to a cluster. Within each
- cluster, keep the bridges in a ring based on HMAC(M,ID).
-
- [Should we re-sort the rings for each new time period, so the ring
- for a given cluster is based on HMAC(M,PS|ID)? -RD]
-
- When we get a connection:
-
- If it's http, redirect it to https.
-
- Let area be the incoming IP network. Let PS be the current
- period. Compute X = HMAC(N, PS|area). Return the next K bridges
- in the ring after X.
-
- [Don't we want to compute C = HMAC(key, area) to learn what cluster
- to answer from, and then X = HMAC(key, PS|area) to pick a point in
- that ring? -RD]
-
-
- Need to clarify that some HMACs are for rings, and some are for
- partitions. How rings scale is clear. How do we grow the number of
- partitions? Looking at successive bits from the HMAC output is one way.
-
-3. Open issues
-
- Denial of service attacks
- A good view of network topology
-
-at some point we should learn some reliability stats on our bridges. when
-we say above 'give out k bridges', we might give out 2 reliable ones and
-k-2 others. we count around the ring the same way we do now, to find them.
-
diff --git a/doc/spec/proposals/ideas/xxx-bwrate-algs.txt b/doc/spec/proposals/ideas/xxx-bwrate-algs.txt
deleted file mode 100644
index 757f5bc55..000000000
--- a/doc/spec/proposals/ideas/xxx-bwrate-algs.txt
+++ /dev/null
@@ -1,106 +0,0 @@
-# The following two algorithms
-
-
-# Algorithm 1
-# TODO: Burst and Relay/Regular differentiation
-
-BwRate = Bandwidth Rate in Bytes Per Second
-GlobalWriteBucket = 0
-GlobalReadBucket = 0
-Epoch = Token Fill Rate in seconds: suggest 50ms=.050
-SecondCounter = 0
-MinWriteBytes = Minimum amount bytes per write
-
-Every Epoch Seconds:
- UseMinWriteBytes = MinWriteBytes
- WriteCnt = 0
- ReadCnt = 0
- BytesRead = 0
-
- For Each Open OR Conn with pending write data:
- WriteCnt++
- For Each Open OR Conn:
- ReadCnt++
-
- BytesToRead = (BwRate*Epoch + GlobalReadBucket)/ReadCnt
- BytesToWrite = (BwRate*Epoch + GlobalWriteBucket)/WriteCnt
-
- if BwRate/WriteCnt < MinWriteBytes:
- # If we aren't likely to accumulate enough bytes in a second to
- # send a whole cell for our connections, send partials
- Log(NOTICE, "Too many ORCons to write full blocks. Sending short packets.")
- UseMinWriteBytes = 1
- # Other option: We could switch to plan 2 here
-
- # Service each writable ORConn. If there are any partial writes,
- # return remaining bytes from this epoch to the global pool
- For Each Open OR Conn with pending write data:
- ORConn->write_bucket += BytesToWrite
- if ORConn->write_bucket > UseMinWriteBytes:
- w = write(ORConn, MIN(len(ORConn->write_data), ORConn->write_bucket))
- # possible that w < ORConn->write_data here due to TCP pushback.
- # We should restore the rest of the write_bucket to the global
- # buffer
- GlobalWriteBucket += (ORConn->write_bucket - w)
- ORConn->write_bucket = 0
-
- For Each Open OR Conn:
- r = read_nonblock(ORConn, BytesToRead)
- BytesRead += r
-
- SecondCounter += Epoch
- if SecondCounter < 1:
- # Save unused bytes from this epoch to be used later in the second
- GlobalReadBucket += (BwRate*Epoch - BytesRead)
- else:
- SecondCounter = 0
- GlobalReadBucket = 0
- GlobalWriteBucket = 0
- For Each ORConn:
- ORConn->write_bucket = 0
-
-
-
-# Alternate plan for Writing fairly. Reads would still be covered
-# by plan 1 as there is no additional network overhead for short reads,
-# so we don't need to try to avoid them.
-#
-# I think this is actually pretty similar to what we do now, but
-# with the addition that the bytes accumulate up to the second mark
-# and we try to keep track of our position in the write list here
-# (unless libevent is doing that for us already and I just don't see it)
-#
-# TODO: Burst and Relay/Regular differentiation
-
-# XXX: The inability to send single cells will cause us to block
-# on EXTEND cells for low-bandwidth node pairs..
-BwRate = Bandwidth Rate in Bytes Per Second
-WriteBytes = Bytes per write
-Epoch = MAX(MIN(WriteBytes/BwRate, .333s), .050s)
-
-SecondCounter = 0
-GlobalWriteBucket = 0
-
-# New connections are inserted at Head-1 (the 'tail' of this circular list)
-# This is not 100% fifo for all node data, but it is the best we can do
-# without insane amounts of additional queueing complexity.
-WriteConnList = List of Open OR Conns with pending write data > WriteBytes
-WriteConnHead = 0
-
-Every Epoch Seconds:
- GlobalWriteBucket += BwRate*Epoch
- WriteListEnd = WriteConnHead
-
- do
- ORCONN = WriteConnList[WriteConnHead]
- w = write(ORConn, WriteBytes)
- GlobalWriteBucket -= w
- WriteConnHead += 1
- while GlobalWriteBucket > 0 and WriteConnHead != WriteListEnd
-
- SecondCounter += Epoch
- if SecondCounter >= 1:
- SecondCounter = 0
- GlobalWriteBucket = 0
-
-
diff --git a/doc/spec/proposals/ideas/xxx-choosing-crypto-in-tor-protocol.txt b/doc/spec/proposals/ideas/xxx-choosing-crypto-in-tor-protocol.txt
deleted file mode 100644
index e8489570f..000000000
--- a/doc/spec/proposals/ideas/xxx-choosing-crypto-in-tor-protocol.txt
+++ /dev/null
@@ -1,138 +0,0 @@
-Filename: xxx-choosing-crypto-in-tor-protocol.txt
-Title: Picking cryptographic standards in the Tor wire protocol
-Author: Marian
-Created: 2009-05-16
-Status: Draft
-
-Motivation:
-
- SHA-1 is horribly outdated and not suited for security critical
- purposes. SHA-2, RIPEMD-160, Whirlpool and Tigerare good options
- for a short-term replacement, but in the long run, we will
- probably want to upgrade to the winner or a semi-finalist of the
- SHA-3 competition.
-
- For a 2006 comparison of different hash algorithms, read:
- http://www.sane.nl/sane2006/program/final-papers/R10.pdf
-
- Other reading about SHA-1:
- http://www.schneier.com/blog/archives/2005/02/sha1_broken.html
- http://www.schneier.com/blog/archives/2005/08/new_cryptanalyt.html
- http://www.schneier.com/paper-preimages.html
-
- Additionally, AES has been theoretically broken for years. While
- the attack is still not efficient enough that the public sector
- has been able to prove that it works, we should probably consider
- the time between a theoretical attack and a practical attack as an
- opportunity to figure out how to upgrade to a better algorithm,
- such as Twofish.
-
- See:
- http://schneier.com/crypto-gram-0209.html#1
-
-Design:
-
- I suggest that nodes should publish in directories which
- cryptographic standards, such as hash algorithms and ciphers,
- they support. Clients communicating with nodes will then
- pick whichever of those cryptographic standards they prefer
- the most. In the case that the node does not publish which
- cryptographic standards it supports, the client should assume
- that the server supports the older standards, such as SHA-1
- and AES, until such time as we choose to desupport those
- standards.
-
- Node to node communications could work similarly. However, in
- case they both support a set of algorithms but have different
- preferences, the disagreement would have to be resolved
- somehow. Two possibilities include:
- * the node requesting communications presents which
- cryptographic standards it supports in the request. The
- other node picks.
- * both nodes send each other lists of what they support and
- what version of Tor they are using. The newer node picks,
- based on the assumption that the newer node has the most up
- to date information about which hash algorithm is the best.
- Of course, the node could lie about its version, but then
- again, it could also maliciously choose only to support older
- algorithms.
-
- Using this method, we could potentially add server side support
- to hash algorithms and ciphers before we instruct clients to
- begin preferring those hash algorithms and ciphers. In this way,
- the clients could upgrade and the servers would already support
- the newly preferred hash algorithms and ciphers, even if the
- servers were still using older versions of Tor, so long as the
- older versions of Tor were at least new enough to have server
- side support.
-
- This would make quickly upgrading to new hash algorithms and
- ciphers easier. This could be very useful when new attacks
- are published.
-
- One concern is that client preferences could expose the client
- to segmentation attacks. To mitigate this, we suggest hardcoding
- preferences in the client, to prevent the client from choosing
- to use a new hash algorithm or cipher that no one else is using
- yet. While offering a preference might be useful in case a client
- with an older version of Tor wants to start using the newer hash
- algorithm or cipher that everyone else is using, if the client
- cares enough, he or she can just upgrade Tor.
-
- We may also have to worry about nodes which, through laziness or
- maliciousness, refuse to start supporting new hash algorithms or
- ciphers. This must be balanced with the need to maintain
- backward compatibility so the client will have a large selection
- of nodes to pick from. Adding new hash algorithms and ciphers
- long before we suggest nodes start using them can help mitigate
- this. However, eventually, once sufficient nodes support new
- standards, client side support for older standards should be
- disabled, particularly if there are practical rather than merely
- theoretical attacks.
-
- Server side support for older standards can be kept much longer
- than client side support, since clients using older hashes and
- ciphers are really only hurting theirselvse.
-
- If server side support for a hash algorithm or cipher is added
- but never preferred before we decide we don't really want it,
- support can be removed without having to worry about backward
- compatibility.
-
-Security implications:
- Improving cryptography will improve Tor's security. However, if
- clients pick different cryptographic standards, they could be
- partitioned based on their cryptographic preferences. We also
- need to worry about nodes refusing to support new standards.
- These issues are detailed above.
-
-Specification:
-
- Todo. Need better understanding of how Tor currently works or
- help from someone who does.
-
-Compatibility:
-
- This idea is intended to allow easier upgrading of cryptographic
- hash algorithms and ciphers while maintaining backwards
- compatibility. However, at some point, backwards compatibility
- with very old hashes and ciphers should be dropped for security
- reasons.
-
-Implementation:
-
- Todo.
-
-Performance and scalability nodes:
-
- Better hashes and cipher are someimes a little more CPU intensive
- than weaker ones. For instance, on most computers AES is a little
- faster than Twofish. However, in that example, I consider Twofish's
- additional security worth the tradeoff.
-
-Acknowledgements:
-
- Discussed this on IRC with a few people, mostly Nick Mathewson.
- Nick was particularly helpful in explaining how Tor works,
- explaining goals, and providing various links to Tor
- specifications.
diff --git a/doc/spec/proposals/ideas/xxx-controllers-intercept-extends.txt b/doc/spec/proposals/ideas/xxx-controllers-intercept-extends.txt
deleted file mode 100644
index 76ba5c84b..000000000
--- a/doc/spec/proposals/ideas/xxx-controllers-intercept-extends.txt
+++ /dev/null
@@ -1,44 +0,0 @@
-Author: Geoff Goodell
-Title: Allow controller to manage circuit extensions
-Date: 12 March 2006
-
-History:
-
- This was once bug 268. Moving it into the proposal system for posterity.
-
-Test:
-
-Tor controllers should have a means of learning more about circuits built
-through Tor routers. Specifically, if a Tor controller is connected to a Tor
-router, it should be able to subscribe to a new class of events, perhaps
-"onion" or "router" events. A Tor router SHOULD then ensure that the
-controller is informed:
-
-(a) (NEW) when it receives a connection from some other location, in which
-case it SHOULD indicate (1) a unique identifier for the circuit, and (2) a
-ServerID in the event of an OR connection from another Tor router, and
-Hostname otherwise.
-
-(b) (REQUEST) when it receives a request to extend an existing circuit to a
-successive Tor router, in which case it SHOULD provide (1) the unique
-identifier for the circuit, (2) a Hostname (or, if possible, ServerID) of the
-previous Tor router in the circuit, and (3) a ServerID for the requested
-successive Tor router in the circuit;
-
-(c) (EXTEND) Tor will attempt to extend the circuit to some other router, in
-which case it SHOULD provide the same fields as provided for REQUEST.
-
-(d) (SUCCEEDED) The circuit has been successfully extended to some ther
-router, in which case it SHOULD provide the same fields as provided for
-REQUEST.
-
-We also need a new configuration option analogous to _leavestreamsunattached,
-specifying whether the controller is to manage circuit extensions or not.
-Perhaps we can call it "_leavecircuitsunextended". When set to 0, Tor
-manages everything as usual. When set to 1, a circuit received by the Tor
-router cannot transition from "REQUEST" to "EXTEND" state without being
-directed by a new controller command. The controller command probably does
-not need any arguments, since circuits are extended per client source
-routing, and all that the controller does is accept or reject the extension.
-
-This feature can be used as a basis for enforcing routing policy.
diff --git a/doc/spec/proposals/ideas/xxx-crypto-migration.txt b/doc/spec/proposals/ideas/xxx-crypto-migration.txt
deleted file mode 100644
index 1c734229b..000000000
--- a/doc/spec/proposals/ideas/xxx-crypto-migration.txt
+++ /dev/null
@@ -1,384 +0,0 @@
-
-Title: Initial thoughts on migrating Tor to new cryptography
-Author: Nick Mathewson
-Created: 12 December 2010
-
-1. Introduction
-
- Tor currently uses AES-128, RSA-1024, and SHA1. Even though these
- ciphers were a decent choice back in 2003, and even though attacking
- these algorithms is by no means the best way for a well-funded
- adversary to attack users (correlation attacks are still cheaper, even
- with pessimistic assumptions about the security of each cipher), we
- will want to move to better algorithms in the future. Indeed, if
- migrating to a new ciphersuite were simple, we would probably have
- already moved to RSA-1024/AES-128/SHA256 or something like that.
-
- So it's a good idea to start figuring out how we can move to better
- ciphers. Unfortunately, this is a bit nontrivial, so before we start
- doing the design work here, we should start by examining the issues
- involved. Robert Ransom and I both decided to spend this weekend
- writing up documents of this type so that we can see how much two
- people working independently agree on. I know more Tor than Robert;
- Robert knows far more cryptography than I do. With luck we'll
- complement each other's work nicely.
-
- A note on scope: This document WILL NOT attempt to pick a new cipher
- or set of ciphers. Instead, it's about how to migrate to new ciphers
- in general. Any algorithms mentioned other than those we use today
- are just for illustration.
-
- Also, I don't much consider the importance of updating each particular
- usage; only the methods that you'd use to do it.
-
- Also, this isn't a complete proposal.
-
-2. General principles and tricks
-
- Before I get started, let's talk about some general design issues.
-
-2.1. Many algorithms or few?
-
- Protocols like TLS and OpenPGP allow a wide choice of cryptographic
- algorithms; so long as the sender and receiver (or the responder and
- initiator) have at least one mutually acceptable algorithm, they can
- converge upon it and send each other messages.
-
- This isn't the best choice for anonymity designs. If two clients
- support a different set of algorithms, then an attacker can tell them
- apart. A protocol with N ciphersuites would in principle split
- clients into 2**N-1 sets. (In practice, nearly all users will use the
- default, and most users who choose _not_ to use the default will do so
- without considering the loss of anonymity. See "Anonymity Loves
- Company: Usability and the Network Effect".)
-
- On the other hand, building only one ciphersuite into Tor has a flaw
- of its own: it has proven difficult to migrate to another one. So
- perhaps instead of specifying only a single new ciphersuite, we should
- specify more than one, with plans to switch over (based on a flag in
- the consensus or some other secure signal) once the first choice of
- algorithms start looking iffy. This switch-based approach would seem
- especially easy for parameterizable stuff like key sizes.
-
-2.2. Waiting for old clients and servers to upgrade
-
- The easiest way to implement a shift in algorithms would be to declare
- a "flag day": once we have the new versions of the protocols
- implemented, pick a day by which everybody must upgrade to the new
- software. Before this day, the software would have the old behavior;
- after this way, it would use the improved behavior.
-
- Tor tries to avoid flag days whenever possible; they have well-known
- issues. First, since a number of our users don't automatically
- update, it can take a while for people to upgrade to new versions of
- our software. Second and more worryingly, it's hard to get adequate
- testing for new behavior that is off-by-default. Flag days in other
- systems have been known to leave whole networks more or less
- inoperable for months; we should not trust in our skill to avoid
- similar problems.
-
- So if we're avoiding flag days, what can we do?
-
- * We can add _support_ for new behavior early, and have clients use it
- where it's available. (Clients know the advertised versions of the
- Tor servers they use-- but see 2.3 below for a danger here, and 2.4
- for a bigger danger.)
-
- * We can remove misfeatures that _prevent_ deployment of new
- behavior. For instance, if a certain key length has an arbitrary
- 1024-bit limit, we can remove that arbitrary limitation.
-
- * Once an optional new behavior is ubiquitous enough, the authorities
- can stop accepting descriptors from servers that do not have it
- until they upgrade.
-
- It is far easier to remove arbitrary limitations than to make other
- changes; such changes are generally safe to back-port to older stable
- release series. But in general, it's much better to avoid any plans
- that require waiting for any version of Tor to no longer be in common
- use: a stable release can take on the order of 2.5 years to start
- dropping off the radar. Thandy might fix that, but even if a perfect
- Thandy release comes out tomorrow, we'll still have lots of older
- clients and relays not using it.
-
- We'll have to approach the migration problem on a case-by-case basis
- as we consider the algorithms used by Tor and how to change them.
-
-2.3. Early adopters and other partitioning dangers
-
- It's pretty much unavoidable that clients running software that speak
- the new version of any protocol will be distinguishable from those
- that cannot speak the new version. This is inevitable, though we
- could try to minimize the number of such partitioning sets by having
- features turned on in the same release rather than one-at-a-time.
-
- Another option here is to have new protocols controlled by a
- configuration tri-state with values "on", "off", and "auto". The
- "auto" value means to look at the consensus to decide wither to use
- the feature; the other two values are self-explanatory. We'd ship
- clients with the feature set to "auto" by default, with people only
- using "on" for testing.
-
- If we're worried about early client-side implementations of a protocol
- turning out to be broken, we can have the consensus value say _which_
- versions should turn on the protocol.
-
-2.4. Avoid whole-circuit switches
-
- One risky kind of protocol migration is a feature that gets used only
- when all the routers in a circuit support it. If such a feature is
- implemented by few relays, then each relay learns a lot about the rest
- of the path by seeing it used. On the other hand, if the feature is
- implemented by most relays, then a relay learns a lot about the rest of
- the path when the feature is *not* used.
-
- It's okay to have a feature that can be only used if two consecutive
- routers in the patch support it: each router knows the ones adjacent
- to it, after all, so knowing what version of Tor they're running is no
- big deal.
-
-2.5. The Second System Effect rears its ugly head
-
- Any attempt at improving Tor's crypto is likely to involve changes
- throughout the Tor protocol. We should be aware of the risks of
- falling into what Fred Brooks called the "Second System Effect": when
- redesigning a fielded system, it's always tempting to try to shovel in
- every possible change that one ever wanted to make to it.
-
- This is a fine time to make parts of our protocol that weren't
- previously versionable into ones that are easier to upgrade in the
- future. This probably _isn't_ time to redesign every aspect of the
- Tor protocol that anybody finds problematic.
-
-2.6. Low-hanging fruit and well-lit areas
-
- Not all parts of Tor are tightly covered. If it's possible to upgrade
- different parts of the system at different rates from one another, we
- should consider doing the stuff we can do easier, earlier.
-
- But remember the story of the policeman who finds a drunk under a
- streetlamp, staring at the ground? The cop asks, "What are you
- doing?" The drunk says, "I'm looking for my keys!" "Oh, did you drop
- them around here?" says the policeman. "No," says the drunk, "But the
- light is so much better here!"
-
- Or less proverbially: Simply because a change is easiest, does not
- mean it is the best use of our time. We should avoid getting bogged
- down solving the _easy_ aspects of our system unless they happen also
- to be _important_.
-
-2.7. Nice safe boring codes
-
- Let's avoid, to the extent that we can:
- - being the primary user of any cryptographic construction or
- protocol.
- - anything that hasn't gotten much attention in the literature.
- - anything we would have to implement from scratch
- - anything without a nice BSD-licensed C implementation
-
- Sometimes we'll have the choice of a more efficient algorithm or a
- more boring & well-analyzed one. We should not even consider trading
- conservative design for efficiency unless we are firmly in the
- critical path.
-
-2.8. Key restrictions
-
- Our spec says that RSA exponents should be 65537, but our code never
- checks for that. If we want to bolster resistance against collision
- attacks, we could check this requirement. To the best of my
- knowledge, nothing violates it except for tools like "shallot" that
- generate cute memorable .onion names. If we want to be nice to
- shallot users, we could check the requirement for everything *except*
- hidden service identity keys.
-
-3. Aspects of Tor's cryptography, and thoughts on how to upgrade them all
-
-3.1. Link cryptography
-
- Tor uses TLS for its link cryptography; it is easy to add more
- ciphersuites to the acceptable list, or increase the length of
- link-crypto public keys, or increase the length of the DH parameter,
- or sign the X509 certificates with any digest algorithm that OpenSSL
- clients will support. Current Tor versions do not check any of these
- against expected values.
-
- The identity key used to sign the second certificate in the current
- handshake protocol, however, is harder to change, since it needs to
- match up with what we see in the router descriptor for the router
- we're connecting to. See notes on router identity below. So long as
- the certificate chain is ultimately authenticated by a RSA-1024 key,
- it's not clear whether making the link RSA key longer on its own
- really improves matters or not.
-
- Recall also that for anti-fingerprinting reasons, we're thinking of
- revising the protocol handshake sometime in the 0.2.3.x timeframe.
- If we do that, that might be a good time to make sure that we aren't
- limited by the old identity key size.
-
-3.2. Circuit-extend crypto
-
- Currently, our code requires RSA onion keys to be 1024 bits long.
- Additionally, current nodes will not deliver an EXTEND cell unless it
- is the right length.
-
- For this, we might add a second, longer onion-key to router
- descriptors, and a second CREATE2 cell to open new circuits
- using this key type. It should contain not only the onionskin, but
- also information on onionskin version and ciphersuite. Onionskins
- generated for CREATE2 cells should use a larger DH group as well, and
- keys should be derived from DH results using a better digest algorithm.
-
- We should remove the length limit on EXTEND cells, backported to all
- supported stable versions; call these "EXTEND2" cells. Call these
- "lightly patched". Clients could use the new EXTEND2/CREATE2 format
- whenever using a lightly patched or new server to extend to a new
- server, and the old EXTEND/CREATE format otherwise.
-
- The new onion skin format should try to avoid the design oddities of
- our old one. Instead of its current iffy hybrid encryption scheme, it
- should probably do something more like a BEAR/LIONESS operation with a
- fixed key on the g^x value, followed by a public key encryption on the
- start of the encrypted data. (Robert reminded me about this
- construction.)
-
- The current EXTEND cell format ends with a router identity
- fingerprint, which is used by the extended-from router to authenticate
- the extended-to router when it connects. Changes to this will
- interact with changes to how long an identity key can be and to the
- link protocol; see notes on the link protocol above and about router
- identity below.
-
-3.2.1. Circuit-extend crypto: fast case
-
- When we do unauthenticated circuit extends with CREATE/CREATED_FAST,
- the two input values are combined with SHA1. I believe that's okay;
- using any entropy here at all is overkill.
-
-3.3. Relay crypto
-
- Upon receiving relay cells, a router transforms the payload portion of
- the cell with the appropriate key appropriate key, sees if it
- recognizes the cell (the recognized field is zero, the digest field is
- correct, the cell is outbound), and passes them on if not. It is
- possible for each hop in the circuit to handle the relay crypto
- differently; nobody but the client and the hop in question need to
- coordinate their operations.
-
- It's not clear, though, whether updating the relay crypto algorithms
- would help anything, unless we changed the whole relay cell processing
- format too. The stream cipher is good enough, and the use of 4 bytes
- of digest does not have enough bits to provide cryptographic strength,
- no matter what cipher we use.
-
- This is the likeliest area for the second-system effect to strike;
- there are lots of opportunities to try to be more clever than we are
- now.
-
-3.4. Router identity
-
- This is one of the hardest things to change. Right now, routers are
- identified by a "fingerprint" equal to the SHA1 hash of their 1024-bit
- identity key as given in their router descriptor. No existing Tor
- will accept any other size of identity key, or any other hash
- algorithm. The identity key itself is used:
- - To sign the router descriptors
- - To sign link-key certificates
- - To determine the least significant bits of circuit IDs used on a
- Tor instance's links (see tor-spec §5.1)
-
- The fingerprint is used:
- - To identify a router identity key in EXTEND cells
- - To identify a router identity key in bridge lines
- - Throughout the controller interface
- - To fetch bridge descriptors for a bridge
- - To identify a particular router throughout the codebase
- - In the .exit notation.
- - By the controller to identify nodes
- - To identify servers in the logs
- - Probably other places too
-
- To begin to allow other key types, key lengths, and hash functions, we
- would either need to wait till all current Tors are obsolete, or allow
- routers to have more than one identity for a while.
-
- To allow routers to have more than one identity, we need to
- cross-certify identity keys. We can do this trivially, in theory, by
- listing both keys in the router descriptor and having both identities
- sign the descriptor. In practice, we will need to analyze this pretty
- carefully to avoid attacks where one key is completely fake aimed to
- trick old clients somehow.
-
- Upgrading the hash algorithm once would be easy: just say that all
- new-type keys get hashed using the new hash algorithm. Remaining
- future-proof could be tricky.
-
- This is one of the hardest areas to update; "SHA1 of identity key" is
- assumed in so many places throughout Tor that we'll probably need a
- lot of design work to work with something else.
-
-3.5. Directory objects
-
- Fortunately, the problem is not so bad for consensuses themselves,
- because:
- - Authority identity keys are allowed to be RSA keys of any length;
- in practice I think they are all 3072 bits.
- - Authority signing keys are also allowed to be of any length.
- AFAIK the code works with longer signing keys just fine.
- - Currently, votes are hashed with both sha1 and sha256; adding
- more hash algorithms isn't so hard.
- - Microdescriptor consensuses are all signed using sha256. While
- regular consensuses are signed using sha1, exploitable collisions
- are hard to come up with, since once you had a collision, you
- would need to get a majority of other authorities to agree to
- generate it.
-
- Router descriptors are currently identified by SHA1 digests of their
- identity keys and descriptor digests in regular consensuses, and by
- SHA1 digests of identity keys and SHA256 digests of microdescriptors
- in microdesc consensuses. The consensus-flavors design allows us to
- generate new flavors of consensus that identity routers by new hashes
- of their identity keys. Alternatively, existing consensuses could be
- expanded to contain more hashes, though that would have some space
- concerns.
-
- Router descriptors themselves are signed using RSA-1024 identity keys
- and SHA1. For information on updating identity keys, see above.
-
- Router descriptors and extra-info documents cross-certify one another
- using SHA1.
-
- Microdescriptors are currently specified to contain exactly one
- onion key, of length 1024 bits.
-
-3.6. The directory protocol
-
- Most objects are indexed by SHA1 hash of an identity key or a
- descriptor object. Adding more hash types wouldn't be a huge problem
- at the directory cache level.
-
-3.7. The hidden service protocol
-
- Hidden services self-identify by a 1024-bit RSA key. Other key
- lengths are not supported. This key is turned into an 80 bit half
- SHA-1 hash for hidden service names.
-
- The most simple change here would be to set an interface for putting
- the whole ugly SHA1 hash in the hidden service name. Remember that
- this needs to coexist with the authentication system which also uses
- .onion hostnames; that hostnames top out around 255 characters and and
- their components top out at 63.
-
- Currently, ESTABLISH_INTRO cells take a key length parameter, so in
- theory they allow longer keys. The rest of the protocol assumes that
- this will be hashed into a 20-byte SHA1 identifier. Changing that
- would require changes at the introduction point as well as the hidden
- service.
-
- The parsing code for hidden service descriptors currently enforce a
- 1024-bit identity key, though this does not seem to be described in
- the specification. Changing that would be at least as hard as doing
- it for regular identity keys.
-
- Fortunately, hidden services are nearly completely orthogonal to
- everything else.
-
diff --git a/doc/spec/proposals/ideas/xxx-crypto-requirements.txt b/doc/spec/proposals/ideas/xxx-crypto-requirements.txt
deleted file mode 100644
index 8a8943a42..000000000
--- a/doc/spec/proposals/ideas/xxx-crypto-requirements.txt
+++ /dev/null
@@ -1,72 +0,0 @@
-Title: Requirements for Tor's circuit cryptography
-Author: Robert Ransom
-Created: 12 December 2010
-
-Overview
-
- This draft is intended to specify the meaning of 'secure' for a Tor
- circuit protocol, hopefully in enough detail that
- mathematically-inclined cryptographers can use this definition to
- prove that a Tor circuit protocol (or component thereof) is secure
- under reasonably well-accepted assumptions.
-
- Tor's current circuit protocol consists of the CREATE, CREATED, RELAY,
- DESTROY, CREATE_FAST, CREATED_FAST, and RELAY_EARLY cells (including
- all subtypes of RELAY and RELAY_EARLY cells). Tor currently has two
- circuit-extension handshake protocols: one consists of the CREATE and
- CREATED cells; the other, used only over the TLS connection to the
- first node in a circuit, consists of the CREATE_FAST and CREATED_FAST
- cells.
-
-Requirements
-
- 1. Every circuit-extension handshake protocol must provide forward
- secrecy -- the protocol must allow both the client and the relay to
- destroy, immediately after a circuit is closed, enough key material
- that no attacker who can eavesdrop on all handshake and circuit cells
- and who can seize and inspect the client and relay after the circuit
- is closed will be able to decrypt any non-handshake data sent along
- the circuit.
-
- In particular, the protocol must not require that a key which can be
- used to decrypt non-handshake data be stored for a predetermined
- period of time, as such a key must be written to persistent storage.
-
- 2. Every circuit-extension handshake protocol must specify what key
- material must be used only once in order to allow unlinkability of
- circuit-extension handshakes.
-
- 3. Every circuit-extension handshake protocol must authenticate the relay
- to the client -- an attacker who can eavesdrop on all handshake and
- circuit cells and who can participate in handshakes with the client
- must not be able to determine a symmetric session key that a circuit
- will use without either knowing a secret key corresponding to a
- handshake-authentication public key published by the relay or breaking
- a cryptosystem for which the relay published a
- handshake-authentication public key.
-
- 4. Every circuit-extension handshake protocol must ensure that neither
- the client nor the relay can cause the handshake to result in a
- predetermined symmetric session key.
-
- 5. Every circuit-extension handshake protocol should ensure that an
- attacker who can predict the relay's ephemeral secret input to the
- handshake and can eavesdrop on all handshake and circuit cells, but
- does not know a secret key corresponding to the
- handshake-authentication public key used in the handshake, cannot
- break the handshake-authentication public key's cryptosystem, and
- cannot predict the client's ephemeral secret input to the handshake,
- cannot predict the symmetric session keys used for the resulting
- circuit.
-
- 6. The circuit protocol must specify an end-to-end flow-control
- mechanism, and must allow for the addition of new mechanisms.
-
- 7. The circuit protocol should specify the statistics to be exchanged
- between circuit endpoints in order to support end-to-end flow control,
- and should specify how such statistics can be verified.
-
-
- 8. The circuit protocol should allow an endpoint to verify that the other
- endpoint is participating in an end-to-end flow-control protocol
- honestly.
diff --git a/doc/spec/proposals/ideas/xxx-draft-spec-for-TLS-normalization.txt b/doc/spec/proposals/ideas/xxx-draft-spec-for-TLS-normalization.txt
deleted file mode 100644
index 16484e637..000000000
--- a/doc/spec/proposals/ideas/xxx-draft-spec-for-TLS-normalization.txt
+++ /dev/null
@@ -1,360 +0,0 @@
-Filename: xxx-draft-spec-for-TLS-normalization.txt
-Title: Draft spec for TLS certificate and handshake normalization
-Author: Jacob Appelbaum, Gladys Shufflebottom
-Created: 16-Feb-2011
-Status: Draft
-
-
- Draft spec for TLS certificate and handshake normalization
-
-
- Overview
-
-Scope
-
-This is a document that proposes improvements to problems with Tor's
-current TLS (Transport Layer Security) certificates and handshake that will
-reduce the distinguishability of Tor traffic from other encrypted traffic that
-uses TLS. It also addresses some of the possible fingerprinting attacks
-possible against the current Tor TLS protocol setup process.
-
-Motivation and history
-
-Censorship is an arms race and this is a step forward in the defense
-of Tor. This proposal outlines ideas to make it more difficult to
-fingerprint and block Tor traffic.
-
-Goals
-
-This proposal intends to normalize or remove easy-to-predict or static
-values in the Tor TLS certificates and with the Tor TLS setup process.
-These values can be used as criteria for the automated classification of
-encrypted traffic as Tor traffic. Network observers should not be able
-to trivially detect Tor merely by receiving or observing the certificate
-used or advertised by a Tor relay. I also propose the creation of
-a hard-to-detect covert channel through which a server can signal that it
-supports the third version ("V3") of the Tor handshake protocol.
-
-Non-Goals
-
-This document is not intended to solve all of the possible active or passive
-Tor fingerprinting problems. This document focuses on removing distinctive
-and predictable features of TLS protocol negotiation; we do not attempt to
-make guarantees about resisting other kinds of fingerprinting of Tor
-traffic, such as fingerprinting techniques related to timing or volume of
-transmitted data.
-
- Implementation details
-
-
-Certificate Issues
-
-The CN or commonName ASN1 field
-
-Tor generates certificates with a predictable commonName field; the
-field is within a given range of values that is specific to Tor.
-Additionally, the generated host names have other undesirable properties.
-The host names typically do not resolve in the DNS because the domain
-names referred to are generated at random. Although they are syntatically
-valid, they usually refer to domains that have never been registered by
-any domain name registrar.
-
-An example of the current commonName field: CN=www.s4ku5skci.net
-
-An example of OpenSSL’s asn1parse over a typical Tor certificate:
-
- 0:d=0 hl=4 l= 438 cons: SEQUENCE
- 4:d=1 hl=4 l= 287 cons: SEQUENCE
- 8:d=2 hl=2 l= 3 cons: cont [ 0 ]
- 10:d=3 hl=2 l= 1 prim: INTEGER :02
- 13:d=2 hl=2 l= 4 prim: INTEGER :4D3C763A
- 19:d=2 hl=2 l= 13 cons: SEQUENCE
- 21:d=3 hl=2 l= 9 prim: OBJECT :sha1WithRSAEncryption
- 32:d=3 hl=2 l= 0 prim: NULL
- 34:d=2 hl=2 l= 35 cons: SEQUENCE
- 36:d=3 hl=2 l= 33 cons: SET
- 38:d=4 hl=2 l= 31 cons: SEQUENCE
- 40:d=5 hl=2 l= 3 prim: OBJECT :commonName
- 45:d=5 hl=2 l= 24 prim: PRINTABLESTRING :www.vsbsvwu5b4soh4wg.net
- 71:d=2 hl=2 l= 30 cons: SEQUENCE
- 73:d=3 hl=2 l= 13 prim: UTCTIME :110123184058Z
- 88:d=3 hl=2 l= 13 prim: UTCTIME :110123204058Z
- 103:d=2 hl=2 l= 28 cons: SEQUENCE
- 105:d=3 hl=2 l= 26 cons: SET
- 107:d=4 hl=2 l= 24 cons: SEQUENCE
- 109:d=5 hl=2 l= 3 prim: OBJECT :commonName
- 114:d=5 hl=2 l= 17 prim: PRINTABLESTRING :www.s4ku5skci.net
- 133:d=2 hl=3 l= 159 cons: SEQUENCE
- 136:d=3 hl=2 l= 13 cons: SEQUENCE
- 138:d=4 hl=2 l= 9 prim: OBJECT :rsaEncryption
- 149:d=4 hl=2 l= 0 prim: NULL
- 151:d=3 hl=3 l= 141 prim: BIT STRING
- 295:d=1 hl=2 l= 13 cons: SEQUENCE
- 297:d=2 hl=2 l= 9 prim: OBJECT :sha1WithRSAEncryption
- 308:d=2 hl=2 l= 0 prim: NULL
- 310:d=1 hl=3 l= 129 prim: BIT STRING
-
-I propose that we match OpenSSL's default self-signed certificates. I hypothesise
-that they are the most common self-signed certificates. If this turns out not
-to be the case, then we should use whatever the most common turns out to be.
-
-Certificate serial numbers
-
-Currently our generated certificate serial number is set to the number of
-seconds since the epoch at the time of the certificate's creation. I propose
-that we should ensure that our serial numbers are unrelated to the epoch,
-since the generation methods are potentially recognizable as Tor-related.
-
-Instead, I propose that we use a randomly generated number that is
-subsequently hashed with SHA-512 and then truncate the data to eight bytes[1].
-
-Random sixteen byte values appear to be the high bound for serial number as
-issued by Verisign and DigiCert. RapidSSL appears to be three bytes in length.
-Others common byte lengths appear to be between one and four bytes. The default
-OpenSSL certificates are eight bytes and we should use this length with our
-self-signed certificates.
-
-This randomly generated serial number field may now serve as a covert channel
-that signals to the client that the OR will not support TLS renegotiation; this
-means that the client can expect to perform a V3 TLS handshake setup.
-Otherwise, if the serial number is a reasonable time since the epoch, we should
-assume the OR is using an earlier protocol version and hence that it expects
-renegotiation.
-
-We also have a need to signal properties with our certificates for a possible
-v3 handshake in the future. Therefore I propose that we match OpenSSL default
-self-signed certificates (a 64-bit random number), but reserve the two least-
-significant bits for signaling. For the moment, these two bits will be zero.
-
-This means that an attacker may be able to identify Tor certificates from default
-OpenSSL certificates with a 75% probability.
-
-As a security note, care must be taken to ensure that supporting this
-covert channel will not lead to an attacker having a method to downgrade client
-behavior. This shouldn't be a risk because the TLS Finished message hashes over
-all the bytes of the handshake, including the certificates.
-
-Certificate fingerprinting issues expressed as base64 encoding
-
-It appears that all deployed Tor certificates have the following strings in
-common:
-
-MIIB
-CCA
-gAwIBAgIETU
-ANBgkqhkiG9w0BAQUFADA
-YDVQQDEx
-3d3cu
-
-As expected these values correspond to specific ASN.1 OBJECT IDENTIFIER (OID)
-properties (sha1WithRSAEncryption, commonName, etc) of how we generate our
-certificates.
-
-As an illustrated example of the common bytes of all certificates used within
-the Tor network within a single one hour window, I have replaced the actual
-value with a wild card ('.') character here:
-
------BEGIN CERTIFICATE-----
-MIIB..CCA..gAwIBAgIETU....ANBgkqhkiG9w0BAQUFADA.M..w..YDVQQDEx.3
-d3cu............................................................
-................................................................
-................................................................
-................................................................
-................................................................
-................................................................
-................................................................
-................................................................
-........................... <--- Variable length and padding
------END CERTIFICATE-----
-
-This fine ascii art only illustrates the bytes that absolutely match in all
-cases. In many cases, it's likely that there is a high probability for a given
-byte to be only a small subset of choices.
-
-Using the above strings, the EFF's certificate observatory may trivially
-discover all known relays, known bridges and unknown bridges in a single SQL
-query. I propose that we ensure that we test our certificates to ensure that
-they do not have these kinds of statistical similarities without ensuring
-overlap with a very large cross section of the internet's certificates.
-
-Certificate dating and validity issues
-
-TLS certificates found in the wild are generally found to be long-lived;
-they are frequently old and often even expired. The current Tor certificate
-validity time is a very small time window starting at generation time and
-ending shortly thereafter, as defined in or.h by MAX_SSL_KEY_LIFETIME
-(2*60*60).
-
-I propose that the certificate validity time length is extended to a period of
-twelve Earth months, possibly with a small random skew to be determined by the
-implementer. Tor should randomly set the start date in the past or some
-currently unspecified window of time before the current date. This would
-more closely track the typical distribution of non-Tor TLS certificate
-expiration times.
-
-The certificate values, such as expiration, should not be used for anything
-relating to security; for example, if the OR presents an expired TLS
-certificate, this does not imply that the client should terminate the
-connection (as would be appropriate for an ordinary TLS implementation).
-Rather, I propose we use a TOFU style expiration policy - the certificate
-should never be trusted for more than a two hour window from first sighting.
-
-This policy should have two major impacts. The first is that an adversary will
-have to perform a differential analysis of all certificates for a given IP
-address rather than a single check. The second is that the server expiration
-time is enforced by the client and confirmed by keys rotating in the consensus.
-
-The expiration time should not be a fixed time that is simple to calculate by
-any Deep Packet Inspection device or it will become a new Tor TLS setup
-fingerprint.
-
-Proposed certificate form
-
-The following output from openssl asn1parse results from the proposed
-certificate generation algorithm. It matches the results of generating a
-default self-signed certificate:
-
- 0:d=0 hl=4 l= 513 cons: SEQUENCE
- 4:d=1 hl=4 l= 362 cons: SEQUENCE
- 8:d=2 hl=2 l= 9 prim: INTEGER :DBF6B3B864FF7478
- 19:d=2 hl=2 l= 13 cons: SEQUENCE
- 21:d=3 hl=2 l= 9 prim: OBJECT :sha1WithRSAEncryption
- 32:d=3 hl=2 l= 0 prim: NULL
- 34:d=2 hl=2 l= 69 cons: SEQUENCE
- 36:d=3 hl=2 l= 11 cons: SET
- 38:d=4 hl=2 l= 9 cons: SEQUENCE
- 40:d=5 hl=2 l= 3 prim: OBJECT :countryName
- 45:d=5 hl=2 l= 2 prim: PRINTABLESTRING :AU
- 49:d=3 hl=2 l= 19 cons: SET
- 51:d=4 hl=2 l= 17 cons: SEQUENCE
- 53:d=5 hl=2 l= 3 prim: OBJECT :stateOrProvinceName
- 58:d=5 hl=2 l= 10 prim: PRINTABLESTRING :Some-State
- 70:d=3 hl=2 l= 33 cons: SET
- 72:d=4 hl=2 l= 31 cons: SEQUENCE
- 74:d=5 hl=2 l= 3 prim: OBJECT :organizationName
- 79:d=5 hl=2 l= 24 prim: PRINTABLESTRING :Internet Widgits Pty Ltd
- 105:d=2 hl=2 l= 30 cons: SEQUENCE
- 107:d=3 hl=2 l= 13 prim: UTCTIME :110217011237Z
- 122:d=3 hl=2 l= 13 prim: UTCTIME :120217011237Z
- 137:d=2 hl=2 l= 69 cons: SEQUENCE
- 139:d=3 hl=2 l= 11 cons: SET
- 141:d=4 hl=2 l= 9 cons: SEQUENCE
- 143:d=5 hl=2 l= 3 prim: OBJECT :countryName
- 148:d=5 hl=2 l= 2 prim: PRINTABLESTRING :AU
- 152:d=3 hl=2 l= 19 cons: SET
- 154:d=4 hl=2 l= 17 cons: SEQUENCE
- 156:d=5 hl=2 l= 3 prim: OBJECT :stateOrProvinceName
- 161:d=5 hl=2 l= 10 prim: PRINTABLESTRING :Some-State
- 173:d=3 hl=2 l= 33 cons: SET
- 175:d=4 hl=2 l= 31 cons: SEQUENCE
- 177:d=5 hl=2 l= 3 prim: OBJECT :organizationName
- 182:d=5 hl=2 l= 24 prim: PRINTABLESTRING :Internet Widgits Pty Ltd
- 208:d=2 hl=3 l= 159 cons: SEQUENCE
- 211:d=3 hl=2 l= 13 cons: SEQUENCE
- 213:d=4 hl=2 l= 9 prim: OBJECT :rsaEncryption
- 224:d=4 hl=2 l= 0 prim: NULL
- 226:d=3 hl=3 l= 141 prim: BIT STRING
- 370:d=1 hl=2 l= 13 cons: SEQUENCE
- 372:d=2 hl=2 l= 9 prim: OBJECT :sha1WithRSAEncryption
- 383:d=2 hl=2 l= 0 prim: NULL
- 385:d=1 hl=3 l= 129 prim: BIT STRING
-
-
-Custom Certificates
-
-It should be possible for a Tor relay operator to use a specifically supplied
-certificate and secret key. This will allow a relay or bridge operator to use a
-certificate signed by any member of any geographically relevant certificate
-authority racket; it will also allow for any other user-supplied certificate.
-This may be desirable in some kinds of filtered networks or when attempting to
-avoid attracting suspicion by blending in with the TLS web server certificate
-crowd.
-
-Problematic Diffie–Hellman parameters
-
-We currently send a static Diffie–Hellman parameter, prime p (or “prime p
-outlawâ€) as specified in RFC2409 as part of the TLS Server Hello response.
-
-The use of this prime in TLS negotiations may, as a result, be filtered and
-effectively banned by certain networks. We do not have to use this particular
-prime in all cases.
-
-While amusing to have the power to make specific prime numbers into a new class
-of numbers (cf. imaginary, irrational, illegal [3]) - our new friend prime p
-outlaw is not required.
-
-The use of this prime in TLS negotiations may, as a result, be filtered and
-effectively banned by certain networks. We do not have to use this particular
-prime in all cases.
-
-I propose that the function to initialize and generate DH parameters be
-split into two functions.
-
-First, init_dh_param() should be used only for OR-to-OR DH setup and
-communication. Second, it is proposed that we create a new function
-init_tls_dh_param() that will have a two-stage development process.
-
-The first stage init_tls_dh_param() will use the same prime that
-Apache2.x [4] sends (or “dh1024_apache_pâ€), and this change should be
-made immediately. This is a known good and safe prime number (p-1 / 2
-is also prime) that is currently not known to be blocked.
-
-The second stage init_tls_dh_param() should randomly generate a new prime on a
-regular basis; this is designed to make the prime difficult to outlaw or
-filter. Call this a shape-shifting or "Rakshasa" prime. This should be added
-to the 0.2.3.x branch of Tor. This prime can be generated at setup or execution
-time and probably does not need to be stored on disk. Rakshasa primes only
-need to be generated by Tor relays as Tor clients will never send them. Such
-a prime should absolutely not be shared between different Tor relays nor
-should it ever be static after the 0.2.3.x release.
-
-As a security precaution, care must be taken to ensure that we do not generate
-weak primes or known filtered primes. Both weak and filtered primes will
-undermine the TLS connection security properties. OpenSSH solves this issue
-dynamically in RFC 4419 [5] and may provide a solution that works reasonably
-well for Tor. More research in this area including the applicability of
-Miller-Rabin or AKS primality tests[6] will need to be analyzed and probably
-added to Tor.
-
-Practical key size
-
-Currently we use a 1024 bit long RSA modulus. I propose that we increase the
-RSA key size to 2048 as an additional channel to signal support for the V3
-handshake setup. 2048 appears to be the most common key size[0] above 1024.
-Additionally, the increase in modulus size provides a reasonable security boost
-with regard to key security properties.
-
-The implementer should increase the 1024 bit RSA modulus to 2048 bits.
-
-Possible future filtering nightmares
-
-At some point it may cost effective or politically feasible for a network
-filter to simply block all signed or self-signed certificates without a known
-valid CA trust chain. This will break many applications on the internet and
-hopefully, our option for custom certificates will ensure that this step is
-simply avoided by the censors.
-
-The Rakshasa prime approach may cause censors to specifically allow only
-certain known and accepted DH parameters.
-
-
-Appendix: Other issues
-
-What other obvious TLS certificate issues exist? What other static values are
-present in the Tor TLS setup process?
-
-[0] http://archives.seul.org/or/dev/Jan-2011/msg00051.html
-[1] http://archives.seul.org/or/dev/Feb-2011/msg00016.html
-[2] http://archives.seul.org/or/dev/Feb-2011/msg00039.html
-[3] To be fair this is hardly a new class of numbers. History is rife with
- similar examples of inane authoritarian attempts at mathematical secrecy.
- Probably the most dramatic example is the story of the pupil Hipassus of
- Metapontum, pupil of the famous Pythagoras, who, legend goes, proved the
- fact that Root2 cannot be expressed as a fraction of whole numbers (now
- called an irrational number) and was assassinated for revealing this
- secret. Further reading on the subject may be found on the Wikipedia:
- http://en.wikipedia.org/wiki/Hippasus
-
-[4] httpd-2.2.17/modules/ss/ssl_engine_dh.c
-[5] http://tools.ietf.org/html/rfc4419
-[6] http://archives.seul.org/or/dev/Jan-2011/msg00037.html
diff --git a/doc/spec/proposals/ideas/xxx-encrypted-services.txt b/doc/spec/proposals/ideas/xxx-encrypted-services.txt
deleted file mode 100644
index 3c2ac67fa..000000000
--- a/doc/spec/proposals/ideas/xxx-encrypted-services.txt
+++ /dev/null
@@ -1,66 +0,0 @@
-Filename: xxx-encrypted-services.txt
-Title: Encrypted services as a replacement to exit enclaving
-Author: Roger Dingledine
-Created: 2011-01-12
-Status: Draft
-
-We should offer a way to run a Tor hidden service where the server-side
-rendezvous circuits are just one hop.
-
-1. Motivation
-
- There are three Tor use cases that this idea addresses:
-
- 1) Indymedia wants to run an exit enclave that provides end-to-end
- authentication and encryption. They tried running an exit relay that
- just exits to themselves:
- https://trac.torproject.org/projects/tor/ticket/800
- but a) it handles lots of other traffic too since it's a relay, and
- b) exit enclaves don't actually work consistently, because the first
- connection from the user doesn't realize it should use the exit enclave.
-
- 2) Wikileaks uses Tor hidden services not to hide their service,
- but because the hidden service address provides a type of usability
- we didn't think much about: unlike a more normal address, a Tor
- hidden service address either works (meaning you get your end-to-end
- authentication and encryption) or it fails hard. With a hidden service
- address there's no way a user could accidentally submit their documents
- to Wikileaks without using Tor, but with normal Tor it's possible.
-
- 3) The Freenode IRC network wants to provide end-to-end encryption and
- authentication to its users, a) to handle the fact that the IRC protocol
- doesn't really provide much of that by default, and b) to funnel all
- their Tor users into a single location so they can handle the anonymous
- users better. They don't mind the fact that their service is hidden, but
- they'd rather have better performance for their users given the choice.
-
-2. Design
-
- It seems that the main changes required would be to a) make
- circuit_launch_by_extend_info() know to use 1 hop rather than the
- default, and know not to try to cannibalize a general 3-hop circ for
- these circuits, and b) add a way in the torrc file to specify that this
- service wants to be an encrypted service rather than a hidden service.
-
- I had originally pondered some sort of even more efficient "signed
- document saying this service is running at this Tor relay", which
- would be more efficient because it would cut out the rendezvous step.
- But by reusing the hidden service rendezvous infrastructure, we a)
- blend in with hidden services (and hidden service descriptors) and
- don't need to teach users (or their Tor clients) a new interface,
- and b) can offer the encrypted service on a non-relay.
-
- One design question to ponder: should we continue to use three-hop
- circuits for our introduction points, and for publishing our encrypted
- service descriptor? Probably.
-
-3. Security implications
-
- There's a possible second-order effect here since both encrypted
- services and hidden services will have foo.onion addresses and it's
- not clear based on the address whether the service will be hidden --
- if *some* .onion addresses are easy to track down, are we encouraging
- adversaries to attack all rendezvous points just in case?
-
-...
-
diff --git a/doc/spec/proposals/ideas/xxx-exit-scanning-outline.txt b/doc/spec/proposals/ideas/xxx-exit-scanning-outline.txt
deleted file mode 100644
index d84094400..000000000
--- a/doc/spec/proposals/ideas/xxx-exit-scanning-outline.txt
+++ /dev/null
@@ -1,44 +0,0 @@
-1. Scanning process
- A. Non-HTML/JS HTTP mime types compared via SHA1 hash
- B. Dynamic HTTP content filtered at 4 levels:
- 1. IP change+Tor cookie utilization
- - Tor cookies replayed with new IP in case of changes
- 2. HTML Tag+Attribute+JS comparison
- - Comparisons made based only on "relevant" HTML tags
- and attributes
- 3. HTML Tag+Attribute+JS diffing
- - Tags, attributes and JS AST nodes that change during
- Non-Tor fetches pruned from comparison
- 4. URLS with > N% of node failures removed
- - results purged from filesystem at end of scan loop
- C. SSL scanning handles some forms of dynamic certs
- 1. Catalogs certs for all IPs resolved locally
- by getaddrinfo over the duration of the scan.
- - Updated each test.
- 2. If the domain presents a new cert for each IP, this
- is noted on the failure result for the node
- 3. If the same IP presents two different certs locally,
- the cert list is first refreshed, and if it happens
- again, discarded
- 4. A N% node failure filter also applies
- D. Scanner can be restarted from any point in the event
- of scanner or system crashes, or graceful shutdown.
- - Results+scan state pickled to filesystem continuously
-2. Cron job checks results periodically for reporting
- A. Divide failures into three types of BadExit based on type
- and frequency over time and incident rate
- B. write reject lines to approved-routers for those three types:
- 1. ID Hex based (for misconfig/network problems easily fixed)
- 2. IP based (for content modification)
- 3. IP+mask based (for continuous/egregious content modification)
- C. Emails results to tor-scanners@freehaven.net
-3. Human Review and Appeal
- A. ID Hex-based BadExit is meant to be possible to removed easily
- without needing to beg us.
- - Should this behavior be encouraged?
- B. Optionally can reserve IP based badexits for human review
- 1. Results are encapsulated fully on the filesystem and can be
- reviewed without network access
- 2. Soat has --rescan to rescan failed nodes from a data directory
- - New set of URLs used
-
diff --git a/doc/spec/proposals/ideas/xxx-geoip-survey-plan.txt b/doc/spec/proposals/ideas/xxx-geoip-survey-plan.txt
deleted file mode 100644
index 49c6615a6..000000000
--- a/doc/spec/proposals/ideas/xxx-geoip-survey-plan.txt
+++ /dev/null
@@ -1,137 +0,0 @@
-
-
-Abstract
-
- This document explains how to tell about how many Tor users there
- are, and how many there are in which country. Statistics are
- involved.
-
-Motivation
-
- There are a few reasons we need to keep track of which countries
- Tor users (in aggregate) are coming from:
-
- - Resource allocation. Knowing about underserved countries with
- lots of users can let us know about where we need to direct
- translation and outreach efforts.
-
- - Anticensorship. Sudden drops in usage on a national basis can
- indicate the arrival of a censorious firewall.
-
- - Sponsor outreach and self-evalutation. Many people and
- organizations who are interested in funding The Tor Project's
- work want to know that we're successfully serving parts of the
- world they're interested in, and that efforts to expand our
- userbase are actually succeeding. So do we.
-
-Goals
-
- We want to know approximately how many Tor users there are, and which
- countries they're in, even in the presence of a hypothetical
- "directory guard" feature. Some uncertainty is okay, but we'd like
- to be able to put a bound on the uncertainty.
-
- We need to make sure this information isn't exposed in a way that
- helps an adversary.
-
-Methods for current clients:
-
- Every client downloads network status documents. There are
- currently three methods (one hypothetical) for clients to get them.
- - 0.1.2.x clients (and earlier) fetch a v2 networkstatus
- document about every NETWORKSTATUS_CLIENT_DL_INTERVAL [30
- minutes].
-
- - 0.2.0.x clients fetch a v3 networkstatus consensus document
- at a random interval between when their current document is no
- longer freshest, and when their current document is about to
- expire.
-
- [In both of the above cases, clients choose a running
- directory cache at random with odds roughly proportional to
- its bandwidth. If they're just starting, they know a XXXX FIXME -NM]
-
- - In some future version, clients will choose directory caches
- to serve as their "directory guards" to avoid profiling
- attacks, similarly to how clients currently start all their
- circuits at guard nodes.
-
- We assume that a directory cache can tell which of these three
- categories a client is in by the format of its status request.
-
- A directory cache can be made to count distinct client IP
- addresses that make a certain request of it in a given timeframe,
- and total requests made to it over that timeframe. For the first
- two cases, a cache can get a picture of the overall
- number and countries of users in the network by dividing the IP
- count by the probability with which they (as a cache) would be
- chosen. Assuming that our listed bandwidth is such that we expect
- to be chosen with probability P for any given request, and we've
- been counting IPs for long enough that we expect the average
- client to have made N requests, they will have visited us at least
- once with probability P' = 1-(1-P)^N, and so we divide the IP
- counts we've seen by P' for our estimate. To estimate total
- number of clients of a given type, determine how many requests a
- client of that type will make over that time, and assume we'll
- have seen P of them.
-
- Both of these numbers are useful: the IP counts will give the
- total number of IPs connecting to the network, and the request
- counts will give the total number of users on the network at any
- given time.
-
- Notes:
- - [Over H hours, the N for V2 clients is 2*H, and the N for V3
- clients is currently around H/2 or H/3.]
-
- - (We should only count requests that we actually intend to answer;
- 503 requests shouldn't count.)
-
- - These measurements should also be taken at a directory
- authority if possible: their picture of the network is skewed
- by clients that fetch from them directly. These clients,
- however, are all the clients that are just bootstrapping
- (assuming that the fallback-consensus feature isn't yet used
- much).
-
- - These measurements also overestimate the V2 download rate if
- some downloads fail and clients retry them later after backing
- off.
-
-Methods for directory guards:
-
- If directory guards are in use, directory guards get a picture of
- all those users who chose them as a guard when they were listed
- as a good choice for a guard, and who are also on the network
- now. The cleanest data here will come from nodes that were listed
- as good new-guards choices for a while, and have not been so for a
- while longer (to study decay rates); nodes that have been listed
- as good new-guard choices consistently for a long time (to get a
- sample of the network); and nodes that have been listed as good
- new-guard choices only recently (to get a sample of new users and
- users whose guards have died out.)
-
- Since directory guards are currently unspecified, we'll need to
- make some guesses about how they'll turn out to work. Here are
- a couple of approaches that could work.
- - We could have clients pick completely new directory guards on
- a rolling basis every two months or so. This would ensure
- that staying as a guard for a while would be sufficient to
- see a sample of users. This is potentially advantageous for
- load-balancing the network as well, though it might lose some
- of the benefits of directory guard. We need to quantify the
- impact of this; it might not actually make stuff worse in
- practice, if most guards don't stay good guards for a month
- or two.
-
- - We could try to collect statistics at several directory
- guards and combine their statisics, but we would need to make
- sure that for all time, at least one of the directory guards
- had been recommended as a good choice for new guards. By
- looking at new-IP rates for guards, we could get an idea of
- user uptake; for looking at old-IP decay rates, we could get
- an idea of turnover. This approach would entail significant
- complexity, and we'd probably need to record more information
- than we'd really like to.
-
-
diff --git a/doc/spec/proposals/ideas/xxx-grand-scaling-plan.txt b/doc/spec/proposals/ideas/xxx-grand-scaling-plan.txt
deleted file mode 100644
index 336798cc0..000000000
--- a/doc/spec/proposals/ideas/xxx-grand-scaling-plan.txt
+++ /dev/null
@@ -1,97 +0,0 @@
-
-Right now as I understand it, there are n big scaling problems heading
-our way:
-
-1) Clients need to learn all the relay descriptors they could use. That's
-a lot of bytes through a potentially small pipe.
-2) Relays need to hold open TCP connections to most other relays.
-3) Clients need to learn the whole networkstatus. Even using v3, as
-the network grows that will become unwieldy.
-4) Dir mirrors need to mirror all the relay descriptors; eventually this
-will get big too.
-
-Here's my plan.
-
---------------------------------------------------------------------
-
-Piece one: download O(1) descriptors rather than O(n) descriptors.
-
-We need to change our circuit extend protocol so it fetches a relay
-descriptor at every 'extend' operation:
- - Client fetches networkstatus, picks guards, connects to one.
- - Client picks middle hop out of networkstatus, asks guard for
- its descriptor, then extends to it.
- - Clients picks exit hop out of networkstatus, asks middle hop
- for its descriptor, then extends to it. Done.
-
-The client needs to ask for the descriptor even if it already has a
-copy, because otherwise we leak too much. Also, the descriptor needs to
-be padded to some large (but not too large) size to prevent the middle
-hops from guessing about it.
-
-The first step towards this is to instrument the current code to see
-how much of a win this would actually be -- I am guessing it is already
-a win even with the current number of descriptors.
-
-We also would need to assign the 'Exit' flag more usefully, and make
-clients pay attention to it when picking their last hop, since they
-don't actually know the exit policies of the relays they're choosing from.
-
-We also need to think harder about other implications -- for example,
-a relay with a tiny exit policy won't get the Exit flag, and thus won't
-ever get picked as an exit relay. Plus, our "enclave exit" model is out
-the window unless we figure out a cool trick.
-
-More generally, we'll probably want to compress the descriptors that we
-send back; maybe 8k is a good upper bound? I wonder if we could ask for
-several descriptors, and bundle back all of the ones that fit in the 8k?
-
-We'd also want to put the load balancing weights into the networkstatus,
-so clients can choose fast nodes more often without needing to see the
-descriptors. This is a good opportunity for the authorities to be able
-to put "more accurate" weights in if they learn to detect attacks. It
-also means we should consider running automated audits to make sure the
-authorities aren't trying to snooker everybody.
-
-I'm aiming to get Peter Palfrader to tackle this problem in mid 2008,
-but I bet he could use some help.
-
---------------------------------------------------------------------
-
-Piece two: inter-relay communication uses UDP
-
-If relays send packets to/from other relays via UDP, they don't need a
-new descriptor for each such link. Thus we'll still need to keep state
-for each link, but we won't max out on sockets.
-
-Clearly a lot more work needs to be done here. Ian Goldberg has a student
-who has been working on it, and if all goes well we'll be chipping in
-some funding to continue that. Also, Camilo Viecco has been doing his
-PhD thesis on it.
-
---------------------------------------------------------------------
-
-Piece three: networkstatus documents get partitioned
-
-While the authorities should be expected to be able to handle learning
-about all the relays, there's no reason the clients or the mirrors need
-to. Authorities should put a cap on the number of relays listed in a
-single networkstatus, and split them when they get too big.
-
-We'd need a good way to have each authority come to the same conclusion
-about which partition a given relay goes into.
-
-Directory mirrors would then mirror all the relay descriptors in their
-partition. This is compatible with 'piece one' above, since clients in
-a given partition will only ask about descriptors in that partition.
-
-More complex versions of this design would involve overlapping partitions,
-but that would seem to start contradicting other parts of this proposal
-right quick.
-
-Nobody is working on this piece yet. It's hard to say when we'll need
-it, but it would be nice to have some more thought on it before the week
-that we need it.
-
---------------------------------------------------------------------
-
diff --git a/doc/spec/proposals/ideas/xxx-hide-platform.txt b/doc/spec/proposals/ideas/xxx-hide-platform.txt
deleted file mode 100644
index ad19fb1fd..000000000
--- a/doc/spec/proposals/ideas/xxx-hide-platform.txt
+++ /dev/null
@@ -1,37 +0,0 @@
-Filename: xxx-hide-platform.txt
-Title: Hide Tor Platform Information
-Author: Jacob Appelbaum
-Created: 24-July-2008
-Status: Draft
-
-
- Hiding Tor Platform Information
-
-0.0 Introduction
-
-The current Tor program publishes its specific Tor version and related OS
-platform information. This information could be misused by an attacker.
-
-0.1 Current Implementation
-
-Currently, the Tor binary sends data that looks like the following:
-
- Tor 0.2.0.26-rc (r14597) on Darwin Power Macintosh
- Tor 0.1.2.19 on Windows XP Service Pack 3 [workstation] {terminal services,
- single user}
-
-1.0 Suggested changes
-
-It would be useful to allow a user to configure the disclosure of such
-information. Such a change would be an option in the torrc file like so:
-
- HidePlatform Yes
-
-1.1 Suggested default behavior in the future
-
-If a user would like to disclose this information, they could configure their
-Tor to do so.
-
- HidePlatform No
-
-
diff --git a/doc/spec/proposals/ideas/xxx-pluggable-transport.txt b/doc/spec/proposals/ideas/xxx-pluggable-transport.txt
deleted file mode 100644
index 53ba9c630..000000000
--- a/doc/spec/proposals/ideas/xxx-pluggable-transport.txt
+++ /dev/null
@@ -1,312 +0,0 @@
-Filename: xxx-pluggable-transport.txt
-Title: Pluggable transports for circumvention
-Author: Jacob Appelbaum, Nick Mathewson
-Created: 15-Oct-2010
-Status: Draft
-
-Overview
-
- This proposal describes a way to decouple protocol-level obfuscation
- from the core Tor protocol in order to better resist client-bridge
- censorship. Our approach is to specify a means to add pluggable
- transport implementations to Tor clients and bridges so that they can
- negotiate a superencipherment for the Tor protocol.
-
-Scope
-
- This is a document about transport plugins; it does not cover
- discovery improvements, or bridgedb improvements. While these
- requirements might be solved by a program that also functions as a
- transport plugin, this proposal only covers the requirements and
- operation of transport plugins.
-
-Motivation
-
- Frequently, people want to try a novel circumvention method to help
- users connect to Tor bridges. Some of these methods are already
- pretty easy to deploy: if the user knows an unblocked VPN or open
- SOCKS proxy, they can just use that with the Tor client today.
-
- Less easy to deploy are methods that require participation by both the
- client and the bridge. In order of increasing sophistication, we
- might want to support:
-
- 1. A protocol obfuscation tool that transforms the output of a TLS
- connection into something that looks like HTTP as it leaves the
- client, and back to TLS as it arrives at the bridge.
- 2. An additional authentication step that a client would need to
- perform for a given bridge before being allowed to connect.
- 3. An information passing system that uses a side-channel in some
- existing protocol to convey traffic between a client and a bridge
- without the two of them ever communicating directly.
- 4. A set of clients to tunnel client->bridge traffic over an existing
- large p2p network, such that the bridge is known by an identifier
- in that network rather than by an IP address.
-
- We could in theory support these almost fine with Tor as it stands
- today: every Tor client can take a SOCKS proxy to use for its outgoing
- traffic, so a suitable client proxy could handle the client's traffic
- and connections on its behalf, while a corresponding program on the
- bridge side could handle the bridge's side of the protocol
- transformation. Nevertheless, there are some reasons to add support
- for transportation plugins to Tor itself:
-
- 1. It would be good for bridges to have a standard way to advertise
- which transports they support, so that clients can have multiple
- local transport proxies, and automatically use the right one for
- the right bridge.
-
- 2. There are some changes to our architecture that we'll need for a
- system like this to work. For testing purposes, if a bridge blocks
- off its regular ORPort and instead has an obfuscated ORPort, the
- bridge authority has no way to test it. Also, unless the bridge
- has some way to tell that the bridge-side proxy at 127.0.0.1 is not
- the origin of all the connections it is relaying, it might decide
- that there are too many connections from 127.0.0.1, and start
- paring them down to avoid a DoS.
-
- 3. Censorship and anticensorship techniques often evolve faster than
- the typical Tor release cycle. As such, it's a good idea to
- provide ways to test out new anticensorship mechanisms on a more
- rapid basis.
-
- 4. Transport obfuscation is a relatively distinct problem
- from the other privacy problems that Tor tries to solve, and it
- requires a fairly distinct skill-set from hacking the rest of Tor.
- By decoupling transport obfuscation from the Tor core, we hope to
- encourage people working on transport obfuscation who would
- otherwise not be interested in hacking Tor.
-
- 5. Finally, we hope that defining a generic transport obfuscation plugin
- mechanism will be useful to other anticensorship projects.
-
-Non-Goals
-
- We're not going to talk about automatic verification of plugin
- correctness and safety via sandboxing, proof-carrying code, or
- whatever.
-
- We need to do more with discovery and distribution, but that's not
- what this proposal is about. We're pretty convinced that the problems
- are sufficiently orthogonal that we should be fine so long as we don't
- preclude a single program from implementing both transport and
- discovery extensions.
-
- This proposal is not about what transport plugins are the best ones
- for people to write. We do, however, make some general
- recommendations for plugin authors in an appendix.
-
- We've considered issues involved with completely replacing Tor's TLS
- with another encryption layer, rather than layering it inside the
- obfuscation layer. We describe how to do this in an appendix to the
- current proposal, though we are not currently sure whether it's a good
- idea to implement.
-
- We deliberately reject any design that would involve linking more code
- into Tor's process space.
-
-Design overview
-
- To write a new transport protocol, an implementer must provide two
- pieces: a "Client Proxy" to run at the initiator side, and a "Server
- Proxy" to run a the server side. These two pieces may or may not be
- implemented by the same program.
-
- Each client may run any number of Client Proxies. Each one acts like
- a SOCKS proxy that accepts accept connections on localhost. Each one
- runs on a different port, and implements one or more transport
- methods. If the protocol has any parameters, they passed from Tor
- inside the regular username/password parts of the SOCKS protocol.
-
- Bridges (and maybe relays) may run any number of Server Proxies: these
- programs provide an interface like stunnel-server (or whatever the
- option is): they get connections from the network (typically by
- listening for connections on the network) and relay them to the
- Bridge's real ORPort.
-
- To configure one of these programs, it should be sufficient simply to
- list it in your torrc. The program tells Tor which transports it
- provides. The Tor consensus should carry a new approved version number that
- is specific for pluggable transport; this will allow Tor to know when a
- particular transport is known to be unsafe safe or non-functional.
-
- Bridges (and maybe relays) report in their descriptors which transport
- protocols they support. This information can be copied into bridge
- lines. Bridges using a transport protocol may have multiple bridge
- lines.
-
- Any methods that are wildly successful, we can bake into Tor.
-
-Specifications: Client behavior
-
- Bridge lines can now follow the extended format "bridge method
- address:port [[keyid=]id-fingerprint] [k=v] [k=v] [k=v]". To connect
- to such a bridge, a client must open a local connection to the SOCKS
- proxy for "method", and ask it to connect to address:port. If
- [id-fingerprint] is provided, it should expect the public identity key
- on the TLS connection to match the digest provided in
- [id-fingerprint]. If any [k=v] items are provided, they are
- configuration parameters for the proxy: Tor should separate them with
- semicolons and put them user and password fields of the request,
- splitting them across the fields as necessary. If a key or value
- value must contain a semicolon or a backslash, it is escaped with a
- backslash.
-
- The "id-fingerprint" field is always provided in a field named
- "keyid", if it was given. Method names must be C identifiers.
-
- Example: if the bridge line is "bridge trebuchet www.example.com:3333
- rocks=20 height=5.6m" AND if the Tor client knows that the
- 'trebuchet' method is provided by a SOCKS5 proxy on
- 127.0.0.1:19999, the client should connect to that proxy, ask it to
- connect to www.example.com, and provide the string
- "rocks=20;height=5.6m" as the username, the password, or split
- across the username and password.
-
- There are two ways to tell Tor clients about protocol proxies:
- external proxies and managed proxies. An external proxy is configured
- with "ClientTransportPlugin trebuchet socks5 127.0.0.1:9999". This
- tells Tor that another program is already running to handle
- 'trubuchet' connections, and Tor doesn't need to worry about it. A
- managed proxy is configured with "ClientTransportPlugin trebuchet
- exec /usr/libexec/tor-proxies/trebuchet [options]", and tells Tor to launch
- an external program on-demand to provide a socks proxy for 'trebuchet'
- connections. The Tor client only launches one instance of each
- external program, even if the same executable is listed for more than
- one method.
-
- The same program can implement a managed or an external proxy: it just
- needs to take an argument saying which one to be.
-
-Client proxy behavior
-
- When launched from the command-line by a Tor client, a transport
- proxy needs to tell Tor which methods and ports it supports. It does
- this by printing one or more CMETHOD: lines to its stdout. These look
- like
-
- CMETHOD: trebuchet SOCKS5 127.0.0.1:19999 ARGS:rocks,height \
- OPT-ARGS:tensile-strength
-
- The ARGS field lists mandatory parameters that must appear in every
- bridge line for this method. The OPT-ARGS field lists optional
- parameters. If no ARGS or OPT-ARGS field is provided, Tor should not
- check the parameters in bridge lines for this method.
-
- The proxy should print a single "METHODS: DONE" line after it is
- finished telling Tor about the methods it provides.
-
- The transport proxy MUST exit cleanly when it receives a SIGTERM from
- Tor.
-
- The Tor client MUST ignore lines beginning with a keyword and a colon
- if it does not recognize the keyword.
-
- In the future, if we need a control mechanism, we can use the
- stdin/stdout from Tor to the transport proxy.
-
- A transport proxy MUST handle SOCKS connect requests using the SOCKS
- version it advertises.
-
- Tor clients SHOULD NOT use any method from a client proxy unless it
- is both listed as a possible method for that proxy in torrc, and it
- is listed by the proxy as a method it supports.
-
- [XXXX say something about versioning.]
-
-Server behavior
-
- Server proxies are configured similarly to client proxies.
-
-
-
-Server proxy behavior
-
-
-
- [so, we can have this work like client proxies, where the bridge
- launches some programs, and they tell the bridge, "I am giving you
- method X with parameters Y"? Do you have to take all the methods? If
- not, which do you specify?]
-
- [Do we allow programs that get started independently?]
-
- [We'll need to figure out how this works with port forwarding. Is
- port forwarding the bridge's problem, the proxy's problem, or some
- combination of the two?]
-
- [If we're using the bridge authority/bridgedb system for distributing
- bridge info, the right place to advertise bridge lines is probably
- the extrainfo document. We also need a way to tell the bridge
- authority "don't give out a default bridge line for me"]
-
-Server behavior
-
-Bridge authority behavior
-
-Implementation plan
-
- Turn this into a draft proposal
-
- Circulate and discuss on or-dev.
-
- We should ship a couple of null plugin implementations in one or two
- popular, portable languages so that people get an idea of how to
- write the stuff.
-
- 1. We should have one that's just a proof of concept that does
- nothing but transfer bytes back and forth.
-
- 1. We should not do a rot13 one.
-
- 2. We should implement a basic proxy that does not transform the bytes at all
-
- 1. We should implement DNS or HTTP using other software (as goodell
- did years ago with DNS) as an example of wrapping existing code into
- our plugin model.
-
- 2. The obfuscated-ssh superencipherment is pretty trivial and pretty
- useful. It makes the protocol stringwise unfingerprintable.
-
- 1. Nick needs to be told firmly not to bikeshed the obfuscated-ssh
- superencipherment too badly
-
- 1. Go ahead, bikeshed my day
-
- 1. If we do a raw-traffic proxy, openssh tunnels would be the logical choice.
-
-Appendix: recommendations for transports
-
- Be free/open-source software. Also, if you think your code might
- someday do so well at circumvention that it should be implemented
- inside Tor, it should use the same license as Tor.
-
- Use libraries that Tor already requires. (You can rely on openssl and
- libevent being present if current Tor is present.)
-
- Be portable: most Tor users are on Windows, and most Tor developers
- are not, so designing your code for just one of these platforms will
- make it either get a small userbase, or poor auditing.
-
- Think secure: if your code is in a C-like language, and it's hard to
- read it and become convinced it's safe then, it's probably not safe.
-
- Think small: we want to minimize the bytes that a Windows user needs
- to download for a transport client.
-
- Specify: if you can't come up with a good explanation
-
- Avoid security-through-obscurity if possible. Specify.
-
- Resist trivial fingerprinting: There should be no good string or regex
- to search for to distinguish your protocol from protocols permitted by
- censors.
-
- Imitate a real profile: There are many ways to implement most
- protocols -- and in many cases, most possible variants of a given
- protocol won't actually exist in the wild.
-
-Appendix: Raw-traffic transports
-
- This section describes an optional extension to the proposal above.
- We are not sure whether it is a good idea.
diff --git a/doc/spec/proposals/ideas/xxx-port-knocking.txt b/doc/spec/proposals/ideas/xxx-port-knocking.txt
deleted file mode 100644
index 85c27ec52..000000000
--- a/doc/spec/proposals/ideas/xxx-port-knocking.txt
+++ /dev/null
@@ -1,91 +0,0 @@
-Filename: xxx-port-knocking.txt
-Title: Port knocking for bridge scanning resistance
-Author: Jacob Appelbaum
-Created: 19-April-2009
-Status: Draft
-
- Port knocking for bridge scanning resistance
-
-0.0 Introduction
-
-This document is a collection of ideas relating to improving scanning
-resistance for private bridge relays. This is intented to stop opportunistic
-network scanning and subsequent discovery of private bridge relays.
-
-
-0.1 Current Implementation
-
-Currently private bridges are only hidden by their obscurity. If you know
-a bridge ip address, the bridge can be detected trivially and added to a block
-list.
-
-0.2 Configuring an external port knocking program to control the firewall
-
-It is currently possible for bridge operators to configure a port knocking
-daemon that controls access to the incoming OR port. This is currently out of
-scope for Tor and Tor configuration. This process requires the firewall to know
-the current nodes in the Tor network.
-
-1.0 Suggested changes
-
-Private bridge operators should be able to configure a method of hiding their
-relay. Only authorized users should be able to communicate with the private
-bridge. This should be done with Tor and if possible without the help of the
-firewall. It should be possible for a Tor user to enter a secret key into
-Tor or optionally Vidalia on a per bridge basis. This secret key should be
-used to authenticate the bridge user to the private bridge.
-
-1.x Issues with low ports and bind() for ORPort
-
-Tor opens low numbered ports during startup and then drops privileges. It is
-no longer possible to rebind to those lower ports after they are closed.
-
-1.x Issues with OS level packet filtering
-
-Tor does not know about any OS level packet filtering. Currently there is no
-packet filters that understands the Tor network in real time.
-
-1.x Possible partioning of users by bridge operator
-
-Depending on implementation, it may be possible for bridge operators to
-uniquely identify users. This appears to be a general bridge issue when a
-bridge operator uniquely deploys bridges per user.
-
-2.0 Implementation ideas
-
-This is a suggested set of methods for port knocking.
-
-2.x Using SPA port knocking
-
-Single Packet Authentication port knocking encodes all required data into a
-single UDP packet. Improperly formatted packets may be simply discarded.
-Properly formatted packets should be processed and appropriate actions taken.
-
-2.x Using DNS as a transport for SPA
-
-It should be possible for Tor to bind to port 53 at startup and merely drop all
-packets that are not valid. UDP does not require a response and invalid packets
-will not trigger a response from Tor. With base32 encoding it should be
-possible to encode SPA as valid DNS requests. This should allow use of the
-public DNS infrastructure for authorization requests if desired.
-
-2.x Ghetto firewalling with opportunistic connection closing
-
-Until a user has authenticated with Tor, Tor only has a UDP listener. This
-listener should never send data in response, it should only open an ORPort
-when a user has successfully authenticated. After a user has authenticated
-with Tor to open an ORPort, only users who have authenticated will be able
-to use it. All other users as identified by their ip address will have their
-connection closed before any data is sent or received. This should be
-accomplished with an access policy. By default, the access policy should block
-all access to the ORPort.
-
-2.x Timing and reset of access policies
-
-Access to the ORPort is sensitive. The bridge should remove any exceptions
-to its access policy regularly when the ORPort is unused. Valid users should
-reauthenticate if they do not use the ORPort within a given time frame.
-
-2.x Additional considerations
-
-There are many. A format of the packet and the crypto involved is a good start.
diff --git a/doc/spec/proposals/ideas/xxx-rate-limit-exits.txt b/doc/spec/proposals/ideas/xxx-rate-limit-exits.txt
deleted file mode 100644
index 81fed20af..000000000
--- a/doc/spec/proposals/ideas/xxx-rate-limit-exits.txt
+++ /dev/null
@@ -1,63 +0,0 @@
-
-1. Overview
-
- We should rate limit the volume of stream creations at exits:
-
-2.1. Per-circuit limits
-
- If a given circuit opens more than N streams in X seconds, further
- stream requests over the next Y seconds should fail with the reason
- 'resourcelimit'. Clients will automatically notice this and switch to
- a new circuit.
-
- The goal is to limit the effects of port scans on a given exit relay,
- so the relay's ISP won't get hassled as much.
-
- First thoughts for parameters would be N=100 streams in X=5 seconds
- causes 30 seconds of fails; and N=300 streams in X=30 seconds causes
- 30 seconds of fails.
-
- We could simplify by, instead of having a "for 30 seconds" parameter,
- just marking the circuit as forever failing new requests. (We don't want
- to just close the circuit because it may still have open streams on it.)
-
-2.2. Per-destination limits
-
- If a given circuit opens more than N1 streams in X seconds to a single
- IP address, or all the circuits combined open more than N2 streams,
- then we should fail further attempts to reach that address for a while.
-
- The goal is to limit the abuse that Tor exit relays can dish out
- to a single target either for socket DoS or for web crawling, in
- the hopes of a) not triggering their automated defenses, and b) not
- making them upset at Tor. Hopefully these self-imposed bans will be
- much shorter-lived than bans or barriers put up by the websites.
-
-3. Issues
-
-3.1. Circuit-creation overload
-
- Making clients move to new circuits more often will cause more circuit
- creation requests.
-
-3.2. How to pick the parameters?
-
- If we pick the numbers too low, then popular sites are effectively
- cut out of Tor. If we pick them too high, we don't do much good.
-
- Worse, picking them wrong isn't easy to fix, since the deployed Tor
- servers will ship with a certain set of numbers.
-
- We could put numbers (or "general settings") in the networkstatus
- consensus, and Tor exits would adapt more dynamically.
-
- We could also have a local config option about how aggressive this
- server should be with its parameters.
-
-4. Client-side limitations
-
- Perhaps the clients should have built-in rate limits too, so they avoid
- harrassing the servers by default?
-
- Tricky if we want to get Tor clients in use at large enclaves.
-
diff --git a/doc/spec/proposals/ideas/xxx-using-spdy.txt b/doc/spec/proposals/ideas/xxx-using-spdy.txt
deleted file mode 100644
index d733a84b6..000000000
--- a/doc/spec/proposals/ideas/xxx-using-spdy.txt
+++ /dev/null
@@ -1,143 +0,0 @@
-Filename: xxx-using-spdy.txt
-Title: Using the SPDY protocol to improve Tor performance
-Author: Steven J. Murdoch
-Created: 03-Feb-2010
-Status: Draft
-Target:
-
-1. Overview
-
- The SPDY protocol [1] is an alternative method for transferring
- web content over TCP, designed to improve efficiency and
- performance. A SPDY-aware browser can already communicate with
- a SPDY-aware web server over Tor, because this only requires a TCP
- stream to be set up. However, a SPDY-aware browser cannot
- communicate with a non-SPDY-aware web server. This proposal
- outlines how Tor could support this latter case, and why it
- may be good for performance.
-
-2. Motivation
-
- About 90% of Tor traffic, by connection, is HTTP [2], but
- users report subjective performance to be poor. It would
- therefore be desirable to improve this situation. SPDY was
- designed to offer better performance than HTTP, in
- high-latency and/or low-bandwidth situations, and is therefore
- an option worth examining.
-
- If a user wishes to access a SPDY-enabled web server over Tor,
- all they need to do is to configure their SPDY-enabled browser
- (e.g. Google Chrome) to use Tor. However, there are few
- SPDY-enabled web servers, and even if there was high demand
- from Tor users, there would be little motivation for server
- operators to upgrade, for the benefit of only a small
- proportion of their users.
-
- The motivation of this proposal is to allow only the user to
- install a SPDY-enabled browser, and permit web servers to
- remain unmodified. Essentially, Tor would incorporate a proxy
- on the exit node, which communicates SPDY to the web browser
- and normal HTTP to the web server. This proxy would translate
- between the two transport protocols, and possibly perform
- other optimizations.
-
- SPDY currently offers five optimizations:
-
- 1) Multiplexed streams:
- An unlimited number of resources can be transferred
- concurrently, over a single TCP connection.
-
- 2) Request prioritization:
- The client can set a priority on each resource, to assist
- the server in re-ordering responses.
-
- 3) Compression:
- Both HTTP header and resource content can be compressed.
-
- 4) Server push:
- The server can offer the client resources which have not
- been requested, but which the server believes will be.
-
- 5) Server hint:
- The server can suggest that the client request further
- resources, before the main content is transferred.
-
- Tor currently effectively implements (1), by being able to put
- multiple streams on one circuit. SPDY however requires fewer
- round-trips to do the same. The other features are not
- implemented by Tor. Therefore it is reasonable to expect that
- a HTTP <-> SPDY proxy may improve Tor performance, by some
- amount.
-
- The consequences on caching need to be considered carefully.
- Most of the optimizations SPDY offers have no effect because
- the existing HTTP cache control headers are transmitted without
- modification. Server push is more problematic, because here
- the server may push a resource that the client already has.
-
-3. Design outline
-
- One way to implement the SPDY proxy is for Tor exit nodes to
- advertise this capability in their descriptor. The OP would
- then preferentially select these nodes when routing streams
- destined for port 80.
-
- Then, rather than sending the usual RELAY_BEGIN cell, the OP
- would send a RELAY_BEGIN_TRANSFORMED cell, with a parameter to
- indicate that the exit node should translate between SPDY and
- HTTP. The rest of the connection process would operate as
- usual.
-
- There would need to be some way of elegantly handling non-HTTP
- traffic which goes over port 80.
-
-4. Implementation status
-
- SPDY is under active development and both the specification
- and implementations are in a state of flux. Initial
- experiments with Google Chrome in SPDY-mode and server
- libraries indicate that more work is needed before they are
- production-ready. There is no indication that browsers other
- than Google Chrome will support SPDY (and no official
- statement as to whether Google Chrome will eventually enable
- SPDY by default).
-
- Implementing a full SPDY proxy would be non-trivial. Stream
- multiplexing and compression are supported by existing
- libraries and would be fairly simple to implement. Request
- prioritization would require some form of caching on the
- proxy-side. Server push and server hint would require content
- parsing to identify resources which should be treated
- specially.
-
-5. Security and policy implications
-
- A SPDY proxy would be a significant amount of code, and may
- pull in external libraries. This code will process potentially
- malicious data, both at the SPDY and HTTP sides. This proposal
- therefore increases the risk that exit nodes will be
- compromised by exploiting a bug in the proxy.
-
- This proposal would also be the first way in which Tor is
- modifying TCP stream data. Arguably this is still meta-data
- (HTTP headers), but there may be some concern that Tor should
- not be doing this.
-
- Torbutton only works with Firefox, but SPDY only works with
- Google Chrome. We should be careful not to recommend that
- users adopt a browser which harms their privacy in other ways.
-
-6. Open questions:
-
- - How difficult would this be to implement?
-
- - How much performance improvement would it actually result in?
-
- - Is there some way to rapidly develop a prototype which would
- answer the previous question?
-
-[1] SPDY: An experimental protocol for a faster web
- http://dev.chromium.org/spdy/spdy-whitepaper
-[2] Shining Light in Dark Places: Understanding the Tor Network Damon McCoy,
- Kevin Bauer, Dirk Grunwald, Tadayoshi Kohno, Douglas Sicker
- http://www.cs.washington.edu/homes/yoshi/papers/Tor/PETS2008_37.pdf
diff --git a/doc/spec/proposals/ideas/xxx-what-uses-sha1.txt b/doc/spec/proposals/ideas/xxx-what-uses-sha1.txt
deleted file mode 100644
index b3ca3eea5..000000000
--- a/doc/spec/proposals/ideas/xxx-what-uses-sha1.txt
+++ /dev/null
@@ -1,247 +0,0 @@
-Filename: xxx-what-uses-sha1.txt
-Title: Where does Tor use SHA-1 today?
-Authors: Nick Mathewson, Marian
-Created: 30-Dec-2008
-Status: Meta
-
-
-Introduction:
-
- Tor uses SHA-1 as a message digest. SHA-1 is showing its age:
- theoretical attacks for finding collisions against it get better
- every year or two, and it will likely be broken in practice before
- too long.
-
- According to smart crypto people, the SHA-2 functions (SHA-256, etc)
- share too much of SHA-1's structure to be very good. RIPEMD-160 is
- also based on flawed past hashes. Some people think other hash
- functions (e.g. Whirlpool and Tiger) are not as bad; most of these
- have not seen enough analysis to be used yet.
-
- Here is a 2006 paper about hash algorithms.
- http://www.sane.nl/sane2006/program/final-papers/R10.pdf
-
- (Todo: Ask smart crypto people.)
-
- By 2012, the NIST SHA-3 competition will be done, and with luck we'll
- have something good to switch too. But it's probably a bad idea to
- wait until 2012 to figure out _how_ to migrate to a new hash
- function, for two reasons:
- 1) It's not inconceivable we'll want to migrate in a hurry
- some time before then.
- 2) It's likely that migrating to a new hash function will
- require protocol changes, and it's easiest to make protocol
- changes backward compatible if we lay the groundwork in
- advance. It would suck to have to break compatibility with
- a big hard-to-test "flag day" protocol change.
-
- This document attempts to list everything Tor uses SHA-1 for today.
- This is the first step in getting all the design work done to switch
- to something else.
-
- This document SHOULD NOT be a clearinghouse of what to do about our
- use of SHA-1. That's better left for other individual proposals.
-
-
-Why now?
-
- The recent publication of "MD5 considered harmful today: Creating a
- rogue CA certificate" by Alexander Sotirov, Marc Stevens, Jacob
- Appelbaum, Arjen Lenstra, David Molnar, Dag Arne Osvik, and Benne de
- Weger has reminded me that:
-
- * You can't rely on theoretical attacks to stay theoretical.
- * It's quite unpleasant when theoretical attacks become practical
- and public on days you were planning to leave for vacation.
- * Broken hash functions (which SHA-1 is not quite yet AFAIU)
- should be dropped like hot potatoes. Failure to do so can make
- one look silly.
-
-
-Triage
-
- How severe are these problems? Let's divide them into these
- categories, where H(x) is the SHA-1 hash of x:
- PREIMAGE -- find any x such that a H(x) has a chosen value
- -- A SHA-1 usage that only depends on preimage
- resistance
- * Also SECOND PREIMAGE. Given x, find a y not equal to
- x such that H(x) = H(y)
- COLLISION<role> -- A SHA-1 usage that depends on collision
- resistance, but the only party who could mount a
- collision-based attack is already in a trusted role
- (like a distribution signer or a directory authority).
- COLLISION -- find any x and y such that H(x) = H(y) -- A
- SHA-1 usage that depends on collision resistance
- and doesn't need the attacker to have any special keys.
-
- There is no need to put much effort into fixing PREIMAGE and SECOND
- PREIMAGE usages in the near-term: while there have been some
- theoretical results doing these attacks against SHA-1, they don't
- seem to be close to practical yet. To fix COLLISION<code-signing>
- usages is not too important either, since anyone who has the key to
- sign the code can mount far worse attacks. It would be good to fix
- COLLISION<authority> usages, since we try to resist bad authorities
- to a limited extent. The COLLISION usages are the most important
- to fix.
-
- Kelsey and Schneier published a theoretical second preimage attack
- against SHA-1 in 2005, so it would be a good idea to fix PREIMAGE
- and SECOND PREIMAGE usages after fixing COLLISION usages or where fixes
- require minimal effort.
-
- http://www.schneier.com/paper-preimages.html
-
- Additionally, we need to consider the impact of a successful attack
- in each of these cases. SHA-1 collisions are still expensive even
- if recent results are verified, and anybody with the resources to
- compute one also has the resources to mount a decent Sybil attack.
-
- Let's be pessimistic, and not assume that producing collisions of
- a given format is actually any harder than producing collisions at
- all.
-
-
-What Tor uses hashes for today:
-
-1. Infrastructure.
-
- A. Our X.509 certificates are signed with SHA-1.
- COLLSION
- B. TLS uses SHA-1 (and MD5) internally to generate keys.
- PREIMAGE?
- * At least breaking SHA-1 and MD5 simultaneously is
- much more difficult than breaking either
- independently.
- C. Some of the TLS ciphersuites we allow use SHA-1.
- PREIMAGE?
- D. When we sign our code with GPG, it might be using SHA-1.
- COLLISION<code-signing>
- * GPG 1.4 and up have writing support for SHA-2 hashes.
- This blog has help for converting:
- http://www.schwer.us/journal/2005/02/19/sha-1-broken-and-gnupg-gpg/
- E. Our GPG keys might be authenticated with SHA-1.
- COLLISION<code-signing-key-signing>
- F. OpenSSL's random number generator uses SHA-1, I believe.
- PREIMAGE
-
-2. The Tor protocol
-
- A. Everything we sign, we sign using SHA-1-based OAEP-MGF1.
- PREIMAGE?
- B. Our CREATE cell format uses SHA-1 for: OAEP padding.
- PREIMAGE?
- C. Our EXTEND cells use SHA-1 to hash the identity key of the
- target server.
- COLLISION
- D. Our CREATED cells use SHA-1 to hash the derived key data.
- ??
- E. The data we use in CREATE_FAST cells to generate a key is the
- length of a SHA-1.
- NONE
- F. The data we send back in a CREATED/CREATED_FAST cell is the length
- of a SHA-1.
- NONE
- G. We use SHA-1 to derive our circuit keys from the negotiated g^xy
- value.
- NONE
- H. We use SHA-1 to derive the digest field of each RELAY cell, but that's
- used more as a checksum than as a strong digest.
- NONE
-
-3. Directory services
-
- [All are COLLISION or COLLISION<authority> ]
-
- A. All signatures are generated on the SHA-1 of their corresponding
- documents, using PKCS1 padding.
- * In dir-spec.txt, section 1.3, it states,
- "SIGNATURE" Object contains a signature (using the signing key)
- of the PKCS1-padded digest of the entire document, taken from
- the beginning of the Initial item, through the newline after
- the Signature Item's keyword and its arguments."
- So our attacker, Malcom, could generate a collision for the hash
- that is signed. Thus, a second pre-image attack is possible.
- Vulnerable to regular collision attack only if key is stolen.
- If the key is stolen, Malcom could distribute two different
- copies of the document which have the same hash. Maybe useful
- for a partitioning attack?
- B. Router descriptors identify their corresponding extra-info documents
- by their SHA-1 digest.
- * A third party might use a second pre-image attack to generate a
- false extra-info document that has the same hash. The router
- itself might use a regular collision attack to generate multiple
- extra-info documents with the same hash, which might be useful
- for a partitioning attack.
- C. Fingerprints in router descriptors are taken using SHA-1.
- * The fingerprint must match the public key. Not sure what would
- happen if two routers had different public keys but the same
- fingerprint. There could perhaps be unpredictable behaviour.
- D. In router descriptors, routers in the same "Family" may be listed
- by server nicknames or hexdigests.
- * Does not seem critical.
- E. Fingerprints in authority certs are taken using SHA-1.
- F. Fingerprints in dir-source lines of votes and consensuses are taken
- using SHA-1.
- G. Networkstatuses refer to routers identity keys and descriptors by their
- SHA-1 digests.
- H. Directory-signature lines identify which key is doing the signing by
- the SHA-1 digests of the authority's signing key and its identity key.
- I. The following items are downloaded by the SHA-1 of their contents:
- XXXX list them
- J. The following items are downloaded by the SHA-1 of an identity key:
- XXXX list them too.
-
-4. The rendezvous protocol
-
- A. Hidden servers use SHA-1 to establish introduction points on relays,
- and relays use SHA-1 to check incoming introduction point
- establishment requests.
- B. Hidden servers use SHA-1 in multiple places when generating hidden
- service descriptors.
- * The permanent-id is the first 80 bits of the SHA-1 hash of the
- public key
- ** time-period performs caclulations using the permanent-id
- * The secret-id-part is the SHA-1 has of the time period, the
- descriptor-cookie, and replica.
- * Hash of introduction point's identity key.
- C. Hidden servers performing basic-type client authorization for their
- services use SHA-1 when encrypting introduction points contained in
- hidden service descriptors.
- D. Hidden service directories use SHA-1 to check whether a given hidden
- service descriptor may be published under a given descriptor
- identifier or not.
- E. Hidden servers use SHA-1 to derive .onion addresses of their
- services.
- * What's worse, it only uses the first 80 bits of the SHA-1 hash.
- However, the rend-spec.txt says we aren't worried about arbitrary
- collisons?
- F. Clients use SHA-1 to generate the current hidden service descriptor
- identifiers for a given .onion address.
- G. Hidden servers use SHA-1 to remember digests of the first parts of
- Diffie-Hellman handshakes contained in introduction requests in order
- to detect replays. See the RELAY_ESTABLISH_INTRO cell. We seem to be
- taking a hash of a hash here.
- H. Hidden servers use SHA-1 during the Diffie-Hellman key exchange with
- a connecting client.
-
-5. The bridge protocol
-
- XXXX write me
-
- A. Client may attempt to query for bridges where he knows a digest
- (probably SHA-1) before a direct query.
-
-6. The Tor user interface
-
- A. We log information about servers based on SHA-1 hashes of their
- identity keys.
- COLLISION
- B. The controller identifies servers based on SHA-1 hashes of their
- identity keys.
- COLLISION
- C. Nearly all of our configuration options that list servers allow SHA-1
- hashes of their identity keys.
- COLLISION
- E. The deprecated .exit notation uses SHA-1 hashes of identity keys
- COLLISION
diff --git a/doc/spec/proposals/reindex.py b/doc/spec/proposals/reindex.py
deleted file mode 100755
index 980bc0659..000000000
--- a/doc/spec/proposals/reindex.py
+++ /dev/null
@@ -1,117 +0,0 @@
-#!/usr/bin/python
-
-import re, os
-class Error(Exception): pass
-
-STATUSES = """DRAFT NEEDS-REVISION NEEDS-RESEARCH OPEN ACCEPTED META FINISHED
- CLOSED SUPERSEDED DEAD REJECTED""".split()
-REQUIRED_FIELDS = [ "Filename", "Status", "Title" ]
-CONDITIONAL_FIELDS = { "OPEN" : [ "Target" ],
- "ACCEPTED" : [ "Target "],
- "CLOSED" : [ "Implemented-In" ],
- "FINISHED" : [ "Implemented-In" ] }
-FNAME_RE = re.compile(r'^(\d\d\d)-.*[^\~]$')
-DIR = "."
-OUTFILE = "000-index.txt"
-TMPFILE = OUTFILE+".tmp"
-
-def indexed(seq):
- n = 0
- for i in seq:
- yield n, i
- n += 1
-
-def readProposal(fn):
- fields = { }
- f = open(fn, 'r')
- lastField = None
- try:
- for lineno, line in indexed(f):
- line = line.rstrip()
- if not line:
- return fields
- if line[0].isspace():
- fields[lastField] += " %s"%(line.strip())
- else:
- parts = line.split(":", 1)
- if len(parts) != 2:
- raise Error("%s:%s: Neither field nor continuation"%
- (fn,lineno))
- else:
- fields[parts[0]] = parts[1].strip()
- lastField = parts[0]
-
- return fields
- finally:
- f.close()
-
-def checkProposal(fn, fields):
- status = fields.get("Status")
- need_fields = REQUIRED_FIELDS + CONDITIONAL_FIELDS.get(status, [])
- for f in need_fields:
- if not fields.has_key(f):
- raise Error("%s has no %s field"%(fn, f))
- if fn != fields['Filename']:
- print `fn`, `fields['Filename']`
- raise Error("Mismatched Filename field in %s"%fn)
- if fields['Title'][-1] == '.':
- fields['Title'] = fields['Title'][:-1]
-
- status = fields['Status'] = status.upper()
- if status not in STATUSES:
- raise Error("I've never heard of status %s in %s"%(status,fn))
- if status in [ "SUPERSEDED", "DEAD" ]:
- for f in [ 'Implemented-In', 'Target' ]:
- if fields.has_key(f): del fields[f]
-
-def readProposals():
- res = []
- for fn in os.listdir(DIR):
- m = FNAME_RE.match(fn)
- if not m: continue
- if not fn.endswith(".txt"):
- raise Error("%s doesn't end with .txt"%fn)
- num = m.group(1)
- fields = readProposal(fn)
- checkProposal(fn, fields)
- fields['num'] = num
- res.append(fields)
- return res
-
-def writeIndexFile(proposals):
- proposals.sort(key=lambda f:f['num'])
- seenStatuses = set()
- for p in proposals:
- seenStatuses.add(p['Status'])
-
- out = open(TMPFILE, 'w')
- inf = open(OUTFILE, 'r')
- for line in inf:
- out.write(line)
- if line.startswith("====="): break
- inf.close()
-
- out.write("Proposals by number:\n\n")
- for prop in proposals:
- out.write("%(num)s %(Title)s [%(Status)s]\n"%prop)
- out.write("\n\nProposals by status:\n\n")
- for s in STATUSES:
- if s not in seenStatuses: continue
- out.write(" %s:\n"%s)
- for prop in proposals:
- if s == prop['Status']:
- out.write(" %(num)s %(Title)s"%prop)
- if prop.has_key('Target'):
- out.write(" [for %(Target)s]"%prop)
- if prop.has_key('Implemented-In'):
- out.write(" [in %(Implemented-In)s]"%prop)
- out.write("\n")
- out.close()
- os.rename(TMPFILE, OUTFILE)
-
-try:
- os.unlink(TMPFILE)
-except OSError:
- pass
-
-writeIndexFile(readProposals())
diff --git a/doc/spec/rend-spec.txt b/doc/spec/rend-spec.txt
deleted file mode 100644
index 3c14ebc66..000000000
--- a/doc/spec/rend-spec.txt
+++ /dev/null
@@ -1,966 +0,0 @@
-
- Tor Rendezvous Specification
-
-0. Overview and preliminaries
-
- The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
- NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
- "OPTIONAL" in this document are to be interpreted as described in
- RFC 2119.
-
- Read
- https://svn.torproject.org/svn/projects/design-paper/tor-design.html#sec:rendezvous
- before you read this specification. It will make more sense.
-
- Rendezvous points provide location-hidden services (server
- anonymity) for the onion routing network. With rendezvous points,
- Bob can offer a TCP service (say, a webserver) via the onion
- routing network, without revealing the IP of that service.
-
- Bob does this by anonymously advertising a public key for his
- service, along with a list of onion routers to act as "Introduction
- Points" for his service. He creates forward circuits to those
- introduction points, and tells them about his service. To
- connect to Bob, Alice first builds a circuit to an OR to act as
- her "Rendezvous Point." She then connects to one of Bob's chosen
- introduction points, and asks it to tell him about her Rendezvous
- Point (RP). If Bob chooses to answer, he builds a circuit to her
- RP, and tells it to connect him to Alice. The RP joins their
- circuits together, and begins relaying cells. Alice's 'BEGIN'
- cells are received directly by Bob's OP, which passes data to
- and from the local server implementing Bob's service.
-
- Below we describe a network-level specification of this service,
- along with interfaces to make this process transparent to Alice
- (so long as she is using an OP).
-
-0.1. Notation, conventions and prerequisites
-
- In the specifications below, we use the same notation and terminology
- as in "tor-spec.txt". The service specified here also requires the
- existence of an onion routing network as specified in that file.
-
- H(x) is a SHA1 digest of x.
- PKSign(SK,x) is a PKCS.1-padded RSA signature of x with SK.
- PKEncrypt(SK,x) is a PKCS.1-padded RSA encryption of x with SK.
- Public keys are all RSA, and encoded in ASN.1.
- All integers are stored in network (big-endian) order.
- All symmetric encryption uses AES in counter mode, except where
- otherwise noted.
-
- In all discussions, "Alice" will refer to a user connecting to a
- location-hidden service, and "Bob" will refer to a user running a
- location-hidden service.
-
- An OP is (as defined elsewhere) an "Onion Proxy" or Tor client.
-
- An OR is (as defined elsewhere) an "Onion Router" or Tor server.
-
- An "Introduction point" is a Tor server chosen to be Bob's medium-term
- 'meeting place'. A "Rendezvous point" is a Tor server chosen by Alice to
- be a short-term communication relay between her and Bob. All Tor servers
- potentially act as introduction and rendezvous points.
-
-0.2. Protocol outline
-
- 1. Bob->Bob's OP: "Offer IP:Port as public-key-name:Port". [configuration]
- (We do not specify this step; it is left to the implementor of
- Bob's OP.)
-
- 2. Bob's OP generates a long-term keypair.
-
- 3. Bob's OP->Introduction point via Tor: [introduction setup]
- "This public key is (currently) associated to me."
-
- 4. Bob's OP->directory service via Tor: publishes Bob's service descriptor
- [advertisement]
- "Meet public-key X at introduction point A, B, or C." (signed)
-
- 5. Out of band, Alice receives a z.onion:port address.
- She opens a SOCKS connection to her OP, and requests z.onion:port.
-
- 6. Alice's OP retrieves Bob's descriptor via Tor. [descriptor lookup.]
-
- 7. Alice's OP chooses a rendezvous point, opens a circuit to that
- rendezvous point, and establishes a rendezvous circuit. [rendezvous
- setup.]
-
- 8. Alice connects to the Introduction point via Tor, and tells it about
- her rendezvous point. (Encrypted to Bob.) [Introduction 1]
-
- 9. The Introduction point passes this on to Bob's OP via Tor, along the
- introduction circuit. [Introduction 2]
-
- 10. Bob's OP decides whether to connect to Alice, and if so, creates a
- circuit to Alice's RP via Tor. Establishes a shared circuit.
- [Rendezvous 1]
-
- 11. The Rendezvous point forwards Bob's confirmation to Alice's OP.
- [Rendezvous 2]
-
- 12. Alice's OP sends begin cells to Bob's OP. [Connection]
-
-0.3. Constants and new cell types
-
- Relay cell types
- 32 -- RELAY_COMMAND_ESTABLISH_INTRO
- 33 -- RELAY_COMMAND_ESTABLISH_RENDEZVOUS
- 34 -- RELAY_COMMAND_INTRODUCE1
- 35 -- RELAY_COMMAND_INTRODUCE2
- 36 -- RELAY_COMMAND_RENDEZVOUS1
- 37 -- RELAY_COMMAND_RENDEZVOUS2
- 38 -- RELAY_COMMAND_INTRO_ESTABLISHED
- 39 -- RELAY_COMMAND_RENDEZVOUS_ESTABLISHED
- 40 -- RELAY_COMMAND_INTRODUCE_ACK
-
-0.4. Version overview
-
- There are several parts in the hidden service protocol that have
- changed over time, each of them having its own version number, whereas
- other parts remained the same. The following list of potentially
- versioned protocol parts should help reduce some confusion:
-
- - Hidden service descriptor: the binary-based v0 was the default for a
- long time, and an ASCII-based v2 has been added by proposal 114. The
- v0 descriptor format has been deprecated in 0.2.2.1-alpha. See 1.3.
-
- - Hidden service descriptor propagation mechanism: currently related to
- the hidden service descriptor version -- v0 publishes to the original
- hs directory authorities, whereas v2 publishes to a rotating subset
- of relays with the "HSDir" flag; see 1.4 and 1.6.
-
- - Introduction protocol for how to generate an introduction cell:
- v0 specified a nickname for the rendezvous point and assumed the
- relay would know about it, whereas v2 now specifies IP address,
- port, and onion key so the relay doesn't need to already recognize
- it. See 1.8.
-
-1. The Protocol
-
-1.1. Bob configures his local OP.
-
- We do not specify a format for the OP configuration file. However,
- OPs SHOULD allow Bob to provide more than one advertised service
- per OP, and MUST allow Bob to specify one or more virtual ports per
- service. Bob provides a mapping from each of these virtual ports
- to a local IP:Port pair.
-
-1.2. Bob's OP establishes his introduction points.
-
- The first time the OP provides an advertised service, it generates
- a public/private keypair (stored locally).
-
- The OP chooses a small number of Tor servers as introduction points.
- The OP establishes a new introduction circuit to each introduction
- point. These circuits MUST NOT be used for anything but hidden service
- introduction. To establish the introduction, Bob sends a
- RELAY_COMMAND_ESTABLISH_INTRO cell, containing:
-
- KL Key length [2 octets]
- PK Bob's public key or service key [KL octets]
- HS Hash of session info [20 octets]
- SIG Signature of above information [variable]
-
- KL is the length of PK, in octets.
-
- To prevent replay attacks, the HS field contains a SHA-1 hash based on the
- shared secret KH between Bob's OP and the introduction point, as
- follows:
- HS = H(KH | "INTRODUCE")
- That is:
- HS = H(KH | [49 4E 54 52 4F 44 55 43 45])
- (KH, as specified in tor-spec.txt, is H(g^xy | [00]) .)
-
- Upon receiving such a cell, the OR first checks that the signature is
- correct with the included public key. If so, it checks whether HS is
- correct given the shared state between Bob's OP and the OR. If either
- check fails, the OP discards the cell; otherwise, it associates the
- circuit with Bob's public key, and dissociates any other circuits
- currently associated with PK. On success, the OR sends Bob a
- RELAY_COMMAND_INTRO_ESTABLISHED cell with an empty payload.
-
- Bob's OP uses either Bob's public key or a freshly generated, single-use
- service key in the RELAY_COMMAND_ESTABLISH_INTRO cell, depending on the
- configured hidden service descriptor version. The public key is used for
- v0 descriptors, the service key for v2 descriptors. In the latter case, the
- service keys of all introduction points are included in the v2 hidden
- service descriptor together with the other introduction point information.
- The reason is that the introduction point does not need to and therefore
- should not know for which hidden service it works, so as to prevent it from
- tracking the hidden service's activity. If the hidden service is configured
- to publish both v0 and v2 descriptors, two separate sets of introduction
- points are established.
-
-1.3. Bob's OP generates service descriptors.
-
- For versions before 0.2.2.1-alpha, Bob's OP periodically generates and
- publishes a descriptor of type "V0".
-
- The "V0" descriptor contains:
-
- KL Key length [2 octets]
- PK Bob's public key [KL octets]
- TS A timestamp [4 octets]
- NI Number of introduction points [2 octets]
- Ipt A list of NUL-terminated ORs [variable]
- SIG Signature of above fields [variable]
-
- TS is the number of seconds elapsed since Jan 1, 1970.
-
- The members of Ipt may be either (a) nicknames, or (b) identity key
- digests, encoded in hex, and prefixed with a '$'. Clients must
- accept both forms. Services must only generate the second form.
- Once 0.0.9.x is obsoleted, we can drop the first form.
-
- [It's ok for Bob to advertise 0 introduction points. He might want
- to do that if he previously advertised some introduction points,
- and now he doesn't have any. -RD]
-
- Beginning with 0.2.0.10-alpha, Bob's OP encodes "V2" descriptors in
- addition to (or instead of) "V0" descriptors. The format of a "V2"
- descriptor is as follows:
-
- "rendezvous-service-descriptor" descriptor-id NL
-
- [At start, exactly once]
-
- Indicates the beginning of the descriptor. "descriptor-id" is a
- periodically changing identifier of 160 bits formatted as 32 base32
- chars that is calculated by the hidden service and its clients. The
- "descriptor-id" is calculated by performing the following operation:
-
- descriptor-id =
- H(permanent-id | H(time-period | descriptor-cookie | replica))
-
- "permanent-id" is the permanent identifier of the hidden service,
- consisting of 80 bits. It can be calculated by computing the hash value
- of the public hidden service key and truncating after the first 80 bits:
-
- permanent-id = H(public-key)[:10]
-
- Note: If Bob's OP has "stealth" authorization enabled (see Section 2.2),
- it uses the client key in place of the public hidden service key.
-
- "H(time-period | descriptor-cookie | replica)" is the (possibly
- secret) id part that is necessary to verify that the hidden service is
- the true originator of this descriptor and that is therefore contained
- in the descriptor, too. The descriptor ID can only be created by the
- hidden service and its clients, but the "signature" below can only be
- created by the service.
-
- "time-period" changes periodically as a function of time and
-
- "permanent-id". The current value for "time-period" can be calculated
- using the following formula:
-
- time-period = (current-time + permanent-id-byte * 86400 / 256)
- / 86400
-
- "current-time" contains the current system time in seconds since
- 1970-01-01 00:00, e.g. 1188241957. "permanent-id-byte" is the first
- (unsigned) byte of the permanent identifier (which is in network
- order), e.g. 143. Adding the product of "permanent-id-byte" and
- 86400 (seconds per day), divided by 256, prevents "time-period" from
- changing for all descriptors at the same time of the day. The result
- of the overall operation is a (network-ordered) 32-bit integer, e.g.
- 13753 or 0x000035B9 with the example values given above.
-
- "descriptor-cookie" is an optional secret password of 128 bits that
- is shared between the hidden service provider and its clients. If the
- descriptor-cookie is left out, the input to the hash function is 128
- bits shorter.
-
- "replica" denotes the number of the replica. A service publishes
- multiple descriptors with different descriptor IDs in order to
- distribute them to different places on the ring.
-
- "version" version-number NL
-
- [Exactly once]
-
- The version number of this descriptor's format. In this case: 2.
-
- "permanent-key" NL a public key in PEM format
-
- [Exactly once]
-
- The public key of the hidden service which is required to verify the
- "descriptor-id" and the "signature".
-
- "secret-id-part" secret-id-part NL
-
- [Exactly once]
-
- The result of the following operation as explained above, formatted as
- 32 base32 chars. Using this secret id part, everyone can verify that
- the signed descriptor belongs to "descriptor-id".
-
- secret-id-part = H(time-period | descriptor-cookie | replica)
-
- "publication-time" YYYY-MM-DD HH:MM:SS NL
-
- [Exactly once]
-
- A timestamp when this descriptor has been created.
-
- "protocol-versions" version-string NL
-
- [Exactly once]
-
- A comma-separated list of recognized and permitted version numbers
- for use in INTRODUCE cells; these versions are described in section
- 1.8 below.
-
- "introduction-points" NL encrypted-string
-
- [At most once]
-
- A list of introduction points. If the optional "descriptor-cookie" is
- used, this list is encrypted with AES in CTR mode with a random
- initialization vector of 128 bits that is written to
- the beginning of the encrypted string, and the "descriptor-cookie" as
- secret key of 128 bits length.
-
- The string containing the introduction point data (either encrypted
- or not) is encoded in base64, and surrounded with
- "-----BEGIN MESSAGE-----" and "-----END MESSAGE-----".
-
- The unencrypted string may begin with:
-
- "service-authentication" auth-type auth-data NL
-
- [Any number]
-
- The service-specific authentication data can be used to perform
- client authentication. This data is independent of the selected
- introduction point as opposed to "intro-authentication" below. The
- format of auth-data (base64-encoded or PEM format) depends on
- auth-type. See section 2 of this document for details on auth
- mechanisms.
-
- Subsequently, an arbitrary number of introduction point entries may
- follow, each containing the following data:
-
- "introduction-point" identifier NL
-
- [At start, exactly once]
-
- The identifier of this introduction point: the base-32 encoded
- hash of this introduction point's identity key.
-
- "ip-address" ip-address NL
-
- [Exactly once]
-
- The IP address of this introduction point.
-
- "onion-port" port NL
-
- [Exactly once]
-
- The TCP port on which the introduction point is listening for
- incoming onion requests.
-
- "onion-key" NL a public key in PEM format
-
- [Exactly once]
-
- The public key that can be used to encrypt messages to this
- introduction point.
-
- "service-key" NL a public key in PEM format
-
- [Exactly once]
-
- The public key that can be used to encrypt messages to the hidden
- service.
-
- "intro-authentication" auth-type auth-data NL
-
- [Any number]
-
- The introduction-point-specific authentication data can be used
- to perform client authentication. This data depends on the
- selected introduction point as opposed to "service-authentication"
- above. The format of auth-data (base64-encoded or PEM format)
- depends on auth-type. See section 2 of this document for details
- on auth mechanisms.
-
- (This ends the fields in the encrypted portion of the descriptor.)
-
- [It's ok for Bob to advertise 0 introduction points. He might want
- to do that if he previously advertised some introduction points,
- and now he doesn't have any. -RD]
-
- "signature" NL signature-string
-
- [At end, exactly once]
-
- A signature of all fields above with the private key of the hidden
- service.
-
-1.3.1. Other descriptor formats we don't use.
-
- Support for the V0 descriptor format was dropped in 0.2.2.0-alpha-dev:
-
- KL Key length [2 octets]
- PK Bob's public key [KL octets]
- TS A timestamp [4 octets]
- NI Number of introduction points [2 octets]
- Ipt A list of NUL-terminated ORs [variable]
- SIG Signature of above fields [variable]
-
- KL is the length of PK, in octets.
- TS is the number of seconds elapsed since Jan 1, 1970.
-
- The members of Ipt may be either (a) nicknames, or (b) identity key
- digests, encoded in hex, and prefixed with a '$'.
-
- The V1 descriptor format was understood and accepted from
- 0.1.1.5-alpha-cvs to 0.2.0.6-alpha-dev, but no Tors generated it and
- it was removed:
-
- V Format byte: set to 255 [1 octet]
- V Version byte: set to 1 [1 octet]
- KL Key length [2 octets]
- PK Bob's public key [KL octets]
- TS A timestamp [4 octets]
- PROTO Protocol versions: bitmask [2 octets]
- NI Number of introduction points [2 octets]
- For each introduction point: (as in INTRODUCE2 cells)
- IP Introduction point's address [4 octets]
- PORT Introduction point's OR port [2 octets]
- ID Introduction point identity ID [20 octets]
- KLEN Length of onion key [2 octets]
- KEY Introduction point onion key [KLEN octets]
- SIG Signature of above fields [variable]
-
- A hypothetical "V1" descriptor, that has never been used but might
- be useful for historical reasons, contains:
-
- V Format byte: set to 255 [1 octet]
- V Version byte: set to 1 [1 octet]
- KL Key length [2 octets]
- PK Bob's public key [KL octets]
- TS A timestamp [4 octets]
- PROTO Rendezvous protocol versions: bitmask [2 octets]
- NA Number of auth mechanisms accepted [1 octet]
- For each auth mechanism:
- AUTHT The auth type that is supported [2 octets]
- AUTHL Length of auth data [1 octet]
- AUTHD Auth data [variable]
- NI Number of introduction points [2 octets]
- For each introduction point: (as in INTRODUCE2 cells)
- ATYPE An address type (typically 4) [1 octet]
- ADDR Introduction point's IP address [4 or 16 octets]
- PORT Introduction point's OR port [2 octets]
- AUTHT The auth type that is supported [2 octets]
- AUTHL Length of auth data [1 octet]
- AUTHD Auth data [variable]
- ID Introduction point identity ID [20 octets]
- KLEN Length of onion key [2 octets]
- KEY Introduction point onion key [KLEN octets]
- SIG Signature of above fields [variable]
-
- AUTHT specifies which authentication/authorization mechanism is
- required by the hidden service or the introduction point. AUTHD
- is arbitrary data that can be associated with an auth approach.
- Currently only AUTHT of [00 00] is supported, with an AUTHL of 0.
- See section 2 of this document for details on auth mechanisms.
-
-1.4. Bob's OP advertises his service descriptor(s).
-
- Bob's OP advertises his service descriptor to a fixed set of v0 hidden
- service directory servers and/or a changing subset of all v2 hidden service
- directories.
-
- For versions before 0.2.2.1-alpha, Bob's OP opens a stream to each v0
- directory server's directory port via Tor. (He may re-use old circuits for
- this.) Over this stream, Bob's OP makes an HTTP 'POST' request, to a URL
- "/tor/rendezvous/publish" relative to the directory server's root,
- containing as its body Bob's service descriptor.
-
- Upon receiving a descriptor, the directory server checks the signature,
- and discards the descriptor if the signature does not match the enclosed
- public key. Next, the directory server checks the timestamp. If the
- timestamp is more than 24 hours in the past or more than 1 hour in the
- future, or the directory server already has a newer descriptor with the
- same public key, the server discards the descriptor. Otherwise, the
- server discards any older descriptors with the same public key and
- version format, and associates the new descriptor with the public key.
- The directory server remembers this descriptor for at least 24 hours
- after its timestamp. At least every 18 hours, Bob's OP uploads a
- fresh descriptor.
-
- If Bob's OP is configured to publish v2 descriptors, it does so to a
- changing subset of all v2 hidden service directories instead of the
- authoritative directory servers. Therefore, Bob's OP opens a stream via
- Tor to each responsible hidden service directory. (He may re-use old
- circuits for this.) Over this stream, Bob's OP makes an HTTP 'POST'
- request to a URL "/tor/rendezvous2/publish" relative to the hidden service
- directory's root, containing as its body Bob's service descriptor.
-
- At any time, there are 6 hidden service directories responsible for
- keeping replicas of a descriptor; they consist of 2 sets of 3 hidden
- service directories with consecutive onion IDs. Bob's OP learns about
- the complete list of hidden service directories by filtering the
- consensus status document received from the directory authorities. A
- hidden service directory is deemed responsible for all descriptor IDs in
- the interval from its direct predecessor, exclusive, to its own ID,
- inclusive; it further holds replicas for its 2 predecessors. A
- participant only trusts its own routing list and never learns about
- routing information from other parties.
-
- Bob's OP publishes a new v2 descriptor once an hour or whenever its
- content changes. V2 descriptors can be found by clients within a given
- time period of 24 hours, after which they change their ID as described
- under 1.3. If a published descriptor would be valid for less than 60
- minutes (= 2 x 30 minutes to allow the server to be 30 minutes behind
- and the client 30 minutes ahead), Bob's OP publishes the descriptor
- under the ID of both, the current and the next publication period.
-
-1.5. Alice receives a z.onion address.
-
- When Alice receives a pointer to a location-hidden service, it is as a
- hostname of the form "z.onion", where z is a base-32 encoding of a
- 10-octet hash of Bob's service's public key, computed as follows:
-
- 1. Let H = H(PK).
- 2. Let H' = the first 80 bits of H, considering each octet from
- most significant bit to least significant bit.
- 3. Generate a 16-character encoding of H', using base32 as defined
- in RFC 3548.
-
- (We only use 80 bits instead of the 160 bits from SHA1 because we
- don't need to worry about arbitrary collisions, and because it will
- make handling the url's more convenient.)
-
- [Yes, numbers are allowed at the beginning. See RFC 1123. -NM]
-
-1.6. Alice's OP retrieves a service descriptor.
-
- Alice's OP fetches the service descriptor from the fixed set of v0 hidden
- service directory servers and/or a changing subset of all v2 hidden service
- directories.
-
- For versions before 0.2.2.1-alpha, Alice's OP opens a stream to a directory
- server via Tor, and makes an HTTP GET request for the document
- '/tor/rendezvous/<z>', where '<z>' is replaced with the encoding of Bob's
- public key as described above. (She may re-use old circuits for this.) The
- directory replies with a 404 HTTP response if it does not recognize <z>,
- and otherwise returns Bob's most recently uploaded service descriptor.
-
- If Alice's OP receives a 404 response, it tries the other directory
- servers, and only fails the lookup if none recognize the public key hash.
-
- Upon receiving a service descriptor, Alice verifies with the same process
- as the directory server uses, described above in section 1.4.
-
- The directory server gives a 400 response if it cannot understand Alice's
- request.
-
- Alice should cache the descriptor locally, but should not use
- descriptors that are more than 24 hours older than their timestamp.
- [Caching may make her partitionable, but she fetched it anonymously,
- and we can't very well *not* cache it. -RD]
-
- If Alice's OP is running 0.2.1.10-alpha or higher, it fetches v2 hidden
- service descriptors. Versions before 0.2.2.1-alpha are fetching both v0 and
- v2 descriptors in parallel. Similar to the description in section 1.4,
- Alice's OP fetches a v2 descriptor from a randomly chosen hidden service
- directory out of the changing subset of 6 nodes. If the request is
- unsuccessful, Alice retries the other remaining responsible hidden service
- directories in a random order. Alice relies on Bob to care about a potential
- clock skew between the two by possibly storing two sets of descriptors (see
- end of section 1.4).
-
- Alice's OP opens a stream via Tor to the chosen v2 hidden service
- directory. (She may re-use old circuits for this.) Over this stream,
- Alice's OP makes an HTTP 'GET' request for the document
- "/tor/rendezvous2/<z>", where z is replaced with the encoding of the
- descriptor ID. The directory replies with a 404 HTTP response if it does
- not recognize <z>, and otherwise returns Bob's most recently uploaded
- service descriptor.
-
-1.7. Alice's OP establishes a rendezvous point.
-
- When Alice requests a connection to a given location-hidden service,
- and Alice's OP does not have an established circuit to that service,
- the OP builds a rendezvous circuit. It does this by establishing
- a circuit to a randomly chosen OR, and sending a
- RELAY_COMMAND_ESTABLISH_RENDEZVOUS cell to that OR. The body of that cell
- contains:
-
- RC Rendezvous cookie [20 octets]
-
- The rendezvous cookie is an arbitrary 20-byte value, chosen randomly by
- Alice's OP. Alice SHOULD choose a new rendezvous cookie for each new
- connection attempt.
-
- Upon receiving a RELAY_COMMAND_ESTABLISH_RENDEZVOUS cell, the OR associates
- the RC with the circuit that sent it. It replies to Alice with an empty
- RELAY_COMMAND_RENDEZVOUS_ESTABLISHED cell to indicate success.
-
- Alice's OP MUST NOT use the circuit which sent the cell for any purpose
- other than rendezvous with the given location-hidden service.
-
-1.8. Introduction: from Alice's OP to Introduction Point
-
- Alice builds a separate circuit to one of Bob's chosen introduction
- points, and sends it a RELAY_COMMAND_INTRODUCE1 cell containing:
-
- Cleartext
- PK_ID Identifier for Bob's PK [20 octets]
- Encrypted to Bob's PK: (in the v0 intro protocol)
- RP Rendezvous point's nickname [20 octets]
- RC Rendezvous cookie [20 octets]
- g^x Diffie-Hellman data, part 1 [128 octets]
- OR (in the v1 intro protocol)
- VER Version byte: set to 1. [1 octet]
- RP Rendezvous point nick or ID [42 octets]
- RC Rendezvous cookie [20 octets]
- g^x Diffie-Hellman data, part 1 [128 octets]
- OR (in the v2 intro protocol)
- VER Version byte: set to 2. [1 octet]
- IP Rendezvous point's address [4 octets]
- PORT Rendezvous point's OR port [2 octets]
- ID Rendezvous point identity ID [20 octets]
- KLEN Length of onion key [2 octets]
- KEY Rendezvous point onion key [KLEN octets]
- RC Rendezvous cookie [20 octets]
- g^x Diffie-Hellman data, part 1 [128 octets]
- OR (in the v3 intro protocol)
- VER Version byte: set to 3. [1 octet]
- AUTHT The auth type that is used [1 octet]
- AUTHL Length of auth data [2 octets]
- AUTHD Auth data [variable]
- TS A timestamp [4 octets]
- IP Rendezvous point's address [4 octets]
- PORT Rendezvous point's OR port [2 octets]
- ID Rendezvous point identity ID [20 octets]
- KLEN Length of onion key [2 octets]
- KEY Rendezvous point onion key [KLEN octets]
- RC Rendezvous cookie [20 octets]
- g^x Diffie-Hellman data, part 1 [128 octets]
-
- PK_ID is the hash of Bob's public key or the service key, depending on the
- hidden service descriptor version. In case of a v0 descriptor, Alice's OP
- uses Bob's public key. If Alice has downloaded a v2 descriptor, she uses
- the contained public key ("service-key").
-
- RP is NUL-padded and terminated. In version 0 of the intro protocol, RP
- must contain a nickname. In version 1, it must contain EITHER a nickname or
- an identity key digest that is encoded in hex and prefixed with a '$'.
-
- The hybrid encryption to Bob's PK works just like the hybrid
- encryption in CREATE cells (see tor-spec). Thus the payload of the
- version 0 RELAY_COMMAND_INTRODUCE1 cell on the wire will contain
- 20+42+16+20+20+128=246 bytes, and the version 1 and version 2
- introduction formats have other sizes.
-
- Through Tor 0.2.0.6-alpha, clients only generated the v0 introduction
- format, whereas hidden services have understood and accepted v0,
- v1, and v2 since 0.1.1.x. As of Tor 0.2.0.7-alpha and 0.1.2.18,
- clients switched to using the v2 intro format.
-
-1.9. Introduction: From the Introduction Point to Bob's OP
-
- If the Introduction Point recognizes PK_ID as a public key which has
- established a circuit for introductions as in 1.2 above, it sends the body
- of the cell in a new RELAY_COMMAND_INTRODUCE2 cell down the corresponding
- circuit. (If the PK_ID is unrecognized, the RELAY_COMMAND_INTRODUCE1 cell is
- discarded.)
-
- After sending the RELAY_COMMAND_INTRODUCE2 cell to Bob, the OR replies to
- Alice with an empty RELAY_COMMAND_INTRODUCE_ACK cell. If no
- RELAY_COMMAND_INTRODUCE2 cell can be sent, the OR replies to Alice with a
- non-empty cell to indicate an error. (The semantics of the cell body may be
- determined later; the current implementation sends a single '1' byte on
- failure.)
-
- When Bob's OP receives the RELAY_COMMAND_INTRODUCE2 cell, it decrypts it
- with the private key for the corresponding hidden service, and extracts the
- rendezvous point's nickname, the rendezvous cookie, and the value of g^x
- chosen by Alice.
-
-1.10. Rendezvous
-
- Bob's OP builds a new Tor circuit ending at Alice's chosen rendezvous
- point, and sends a RELAY_COMMAND_RENDEZVOUS1 cell along this circuit,
- containing:
- RC Rendezvous cookie [20 octets]
- g^y Diffie-Hellman [128 octets]
- KH Handshake digest [20 octets]
-
- (Bob's OP MUST NOT use this circuit for any other purpose.)
-
- If the RP recognizes RC, it relays the rest of the cell down the
- corresponding circuit in a RELAY_COMMAND_RENDEZVOUS2 cell, containing:
-
- g^y Diffie-Hellman [128 octets]
- KH Handshake digest [20 octets]
-
- (If the RP does not recognize the RC, it discards the cell and
- tears down the circuit.)
-
- When Alice's OP receives a RELAY_COMMAND_RENDEZVOUS2 cell on a circuit which
- has sent a RELAY_COMMAND_ESTABLISH_RENDEZVOUS cell but which has not yet
- received a reply, it uses g^y and H(g^xy) to complete the handshake as in
- the Tor circuit extend process: they establish a 60-octet string as
- K = SHA1(g^xy | [00]) | SHA1(g^xy | [01]) | SHA1(g^xy | [02])
- and generate
- KH = K[0..15]
- Kf = K[16..31]
- Kb = K[32..47]
-
- Subsequently, the rendezvous point passes relay cells, unchanged, from
- each of the two circuits to the other. When Alice's OP sends
- RELAY cells along the circuit, it first encrypts them with the
- Kf, then with all of the keys for the ORs in Alice's side of the circuit;
- and when Alice's OP receives RELAY cells from the circuit, it decrypts
- them with the keys for the ORs in Alice's side of the circuit, then
- decrypts them with Kb. Bob's OP does the same, with Kf and Kb
- interchanged.
-
-1.11. Creating streams
-
- To open TCP connections to Bob's location-hidden service, Alice's OP sends
- a RELAY_COMMAND_BEGIN cell along the established circuit, using the special
- address "", and a chosen port. Bob's OP chooses a destination IP and
- port, based on the configuration of the service connected to the circuit,
- and opens a TCP stream. From then on, Bob's OP treats the stream as an
- ordinary exit connection.
- [ Except he doesn't include addr in the connected cell or the end
- cell. -RD]
-
- Alice MAY send multiple RELAY_COMMAND_BEGIN cells along the circuit, to open
- multiple streams to Bob. Alice SHOULD NOT send RELAY_COMMAND_BEGIN cells
- for any other address along her circuit to Bob; if she does, Bob MUST reject
- them.
-
-2. Authentication and authorization.
-
- The rendezvous protocol as described in Section 1 provides a few options
- for implementing client-side authorization. There are two steps in the
- rendezvous protocol that can be used for performing client authorization:
- when downloading and decrypting parts of the hidden service descriptor and
- at Bob's Tor client before contacting the rendezvous point. A service
- provider can restrict access to his service at these two points to
- authorized clients only.
-
- There are currently two authorization protocols specified that are
- described in more detail below:
-
- 1. The first protocol allows a service provider to restrict access
- to clients with a previously received secret key only, but does not
- attempt to hide service activity from others.
-
- 2. The second protocol, albeit being feasible for a limited set of about
- 16 clients, performs client authorization and hides service activity
- from everyone but the authorized clients.
-
-2.1. Service with large-scale client authorization
-
- The first client authorization protocol aims at performing access control
- while consuming as few additional resources as possible. This is the "basic"
- authorization protocol. A service provider should be able to permit access
- to a large number of clients while denying access for everyone else.
- However, the price for scalability is that the service won't be able to hide
- its activity from unauthorized or formerly authorized clients.
-
- The main idea of this protocol is to encrypt the introduction-point part
- in hidden service descriptors to authorized clients using symmetric keys.
- This ensures that nobody else but authorized clients can learn which
- introduction points a service currently uses, nor can someone send a
- valid INTRODUCE1 message without knowing the introduction key. Therefore,
- a subsequent authorization at the introduction point is not required.
-
- A service provider generates symmetric "descriptor cookies" for his
- clients and distributes them outside of Tor. The suggested key size is
- 128 bits, so that descriptor cookies can be encoded in 22 base64 chars
- (which can hold up to 22 * 5 = 132 bits, leaving 4 bits to encode the
- authorization type (here: "0") and allow a client to distinguish this
- authorization protocol from others like the one proposed below).
- Typically, the contact information for a hidden service using this
- authorization protocol looks like this:
-
- v2cbb2l4lsnpio4q.onion Ll3X7Xgz9eHGKCCnlFH0uz
-
- When generating a hidden service descriptor, the service encrypts the
- introduction-point part with a single randomly generated symmetric
- 128-bit session key using AES-CTR as described for v2 hidden service
- descriptors in rend-spec. Afterwards, the service encrypts the session
- key to all descriptor cookies using AES. Authorized client should be able
- to efficiently find the session key that is encrypted for him/her, so
- that 4 octet long client ID are generated consisting of descriptor cookie
- and initialization vector. Descriptors always contain a number of
- encrypted session keys that is a multiple of 16 by adding fake entries.
- Encrypted session keys are ordered by client IDs in order to conceal
- addition or removal of authorized clients by the service provider.
-
- ATYPE Authorization type: set to 1. [1 octet]
- ALEN Number of clients := 1 + ((clients - 1) div 16) [1 octet]
- for each symmetric descriptor cookie:
- ID Client ID: H(descriptor cookie | IV)[:4] [4 octets]
- SKEY Session key encrypted with descriptor cookie [16 octets]
- (end of client-specific part)
- RND Random data [(15 - ((clients - 1) mod 16)) * 20 octets]
- IV AES initialization vector [16 octets]
- IPOS Intro points, encrypted with session key [remaining octets]
-
- An authorized client needs to configure Tor to use the descriptor cookie
- when accessing the hidden service. Therefore, a user adds the contact
- information that she received from the service provider to her torrc
- file. Upon downloading a hidden service descriptor, Tor finds the
- encrypted introduction-point part and attempts to decrypt it using the
- configured descriptor cookie. (In the rare event of two or more client
- IDs being equal a client tries to decrypt all of them.)
-
- Upon sending the introduction, the client includes her descriptor cookie
- as auth type "1" in the INTRODUCE2 cell that she sends to the service.
- The hidden service checks whether the included descriptor cookie is
- authorized to access the service and either responds to the introduction
- request, or not.
-
-2.2. Authorization for limited number of clients
-
- A second, more sophisticated client authorization protocol goes the extra
- mile of hiding service activity from unauthorized clients. This is the
- "stealth" authorization protocol. With all else being equal to the preceding
- authorization protocol, the second protocol publishes hidden service
- descriptors for each user separately and gets along with encrypting the
- introduction-point part of descriptors to a single client. This allows the
- service to stop publishing descriptors for removed clients. As long as a
- removed client cannot link descriptors issued for other clients to the
- service, it cannot derive service activity any more. The downside of this
- approach is limited scalability. Even though the distributed storage of
- descriptors (cf. proposal 114) tackles the problem of limited scalability to
- a certain extent, this protocol should not be used for services with more
- than 16 clients. (In fact, Tor should refuse to advertise services for more
- than this number of clients.)
-
- A hidden service generates an asymmetric "client key" and a symmetric
- "descriptor cookie" for each client. The client key is used as
- replacement for the service's permanent key, so that the service uses a
- different identity for each of his clients. The descriptor cookie is used
- to store descriptors at changing directory nodes that are unpredictable
- for anyone but service and client, to encrypt the introduction-point
- part, and to be included in INTRODUCE2 cells. Once the service has
- created client key and descriptor cookie, he tells them to the client
- outside of Tor. The contact information string looks similar to the one
- used by the preceding authorization protocol (with the only difference
- that it has "1" encoded as auth-type in the remaining 4 of 132 bits
- instead of "0" as before).
-
- When creating a hidden service descriptor for an authorized client, the
- hidden service uses the client key and descriptor cookie to compute
- secret ID part and descriptor ID:
-
- secret-id-part = H(time-period | descriptor-cookie | replica)
-
- descriptor-id = H(client-key[:10] | secret-id-part)
-
- The hidden service also replaces permanent-key in the descriptor with
- client-key and encrypts introduction-points with the descriptor cookie.
-
- ATYPE Authorization type: set to 2. [1 octet]
- IV AES initialization vector [16 octets]
- IPOS Intro points, encr. with descriptor cookie [remaining octets]
-
- When uploading descriptors, the hidden service needs to make sure that
- descriptors for different clients are not uploaded at the same time (cf.
- Section 1.1) which is also a limiting factor for the number of clients.
-
- When a client is requested to establish a connection to a hidden service
- it looks up whether it has any authorization data configured for that
- service. If the user has configured authorization data for authorization
- protocol "2", the descriptor ID is determined as described in the last
- paragraph. Upon receiving a descriptor, the client decrypts the
- introduction-point part using its descriptor cookie. Further, the client
- includes its descriptor cookie as auth-type "2" in INTRODUCE2 cells that
- it sends to the service.
-
-2.3. Hidden service configuration
-
- A hidden service that is meant to perform client authorization adds a
- new option HiddenServiceAuthorizeClient to its hidden service
- configuration. This option contains the authorization type which is
- either "basic" for the protocol described in 2.1 or "stealth" for the
- protocol in 2.2 and a comma-separated list of human-readable client
- names, so that Tor can create authorization data for these clients:
-
- HiddenServiceAuthorizeClient auth-type client-name,client-name,...
-
- If this option is configured, HiddenServiceVersion is automatically
- reconfigured to contain only version numbers of 2 or higher. There is
- a maximum of 512 client names for basic auth and a maximum of 16 for
- stealth auth.
-
- Tor stores all generated authorization data for the authorization
- protocols described in Sections 2.1 and 2.2 in a new file using the
- following file format:
-
- "client-name" human-readable client identifier NL
- "descriptor-cookie" 128-bit key ^= 22 base64 chars NL
-
- If the authorization protocol of Section 2.2 is used, Tor also generates
- and stores the following data:
-
- "client-key" NL a public key in PEM format
-
-2.4. Client configuration
-
- Clients need to make their authorization data known to Tor using another
- configuration option that contains a service name (mainly for the sake of
- convenience), the service address, and the descriptor cookie that is
- required to access a hidden service (the authorization protocol number is
- encoded in the descriptor cookie):
-
- HidServAuth service-name service-address descriptor-cookie
-
-3. Hidden service directory operation
-
- This section has been introduced with the v2 hidden service descriptor
- format. It describes all operations of the v2 hidden service descriptor
- fetching and propagation mechanism that are required for the protocol
- described in section 1 to succeed with v2 hidden service descriptors.
-
-3.1. Configuring as hidden service directory
-
- Every onion router that has its directory port open can decide whether it
- wants to store and serve hidden service descriptors. An onion router which
- is configured as such includes the "hidden-service-dir" flag in its router
- descriptors that it sends to directory authorities.
-
- The directory authorities include a new flag "HSDir" for routers that
- decided to provide storage for hidden service descriptors and that
- have been running for at least 24 hours.
-
-3.2. Accepting publish requests
-
- Hidden service directory nodes accept publish requests for v2 hidden service
- descriptors and store them to their local memory. (It is not necessary to
- make descriptors persistent, because after restarting, the onion router
- would not be accepted as a storing node anyway, because it has not been
- running for at least 24 hours.) All requests and replies are formatted as
- HTTP messages. Requests are initiated via BEGIN_DIR cells directed to
- the router's directory port, and formatted as HTTP POST requests to the URL
- "/tor/rendezvous2/publish" relative to the hidden service directory's root,
- containing as its body a v2 service descriptor.
-
- A hidden service directory node parses every received descriptor and only
- stores it when it thinks that it is responsible for storing that descriptor
- based on its own routing table. See section 1.4 for more information on how
- to determine responsibility for a certain descriptor ID.
-
-3.3. Processing fetch requests
-
- Hidden service directory nodes process fetch requests for hidden service
- descriptors by looking them up in their local memory. (They do not need to
- determine if they are responsible for the passed ID, because it does no harm
- if they deliver a descriptor for which they are not (any more) responsible.)
- All requests and replies are formatted as HTTP messages. Requests are
- initiated via BEGIN_DIR cells directed to the router's directory port,
- and formatted as HTTP GET requests for the document "/tor/rendezvous2/<z>",
- where z is replaced with the encoding of the descriptor ID.
-
diff --git a/doc/spec/socks-extensions.txt b/doc/spec/socks-extensions.txt
deleted file mode 100644
index 62d86acd9..000000000
--- a/doc/spec/socks-extensions.txt
+++ /dev/null
@@ -1,78 +0,0 @@
-Tor's extensions to the SOCKS protocol
-
-1. Overview
-
- The SOCKS protocol provides a generic interface for TCP proxies. Client
- software connects to a SOCKS server via TCP, and requests a TCP connection
- to another address and port. The SOCKS server establishes the connection,
- and reports success or failure to the client. After the connection has
- been established, the client application uses the TCP stream as usual.
-
- Tor supports SOCKS4 as defined in [1], SOCKS4A as defined in [2], and
- SOCKS5 as defined in [3].
-
- The stickiest issue for Tor in supporting clients, in practice, is forcing
- DNS lookups to occur at the OR side: if clients do their own DNS lookup,
- the DNS server can learn which addresses the client wants to reach.
- SOCKS4 supports addressing by IPv4 address; SOCKS4A is a kludge on top of
- SOCKS4 to allow addressing by hostname; SOCKS5 supports IPv4, IPv6, and
- hostnames.
-
-1.1. Extent of support
-
- Tor supports the SOCKS4, SOCKS4A, and SOCKS5 standards, except as follows:
-
- BOTH:
- - The BIND command is not supported.
-
- SOCKS4,4A:
- - SOCKS4 usernames are ignored.
-
- SOCKS5:
- - The (SOCKS5) "UDP ASSOCIATE" command is not supported.
- - IPv6 is not supported in CONNECT commands.
- - Only the "NO AUTHENTICATION" (SOCKS5) authentication method [00] is
- supported.
-
-2. Name lookup
-
- As an extension to SOCKS4A and SOCKS5, Tor implements a new command value,
- "RESOLVE" [F0]. When Tor receives a "RESOLVE" SOCKS command, it initiates
- a remote lookup of the hostname provided as the target address in the SOCKS
- request. The reply is either an error (if the address couldn't be
- resolved) or a success response. In the case of success, the address is
- stored in the portion of the SOCKS response reserved for remote IP address.
-
- (We support RESOLVE in SOCKS4 too, even though it is unnecessary.)
-
- For SOCKS5 only, we support reverse resolution with a new command value,
- "RESOLVE_PTR" [F1]. In response to a "RESOLVE_PTR" SOCKS5 command with
- an IPv4 address as its target, Tor attempts to find the canonical
- hostname for that IPv4 record, and returns it in the "server bound
- address" portion of the reply.
- (This command was not supported before Tor 0.1.2.2-alpha.)
-
-3. Other command extensions.
-
- Tor 0.1.2.4-alpha added a new command value: "CONNECT_DIR" [F2].
- In this case, Tor will open an encrypted direct TCP connection to the
- directory port of the Tor server specified by address:port (the port
- specified should be the ORPort of the server). It uses a one-hop tunnel
- and a "BEGIN_DIR" relay cell to accomplish this secure connection.
-
- The F2 command value was removed in Tor 0.2.0.10-alpha in favor of a
- new use_begindir flag in edge_connection_t.
-
-4. HTTP-resistance
-
- Tor checks the first byte of each SOCKS request to see whether it looks
- more like an HTTP request (that is, it starts with a "G", "H", or "P"). If
- so, Tor returns a small webpage, telling the user that his/her browser is
- misconfigured. This is helpful for the many users who mistakenly try to
- use Tor as an HTTP proxy instead of a SOCKS proxy.
-
-References:
- [1] http://archive.socks.permeo.com/protocol/socks4.protocol
- [2] http://archive.socks.permeo.com/protocol/socks4a.protocol
- [3] SOCKS5: RFC1928
-
diff --git a/doc/spec/tor-fw-helper-spec.txt b/doc/spec/tor-fw-helper-spec.txt
deleted file mode 100644
index 0068b2655..000000000
--- a/doc/spec/tor-fw-helper-spec.txt
+++ /dev/null
@@ -1,57 +0,0 @@
-
- Tor's (little) Firewall Helper specification
- Jacob Appelbaum
-
-0. Preface
-
- This document describes issues faced by Tor users who are behind NAT devices
- and wish to share their resources with the rest of the Tor network. It also
- explains a possible solution for some NAT devices.
-
-1. Overview
-
- Tor users often wish to relay traffic for the Tor network and their upstream
- firewall thwarts their attempted generosity. Automatic port forwarding
- configuration for many consumer NAT devices is often available with two common
- protocols NAT-PMP[0] and UPnP[1].
-
-2. Implementation
-
- tor-fw-helper is a program that implements basic port forwarding requests; it
- may be used alone or called from Tor itself.
-
-2.1 Output format
-
- When tor-fw-helper has completed the requested action successfully, it will
- report the following message to standard output:
-
- tor-fw-helper: SUCCESS
-
- If tor-fw-helper was unable to complete the requested action successfully, it
- will report the following message to standard error:
-
- tor-fw-helper: FAILURE
-
- All informational messages are printed to standard output; all error messages
- are printed to standard error. Messages other than SUCCESS and FAILURE
- may be printed by any compliant tor-fw-helper.
-
-2.2 Output format stability
-
- The above SUCCESS and FAILURE messages are the only stable output formats
- provided by this specification. tor-fw-helper-spec compliant implementations
- must return SUCCESS or FAILURE as defined above.
-
-3. Security Concerns
-
- It is probably best to hand configure port forwarding and in the process, we
- suggest disabling NAT-PMP and/or UPnP. This is of course absolutely confusing
- to users and so we support automatic, non-authenticated NAT port mapping
- protocols with compliant tor-fw-helper applications.
-
- NAT should not be considered a security boundary. NAT-PMP and UPnP are hacks
- to deal with the shortcomings of user education about TCP/IP, IPv4 shortages,
- and of course, NAT devices that suffer from horrible user interface design.
-
-[0] http://en.wikipedia.org/wiki/NAT_Port_Mapping_Protocol
-[1] http://en.wikipedia.org/wiki/Universal_Plug_and_Play
diff --git a/doc/spec/tor-spec.txt b/doc/spec/tor-spec.txt
deleted file mode 100644
index 91ad561b8..000000000
--- a/doc/spec/tor-spec.txt
+++ /dev/null
@@ -1,1004 +0,0 @@
-
- Tor Protocol Specification
-
- Roger Dingledine
- Nick Mathewson
-
-Note: This document aims to specify Tor as implemented in 0.2.1.x. Future
-versions of Tor may implement improved protocols, and compatibility is not
-guaranteed. Compatibility notes are given for versions 0.1.1.15-rc and
-later; earlier versions are not compatible with the Tor network as of this
-writing.
-
-This specification is not a design document; most design criteria
-are not examined. For more information on why Tor acts as it does,
-see tor-design.pdf.
-
-0. Preliminaries
-
- The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
- NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
- "OPTIONAL" in this document are to be interpreted as described in
- RFC 2119.
-
-0.1. Notation and encoding
-
- PK -- a public key.
- SK -- a private key.
- K -- a key for a symmetric cipher.
-
- a|b -- concatenation of 'a' and 'b'.
-
- [A0 B1 C2] -- a three-byte sequence, containing the bytes with
- hexadecimal values A0, B1, and C2, in that order.
-
- All numeric values are encoded in network (big-endian) order.
-
- H(m) -- a cryptographic hash of m.
-
-0.2. Security parameters
-
- Tor uses a stream cipher, a public-key cipher, the Diffie-Hellman
- protocol, and a hash function.
-
- KEY_LEN -- the length of the stream cipher's key, in bytes.
-
- PK_ENC_LEN -- the length of a public-key encrypted message, in bytes.
- PK_PAD_LEN -- the number of bytes added in padding for public-key
- encryption, in bytes. (The largest number of bytes that can be encrypted
- in a single public-key operation is therefore PK_ENC_LEN-PK_PAD_LEN.)
-
- DH_LEN -- the number of bytes used to represent a member of the
- Diffie-Hellman group.
- DH_SEC_LEN -- the number of bytes used in a Diffie-Hellman private key (x).
-
- HASH_LEN -- the length of the hash function's output, in bytes.
-
- PAYLOAD_LEN -- The longest allowable cell payload, in bytes. (509)
-
- CELL_LEN -- The length of a Tor cell, in bytes.
-
-0.3. Ciphers
-
- For a stream cipher, we use 128-bit AES in counter mode, with an IV of all
- 0 bytes.
-
- For a public-key cipher, we use RSA with 1024-bit keys and a fixed
- exponent of 65537. We use OAEP-MGF1 padding, with SHA-1 as its digest
- function. We leave the optional "Label" parameter unset. (For OAEP
- padding, see ftp://ftp.rsasecurity.com/pub/pkcs/pkcs-1/pkcs-1v2-1.pdf)
-
- For Diffie-Hellman, we use a generator (g) of 2. For the modulus (p), we
- use the 1024-bit safe prime from rfc2409 section 6.2 whose hex
- representation is:
-
- "FFFFFFFFFFFFFFFFC90FDAA22168C234C4C6628B80DC1CD129024E08"
- "8A67CC74020BBEA63B139B22514A08798E3404DDEF9519B3CD3A431B"
- "302B0A6DF25F14374FE1356D6D51C245E485B576625E7EC6F44C42E9"
- "A637ED6B0BFF5CB6F406B7EDEE386BFB5A899FA5AE9F24117C4B1FE6"
- "49286651ECE65381FFFFFFFFFFFFFFFF"
-
- As an optimization, implementations SHOULD choose DH private keys (x) of
- 320 bits. Implementations that do this MUST never use any DH key more
- than once.
- [May other implementations reuse their DH keys?? -RD]
- [Probably not. Conceivably, you could get away with changing DH keys once
- per second, but there are too many oddball attacks for me to be
- comfortable that this is safe. -NM]
-
- For a hash function, we use SHA-1.
-
- KEY_LEN=16.
- DH_LEN=128; DH_SEC_LEN=40.
- PK_ENC_LEN=128; PK_PAD_LEN=42.
- HASH_LEN=20.
-
- When we refer to "the hash of a public key", we mean the SHA-1 hash of the
- DER encoding of an ASN.1 RSA public key (as specified in PKCS.1).
-
- All "random" values should be generated with a cryptographically strong
- random number generator, unless otherwise noted.
-
- The "hybrid encryption" of a byte sequence M with a public key PK is
- computed as follows:
- 1. If M is less than PK_ENC_LEN-PK_PAD_LEN, pad and encrypt M with PK.
- 2. Otherwise, generate a KEY_LEN byte random key K.
- Let M1 = the first PK_ENC_LEN-PK_PAD_LEN-KEY_LEN bytes of M,
- and let M2 = the rest of M.
- Pad and encrypt K|M1 with PK. Encrypt M2 with our stream cipher,
- using the key K. Concatenate these encrypted values.
- [XXX Note that this "hybrid encryption" approach does not prevent
- an attacker from adding or removing bytes to the end of M. It also
- allows attackers to modify the bytes not covered by the OAEP --
- see Goldberg's PET2006 paper for details. We will add a MAC to this
- scheme one day. -RD]
-
-0.4. Other parameter values
-
- CELL_LEN=512
-
-1. System overview
-
- Tor is a distributed overlay network designed to anonymize
- low-latency TCP-based applications such as web browsing, secure shell,
- and instant messaging. Clients choose a path through the network and
- build a ``circuit'', in which each node (or ``onion router'' or ``OR'')
- in the path knows its predecessor and successor, but no other nodes in
- the circuit. Traffic flowing down the circuit is sent in fixed-size
- ``cells'', which are unwrapped by a symmetric key at each node (like
- the layers of an onion) and relayed downstream.
-
-1.1. Keys and names
-
- Every Tor server has multiple public/private keypairs:
-
- - A long-term signing-only "Identity key" used to sign documents and
- certificates, and used to establish server identity.
- - A medium-term "Onion key" used to decrypt onion skins when accepting
- circuit extend attempts. (See 5.1.) Old keys MUST be accepted for at
- least one week after they are no longer advertised. Because of this,
- servers MUST retain old keys for a while after they're rotated.
- - A short-term "Connection key" used to negotiate TLS connections.
- Tor implementations MAY rotate this key as often as they like, and
- SHOULD rotate this key at least once a day.
-
- Tor servers are also identified by "nicknames"; these are specified in
- dir-spec.txt.
-
-2. Connections
-
- Connections between two Tor servers, or between a client and a server,
- use TLS/SSLv3 for link authentication and encryption. All
- implementations MUST support the SSLv3 ciphersuite
- "SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA", and SHOULD support the TLS
- ciphersuite "TLS_DHE_RSA_WITH_AES_128_CBC_SHA" if it is available.
-
- There are three acceptable ways to perform a TLS handshake when
- connecting to a Tor server: "certificates up-front", "renegotiation", and
- "backwards-compatible renegotiation". ("Backwards-compatible
- renegotiation" is, as the name implies, compatible with both other
- handshake types.)
-
- Before Tor 0.2.0.21, only "certificates up-front" was supported. In Tor
- 0.2.0.21 or later, "backwards-compatible renegotiation" is used.
-
- In "certificates up-front", the connection initiator always sends a
- two-certificate chain, consisting of an X.509 certificate using a
- short-term connection public key and a second, self- signed X.509
- certificate containing its identity key. The other party sends a similar
- certificate chain. The initiator's ClientHello MUST NOT include any
- ciphersuites other than:
- TLS_DHE_RSA_WITH_AES_256_CBC_SHA
- TLS_DHE_RSA_WITH_AES_128_CBC_SHA
- SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA
- SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA
-
- In "renegotiation", the connection initiator sends no certificates, and
- the responder sends a single connection certificate. Once the TLS
- handshake is complete, the initiator renegotiates the handshake, with each
- party sending a two-certificate chain as in "certificates up-front".
- The initiator's ClientHello MUST include at least one ciphersuite not in
- the list above. The responder SHOULD NOT select any ciphersuite besides
- those in the list above.
- [The above "should not" is because some of the ciphers that
- clients list may be fake.]
-
- In "backwards-compatible renegotiation", the connection initiator's
- ClientHello MUST include at least one ciphersuite other than those listed
- above. The connection responder examines the initiator's ciphersuite list
- to see whether it includes any ciphers other than those included in the
- list above. If extra ciphers are included, the responder proceeds as in
- "renegotiation": it sends a single certificate and does not request
- client certificates. Otherwise (in the case that no extra ciphersuites
- are included in the ClientHello) the responder proceeds as in
- "certificates up-front": it requests client certificates, and sends a
- two-certificate chain. In either case, once the responder has sent its
- certificate or certificates, the initiator counts them. If two
- certificates have been sent, it proceeds as in "certificates up-front";
- otherwise, it proceeds as in "renegotiation".
-
- All new implementations of the Tor server protocol MUST support
- "backwards-compatible renegotiation"; clients SHOULD do this too. If
- this is not possible, new client implementations MUST support both
- "renegotiation" and "certificates up-front" and use the router's
- published link protocols list (see dir-spec.txt on the "protocols" entry)
- to decide which to use.
-
- In all of the above handshake variants, certificates sent in the clear
- SHOULD NOT include any strings to identify the host as a Tor server. In
- the "renegotiation" and "backwards-compatible renegotiation" steps, the
- initiator SHOULD choose a list of ciphersuites and TLS extensions
- to mimic one used by a popular web browser.
-
- Responders MUST NOT select any TLS ciphersuite that lacks ephemeral keys,
- or whose symmetric keys are less then KEY_LEN bits, or whose digests are
- less than HASH_LEN bits. Responders SHOULD NOT select any SSLv3
- ciphersuite other than those listed above.
-
- Even though the connection protocol is identical, we will think of the
- initiator as either an onion router (OR) if it is willing to relay
- traffic for other Tor users, or an onion proxy (OP) if it only handles
- local requests. Onion proxies SHOULD NOT provide long-term-trackable
- identifiers in their handshakes.
-
- In all handshake variants, once all certificates are exchanged, all
- parties receiving certificates must confirm that the identity key is as
- expected. (When initiating a connection, the expected identity key is
- the one given in the directory; when creating a connection because of an
- EXTEND cell, the expected identity key is the one given in the cell.) If
- the key is not as expected, the party must close the connection.
-
- When connecting to an OR, all parties SHOULD reject the connection if that
- OR has a malformed or missing certificate. When accepting an incoming
- connection, an OR SHOULD NOT reject incoming connections from parties with
- malformed or missing certificates. (However, an OR should not believe
- that an incoming connection is from another OR unless the certificates
- are present and well-formed.)
-
- [Before version 0.1.2.8-rc, ORs rejected incoming connections from ORs and
- OPs alike if their certificates were missing or malformed.]
-
- Once a TLS connection is established, the two sides send cells
- (specified below) to one another. Cells are sent serially. All
- cells are CELL_LEN bytes long. Cells may be sent embedded in TLS
- records of any size or divided across TLS records, but the framing
- of TLS records MUST NOT leak information about the type or contents
- of the cells.
-
- TLS connections are not permanent. Either side MAY close a connection
- if there are no circuits running over it and an amount of time
- (KeepalivePeriod, defaults to 5 minutes) has passed since the last time
- any traffic was transmitted over the TLS connection. Clients SHOULD
- also hold a TLS connection with no circuits open, if it is likely that a
- circuit will be built soon using that connection.
-
- (As an exception, directory servers may try to stay connected to all of
- the ORs -- though this will be phased out for the Tor 0.1.2.x release.)
-
- To avoid being trivially distinguished from servers, client-only Tor
- instances are encouraged but not required to use a two-certificate chain
- as well. Clients SHOULD NOT keep using the same certificates when
- their IP address changes. Clients MAY send no certificates at all.
-
-3. Cell Packet format
-
- The basic unit of communication for onion routers and onion
- proxies is a fixed-width "cell".
-
- On a version 1 connection, each cell contains the following
- fields:
-
- CircID [2 bytes]
- Command [1 byte]
- Payload (padded with 0 bytes) [PAYLOAD_LEN bytes]
-
- On a version 2 connection, all cells are as in version 1 connections,
- except for the initial VERSIONS cell, whose format is:
-
- Circuit [2 octets; set to 0]
- Command [1 octet; set to 7 for VERSIONS]
- Length [2 octets; big-endian integer]
- Payload [Length bytes]
-
- The CircID field determines which circuit, if any, the cell is
- associated with.
-
- The 'Command' field holds one of the following values:
- 0 -- PADDING (Padding) (See Sec 7.2)
- 1 -- CREATE (Create a circuit) (See Sec 5.1)
- 2 -- CREATED (Acknowledge create) (See Sec 5.1)
- 3 -- RELAY (End-to-end data) (See Sec 5.5 and 6)
- 4 -- DESTROY (Stop using a circuit) (See Sec 5.4)
- 5 -- CREATE_FAST (Create a circuit, no PK) (See Sec 5.1)
- 6 -- CREATED_FAST (Circuit created, no PK) (See Sec 5.1)
- 7 -- VERSIONS (Negotiate proto version) (See Sec 4)
- 8 -- NETINFO (Time and address info) (See Sec 4)
- 9 -- RELAY_EARLY (End-to-end data; limited)(See Sec 5.6)
-
- The interpretation of 'Payload' depends on the type of the cell.
- PADDING: Payload is unused.
- CREATE: Payload contains the handshake challenge.
- CREATED: Payload contains the handshake response.
- RELAY: Payload contains the relay header and relay body.
- DESTROY: Payload contains a reason for closing the circuit.
- (see 5.4)
- Upon receiving any other value for the command field, an OR must
- drop the cell. Since more cell types may be added in the future, ORs
- should generally not warn when encountering unrecognized commands.
-
- The payload is padded with 0 bytes.
-
- PADDING cells are currently used to implement connection keepalive.
- If there is no other traffic, ORs and OPs send one another a PADDING
- cell every few minutes.
-
- CREATE, CREATED, and DESTROY cells are used to manage circuits;
- see section 5 below.
-
- RELAY cells are used to send commands and data along a circuit; see
- section 6 below.
-
- VERSIONS and NETINFO cells are used to set up connections. See section 4
- below.
-
-4. Negotiating and initializing connections
-
-4.1. Negotiating versions with VERSIONS cells
-
- There are multiple instances of the Tor link connection protocol. Any
- connection negotiated using the "certificates up front" handshake (see
- section 2 above) is "version 1". In any connection where both parties
- have behaved as in the "renegotiation" handshake, the link protocol
- version is 2 or higher.
-
- To determine the version, in any connection where the "renegotiation"
- handshake was used (that is, where the server sent only one certificate
- at first and where the client did not send any certificates until
- renegotiation), both parties MUST send a VERSIONS cell immediately after
- the renegotiation is finished, before any other cells are sent. Parties
- MUST NOT send any other cells on a connection until they have received a
- VERSIONS cell.
-
- The payload in a VERSIONS cell is a series of big-endian two-byte
- integers. Both parties MUST select as the link protocol version the
- highest number contained both in the VERSIONS cell they sent and in the
- versions cell they received. If they have no such version in common,
- they cannot communicate and MUST close the connection.
-
- Since the version 1 link protocol does not use the "renegotiation"
- handshake, implementations MUST NOT list version 1 in their VERSIONS
- cell.
-
-4.2. NETINFO cells
-
- If version 2 or higher is negotiated, each party sends the other a
- NETINFO cell. The cell's payload is:
-
- Timestamp [4 bytes]
- Other OR's address [variable]
- Number of addresses [1 byte]
- This OR's addresses [variable]
-
- The address format is a type/length/value sequence as given in section
- 6.4 below. The timestamp is a big-endian unsigned integer number of
- seconds since the Unix epoch.
-
- Implementations MAY use the timestamp value to help decide if their
- clocks are skewed. Initiators MAY use "other OR's address" to help
- learn which address their connections are originating from, if they do
- not know it. Initiators SHOULD use "this OR's address" to make sure
- that they have connected to another OR at its canonical address.
-
- [As of 0.2.0.23-rc, implementations use none of the above values.]
-
-
-5. Circuit management
-
-5.1. CREATE and CREATED cells
-
- Users set up circuits incrementally, one hop at a time. To create a
- new circuit, OPs send a CREATE cell to the first node, with the
- first half of the DH handshake; that node responds with a CREATED
- cell with the second half of the DH handshake plus the first 20 bytes
- of derivative key data (see section 5.2). To extend a circuit past
- the first hop, the OP sends an EXTEND relay cell (see section 5)
- which instructs the last node in the circuit to send a CREATE cell
- to extend the circuit.
-
- The payload for a CREATE cell is an 'onion skin', which consists
- of the first step of the DH handshake data (also known as g^x).
- This value is hybrid-encrypted (see 0.3) to Bob's onion key, giving
- an onion-skin of:
- PK-encrypted:
- Padding [PK_PAD_LEN bytes]
- Symmetric key [KEY_LEN bytes]
- First part of g^x [PK_ENC_LEN-PK_PAD_LEN-KEY_LEN bytes]
- Symmetrically encrypted:
- Second part of g^x [DH_LEN-(PK_ENC_LEN-PK_PAD_LEN-KEY_LEN)
- bytes]
-
- The relay payload for an EXTEND relay cell consists of:
- Address [4 bytes]
- Port [2 bytes]
- Onion skin [DH_LEN+KEY_LEN+PK_PAD_LEN bytes]
- Identity fingerprint [HASH_LEN bytes]
-
- The port and address field denote the IPv4 address and port of the next
- onion router in the circuit; the public key hash is the hash of the PKCS#1
- ASN1 encoding of the next onion router's identity (signing) key. (See 0.3
- above.) Including this hash allows the extending OR verify that it is
- indeed connected to the correct target OR, and prevents certain
- man-in-the-middle attacks.
-
- The payload for a CREATED cell, or the relay payload for an
- EXTENDED cell, contains:
- DH data (g^y) [DH_LEN bytes]
- Derivative key data (KH) [HASH_LEN bytes] <see 5.2 below>
-
- The CircID for a CREATE cell is an arbitrarily chosen 2-byte integer,
- selected by the node (OP or OR) that sends the CREATE cell. To prevent
- CircID collisions, when one node sends a CREATE cell to another, it chooses
- from only one half of the possible values based on the ORs' public
- identity keys: if the sending node has a lower key, it chooses a CircID with
- an MSB of 0; otherwise, it chooses a CircID with an MSB of 1.
-
- (An OP with no public key MAY choose any CircID it wishes, since an OP
- never needs to process a CREATE cell.)
-
- Public keys are compared numerically by modulus.
-
- As usual with DH, x and y MUST be generated randomly.
-
-5.1.1. CREATE_FAST/CREATED_FAST cells
-
- When initializing the first hop of a circuit, the OP has already
- established the OR's identity and negotiated a secret key using TLS.
- Because of this, it is not always necessary for the OP to perform the
- public key operations to create a circuit. In this case, the
- OP MAY send a CREATE_FAST cell instead of a CREATE cell for the first
- hop only. The OR responds with a CREATED_FAST cell, and the circuit is
- created.
-
- A CREATE_FAST cell contains:
-
- Key material (X) [HASH_LEN bytes]
-
- A CREATED_FAST cell contains:
-
- Key material (Y) [HASH_LEN bytes]
- Derivative key data [HASH_LEN bytes] (See 5.2 below)
-
- The values of X and Y must be generated randomly.
-
- If an OR sees a circuit created with CREATE_FAST, the OR is sure to be the
- first hop of a circuit. ORs SHOULD reject attempts to create streams with
- RELAY_BEGIN exiting the circuit at the first hop: letting Tor be used as a
- single hop proxy makes exit nodes a more attractive target for compromise.
-
-5.2. Setting circuit keys
-
- Once the handshake between the OP and an OR is completed, both can
- now calculate g^xy with ordinary DH. Before computing g^xy, both client
- and server MUST verify that the received g^x or g^y value is not degenerate;
- that is, it must be strictly greater than 1 and strictly less than p-1
- where p is the DH modulus. Implementations MUST NOT complete a handshake
- with degenerate keys. Implementations MUST NOT discard other "weak"
- g^x values.
-
- (Discarding degenerate keys is critical for security; if bad keys
- are not discarded, an attacker can substitute the server's CREATED
- cell's g^y with 0 or 1, thus creating a known g^xy and impersonating
- the server. Discarding other keys may allow attacks to learn bits of
- the private key.)
-
- If CREATE or EXTEND is used to extend a circuit, the client and server
- base their key material on K0=g^xy, represented as a big-endian unsigned
- integer.
-
- If CREATE_FAST is used, the client and server base their key material on
- K0=X|Y.
-
- From the base key material K0, they compute KEY_LEN*2+HASH_LEN*3 bytes of
- derivative key data as
- K = H(K0 | [00]) | H(K0 | [01]) | H(K0 | [02]) | ...
-
- The first HASH_LEN bytes of K form KH; the next HASH_LEN form the forward
- digest Df; the next HASH_LEN 41-60 form the backward digest Db; the next
- KEY_LEN 61-76 form Kf, and the final KEY_LEN form Kb. Excess bytes from K
- are discarded.
-
- KH is used in the handshake response to demonstrate knowledge of the
- computed shared key. Df is used to seed the integrity-checking hash
- for the stream of data going from the OP to the OR, and Db seeds the
- integrity-checking hash for the data stream from the OR to the OP. Kf
- is used to encrypt the stream of data going from the OP to the OR, and
- Kb is used to encrypt the stream of data going from the OR to the OP.
-
-5.3. Creating circuits
-
- When creating a circuit through the network, the circuit creator
- (OP) performs the following steps:
-
- 1. Choose an onion router as an exit node (R_N), such that the onion
- router's exit policy includes at least one pending stream that
- needs a circuit (if there are any).
-
- 2. Choose a chain of (N-1) onion routers
- (R_1...R_N-1) to constitute the path, such that no router
- appears in the path twice.
-
- 3. If not already connected to the first router in the chain,
- open a new connection to that router.
-
- 4. Choose a circID not already in use on the connection with the
- first router in the chain; send a CREATE cell along the
- connection, to be received by the first onion router.
-
- 5. Wait until a CREATED cell is received; finish the handshake
- and extract the forward key Kf_1 and the backward key Kb_1.
-
- 6. For each subsequent onion router R (R_2 through R_N), extend
- the circuit to R.
-
- To extend the circuit by a single onion router R_M, the OP performs
- these steps:
-
- 1. Create an onion skin, encrypted to R_M's public onion key.
-
- 2. Send the onion skin in a relay EXTEND cell along
- the circuit (see section 5).
-
- 3. When a relay EXTENDED cell is received, verify KH, and
- calculate the shared keys. The circuit is now extended.
-
- When an onion router receives an EXTEND relay cell, it sends a CREATE
- cell to the next onion router, with the enclosed onion skin as its
- payload. As special cases, if the extend cell includes a digest of
- all zeroes, or asks to extend back to the relay that sent the extend
- cell, the circuit will fail and be torn down. The initiating onion
- router chooses some circID not yet used on the connection between the
- two onion routers. (But see section 5.1. above, concerning choosing
- circIDs based on lexicographic order of nicknames.)
-
- When an onion router receives a CREATE cell, if it already has a
- circuit on the given connection with the given circID, it drops the
- cell. Otherwise, after receiving the CREATE cell, it completes the
- DH handshake, and replies with a CREATED cell. Upon receiving a
- CREATED cell, an onion router packs it payload into an EXTENDED relay
- cell (see section 5), and sends that cell up the circuit. Upon
- receiving the EXTENDED relay cell, the OP can retrieve g^y.
-
- (As an optimization, OR implementations may delay processing onions
- until a break in traffic allows time to do so without harming
- network latency too greatly.)
-
-5.3.1. Canonical connections
-
- It is possible for an attacker to launch a man-in-the-middle attack
- against a connection by telling OR Alice to extend to OR Bob at some
- address X controlled by the attacker. The attacker cannot read the
- encrypted traffic, but the attacker is now in a position to count all
- bytes sent between Alice and Bob (assuming Alice was not already
- connected to Bob.)
-
- To prevent this, when an OR we gets an extend request, it SHOULD use an
- existing OR connection if the ID matches, and ANY of the following
- conditions hold:
- - The IP matches the requested IP.
- - The OR knows that the IP of the connection it's using is canonical
- because it was listed in the NETINFO cell.
- - The OR knows that the IP of the connection it's using is canonical
- because it was listed in the server descriptor.
-
- [This is not implemented in Tor 0.2.0.23-rc.]
-
-5.4. Tearing down circuits
-
- Circuits are torn down when an unrecoverable error occurs along
- the circuit, or when all streams on a circuit are closed and the
- circuit's intended lifetime is over. Circuits may be torn down
- either completely or hop-by-hop.
-
- To tear down a circuit completely, an OR or OP sends a DESTROY
- cell to the adjacent nodes on that circuit, using the appropriate
- direction's circID.
-
- Upon receiving an outgoing DESTROY cell, an OR frees resources
- associated with the corresponding circuit. If it's not the end of
- the circuit, it sends a DESTROY cell for that circuit to the next OR
- in the circuit. If the node is the end of the circuit, then it tears
- down any associated edge connections (see section 6.1).
-
- After a DESTROY cell has been processed, an OR ignores all data or
- destroy cells for the corresponding circuit.
-
- To tear down part of a circuit, the OP may send a RELAY_TRUNCATE cell
- signaling a given OR (Stream ID zero). That OR sends a DESTROY
- cell to the next node in the circuit, and replies to the OP with a
- RELAY_TRUNCATED cell.
-
- [Note: If an OR receives a TRUNCATE cell and it has any RELAY cells
- still queued on the circuit for the next node it will drop them
- without sending them. This is not considered conformant behavior,
- but it probably won't get fixed until a later version of Tor. Thus,
- clients SHOULD NOT send a TRUNCATE cell to a node running any current
- version of Tor if a) they have sent relay cells through that node,
- and b) they aren't sure whether those cells have been sent on yes.]
-
- When an unrecoverable error occurs along one connection in a
- circuit, the nodes on either side of the connection should, if they
- are able, act as follows: the node closer to the OP should send a
- RELAY_TRUNCATED cell towards the OP; the node farther from the OP
- should send a DESTROY cell down the circuit.
-
- The payload of a RELAY_TRUNCATED or DESTROY cell contains a single octet,
- describing why the circuit is being closed or truncated. When sending a
- TRUNCATED or DESTROY cell because of another TRUNCATED or DESTROY cell,
- the error code should be propagated. The origin of a circuit always sets
- this error code to 0, to avoid leaking its version.
-
- The error codes are:
- 0 -- NONE (No reason given.)
- 1 -- PROTOCOL (Tor protocol violation.)
- 2 -- INTERNAL (Internal error.)
- 3 -- REQUESTED (A client sent a TRUNCATE command.)
- 4 -- HIBERNATING (Not currently operating; trying to save bandwidth.)
- 5 -- RESOURCELIMIT (Out of memory, sockets, or circuit IDs.)
- 6 -- CONNECTFAILED (Unable to reach server.)
- 7 -- OR_IDENTITY (Connected to server, but its OR identity was not
- as expected.)
- 8 -- OR_CONN_CLOSED (The OR connection that was carrying this circuit
- died.)
- 9 -- FINISHED (The circuit has expired for being dirty or old.)
- 10 -- TIMEOUT (Circuit construction took too long)
- 11 -- DESTROYED (The circuit was destroyed w/o client TRUNCATE)
- 12 -- NOSUCHSERVICE (Request for unknown hidden service)
-
-5.5. Routing relay cells
-
- When an OR receives a RELAY or RELAY_EARLY cell, it checks the cell's
- circID and determines whether it has a corresponding circuit along that
- connection. If not, the OR drops the cell.
-
- Otherwise, if the OR is not at the OP edge of the circuit (that is,
- either an 'exit node' or a non-edge node), it de/encrypts the payload
- with the stream cipher, as follows:
- 'Forward' relay cell (same direction as CREATE):
- Use Kf as key; decrypt.
- 'Back' relay cell (opposite direction from CREATE):
- Use Kb as key; encrypt.
- Note that in counter mode, decrypt and encrypt are the same operation.
-
- The OR then decides whether it recognizes the relay cell, by
- inspecting the payload as described in section 6.1 below. If the OR
- recognizes the cell, it processes the contents of the relay cell.
- Otherwise, it passes the decrypted relay cell along the circuit if
- the circuit continues. If the OR at the end of the circuit
- encounters an unrecognized relay cell, an error has occurred: the OR
- sends a DESTROY cell to tear down the circuit.
-
- When a relay cell arrives at an OP, the OP decrypts the payload
- with the stream cipher as follows:
- OP receives data cell:
- For I=N...1,
- Decrypt with Kb_I. If the payload is recognized (see
- section 6..1), then stop and process the payload.
-
- For more information, see section 6 below.
-
-5.6. Handling relay_early cells
-
- A RELAY_EARLY cell is designed to limit the length any circuit can reach.
- When an OR receives a RELAY_EARLY cell, and the next node in the circuit
- is speaking v2 of the link protocol or later, the OR relays the cell as a
- RELAY_EARLY cell. Otherwise, it relays it as a RELAY cell.
-
- If a node ever receives more than 8 RELAY_EARLY cells on a given
- outbound circuit, it SHOULD close the circuit. (For historical reasons,
- we don't limit the number of inbound RELAY_EARLY cells; they should
- be harmless anyway because clients won't accept extend requests. See
- bug 1038.)
-
- When speaking v2 of the link protocol or later, clients MUST only send
- EXTEND cells inside RELAY_EARLY cells. Clients SHOULD send the first ~8
- RELAY cells that are not targeted at the first hop of any circuit as
- RELAY_EARLY cells too, in order to partially conceal the circuit length.
-
- [In a future version of Tor, servers will reject any EXTEND cell not
- received in a RELAY_EARLY cell. See proposal 110.]
-
-6. Application connections and stream management
-
-6.1. Relay cells
-
- Within a circuit, the OP and the exit node use the contents of
- RELAY packets to tunnel end-to-end commands and TCP connections
- ("Streams") across circuits. End-to-end commands can be initiated
- by either edge; streams are initiated by the OP.
-
- The payload of each unencrypted RELAY cell consists of:
- Relay command [1 byte]
- 'Recognized' [2 bytes]
- StreamID [2 bytes]
- Digest [4 bytes]
- Length [2 bytes]
- Data [CELL_LEN-14 bytes]
-
- The relay commands are:
- 1 -- RELAY_BEGIN [forward]
- 2 -- RELAY_DATA [forward or backward]
- 3 -- RELAY_END [forward or backward]
- 4 -- RELAY_CONNECTED [backward]
- 5 -- RELAY_SENDME [forward or backward] [sometimes control]
- 6 -- RELAY_EXTEND [forward] [control]
- 7 -- RELAY_EXTENDED [backward] [control]
- 8 -- RELAY_TRUNCATE [forward] [control]
- 9 -- RELAY_TRUNCATED [backward] [control]
- 10 -- RELAY_DROP [forward or backward] [control]
- 11 -- RELAY_RESOLVE [forward]
- 12 -- RELAY_RESOLVED [backward]
- 13 -- RELAY_BEGIN_DIR [forward]
-
- 32..40 -- Used for hidden services; see rend-spec.txt.
-
- Commands labelled as "forward" must only be sent by the originator
- of the circuit. Commands labelled as "backward" must only be sent by
- other nodes in the circuit back to the originator. Commands marked
- as either can be sent either by the originator or other nodes.
-
- The 'recognized' field in any unencrypted relay payload is always set
- to zero; the 'digest' field is computed as the first four bytes of
- the running digest of all the bytes that have been destined for
- this hop of the circuit or originated from this hop of the circuit,
- seeded from Df or Db respectively (obtained in section 5.2 above),
- and including this RELAY cell's entire payload (taken with the digest
- field set to zero).
-
- When the 'recognized' field of a RELAY cell is zero, and the digest
- is correct, the cell is considered "recognized" for the purposes of
- decryption (see section 5.5 above).
-
- (The digest does not include any bytes from relay cells that do
- not start or end at this hop of the circuit. That is, it does not
- include forwarded data. Therefore if 'recognized' is zero but the
- digest does not match, the running digest at that node should
- not be updated, and the cell should be forwarded on.)
-
- All RELAY cells pertaining to the same tunneled stream have the
- same stream ID. StreamIDs are chosen arbitrarily by the OP. RELAY
- cells that affect the entire circuit rather than a particular
- stream use a StreamID of zero -- they are marked in the table above
- as "[control]" style cells. (Sendme cells are marked as "sometimes
- control" because they can take include a StreamID or not depending
- on their purpose -- see Section 7.)
-
- The 'Length' field of a relay cell contains the number of bytes in
- the relay payload which contain real payload data. The remainder of
- the payload is padded with NUL bytes.
-
- If the RELAY cell is recognized but the relay command is not
- understood, the cell must be dropped and ignored. Its contents
- still count with respect to the digests, though.
-
-6.2. Opening streams and transferring data
-
- To open a new anonymized TCP connection, the OP chooses an open
- circuit to an exit that may be able to connect to the destination
- address, selects an arbitrary StreamID not yet used on that circuit,
- and constructs a RELAY_BEGIN cell with a payload encoding the address
- and port of the destination host. The payload format is:
-
- ADDRESS | ':' | PORT | [00]
-
- where ADDRESS can be a DNS hostname, or an IPv4 address in
- dotted-quad format, or an IPv6 address surrounded by square brackets;
- and where PORT is a decimal integer between 1 and 65535, inclusive.
-
- [What is the [00] for? -NM]
- [It's so the payload is easy to parse out with string funcs -RD]
-
- Upon receiving this cell, the exit node resolves the address as
- necessary, and opens a new TCP connection to the target port. If the
- address cannot be resolved, or a connection can't be established, the
- exit node replies with a RELAY_END cell. (See 6.4 below.)
- Otherwise, the exit node replies with a RELAY_CONNECTED cell, whose
- payload is in one of the following formats:
- The IPv4 address to which the connection was made [4 octets]
- A number of seconds (TTL) for which the address may be cached [4 octets]
- or
- Four zero-valued octets [4 octets]
- An address type (6) [1 octet]
- The IPv6 address to which the connection was made [16 octets]
- A number of seconds (TTL) for which the address may be cached [4 octets]
- [XXXX No version of Tor currently generates the IPv6 format.]
-
- [Tor servers before 0.1.2.0 set the TTL field to a fixed value. Later
- versions set the TTL to the last value seen from a DNS server, and expire
- their own cached entries after a fixed interval. This prevents certain
- attacks.]
-
- The OP waits for a RELAY_CONNECTED cell before sending any data.
- Once a connection has been established, the OP and exit node
- package stream data in RELAY_DATA cells, and upon receiving such
- cells, echo their contents to the corresponding TCP stream.
- RELAY_DATA cells sent to unrecognized streams are dropped.
-
- Relay RELAY_DROP cells are long-range dummies; upon receiving such
- a cell, the OR or OP must drop it.
-
-6.2.1. Opening a directory stream
-
- If a Tor server is a directory server, it should respond to a
- RELAY_BEGIN_DIR cell as if it had received a BEGIN cell requesting a
- connection to its directory port. RELAY_BEGIN_DIR cells ignore exit
- policy, since the stream is local to the Tor process.
-
- If the Tor server is not running a directory service, it should respond
- with a REASON_NOTDIRECTORY RELAY_END cell.
-
- Clients MUST generate an all-zero payload for RELAY_BEGIN_DIR cells,
- and servers MUST ignore the payload.
-
- [RELAY_BEGIN_DIR was not supported before Tor 0.1.2.2-alpha; clients
- SHOULD NOT send it to routers running earlier versions of Tor.]
-
-6.3. Closing streams
-
- When an anonymized TCP connection is closed, or an edge node
- encounters error on any stream, it sends a 'RELAY_END' cell along the
- circuit (if possible) and closes the TCP connection immediately. If
- an edge node receives a 'RELAY_END' cell for any stream, it closes
- the TCP connection completely, and sends nothing more along the
- circuit for that stream.
-
- The payload of a RELAY_END cell begins with a single 'reason' byte to
- describe why the stream is closing, plus optional data (depending on
- the reason.) The values are:
-
- 1 -- REASON_MISC (catch-all for unlisted reasons)
- 2 -- REASON_RESOLVEFAILED (couldn't look up hostname)
- 3 -- REASON_CONNECTREFUSED (remote host refused connection) [*]
- 4 -- REASON_EXITPOLICY (OR refuses to connect to host or port)
- 5 -- REASON_DESTROY (Circuit is being destroyed)
- 6 -- REASON_DONE (Anonymized TCP connection was closed)
- 7 -- REASON_TIMEOUT (Connection timed out, or OR timed out
- while connecting)
- 8 -- REASON_NOROUTE (Routing error while attempting to
- contact destination)
- 9 -- REASON_HIBERNATING (OR is temporarily hibernating)
- 10 -- REASON_INTERNAL (Internal error at the OR)
- 11 -- REASON_RESOURCELIMIT (OR has no resources to fulfill request)
- 12 -- REASON_CONNRESET (Connection was unexpectedly reset)
- 13 -- REASON_TORPROTOCOL (Sent when closing connection because of
- Tor protocol violations.)
- 14 -- REASON_NOTDIRECTORY (Client sent RELAY_BEGIN_DIR to a
- non-directory server.)
-
- (With REASON_EXITPOLICY, the 4-byte IPv4 address or 16-byte IPv6 address
- forms the optional data, along with a 4-byte TTL; no other reason
- currently has extra data.)
-
- OPs and ORs MUST accept reasons not on the above list, since future
- versions of Tor may provide more fine-grained reasons.
-
- Tors SHOULD NOT send any reason except REASON_MISC for a stream that they
- have originated.
-
- [*] Older versions of Tor also send this reason when connections are
- reset.
-
- --- [The rest of this section describes unimplemented functionality.]
-
- Because TCP connections can be half-open, we follow an equivalent
- to TCP's FIN/FIN-ACK/ACK protocol to close streams.
-
- An exit connection can have a TCP stream in one of three states:
- 'OPEN', 'DONE_PACKAGING', and 'DONE_DELIVERING'. For the purposes
- of modeling transitions, we treat 'CLOSED' as a fourth state,
- although connections in this state are not, in fact, tracked by the
- onion router.
-
- A stream begins in the 'OPEN' state. Upon receiving a 'FIN' from
- the corresponding TCP connection, the edge node sends a 'RELAY_FIN'
- cell along the circuit and changes its state to 'DONE_PACKAGING'.
- Upon receiving a 'RELAY_FIN' cell, an edge node sends a 'FIN' to
- the corresponding TCP connection (e.g., by calling
- shutdown(SHUT_WR)) and changing its state to 'DONE_DELIVERING'.
-
- When a stream in already in 'DONE_DELIVERING' receives a 'FIN', it
- also sends a 'RELAY_FIN' along the circuit, and changes its state
- to 'CLOSED'. When a stream already in 'DONE_PACKAGING' receives a
- 'RELAY_FIN' cell, it sends a 'FIN' and changes its state to
- 'CLOSED'.
-
- If an edge node encounters an error on any stream, it sends a
- 'RELAY_END' cell (if possible) and closes the stream immediately.
-
-6.4. Remote hostname lookup
-
- To find the address associated with a hostname, the OP sends a
- RELAY_RESOLVE cell containing the hostname to be resolved with a NUL
- terminating byte. (For a reverse lookup, the OP sends a RELAY_RESOLVE
- cell containing an in-addr.arpa address.) The OR replies with a
- RELAY_RESOLVED cell containing a status byte, and any number of
- answers. Each answer is of the form:
- Type (1 octet)
- Length (1 octet)
- Value (variable-width)
- TTL (4 octets)
- "Length" is the length of the Value field.
- "Type" is one of:
- 0x00 -- Hostname
- 0x04 -- IPv4 address
- 0x06 -- IPv6 address
- 0xF0 -- Error, transient
- 0xF1 -- Error, nontransient
-
- If any answer has a type of 'Error', then no other answer may be given.
-
- The RELAY_RESOLVE cell must use a nonzero, distinct streamID; the
- corresponding RELAY_RESOLVED cell must use the same streamID. No stream
- is actually created by the OR when resolving the name.
-
-7. Flow control
-
-7.1. Link throttling
-
- Each client or relay should do appropriate bandwidth throttling to
- keep its user happy.
-
- Communicants rely on TCP's default flow control to push back when they
- stop reading.
-
- The mainline Tor implementation uses token buckets (one for reads,
- one for writes) for the rate limiting.
-
- Since 0.2.0.x, Tor has let the user specify an additional pair of
- token buckets for "relayed" traffic, so people can deploy a Tor relay
- with strict rate limiting, but also use the same Tor as a client. To
- avoid partitioning concerns we combine both classes of traffic over a
- given OR connection, and keep track of the last time we read or wrote
- a high-priority (non-relayed) cell. If it's been less than N seconds
- (currently N=30), we give the whole connection high priority, else we
- give the whole connection low priority. We also give low priority
- to reads and writes for connections that are serving directory
- information. See proposal 111 for details.
-
-7.2. Link padding
-
- Link padding can be created by sending PADDING cells along the
- connection; relay cells of type "DROP" can be used for long-range
- padding.
-
- Currently nodes are not required to do any sort of link padding or
- dummy traffic. Because strong attacks exist even with link padding,
- and because link padding greatly increases the bandwidth requirements
- for running a node, we plan to leave out link padding until this
- tradeoff is better understood.
-
-7.3. Circuit-level flow control
-
- To control a circuit's bandwidth usage, each OR keeps track of two
- 'windows', consisting of how many RELAY_DATA cells it is allowed to
- originate (package for transmission), and how many RELAY_DATA cells
- it is willing to consume (receive for local streams). These limits
- do not apply to cells that the OR receives from one host and relays
- to another.
-
- Each 'window' value is initially set to 1000 data cells
- in each direction (cells that are not data cells do not affect
- the window). When an OR is willing to deliver more cells, it sends a
- RELAY_SENDME cell towards the OP, with Stream ID zero. When an OR
- receives a RELAY_SENDME cell with stream ID zero, it increments its
- packaging window.
-
- Each of these cells increments the corresponding window by 100.
-
- The OP behaves identically, except that it must track a packaging
- window and a delivery window for every OR in the circuit.
-
- An OR or OP sends cells to increment its delivery window when the
- corresponding window value falls under some threshold (900).
-
- If a packaging window reaches 0, the OR or OP stops reading from
- TCP connections for all streams on the corresponding circuit, and
- sends no more RELAY_DATA cells until receiving a RELAY_SENDME cell.
-[this stuff is badly worded; copy in the tor-design section -RD]
-
-7.4. Stream-level flow control
-
- Edge nodes use RELAY_SENDME cells to implement end-to-end flow
- control for individual connections across circuits. Similarly to
- circuit-level flow control, edge nodes begin with a window of cells
- (500) per stream, and increment the window by a fixed value (50)
- upon receiving a RELAY_SENDME cell. Edge nodes initiate RELAY_SENDME
- cells when both a) the window is <= 450, and b) there are less than
- ten cell payloads remaining to be flushed at that edge.
-
-A.1. Differences between spec and implementation
-
-- The current specification requires all ORs to have IPv4 addresses, but
- allows servers to exit and resolve to IPv6 addresses, and to declare IPv6
- addresses in their exit policies. The current codebase has no IPv6
- support at all.
-
diff --git a/doc/spec/version-spec.txt b/doc/spec/version-spec.txt
deleted file mode 100644
index 265717f40..000000000
--- a/doc/spec/version-spec.txt
+++ /dev/null
@@ -1,44 +0,0 @@
-
- HOW TOR VERSION NUMBERS WORK
-
-1. The Old Way
-
- Before 0.1.0, versions were of the format:
- MAJOR.MINOR.MICRO(status(PATCHLEVEL))?(-cvs)?
- where MAJOR, MINOR, MICRO, and PATCHLEVEL are numbers, status is one
- of "pre" (for an alpha release), "rc" (for a release candidate), or
- "." for a release. As a special case, "a.b.c" was equivalent to
- "a.b.c.0". We compare the elements in order (major, minor, micro,
- status, patchlevel, cvs), with "cvs" preceding non-cvs.
-
- We would start each development branch with a final version in mind:
- say, "0.0.8". Our first pre-release would be "0.0.8pre1", followed by
- (for example) "0.0.8pre2-cvs", "0.0.8pre2", "0.0.8pre3-cvs",
- "0.0.8rc1", "0.0.8rc2-cvs", and "0.0.8rc2". Finally, we'd release
- 0.0.8. The stable CVS branch would then be versioned "0.0.8.1-cvs",
- and any eventual bugfix release would be "0.0.8.1".
-
-2. The New Way
-
- After 0.1.0, versions are of the format:
- MAJOR.MINOR.MICRO(.PATCHLEVEL)(-status_tag)
- The stuff in parentheses is optional. As before, MAJOR, MINOR, MICRO,
- and PATCHLEVEL are numbers, with an absent number equivalent to 0.
- All versions should be distinguishable purely by those four
- numbers. The status tag is purely informational, and lets you know how
- stable we think the release is: "alpha" is pretty unstable; "rc" is a
- release candidate; and no tag at all means that we have a final
- release. If the tag ends with "-cvs" or "-dev", you're looking at a
- development snapshot that came after a given release. If we *do*
- encounter two versions that differ only by status tag, we compare them
- lexically.
-
- Now, we start each development branch with (say) 0.1.1.1-alpha. The
- patchlevel increments consistently as the status tag changes, for
- example, as in: 0.1.1.2-alpha, 0.1.1.3-alpha, 0.1.1.4-rc, 0.1.1.5-rc.
- Eventually, we release 0.1.1.6. The next patch release is 0.1.1.7.
-
- Between these releases, CVS is versioned with a -cvs tag: after
- 0.1.1.1-alpha comes 0.1.1.1-alpha-cvs, and so on. But starting with
- 0.1.2.1-alpha-dev, we switched to SVN and started using the "-dev"
- suffix instead of the "-cvs" suffix.