aboutsummaryrefslogtreecommitdiff
path: root/doc/tor-spec.txt
diff options
context:
space:
mode:
Diffstat (limited to 'doc/tor-spec.txt')
-rw-r--r--doc/tor-spec.txt299
1 files changed, 191 insertions, 108 deletions
diff --git a/doc/tor-spec.txt b/doc/tor-spec.txt
index a3e1f688f..588da76b5 100644
--- a/doc/tor-spec.txt
+++ b/doc/tor-spec.txt
@@ -31,6 +31,7 @@ protocols.
[We will move to AES once we can assume everybody will have it. -RD]
+
1. System overview
Tor is a connection-oriented anonymizing communication service. Users
@@ -40,7 +41,6 @@ flowing down the circuit is unwrapped by a symmetric key at each node,
which reveals the downstream node.
-
2. Connections
2.1. Establishing OR connections
@@ -217,6 +217,7 @@ which reveals the downstream node.
TOPIC_COMMAND_BEGIN cell to www.slashdot.org:80 , I can change the
address and port to point to a machine I control. -NM]
+
3. Cell Packet format
The basic unit of communication for onion routers and onion
@@ -261,9 +262,10 @@ which reveals the downstream node.
RELAY cells are used to send commands and data along a circuit; see
section 5 below.
+
4. Circuit management
-4.1. Setting up circuits
+4.1. CREATE and CREATED cells
Users set up circuits incrementally, one hop at a time. To create
a new circuit, users send a CREATE cell to the first node, with the
@@ -273,71 +275,71 @@ which reveals the downstream node.
which instructs the last node in the circuit to send a CREATE cell
to extend the circuit.
- CREATE cells contain the following:
+ The payload for a CREATE cell is an 'onion skin', consisting of:
+ RSA-encrypted data [128 bytes]
+ Symmetrically-encrypted data [16 bytes]
+ The RSA-encrypted portion contains:
+ Symmetric key [16 bytes]
+ First part of DH data (g^x) [112 bytes]
+ The symmetrically encrypted portion contains:
+ Second part of DH data (g^x) [16 bytes]
- [this stuff now wrong; haven't fixed the rest of the file either.]
+ The two parts of the DH data, once decrypted and concatenated, form
+ g^x as calculated by the client.
- Version [1 byte]
- Port [2 bytes]
- Address [4 bytes]
- Expiration time [4 bytes]
- Key seed material [16 bytes]
- [Total: 27 bytes]
+ The relay payload for an EXTEND relay cell consists of:
+ Address [4 bytes]
+ Port [2 bytes]
+ Onion skin [144 bytes]
- The port and address field denote the IPV4 address and port of
- the next onion router in the circuit, or are set to 0 for the
- last hop.
+ The port and address field denote the IPV4 address and port of the
+ next onion router in the circuit.
- The expiration time is a number of seconds since the epoch (1
- Jan 1970); by default, it is set to the current time plus one
- day.
+4.2. Setting circuit keys
- When constructing an onion to create a circuit from OR_1,
- OR_2... OR_N, the onion creator performs the following steps:
+ Once the handshake between the OP and an OR is completed, both
+ servers can now calculate g^xy with ordinary DH. They divide the
+ last 32 bytes of this shared secret into two 16-byte keys, the
+ first of which (called Kf) is used to encrypt the stream of data
+ going from the OP to the OR, and second of which (called Kb) is
+ used to encrypt the stream of data going from the OR to the OP.
- 1. Let M = 100 random bytes.
+4.3. Creating circuits
- 2. For I=N downto 1:
-
- A. Create an onion layer L, setting Version=2,
- ExpirationTime=now + 1 day, and Seed=16 random bytes.
+ When creating a circuit through the network, the circuit creator
+ performs the following steps:
- If I=N, set Port=Address=0. Else, set Port and Address to
- the IPV4 port and address of OR_{I+1}.
+ 1. Choose a chain of N onion routers (R_1...R_N) to constitute
+ the path, such that no router appears in the path twice.
- B. Let M = L | M.
+ 2. If not already connected to the first router in the chain,
+ open a new connection to that router.
- C. Let K1_I = SHA1(Seed).
- Let K2_I = SHA1(K1_I).
- Let K3_I = SHA1(K2_I).
+ 3. Choose an ACI not already in use on the connection with the
+ first router in the chain. If our address/port pair is
+ numerically higher than the address/port pair of the other
+ side, then let the high bit of the ACI be 1, else 0.
- D. Encrypt the first 128 bytes of M with the RSA key of
- OR_I, using no padding. Encrypt the remaining portion of
- M with 3DES/OFB, using K1_I as a key and an all-0 IV.
+ 4. Send a CREATE cell along the connection, to be received by
+ the first onion router.
- 3. M is now the onion.
+ 5. Wait until a CREATED cell is received; finish the handshake
+ and extract the forward key Kf_1 and the back key Kb_1.
- To create a connection using the onion M, an OP or OR performs the
- following steps:
+ 6. For each subsequent onion router R (R_2 through R_N), extend
+ the circuit to R.
- 1. If not already connected to the first router in the chain,
- open a new connection to that router.
+ To extend the circuit by a single onion router R_M, the circuit
+ creator performs these steps:
- 2. Choose an ACI not already in use on the connection with the
- first router in the chain. If our address/port pair is
- numerically higher than the address/port pair of the other
- side, then let the high bit of the ACI be 1, else 0.
+ 1. Create an onion skin, encrypting the RSA-encrypted part with
+ R's public key.
- 3. To send M over the wire, prepend a 4-byte integer containing
- Len(M). Call the result M'. Let N=ceil(Len(M')/248).
- Divide M' into N chunks, such that:
- Chunk_I = M'[(I-1)*248:I*248] for 1 <= I <= N-1
- Chunk_N = M'[(N-1)*248:Len(M')]
+ 2. Encrypt and send the onion skin in a RELAY_CREATE cell along
+ the circuit (see section 5).
- 4. Send N CREATE cells along the connection, setting the ACI
- on each to the selected ACI, setting the payload on each to
- the corresponding 'Chunk_I', and setting the length on each
- to the length of the payload.
+ 3. When a RELAY_CREATED cell is received, calculate the shared
+ keys. The circuit is now extended.
Upon receiving a CREATE cell along a connection, an OR performs
the following steps:
@@ -370,14 +372,29 @@ which reveals the downstream node.
choose a different ACI for this circuit on the connection
with the next OR.)
- As an optimization, OR implementations may delay processing onions
+ When an onion router receives an EXTEND relay cell, it sends a
+ CREATE cell to the next onion router, with the enclosed onion skin
+ as its payload. The initiating onion router chooses some random
+ ACI not yet used on the connection between the two onion routers.
+
+ Some time after receiving a create cell, an onion router completes
+ the DH handshake, and replies with a CREATED cell, containing g^y
+ as its [128 byte] payload. Upon receiving a CREATED cell, an onion
+ router packs it payload into a CREATED relay cell (see section 5),
+ and sends that cell up the circuit. Upon receiving the CREATED
+ relay cell, the OP can retrieve g^y.
+
+ (As an optimization, OR implementations may delay processing onions
until a break in traffic allows time to do so without harming
- network latency too greatly.
+ network latency too greatly.)
4.2. Tearing down circuits
+ [Note: this section is untouched; the code doesn't seem to match
+ what I remembered discussing. Let's sort it out. -NM]
+
Circuits are torn down when an unrecoverable error occurs along
- the circuit, or when all topics on a circuit are closed and the
+ the circuit, or when all streams on a circuit are closed and the
circuit's intended lifetime is over.
To tear down a circuit, an OR or OP sends a DESTROY cell with that
@@ -394,55 +411,73 @@ which reveals the downstream node.
4.3. Routing data cells
- When an OR receives a DATA cell, it checks the cell's ACI and
+ When an OR receives a RELAY cell, it checks the cell's ACI and
determines whether it has a corresponding circuit along that
- connection. If not, the OR drops the DATA cell.
+ connection. If not, the OR drops the RELAY cell.
Otherwise, if the OR is not at the OP edge of the circuit (that is,
either an 'exit node' or a non-edge node), it de/encrypts the length
field and the payload with 3DES/OFB, as follows:
- 'Forward' data cell (same direction as onion):
- Use K2 as key; encrypt.
- 'Back' data cell (opposite direction from onion):
- Use K3 as key; decrypt.
+ 'Forward' relay cell (same direction as CREATE):
+ Use Kf as key; encrypt.
+ 'Back' relay cell (opposite direction from CREATE):
+ Use Kb as key; decrypt.
+ If the OR recognizes the stream ID on the cell (it is either the ID
+ of an open stream or the signaling ID, zero), the OR processes the
+ contents of the relay cell. Otherwise, it passes the decrypted
+ relay cell along the circuit. [What if the circuit doesn't go any
+ farther?]
+
+
+ Otherwise, if the data cell is coming from the OP edge of the
+ circuit, the OP decrypts the length and payload fields with 3DES/OFB as
+ follows:
+ OP sends data cell to node R_M:
+ For I=1...M, decrypt with Kf_I.
- Otherwise, if the data cell has arrived to the OP edge of the circuit,
- the OP de/encrypts the length and payload fields with 3DES/OFB as
+ Otherwise, if the data cell is arriving at the OP edge if the
+ circuit, the OP encrypts the length and payload fields with 3DES/OFB as
follows:
- OP sends data cell:
- For I=1...N, decrypt with K2_I.
OP receives data cell:
- For I=N...1, encrypt with K3_I.
+ For I=N...1,
+ Encrypt with Kb_I. If the stream ID is a recognized
+ stream for R_I, or if the stream ID is the signaling
+ ID, zero, then process the payload.
- Edge nodes process the length and payload fields of DATA cells as
- described in section 5 below.
+ For more information, see section 5 below.
5. Application connections and stream management
5.1. Streams
- Within a circuit, the OP and the exit node use the contents of DATA
- packets to tunnel TCP connections ("Topics") across circuits.
- These connections are initiated by the OP.
-
- The first 4 bytes of each data cell are reserved as follows:
- Topic command [1 byte]
- Unused, set to 0. [1 byte]
- Topic ID [2 bytes]
-
- The recognized topic commands are:
- 1 -- TOPIC_BEGIN
- 2 -- TOPIC_DATA
- 3 -- TOPIC_END
- 4 -- TOPIC_CONNECTED
- 5 -- TOPIC_SENDME
-
- All DATA cells pertaining to the same tunneled connection have the
- same topic ID.
+ Within a circuit, the OP and the exit node use the contents of
+ RELAY packets to tunnel end-to-end commands and TCP connections
+ ("Streams") across circuits. End-to-end commands can be initiated
+ by either edge; streams are initiated by the OP.
+
+ The first 8 bytes of each relay cell are reserved as follows:
+ Relay command [1 byte]
+ Stream ID [7 bytes]
+
+ The recognized relay commands are:
+ 1 -- RELAY_BEGIN
+ 2 -- RELAY_DATA
+ 3 -- RELAY_END
+ 4 -- RELAY_CONNECTED
+ 5 -- RELAY_SENDME
+ 6 -- RELAY_EXTEND
+ 7 -- RELAY_EXTENDED
+
+ All RELAY cells pertaining to the same tunneled stream have the
+ same stream ID. Stream ID's are chosen randomly by the OP. A
+ stream ID is considered "recognized" on a circuit C by an OP or an
+ OR if it already has an existing stream established on that
+ circuit, or if the stream ID is equal to the signaling stream ID,
+ which is all zero: [00 00 00 00 00 00 00]
To create a new anonymized TCP connection, the OP sends a
- TOPIC_BEGIN data cell with a payload encoding the address and port
- of the destination host. The payload format is:
+ RELAY_BEGIN data cell with a payload encoding the address and port
+ of the destination host. The stream ID is zero. The payload format is:
ADDRESS | ':' | PORT | '\000'
where ADDRESS may be a DNS hostname, or an IPv4 address in
dotted-quad format; and where PORT is encoded in decimal.
@@ -450,29 +485,33 @@ which reveals the downstream node.
Upon receiving this packet, the exit node resolves the address as
necessary, and opens a new TCP connection to the target port. If
the address cannot be resolved, or a connection can't be
- established, the exit node replies with a TOPIC_END cell.
- Otherwise, the exit node replies with a TOPIC_CONNECTED cell.
+ established, the exit node replies with a RELAY_END cell.
+ Otherwise, the exit node replies with a RELAY_CONNECTED cell.
- The OP waits for a TOPIC_CONNECTED cell before sending any data.
+ The OP waits for a RELAY_CONNECTED cell before sending any data.
Once a connection has been established, the OP and exit node
- package stream data in TOPIC_DATA cells, and upon receiving such
+ package stream data in RELAY_DATA cells, and upon receiving such
cells, echo their contents to the corresponding TCP stream.
[XXX Mention zlib encoding. -NM]
When one side of the TCP stream is closed, the corresponding edge
- node sends a TOPIC_END cell along the circuit; upon receiving a
- TOPIC_END cell, the edge node closes the corresponding TCP stream.
-
+ node sends a RELAY_END cell along the circuit; upon receiving a
+ RELAY_END cell, the edge node closes the corresponding TCP stream.
[This should probably become:
When one side of the TCP stream is closed, the corresponding edge
- node sends a TOPIC_END cell along the circuit; upon receiving a
- TOPIC_END cell, the edge node closes its side of the corresponding
+ node sends a RELAY_END cell along the circuit; upon receiving a
+ RELAY_END cell, the edge node closes its side of the corresponding
TCP stream (by sending a FIN packet), but continues to accept and
package incoming data until both sides of the TCP stream are
- closed. At that point, the edge node sends a second TOPIC_END
+ closed. At that point, the edge node sends a second RELAY_END
cell, and drops its record of the topic. -NM]
+ For creation and handling of RELAY_EXTEND and RELAY_EXTENDED cells,
+ see section 4. For creating and handling of RELAY_SENDME cells,
+ see section 6.
+
+
6. Flow control
6.1. Link throttling
@@ -497,10 +536,19 @@ which reveals the downstream node.
6.3. Circuit flow control
To control a circuit's bandwidth usage, each node keeps track of
- how many data cells it is allowed to send to the next hop in the
- circuit. This 'window' value is initially set to 1000 data cells
+ two 'windows', consisting of how many RELAY_DATA cells it is
+ allowed to package for transmission, and how many RELAY_DATA cells
+ it is willing to deliver to a stream outside the network.
+ Each 'window' value is initially set to 500 data cells
in each direction (cells that are not data cells do not affect
- the window). Each edge node on a circuit sends a SENDME cell
+ the window).
+
+ [Note: I'm not touching the rest of this section... it looks in the
+ code as if RELAY_COMMAND_SENDME is now doing double duty for both
+ stream flow control and circuit flow control. I thought we wanted
+ two different notions of windows. -NM]
+
+ Each edge node on a circuit sends a SENDME cell
(with length=100) every time it has received 100 data cells on the
circuit. When a node receives a SENDME cell for a circuit, it increases
the circuit's window in the corresponding direction (that is, for
@@ -517,30 +565,65 @@ which reveals the downstream node.
6.4. Topic flow control
- Edge nodes use TOPIC_SENDME data cells to implement end-to-end flow
+ Edge nodes use RELAY_SENDME data cells to implement end-to-end flow
control for individual connections across circuits. As with circuit
flow control, edge nodes begin with a window of cells (500) per
topic, and increment the window by a fixed value (50) upon receiving
- a TOPIC_SENDME data cell. Edge nodes initiate TOPIC_SENDME data
+ a RELAY_SENDME data cell. Edge nodes initiate TOPIC_SENDME data
cells when both a) the window is <= 450, and b) there are less than
ten cell payloads remaining to be flushed at that edge.
+
7. Directories and routers
7.1. Router descriptor format.
-Line format : address ORPort OPPort APPort DirPort bandwidth(bytes/s)
-followed by the router's public key.
-ORport is where the router listens for other routers (speaking cells)
-OPPort is where the router listens for onion proxies (speaking cells)
-APPort is where the router listens for applications (speaking socks)
-DirPort is where the router listens for directory download requests
+(Unless otherwise noted, tokens on the same line are space-separated.)
+
+Router ::= Router-Line Public-Key Signing-Key? Exit-Policy NL
+Router-Line ::= "router" address ORPort OPPort APPort DirPort bandwidth
+ NL
+Public-key ::= a public key in PEM format NL
+Signing-Key ::= "signing-key" NL signing key in PEM format NL
+Exit-Policy ::= Exit-Line*
+Exit-Line ::= ("accept"|"reject") string NL
+
+ORport ::= port where the router listens for other routers (speaking cells)
+OPPort ::= where the router listens for onion proxies (speaking cells)
+APPort ::= where the router listens for applications (speaking socks)
+DirPort ::= where the router listens for directory download requests
+bandwidth ::= maximum bandwidth, in bytes/s
+
Example:
-moria.mit.edu 9001 9011 9021 9031 100000
+router moria.mit.edu 9001 9011 9021 9031 100000
-----BEGIN RSA PUBLIC KEY-----
MIGJAoGBAMBBuk1sYxEg5jLAJy86U3GGJ7EGMSV7yoA6mmcsEVU3pwTUrpbpCmwS
7BvovoY3z4zk63NZVBErgKQUDkn3pp8n83xZgEf4GI27gdWIIwaBjEimuJlEY+7K
nZ7kVMRoiXCbjL6VAtNa4Zy1Af/GOm0iCIDpholeujQ95xew7rQnAgMA//8=
-----END RSA PUBLIC KEY-----
+signing-key
+-----BEGIN RSA PUBLIC KEY-----
+7BvovoY3z4zk63NZVBErgKQUDkn3pp8n83xZgEf4GI27gdWIIwaBjEimuJlEY+7K
+MIGJAoGBAMBBuk1sYxEg5jLAJy86U3GGJ7EGMSV7yoA6mmcsEVU3pwTUrpbpCmwS
+f/GOm0iCIDpholeujQ95xew7rnZ7kVMRoiXCbjL6VAtNa4Zy1AQnAgMA//8=
+-----END RSA PUBLIC KEY-----
+reject 18.0.0.0/24
+
+Note: The extra newline at the end of the router block is intentional.
+
+7.2. Directory format
+
+Directory ::= Directory-Header Directory-Router Router* Signature
+Directory-Header ::= "signed-directory" NL Software-Line NL
+Software-Line: "recommended-software" comma-separated-version-list
+Directory-Router ::= Router
+Signature ::= "directory-signature" NL "-----BEGIN SIGNATURE-----" NL
+ Base-64-encoded-signature NL "-----END SIGNATURE-----" NL
+
+Note: The router block for the directory server must appear first.
+The signature is computed by computing the SHA-1 hash of the
+directory, from the characters "signed-directory", through the newline
+after "directory-signature". This digest is then padded with PKCS.1,
+and signed with the directory server's signing key.