aboutsummaryrefslogtreecommitdiff
path: root/doc/tor-spec.txt
diff options
context:
space:
mode:
authorNick Mathewson <nickm@torproject.org>2003-06-03 06:45:06 +0000
committerNick Mathewson <nickm@torproject.org>2003-06-03 06:45:06 +0000
commitf40ddfab2ef8ad02c5cdd3863b121304a80a8f15 (patch)
treef0c8a73d04ab7dc6aa3f15359b2c68c6c7f46138 /doc/tor-spec.txt
parentd3592af0428bd604041bb7524532e04d2799e479 (diff)
downloadtor-f40ddfab2ef8ad02c5cdd3863b121304a80a8f15.tar
tor-f40ddfab2ef8ad02c5cdd3863b121304a80a8f15.tar.gz
Committing the parts of tor-spec I can write. There are still a
couple of points where the code doesn't match my understanding -- I can write those, once I understand whether we're still going to do what I thought. The rendezvous point spec is begun, but has turned out not to be what we had talked about. Let's talk design tomorrow, Roger, and I'll write down what we say. svn:r305
Diffstat (limited to 'doc/tor-spec.txt')
-rw-r--r--doc/tor-spec.txt299
1 files changed, 191 insertions, 108 deletions
diff --git a/doc/tor-spec.txt b/doc/tor-spec.txt
index a3e1f688f..588da76b5 100644
--- a/doc/tor-spec.txt
+++ b/doc/tor-spec.txt
@@ -31,6 +31,7 @@ protocols.
[We will move to AES once we can assume everybody will have it. -RD]
+
1. System overview
Tor is a connection-oriented anonymizing communication service. Users
@@ -40,7 +41,6 @@ flowing down the circuit is unwrapped by a symmetric key at each node,
which reveals the downstream node.
-
2. Connections
2.1. Establishing OR connections
@@ -217,6 +217,7 @@ which reveals the downstream node.
TOPIC_COMMAND_BEGIN cell to www.slashdot.org:80 , I can change the
address and port to point to a machine I control. -NM]
+
3. Cell Packet format
The basic unit of communication for onion routers and onion
@@ -261,9 +262,10 @@ which reveals the downstream node.
RELAY cells are used to send commands and data along a circuit; see
section 5 below.
+
4. Circuit management
-4.1. Setting up circuits
+4.1. CREATE and CREATED cells
Users set up circuits incrementally, one hop at a time. To create
a new circuit, users send a CREATE cell to the first node, with the
@@ -273,71 +275,71 @@ which reveals the downstream node.
which instructs the last node in the circuit to send a CREATE cell
to extend the circuit.
- CREATE cells contain the following:
+ The payload for a CREATE cell is an 'onion skin', consisting of:
+ RSA-encrypted data [128 bytes]
+ Symmetrically-encrypted data [16 bytes]
+ The RSA-encrypted portion contains:
+ Symmetric key [16 bytes]
+ First part of DH data (g^x) [112 bytes]
+ The symmetrically encrypted portion contains:
+ Second part of DH data (g^x) [16 bytes]
- [this stuff now wrong; haven't fixed the rest of the file either.]
+ The two parts of the DH data, once decrypted and concatenated, form
+ g^x as calculated by the client.
- Version [1 byte]
- Port [2 bytes]
- Address [4 bytes]
- Expiration time [4 bytes]
- Key seed material [16 bytes]
- [Total: 27 bytes]
+ The relay payload for an EXTEND relay cell consists of:
+ Address [4 bytes]
+ Port [2 bytes]
+ Onion skin [144 bytes]
- The port and address field denote the IPV4 address and port of
- the next onion router in the circuit, or are set to 0 for the
- last hop.
+ The port and address field denote the IPV4 address and port of the
+ next onion router in the circuit.
- The expiration time is a number of seconds since the epoch (1
- Jan 1970); by default, it is set to the current time plus one
- day.
+4.2. Setting circuit keys
- When constructing an onion to create a circuit from OR_1,
- OR_2... OR_N, the onion creator performs the following steps:
+ Once the handshake between the OP and an OR is completed, both
+ servers can now calculate g^xy with ordinary DH. They divide the
+ last 32 bytes of this shared secret into two 16-byte keys, the
+ first of which (called Kf) is used to encrypt the stream of data
+ going from the OP to the OR, and second of which (called Kb) is
+ used to encrypt the stream of data going from the OR to the OP.
- 1. Let M = 100 random bytes.
+4.3. Creating circuits
- 2. For I=N downto 1:
-
- A. Create an onion layer L, setting Version=2,
- ExpirationTime=now + 1 day, and Seed=16 random bytes.
+ When creating a circuit through the network, the circuit creator
+ performs the following steps:
- If I=N, set Port=Address=0. Else, set Port and Address to
- the IPV4 port and address of OR_{I+1}.
+ 1. Choose a chain of N onion routers (R_1...R_N) to constitute
+ the path, such that no router appears in the path twice.
- B. Let M = L | M.
+ 2. If not already connected to the first router in the chain,
+ open a new connection to that router.
- C. Let K1_I = SHA1(Seed).
- Let K2_I = SHA1(K1_I).
- Let K3_I = SHA1(K2_I).
+ 3. Choose an ACI not already in use on the connection with the
+ first router in the chain. If our address/port pair is
+ numerically higher than the address/port pair of the other
+ side, then let the high bit of the ACI be 1, else 0.
- D. Encrypt the first 128 bytes of M with the RSA key of
- OR_I, using no padding. Encrypt the remaining portion of
- M with 3DES/OFB, using K1_I as a key and an all-0 IV.
+ 4. Send a CREATE cell along the connection, to be received by
+ the first onion router.
- 3. M is now the onion.
+ 5. Wait until a CREATED cell is received; finish the handshake
+ and extract the forward key Kf_1 and the back key Kb_1.
- To create a connection using the onion M, an OP or OR performs the
- following steps:
+ 6. For each subsequent onion router R (R_2 through R_N), extend
+ the circuit to R.
- 1. If not already connected to the first router in the chain,
- open a new connection to that router.
+ To extend the circuit by a single onion router R_M, the circuit
+ creator performs these steps:
- 2. Choose an ACI not already in use on the connection with the
- first router in the chain. If our address/port pair is
- numerically higher than the address/port pair of the other
- side, then let the high bit of the ACI be 1, else 0.
+ 1. Create an onion skin, encrypting the RSA-encrypted part with
+ R's public key.
- 3. To send M over the wire, prepend a 4-byte integer containing
- Len(M). Call the result M'. Let N=ceil(Len(M')/248).
- Divide M' into N chunks, such that:
- Chunk_I = M'[(I-1)*248:I*248] for 1 <= I <= N-1
- Chunk_N = M'[(N-1)*248:Len(M')]
+ 2. Encrypt and send the onion skin in a RELAY_CREATE cell along
+ the circuit (see section 5).
- 4. Send N CREATE cells along the connection, setting the ACI
- on each to the selected ACI, setting the payload on each to
- the corresponding 'Chunk_I', and setting the length on each
- to the length of the payload.
+ 3. When a RELAY_CREATED cell is received, calculate the shared
+ keys. The circuit is now extended.
Upon receiving a CREATE cell along a connection, an OR performs
the following steps:
@@ -370,14 +372,29 @@ which reveals the downstream node.
choose a different ACI for this circuit on the connection
with the next OR.)
- As an optimization, OR implementations may delay processing onions
+ When an onion router receives an EXTEND relay cell, it sends a
+ CREATE cell to the next onion router, with the enclosed onion skin
+ as its payload. The initiating onion router chooses some random
+ ACI not yet used on the connection between the two onion routers.
+
+ Some time after receiving a create cell, an onion router completes
+ the DH handshake, and replies with a CREATED cell, containing g^y
+ as its [128 byte] payload. Upon receiving a CREATED cell, an onion
+ router packs it payload into a CREATED relay cell (see section 5),
+ and sends that cell up the circuit. Upon receiving the CREATED
+ relay cell, the OP can retrieve g^y.
+
+ (As an optimization, OR implementations may delay processing onions
until a break in traffic allows time to do so without harming
- network latency too greatly.
+ network latency too greatly.)
4.2. Tearing down circuits
+ [Note: this section is untouched; the code doesn't seem to match
+ what I remembered discussing. Let's sort it out. -NM]
+
Circuits are torn down when an unrecoverable error occurs along
- the circuit, or when all topics on a circuit are closed and the
+ the circuit, or when all streams on a circuit are closed and the
circuit's intended lifetime is over.
To tear down a circuit, an OR or OP sends a DESTROY cell with that
@@ -394,55 +411,73 @@ which reveals the downstream node.
4.3. Routing data cells
- When an OR receives a DATA cell, it checks the cell's ACI and
+ When an OR receives a RELAY cell, it checks the cell's ACI and
determines whether it has a corresponding circuit along that
- connection. If not, the OR drops the DATA cell.
+ connection. If not, the OR drops the RELAY cell.
Otherwise, if the OR is not at the OP edge of the circuit (that is,
either an 'exit node' or a non-edge node), it de/encrypts the length
field and the payload with 3DES/OFB, as follows:
- 'Forward' data cell (same direction as onion):
- Use K2 as key; encrypt.
- 'Back' data cell (opposite direction from onion):
- Use K3 as key; decrypt.
+ 'Forward' relay cell (same direction as CREATE):
+ Use Kf as key; encrypt.
+ 'Back' relay cell (opposite direction from CREATE):
+ Use Kb as key; decrypt.
+ If the OR recognizes the stream ID on the cell (it is either the ID
+ of an open stream or the signaling ID, zero), the OR processes the
+ contents of the relay cell. Otherwise, it passes the decrypted
+ relay cell along the circuit. [What if the circuit doesn't go any
+ farther?]
+
+
+ Otherwise, if the data cell is coming from the OP edge of the
+ circuit, the OP decrypts the length and payload fields with 3DES/OFB as
+ follows:
+ OP sends data cell to node R_M:
+ For I=1...M, decrypt with Kf_I.
- Otherwise, if the data cell has arrived to the OP edge of the circuit,
- the OP de/encrypts the length and payload fields with 3DES/OFB as
+ Otherwise, if the data cell is arriving at the OP edge if the
+ circuit, the OP encrypts the length and payload fields with 3DES/OFB as
follows:
- OP sends data cell:
- For I=1...N, decrypt with K2_I.
OP receives data cell:
- For I=N...1, encrypt with K3_I.
+ For I=N...1,
+ Encrypt with Kb_I. If the stream ID is a recognized
+ stream for R_I, or if the stream ID is the signaling
+ ID, zero, then process the payload.
- Edge nodes process the length and payload fields of DATA cells as
- described in section 5 below.
+ For more information, see section 5 below.
5. Application connections and stream management
5.1. Streams
- Within a circuit, the OP and the exit node use the contents of DATA
- packets to tunnel TCP connections ("Topics") across circuits.
- These connections are initiated by the OP.
-
- The first 4 bytes of each data cell are reserved as follows:
- Topic command [1 byte]
- Unused, set to 0. [1 byte]
- Topic ID [2 bytes]
-
- The recognized topic commands are:
- 1 -- TOPIC_BEGIN
- 2 -- TOPIC_DATA
- 3 -- TOPIC_END
- 4 -- TOPIC_CONNECTED
- 5 -- TOPIC_SENDME
-
- All DATA cells pertaining to the same tunneled connection have the
- same topic ID.
+ Within a circuit, the OP and the exit node use the contents of
+ RELAY packets to tunnel end-to-end commands and TCP connections
+ ("Streams") across circuits. End-to-end commands can be initiated
+ by either edge; streams are initiated by the OP.
+
+ The first 8 bytes of each relay cell are reserved as follows:
+ Relay command [1 byte]
+ Stream ID [7 bytes]
+
+ The recognized relay commands are:
+ 1 -- RELAY_BEGIN
+ 2 -- RELAY_DATA
+ 3 -- RELAY_END
+ 4 -- RELAY_CONNECTED
+ 5 -- RELAY_SENDME
+ 6 -- RELAY_EXTEND
+ 7 -- RELAY_EXTENDED
+
+ All RELAY cells pertaining to the same tunneled stream have the
+ same stream ID. Stream ID's are chosen randomly by the OP. A
+ stream ID is considered "recognized" on a circuit C by an OP or an
+ OR if it already has an existing stream established on that
+ circuit, or if the stream ID is equal to the signaling stream ID,
+ which is all zero: [00 00 00 00 00 00 00]
To create a new anonymized TCP connection, the OP sends a
- TOPIC_BEGIN data cell with a payload encoding the address and port
- of the destination host. The payload format is:
+ RELAY_BEGIN data cell with a payload encoding the address and port
+ of the destination host. The stream ID is zero. The payload format is:
ADDRESS | ':' | PORT | '\000'
where ADDRESS may be a DNS hostname, or an IPv4 address in
dotted-quad format; and where PORT is encoded in decimal.
@@ -450,29 +485,33 @@ which reveals the downstream node.
Upon receiving this packet, the exit node resolves the address as
necessary, and opens a new TCP connection to the target port. If
the address cannot be resolved, or a connection can't be
- established, the exit node replies with a TOPIC_END cell.
- Otherwise, the exit node replies with a TOPIC_CONNECTED cell.
+ established, the exit node replies with a RELAY_END cell.
+ Otherwise, the exit node replies with a RELAY_CONNECTED cell.
- The OP waits for a TOPIC_CONNECTED cell before sending any data.
+ The OP waits for a RELAY_CONNECTED cell before sending any data.
Once a connection has been established, the OP and exit node
- package stream data in TOPIC_DATA cells, and upon receiving such
+ package stream data in RELAY_DATA cells, and upon receiving such
cells, echo their contents to the corresponding TCP stream.
[XXX Mention zlib encoding. -NM]
When one side of the TCP stream is closed, the corresponding edge
- node sends a TOPIC_END cell along the circuit; upon receiving a
- TOPIC_END cell, the edge node closes the corresponding TCP stream.
-
+ node sends a RELAY_END cell along the circuit; upon receiving a
+ RELAY_END cell, the edge node closes the corresponding TCP stream.
[This should probably become:
When one side of the TCP stream is closed, the corresponding edge
- node sends a TOPIC_END cell along the circuit; upon receiving a
- TOPIC_END cell, the edge node closes its side of the corresponding
+ node sends a RELAY_END cell along the circuit; upon receiving a
+ RELAY_END cell, the edge node closes its side of the corresponding
TCP stream (by sending a FIN packet), but continues to accept and
package incoming data until both sides of the TCP stream are
- closed. At that point, the edge node sends a second TOPIC_END
+ closed. At that point, the edge node sends a second RELAY_END
cell, and drops its record of the topic. -NM]
+ For creation and handling of RELAY_EXTEND and RELAY_EXTENDED cells,
+ see section 4. For creating and handling of RELAY_SENDME cells,
+ see section 6.
+
+
6. Flow control
6.1. Link throttling
@@ -497,10 +536,19 @@ which reveals the downstream node.
6.3. Circuit flow control
To control a circuit's bandwidth usage, each node keeps track of
- how many data cells it is allowed to send to the next hop in the
- circuit. This 'window' value is initially set to 1000 data cells
+ two 'windows', consisting of how many RELAY_DATA cells it is
+ allowed to package for transmission, and how many RELAY_DATA cells
+ it is willing to deliver to a stream outside the network.
+ Each 'window' value is initially set to 500 data cells
in each direction (cells that are not data cells do not affect
- the window). Each edge node on a circuit sends a SENDME cell
+ the window).
+
+ [Note: I'm not touching the rest of this section... it looks in the
+ code as if RELAY_COMMAND_SENDME is now doing double duty for both
+ stream flow control and circuit flow control. I thought we wanted
+ two different notions of windows. -NM]
+
+ Each edge node on a circuit sends a SENDME cell
(with length=100) every time it has received 100 data cells on the
circuit. When a node receives a SENDME cell for a circuit, it increases
the circuit's window in the corresponding direction (that is, for
@@ -517,30 +565,65 @@ which reveals the downstream node.
6.4. Topic flow control
- Edge nodes use TOPIC_SENDME data cells to implement end-to-end flow
+ Edge nodes use RELAY_SENDME data cells to implement end-to-end flow
control for individual connections across circuits. As with circuit
flow control, edge nodes begin with a window of cells (500) per
topic, and increment the window by a fixed value (50) upon receiving
- a TOPIC_SENDME data cell. Edge nodes initiate TOPIC_SENDME data
+ a RELAY_SENDME data cell. Edge nodes initiate TOPIC_SENDME data
cells when both a) the window is <= 450, and b) there are less than
ten cell payloads remaining to be flushed at that edge.
+
7. Directories and routers
7.1. Router descriptor format.
-Line format : address ORPort OPPort APPort DirPort bandwidth(bytes/s)
-followed by the router's public key.
-ORport is where the router listens for other routers (speaking cells)
-OPPort is where the router listens for onion proxies (speaking cells)
-APPort is where the router listens for applications (speaking socks)
-DirPort is where the router listens for directory download requests
+(Unless otherwise noted, tokens on the same line are space-separated.)
+
+Router ::= Router-Line Public-Key Signing-Key? Exit-Policy NL
+Router-Line ::= "router" address ORPort OPPort APPort DirPort bandwidth
+ NL
+Public-key ::= a public key in PEM format NL
+Signing-Key ::= "signing-key" NL signing key in PEM format NL
+Exit-Policy ::= Exit-Line*
+Exit-Line ::= ("accept"|"reject") string NL
+
+ORport ::= port where the router listens for other routers (speaking cells)
+OPPort ::= where the router listens for onion proxies (speaking cells)
+APPort ::= where the router listens for applications (speaking socks)
+DirPort ::= where the router listens for directory download requests
+bandwidth ::= maximum bandwidth, in bytes/s
+
Example:
-moria.mit.edu 9001 9011 9021 9031 100000
+router moria.mit.edu 9001 9011 9021 9031 100000
-----BEGIN RSA PUBLIC KEY-----
MIGJAoGBAMBBuk1sYxEg5jLAJy86U3GGJ7EGMSV7yoA6mmcsEVU3pwTUrpbpCmwS
7BvovoY3z4zk63NZVBErgKQUDkn3pp8n83xZgEf4GI27gdWIIwaBjEimuJlEY+7K
nZ7kVMRoiXCbjL6VAtNa4Zy1Af/GOm0iCIDpholeujQ95xew7rQnAgMA//8=
-----END RSA PUBLIC KEY-----
+signing-key
+-----BEGIN RSA PUBLIC KEY-----
+7BvovoY3z4zk63NZVBErgKQUDkn3pp8n83xZgEf4GI27gdWIIwaBjEimuJlEY+7K
+MIGJAoGBAMBBuk1sYxEg5jLAJy86U3GGJ7EGMSV7yoA6mmcsEVU3pwTUrpbpCmwS
+f/GOm0iCIDpholeujQ95xew7rnZ7kVMRoiXCbjL6VAtNa4Zy1AQnAgMA//8=
+-----END RSA PUBLIC KEY-----
+reject 18.0.0.0/24
+
+Note: The extra newline at the end of the router block is intentional.
+
+7.2. Directory format
+
+Directory ::= Directory-Header Directory-Router Router* Signature
+Directory-Header ::= "signed-directory" NL Software-Line NL
+Software-Line: "recommended-software" comma-separated-version-list
+Directory-Router ::= Router
+Signature ::= "directory-signature" NL "-----BEGIN SIGNATURE-----" NL
+ Base-64-encoded-signature NL "-----END SIGNATURE-----" NL
+
+Note: The router block for the directory server must appear first.
+The signature is computed by computing the SHA-1 hash of the
+directory, from the characters "signed-directory", through the newline
+after "directory-signature". This digest is then padded with PKCS.1,
+and signed with the directory server's signing key.