diff options
Diffstat (limited to 'doc/spec/proposals/174-optimistic-data-server.txt')
-rw-r--r-- | doc/spec/proposals/174-optimistic-data-server.txt | 242 |
1 files changed, 242 insertions, 0 deletions
diff --git a/doc/spec/proposals/174-optimistic-data-server.txt b/doc/spec/proposals/174-optimistic-data-server.txt new file mode 100644 index 000000000..d97c45e90 --- /dev/null +++ b/doc/spec/proposals/174-optimistic-data-server.txt @@ -0,0 +1,242 @@ +Filename: 174-optimistic-data-server.txt +Title: Optimistic Data for Tor: Server Side +Author: Ian Goldberg +Created: 2-Aug-2010 +Status: Open + +Overview: + +When a SOCKS client opens a TCP connection through Tor (for an HTTP +request, for example), the query latency is about 1.5x higher than it +needs to be. Simply, the problem is that the sequence of data flows +is this: + +1. The SOCKS client opens a TCP connection to the OP +2. The SOCKS client sends a SOCKS CONNECT command +3. The OP sends a BEGIN cell to the Exit +4. The Exit opens a TCP connection to the Server +5. The Exit returns a CONNECTED cell to the OP +6. The OP returns a SOCKS CONNECTED notification to the SOCKS client +7. The SOCKS client sends some data (the GET request, for example) +8. The OP sends a DATA cell to the Exit +9. The Exit sends the GET to the server +10. The Server returns the HTTP result to the Exit +11. The Exit sends the DATA cells to the OP +12. The OP returns the HTTP result to the SOCKS client + +Note that the Exit node knows that the connection to the Server was +successful at the end of step 4, but is unable to send the HTTP query to +the server until step 9. + +This proposal (as well as its upcoming sibling concerning the client +side) aims to reduce the latency by allowing: +1. SOCKS clients to optimistically send data before they are notified + that the SOCKS connection has completed successfully +2. OPs to optimistically send DATA cells on streams in the CONNECT_WAIT + state +3. Exit nodes to accept and queue DATA cells while in the + EXIT_CONN_STATE_CONNECTING state + +This particular proposal deals with #3. + +In this way, the flow would be as follows: + +1. The SOCKS client opens a TCP connection to the OP +2. The SOCKS client sends a SOCKS CONNECT command, followed immediately + by data (such as the GET request) +3. The OP sends a BEGIN cell to the Exit, followed immediately by DATA + cells +4. The Exit opens a TCP connection to the Server +5. The Exit returns a CONNECTED cell to the OP, and sends the queued GET + request to the Server +6. The OP returns a SOCKS CONNECTED notification to the SOCKS client, + and the Server returns the HTTP result to the Exit +7. The Exit sends the DATA cells to the OP +8. The OP returns the HTTP result to the SOCKS client + +Motivation: + +This change will save one OP<->Exit round trip (down to one from two). +There are still two SOCKS Client<->OP round trips (negligible time) and +two Exit<->Server round trips. Depending on the ratio of the +Exit<->Server (Internet) RTT to the OP<->Exit (Tor) RTT, this will +decrease the latency by 25 to 50 percent. Experiments validate these +predictions. [Goldberg, PETS 2010 rump session; see +https://thunk.cs.uwaterloo.ca/optimistic-data-pets2010-rump.pdf ] + +Design: + +The current code actually correctly handles queued data at the Exit; if +there is queued data in a EXIT_CONN_STATE_CONNECTING stream, that data +will be immediately sent when the connection succeeds. If the +connection fails, the data will be correctly ignored and freed. The +problem with the current server code is that the server currently +drops DATA cells on streams in the EXIT_CONN_STATE_CONNECTING state. +Also, if you try to queue data in the EXIT_CONN_STATE_RESOLVING state, +bad things happen because streams in that state don't yet have +conn->write_event set, and so some existing sanity checks (any stream +with queued data is at least potentially writable) are no longer sound. + +The solution is to simply not drop received DATA cells while in the +EXIT_CONN_STATE_CONNECTING state. Also do not send SENDME cells in this +state, so that the OP cannot send more than one window's worth of data +to be queued at the Exit. Finally, patch the sanity checks so that +streams in the EXIT_CONN_STATE_RESOLVING state that have buffered data +can pass. + +If no clients ever send such optimistic data, the new code will never be +executed, and the behaviour of Tor will not change. When clients begin +to send optimistic data, the performance of those clients' streams will +improve. + +After discussion with nickm, it seems best to just have the server +version number be the indicator of whether a particular Exit supports +optimistic data. (If a client sends optimistic data to an Exit which +does not support it, the data will be dropped, and the client's request +will fail to complete.) What do version numbers for hypothetical future +protocol-compatible implementations look like, though? + +Security implications: + +Servers (for sure the Exit, and possibly others, by watching the +pattern of packets) will be able to tell that a particular client +is using optimistic data. This will be discussed more in the sibling +proposal. + +On the Exit side, servers will be queueing a little bit extra data, but +no more than one window. Clients today can cause Exits to queue that +much data anyway, simply by establishing a Tor connection to a slow +machine, and sending one window of data. + +Specification: + +tor-spec section 6.2 currently says: + + The OP waits for a RELAY_CONNECTED cell before sending any data. + Once a connection has been established, the OP and exit node + package stream data in RELAY_DATA cells, and upon receiving such + cells, echo their contents to the corresponding TCP stream. + RELAY_DATA cells sent to unrecognized streams are dropped. + +It is not clear exactly what an "unrecognized" stream is, but this last +sentence would be changed to say that RELAY_DATA cells received on a +stream that has processed a RELAY_BEGIN cell and has not yet issued a +RELAY_END or a RELAY_CONNECTED cell are queued; that queue is processed +immediately after a RELAY_CONNECTED cell is issued for the stream, or +freed after a RELAY_END cell is issued for the stream. + +The earlier part of this section will be addressed in the sibling +proposal. + +Compatibility: + +There are compatibility issues, as mentioned above. OPs MUST NOT send +optimistic data to Exit nodes whose version numbers predate (something). +OPs MAY send optimistic data to Exit nodes whose version numbers match +or follow that value. (But see the question about independent server +reimplementations, above.) + +Implementation: + +Here is a simple patch. It seems to work with both regular streams and +hidden services, but there may be other corner cases I'm not aware of. +(Do streams used for directory fetches, hidden services, etc. take a +different code path?) + +diff --git a/src/or/connection.c b/src/or/connection.c +index 7b1493b..f80cd6e 100644 +--- a/src/or/connection.c ++++ b/src/or/connection.c +@@ -2845,7 +2845,13 @@ _connection_write_to_buf_impl(const char *string, size_t len, + return; + } + +- connection_start_writing(conn); ++ /* If we receive optimistic data in the EXIT_CONN_STATE_RESOLVING ++ * state, we don't want to try to write it right away, since ++ * conn->write_event won't be set yet. Otherwise, write data from ++ * this conn as the socket is available. */ ++ if (conn->state != EXIT_CONN_STATE_RESOLVING) { ++ connection_start_writing(conn); ++ } + if (zlib) { + conn->outbuf_flushlen += buf_datalen(conn->outbuf) - old_datalen; + } else { +@@ -3382,7 +3388,11 @@ assert_connection_ok(connection_t *conn, time_t now) + tor_assert(conn->s < 0); + + if (conn->outbuf_flushlen > 0) { +- tor_assert(connection_is_writing(conn) || conn->write_blocked_on_bw || ++ /* With optimistic data, we may have queued data in ++ * EXIT_CONN_STATE_RESOLVING while the conn is not yet marked to writing. ++ * */ ++ tor_assert(conn->state == EXIT_CONN_STATE_RESOLVING || ++ connection_is_writing(conn) || conn->write_blocked_on_bw || + (CONN_IS_EDGE(conn) && TO_EDGE_CONN(conn)->edge_blocked_on_circ)); + } + +diff --git a/src/or/relay.c b/src/or/relay.c +index fab2d88..e45ff70 100644 +--- a/src/or/relay.c ++++ b/src/or/relay.c +@@ -1019,6 +1019,9 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ, + relay_header_t rh; + unsigned domain = layer_hint?LD_APP:LD_EXIT; + int reason; ++ int optimistic_data = 0; /* Set to 1 if we receive data on a stream ++ that's in the EXIT_CONN_STATE_RESOLVING ++ or EXIT_CONN_STATE_CONNECTING states.*/ + + tor_assert(cell); + tor_assert(circ); +@@ -1038,9 +1041,20 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ, + /* either conn is NULL, in which case we've got a control cell, or else + * conn points to the recognized stream. */ + +- if (conn && !connection_state_is_open(TO_CONN(conn))) +- return connection_edge_process_relay_cell_not_open( +- &rh, cell, circ, conn, layer_hint); ++ if (conn && !connection_state_is_open(TO_CONN(conn))) { ++ if ((conn->_base.state == EXIT_CONN_STATE_CONNECTING || ++ conn->_base.state == EXIT_CONN_STATE_RESOLVING) && ++ rh.command == RELAY_COMMAND_DATA) { ++ /* We're going to allow DATA cells to be delivered to an exit ++ * node in state EXIT_CONN_STATE_CONNECTING or ++ * EXIT_CONN_STATE_RESOLVING. This speeds up HTTP, for example. */ ++ log_warn(domain, "Optimistic data received."); ++ optimistic_data = 1; ++ } else { ++ return connection_edge_process_relay_cell_not_open( ++ &rh, cell, circ, conn, layer_hint); ++ } ++ } + + switch (rh.command) { + case RELAY_COMMAND_DROP: +@@ -1090,7 +1104,9 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ, + log_debug(domain,"circ deliver_window now %d.", layer_hint ? + layer_hint->deliver_window : circ->deliver_window); + +- circuit_consider_sending_sendme(circ, layer_hint); ++ if (!optimistic_data) { ++ circuit_consider_sending_sendme(circ, layer_hint); ++ } + + if (!conn) { + log_info(domain,"data cell dropped, unknown stream (streamid %d).", +@@ -1107,7 +1123,9 @@ connection_edge_process_relay_cell(cell_t *cell, circuit_t *circ, + stats_n_data_bytes_received += rh.length; + connection_write_to_buf(cell->payload + RELAY_HEADER_SIZE, + rh.length, TO_CONN(conn)); +- connection_edge_consider_sending_sendme(conn); ++ if (!optimistic_data) { ++ connection_edge_consider_sending_sendme(conn); ++ } + return 0; + case RELAY_COMMAND_END: + reason = rh.length > 0 ? + +Performance and scalability notes: + +There may be more RAM used at Exit nodes, as mentioned above, but it is +transient. |