aboutsummaryrefslogtreecommitdiff
path: root/doc/TODO
blob: 1a4bcafc49dcb88e385195429d0aa5ac3433bec8 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
Legend:
SPEC!!  - Not specified
SPEC    - Spec not finalized
NICK    - nick claims
ARMA    - arma claims
        - Not done
        * Top priority
        . Partially done
        o Done
        D Deferred
        X Abandoned

For 0.0.9:

   o Solve the MSVC nuisance where __FILE__ contains the full path.
     People are getting confused about why their errors are coming from
     C:\Documents and Settings\Nick Mathewson\My Documents\src\tor .
N&R. bring tor-spec up to date
N&R. make loglevels info,debug less noisy
N  . OS X package (and bundle?)
N  . Working RPMs
N  - Get win32 servers working, or find out why it isn't happening now.
     - Why can't win32 find a cpuworker?

************************ For Post 0.0.9 *****************************

https proxy for OR CONNECT stuff
choose entry node to be one you're already connected to?

Tier one:
   o Move to our new version system.
   - Changes for forward compatibility
     - If a version is later than the last in its series, but a version
       in the next series is recommended, that doesn't mean it's bad.

   - Bugfixes
     - fix dfc/weasel's intro point bug
     - when we haven't explicitly sent a socks reject, sending one in
       connection_about_to_close_connection() fails because we never give it
       a chance to flush. right answer is to do the socks reply manually in
       each appropriate case, and then about-to-close-connection can simply
       warn us if we forgot one.

   - Documentation
     - Convert man pages to pod, or whatever's right.  Alternatively, find
       a man2html that actually works.
     - Macintosh HOWTO page.

   - Evangelism
     - Get more nodes running on 80 and 443.
     - Get epic, aclu, etc running nodes.

   - Dirservers and server descs: small, backward-compatible changes
     - support hostnames as well as IPs for authdirservers.
     - If we have a trusted directory on port 80, stop falling back to
       forbidden ports when fascistfirewall blocks all good dirservers.
     - GPSLocation optional config string.

   - SOCKS enhancements
     - niels's "did it fail because conn refused or timeout or what"
       relay end feature.

   - Windows
N    - Make millisecond accuracy work on win32
     - Switch to WSA*Event code as a better poll replacement.  Or maybe just
       do libevent?

   - Code cleanup
     - Make more configuration variables into CSVs.
     - Make configure.in handle cross-compilation
       - Have NULL_REP_IS_ZERO_BYTES default to 1.
       - Make with-ssl-dir disable search for ssl.

   - Support
     - Bug tracker.

   - Exit hostname support
     - cache .foo.exit names better, or differently, or not.

   - IPv6 support
     - teach connection_ap_handshake_socks_reply() about ipv6 and friends
       so connection_ap_handshake_socks_resolved() doesn't also need
       to know about them.

   - Packaging
     - Figure out how to make the rpm not strip the binaries it makes.


Tier two:

   - Efficiency/speed improvements.
     - Handle pools of waiting circuits better.
     - Limit number of circuits that we preemptively generate based on past
       behavior; use same limits in circuit_expire_old_circuits().
     - Write limiting; configurable token buckets.
     - Switch to libevent?  Evaluate it first.
     - Make it harder to circumvent bandwidth caps: look at number of bytes
       sent across sockets, not number sent inside TLS stream.


   - QOI
     - Let more config options (e.g. ORPort) change dynamically.

   - Dirservers and server descs: small, backward-compatible changes
     - make advertised_server_mode() ORs fetch dirs more often.
     - Implement If-Modified-Since for directories.

   - Big, incompatible re-architecting and decentralization of directory
     system.
     - Only the top of a directory needs to be signed.

   - Windows
N    - Clean up NT service code; make it work
     - Get a controller to launch tor and keep it on the system tray.
     - Win32 installer plus privoxy, sockscap/freecap, etc.

   - Controller enhancements.
     - controller should have 'getinfo' command to query about rephist,
       about rendezvous status, etc.

N  - Handle rendezvousing with unverified nodes.
     - Specify: Stick rendezvous point's key in INTRODUCE cell.
       Bob should _always_ use key from INTRODUCE cell.
     - Implement.

N  - IPv6 support (For exit addresses)
     - Spec issue: if a resolve returns an IP4 and an IP6 address,
       which to use?
     - Add to exit policy code
     - Make tor_gethostbyname into tor_getaddrinfo
     - Make everything that uses uint32_t as an IP address change to use
       a generalize address struct.
     - Change relay cell types to accept new addresses.
     - Add flag to serverdescs to tell whether IPv6 is supported.
     - When should servers 

   - Security fixes
     - christian grothoff's attack of infinite-length circuit.
       the solution is to have a separate 'extend-data' cell type
       which is used for the first N data cells, and only
       extend-data cells can be extend requests.
     - Make sure logged information is 'safe'.

   - Code cleanup
     . rename/rearrange functions for what file they're in
     - fix router_get_by_* functions so they can get ourselves too,
       and audit everything to make sure rend and intro points are
       just as likely to be us as not.

   - Bugfixes
     - hidserv offerers shouldn't need to define a SocksPort
       * figure out what breaks for this, and do it.
     - should retry exitpolicy end streams even if the end cell didn't
       resolve the address for you

   - tor should be able to have a pool of outgoing IP addresses
     that it is able to rotate through. (maybe)

   Packaging, docs, etc:
   - Exit node caching: tie into squid or other caching web proxy.

   Deferred until needed:
   - Do something to prevent spurious EXTEND cells from making middleman
     nodes connect all over.  Rate-limit failed connections, perhaps?
   - Limit to 2 dir, 2 OR, N SOCKS connections per IP.
   - Handle full buffers without totally borking
     * do this eventually, no rush.
   - Rate-limit OR and directory connections overall and per-IP and
     maybe per subnet.
   - DoS protection: TLS puzzles, public key ops, bandwidth exhaustion.
   - Have clients and dirservers preserve reputation info over
     reboots.
   - round detected bandwidth up to nearest 10KB?
   - client software not upload descriptor until:
     - you've been running for an hour
     - it's sufficiently satisfied with its bandwidth
     - it decides it is reachable
     - start counting again if your IP ever changes.
     - never regenerate identity keys, for now.
     - you can set a bit for not-being-an-OR.
     * no need to do this yet. few people define their ORPort.
   - authdirserver lists you as running iff:
     - he can connect to you
     - he has successfully extended to you
     - you have sufficient mean-time-between-failures
     * keep doing nothing for now.
   - Include HTTP status messages in logging (see parse_http_response).

   Blue sky or deferred indefinitely:
   - Support egd or other non-OS-integrated strong entropy sources
   - password protection for on-disk identity key
   - Possible to get autoconf to easily install things into ~/.tor?
   - server descriptor declares min log level, clients avoid servers
     that are too loggy.
   - put expiry date on onion-key, so people don't keep trying
     old ones that they could know are expired?
   - Add a notion of nickname->Pubkey binding that's not 'verification'
   - Conn key rotation.
   - Need a relay teardown cell, separate from one-way ends.

Big tasks that would demonstrate progress:

   - Facility to automatically choose long-term helper nodes; perhaps
     on by default for hidden services.
   - patch privoxy and socks protocol to pass strings to the browser.
   - patch tsocks with our current patches + gethostbyname, getpeername, etc.
   - make freecap (or whichever) do what we want.
   - scrubbing proxies for protocols other than http.
     - Find an smtp proxy?
     . Get socks4a support into Mozilla
N  - Reverse DNS: specify and implement.
   - figure out enclaves, e.g. so we know what to recommend that people
     do, and so running a tor server on your website is helpful.
     - Do enclaves for same IP only.
     - Resolve first, then if IP is an OR, extend to him first.
   - implement a trivial fun gui to demonstrate our control interface.

************************ Roadmap for 2004-2005 **********************

Hard problems that need to be solved:

  - Separating node discovery from routing.
  - Arranging membership management for independence.
    Sybil defenses without having a human bottleneck.
    How to gather random sample of nodes.
    How to handle nodelist recommendations.
    Consider incremental switches: a p2p tor with only 50 users has
      different anonymity properties than one with 10k users, and should
      be treated differently.
  - Measuring performance of other nodes. Measuring whether they're up.
  - Choosing exit node by meta-data, e.g. country.
  - Incentives to relay; incentives to exit.
  - Allowing dissidents to relay through Tor clients.
  - How to intercept, or not need to intercept, dns queries locally.
  - Improved anonymity:
    - Experiment with mid-latency systems. How do they impact usability,
      how do they impact safety?
    - Understand how powerful fingerprinting attacks are, and experiment
      with ways to foil them (long-range padding?).
    - Come up with practical approximations to picking entry and exit in
      different routing zones.
    - Find ideal churn rate for helper nodes; how safe is it?
    - What info squeaks by Privoxy? Are other scrubbers better?
    - Attacking freenet-gnunet/timing-delay-randomness-arguments.
    - Is abandoning the circuit the only option when an extend fails, or
      can we do something without impacting anonymity too much?
    - Is exiting from the middle of the circuit always a bad idea?

Sample Publicity Landmarks:

  - we have N servers / N users
  - we have servers at epic and aclu and foo
  - hidden services are robust and fast
  - a more decentralized design
  - tor win32 installer works
  - win32 tray icon for end-users
  - tor server works on win32
  - win32 service for servers
  - mac installer works

***************************Future tasks:****************************

Rendezvous and hidden services:
  make it fast:
    - preemptively build and start rendezvous circs.
    - preemptively build n-1 hops of intro circs?
    - cannibalize general circs?
  make it reliable:
    - standby/hotswap/redundant services.
    - store stuff to disk? dirservers forget service descriptors when
      they restart; nodes offering hidden services forget their chosen
      intro points when they restart.
  make it robust:
    - auth mechanisms to let midpoint and bob selectively choose
      connection requests.
  make it scalable:
    - robust decentralized storage for hidden service descriptors.
  make it accessible:
    - web proxy gateways to let normal people browse hidden services.

Tor scalability:
  Relax clique assumptions.
  Redesign how directories are handled.
    - Resolve directory agreement somehow.
  Find and remove bottlenecks
    - Address linear searches on e.g. circuit and connection lists.
  Reputation/memory system, so dirservers can measure people,
    and so other people can verify their measurements.
    - Need to measure via relay, so it's not distinguishable.
  Let dissidents get to Tor servers via Tor users. ("Backbone model")

Make it more correct:
  Handle half-open connections: right now we don't support all TCP
    streams, at least according to the protocol. But we handle all that
    we've seen in the wild.
  Support IPv6.

Efficiency/speed/robustness:
  Congestion control. Is our current design sufficient once we have heavy
    use? Need to measure and tweak, or maybe overhaul.
  Allow small cells and large cells on the same network?
  Cell buffering and resending. This will allow us to handle broken
    circuits as long as the endpoints don't break, plus will allow
    connection (tls session key) rotation.
  Implement Morphmix, so we can compare its behavior, complexity, etc.
  Use cpuworker for more heavy lifting.
    - Signing (and verifying) hidserv descriptors
    - Signing (and verifying) intro/rend requests
    - Signing (and verifying) router descriptors
    - Signing (and verifying) directories
    - Doing TLS handshake (this is very hard to separate out, though)
  Buffer size pool: allocate a maximum size for all buffers, not
    a maximum size for each buffer. So we don't have to give up as
    quickly (and kill the thickpipe!) when there's congestion.
  Other transport. HTTP, udp, rdp, airhook, etc. May have to do our own
    link crypto, unless we can bully openssl into it.