Metadata-Version: 1.1 Name: urllib3 Version: 1.9 Summary: HTTP library with thread-safe connection pooling, file post, and more. Home-page: http://urllib3.readthedocs.org/ Author: Andrey Petrov Author-email: andrey.petrov@shazow.net License: MIT Description: ======= urllib3 ======= .. image:: https://travis-ci.org/shazow/urllib3.png?branch=master :target: https://travis-ci.org/shazow/urllib3 Highlights ========== - Re-use the same socket connection for multiple requests (``HTTPConnectionPool`` and ``HTTPSConnectionPool``) (with optional client-side certificate verification). - File posting (``encode_multipart_formdata``). - Built-in redirection and retries (optional). - Supports gzip and deflate decoding. - Thread-safe and sanity-safe. - Works with AppEngine, gevent, and eventlib. - Tested on Python 2.6+, Python 3.2+, and PyPy, with 100% unit test coverage. - Small and easy to understand codebase perfect for extending and building upon. For a more comprehensive solution, have a look at `Requests `_ which is also powered by ``urllib3``. You might already be using urllib3! =================================== ``urllib3`` powers `many great Python libraries `_, including ``pip`` and ``requests``. What's wrong with urllib and urllib2? ===================================== There are two critical features missing from the Python standard library: Connection re-using/pooling and file posting. It's not terribly hard to implement these yourself, but it's much easier to use a module that already did the work for you. The Python standard libraries ``urllib`` and ``urllib2`` have little to do with each other. They were designed to be independent and standalone, each solving a different scope of problems, and ``urllib3`` follows in a similar vein. Why do I want to reuse connections? =================================== Performance. When you normally do a urllib call, a separate socket connection is created with each request. By reusing existing sockets (supported since HTTP 1.1), the requests will take up less resources on the server's end, and also provide a faster response time at the client's end. With some simple benchmarks (see `test/benchmark.py `_ ), downloading 15 URLs from google.com is about twice as fast when using HTTPConnectionPool (which uses 1 connection) than using plain urllib (which uses 15 connections). This library is perfect for: - Talking to an API - Crawling a website - Any situation where being able to post files, handle redirection, and retrying is useful. It's relatively lightweight, so it can be used for anything! Examples ======== Go to `urllib3.readthedocs.org `_ for more nice syntax-highlighted examples. But, long story short:: import urllib3 http = urllib3.PoolManager() r = http.request('GET', 'http://google.com/') print r.status, r.data The ``PoolManager`` will take care of reusing connections for you whenever you request the same host. For more fine-grained control of your connection pools, you should look at `ConnectionPool `_. Run the tests ============= We use some external dependencies, multiple interpreters and code coverage analysis while running test suite. Our ``Makefile`` handles much of this for you as long as you're running it `inside of a virtualenv `_:: $ make test [... magically installs dependencies and runs tests on your virtualenv] Ran 182 tests in 1.633s OK (SKIP=6) Note that code coverage less than 100% is regarded as a failing run. Some platform-specific tests are skipped unless run in that platform. To make sure the code works in all of urllib3's supported platforms, you can run our ``tox`` suite:: $ make test-all [... tox creates a virtualenv for every platform and runs tests inside of each] py26: commands succeeded py27: commands succeeded py32: commands succeeded py33: commands succeeded py34: commands succeeded Our test suite `runs continuously on Travis CI `_ with every pull request. Contributing ============ #. `Check for open issues `_ or open a fresh issue to start a discussion around a feature idea or a bug. There is a *Contributor Friendly* tag for issues that should be ideal for people who are not very familiar with the codebase yet. #. Fork the `urllib3 repository on Github `_ to start making your changes. #. Write a test which shows that the bug was fixed or that the feature works as expected. #. Send a pull request and bug the maintainer until it gets merged and published. :) Make sure to add yourself to ``CONTRIBUTORS.txt``. Sponsorship =========== If your company benefits from this library, please consider `sponsoring its development `_. Changes ======= 1.9 (2014-07-04) ++++++++++++++++ * Shuffled around development-related files. If you're maintaining a distro package of urllib3, you may need to tweak things. (Issue #415) * Unverified HTTPS requests will trigger a warning on the first request. See our new `security documentation `_ for details. (Issue #426) * New retry logic and ``urllib3.util.retry.Retry`` configuration object. (Issue #326) * All raised exceptions should now wrapped in a ``urllib3.exceptions.HTTPException``-extending exception. (Issue #326) * All errors during a retry-enabled request should be wrapped in ``urllib3.exceptions.MaxRetryError``, including timeout-related exceptions which were previously exempt. Underlying error is accessible from the ``.reason`` propery. (Issue #326) * ``urllib3.exceptions.ConnectionError`` renamed to ``urllib3.exceptions.ProtocolError``. (Issue #326) * Errors during response read (such as IncompleteRead) are now wrapped in ``urllib3.exceptions.ProtocolError``. (Issue #418) * Requesting an empty host will raise ``urllib3.exceptions.LocationValueError``. (Issue #417) * Catch read timeouts over SSL connections as ``urllib3.exceptions.ReadTimeoutError``. (Issue #419) * Apply socket arguments before connecting. (Issue #427) 1.8.3 (2014-06-23) ++++++++++++++++++ * Fix TLS verification when using a proxy in Python 3.4.1. (Issue #385) * Add ``disable_cache`` option to ``urllib3.util.make_headers``. (Issue #393) * Wrap ``socket.timeout`` exception with ``urllib3.exceptions.ReadTimeoutError``. (Issue #399) * Fixed proxy-related bug where connections were being reused incorrectly. (Issues #366, #369) * Added ``socket_options`` keyword parameter which allows to define ``setsockopt`` configuration of new sockets. (Issue #397) * Removed ``HTTPConnection.tcp_nodelay`` in favor of ``HTTPConnection.default_socket_options``. (Issue #397) * Fixed ``TypeError`` bug in Python 2.6.4. (Issue #411) 1.8.2 (2014-04-17) ++++++++++++++++++ * Fix ``urllib3.util`` not being included in the package. 1.8.1 (2014-04-17) ++++++++++++++++++ * Fix AppEngine bug of HTTPS requests going out as HTTP. (Issue #356) * Don't install ``dummyserver`` into ``site-packages`` as it's only needed for the test suite. (Issue #362) * Added support for specifying ``source_address``. (Issue #352) 1.8 (2014-03-04) ++++++++++++++++ * Improved url parsing in ``urllib3.util.parse_url`` (properly parse '@' in username, and blank ports like 'hostname:'). * New ``urllib3.connection`` module which contains all the HTTPConnection objects. * Several ``urllib3.util.Timeout``-related fixes. Also changed constructor signature to a more sensible order. [Backwards incompatible] (Issues #252, #262, #263) * Use ``backports.ssl_match_hostname`` if it's installed. (Issue #274) * Added ``.tell()`` method to ``urllib3.response.HTTPResponse`` which returns the number of bytes read so far. (Issue #277) * Support for platforms without threading. (Issue #289) * Expand default-port comparison in ``HTTPConnectionPool.is_same_host`` to allow a pool with no specified port to be considered equal to to an HTTP/HTTPS url with port 80/443 explicitly provided. (Issue #305) * Improved default SSL/TLS settings to avoid vulnerabilities. (Issue #309) * Fixed ``urllib3.poolmanager.ProxyManager`` not retrying on connect errors. (Issue #310) * Disable Nagle's Algorithm on the socket for non-proxies. A subset of requests will send the entire HTTP request ~200 milliseconds faster; however, some of the resulting TCP packets will be smaller. (Issue #254) * Increased maximum number of SubjectAltNames in ``urllib3.contrib.pyopenssl`` from the default 64 to 1024 in a single certificate. (Issue #318) * Headers are now passed and stored as a custom ``urllib3.collections_.HTTPHeaderDict`` object rather than a plain ``dict``. (Issue #329, #333) * Headers no longer lose their case on Python 3. (Issue #236) * ``urllib3.contrib.pyopenssl`` now uses the operating system's default CA certificates on inject. (Issue #332) * Requests with ``retries=False`` will immediately raise any exceptions without wrapping them in ``MaxRetryError``. (Issue #348) * Fixed open socket leak with SSL-related failures. (Issue #344, #348) 1.7.1 (2013-09-25) ++++++++++++++++++ * Added granular timeout support with new ``urllib3.util.Timeout`` class. (Issue #231) * Fixed Python 3.4 support. (Issue #238) 1.7 (2013-08-14) ++++++++++++++++ * More exceptions are now pickle-able, with tests. (Issue #174) * Fixed redirecting with relative URLs in Location header. (Issue #178) * Support for relative urls in ``Location: ...`` header. (Issue #179) * ``urllib3.response.HTTPResponse`` now inherits from ``io.IOBase`` for bonus file-like functionality. (Issue #187) * Passing ``assert_hostname=False`` when creating a HTTPSConnectionPool will skip hostname verification for SSL connections. (Issue #194) * New method ``urllib3.response.HTTPResponse.stream(...)`` which acts as a generator wrapped around ``.read(...)``. (Issue #198) * IPv6 url parsing enforces brackets around the hostname. (Issue #199) * Fixed thread race condition in ``urllib3.poolmanager.PoolManager.connection_from_host(...)`` (Issue #204) * ``ProxyManager`` requests now include non-default port in ``Host: ...`` header. (Issue #217) * Added HTTPS proxy support in ``ProxyManager``. (Issue #170 #139) * New ``RequestField`` object can be passed to the ``fields=...`` param which can specify headers. (Issue #220) * Raise ``urllib3.exceptions.ProxyError`` when connecting to proxy fails. (Issue #221) * Use international headers when posting file names. (Issue #119) * Improved IPv6 support. (Issue #203) 1.6 (2013-04-25) ++++++++++++++++ * Contrib: Optional SNI support for Py2 using PyOpenSSL. (Issue #156) * ``ProxyManager`` automatically adds ``Host: ...`` header if not given. * Improved SSL-related code. ``cert_req`` now optionally takes a string like "REQUIRED" or "NONE". Same with ``ssl_version`` takes strings like "SSLv23" The string values reflect the suffix of the respective constant variable. (Issue #130) * Vendored ``socksipy`` now based on Anorov's fork which handles unexpectedly closed proxy connections and larger read buffers. (Issue #135) * Ensure the connection is closed if no data is received, fixes connection leak on some platforms. (Issue #133) * Added SNI support for SSL/TLS connections on Py32+. (Issue #89) * Tests fixed to be compatible with Py26 again. (Issue #125) * Added ability to choose SSL version by passing an ``ssl.PROTOCOL_*`` constant to the ``ssl_version`` parameter of ``HTTPSConnectionPool``. (Issue #109) * Allow an explicit content type to be specified when encoding file fields. (Issue #126) * Exceptions are now pickleable, with tests. (Issue #101) * Fixed default headers not getting passed in some cases. (Issue #99) * Treat "content-encoding" header value as case-insensitive, per RFC 2616 Section 3.5. (Issue #110) * "Connection Refused" SocketErrors will get retried rather than raised. (Issue #92) * Updated vendored ``six``, no longer overrides the global ``six`` module namespace. (Issue #113) * ``urllib3.exceptions.MaxRetryError`` contains a ``reason`` property holding the exception that prompted the final retry. If ``reason is None`` then it was due to a redirect. (Issue #92, #114) * Fixed ``PoolManager.urlopen()`` from not redirecting more than once. (Issue #149) * Don't assume ``Content-Type: text/plain`` for multi-part encoding parameters that are not files. (Issue #111) * Pass `strict` param down to ``httplib.HTTPConnection``. (Issue #122) * Added mechanism to verify SSL certificates by fingerprint (md5, sha1) or against an arbitrary hostname (when connecting by IP or for misconfigured servers). (Issue #140) * Streaming decompression support. (Issue #159) 1.5 (2012-08-02) ++++++++++++++++ * Added ``urllib3.add_stderr_logger()`` for quickly enabling STDERR debug logging in urllib3. * Native full URL parsing (including auth, path, query, fragment) available in ``urllib3.util.parse_url(url)``. * Built-in redirect will switch method to 'GET' if status code is 303. (Issue #11) * ``urllib3.PoolManager`` strips the scheme and host before sending the request uri. (Issue #8) * New ``urllib3.exceptions.DecodeError`` exception for when automatic decoding, based on the Content-Type header, fails. * Fixed bug with pool depletion and leaking connections (Issue #76). Added explicit connection closing on pool eviction. Added ``urllib3.PoolManager.clear()``. * 99% -> 100% unit test coverage. 1.4 (2012-06-16) ++++++++++++++++ * Minor AppEngine-related fixes. * Switched from ``mimetools.choose_boundary`` to ``uuid.uuid4()``. * Improved url parsing. (Issue #73) * IPv6 url support. (Issue #72) 1.3 (2012-03-25) ++++++++++++++++ * Removed pre-1.0 deprecated API. * Refactored helpers into a ``urllib3.util`` submodule. * Fixed multipart encoding to support list-of-tuples for keys with multiple values. (Issue #48) * Fixed multiple Set-Cookie headers in response not getting merged properly in Python 3. (Issue #53) * AppEngine support with Py27. (Issue #61) * Minor ``encode_multipart_formdata`` fixes related to Python 3 strings vs bytes. 1.2.2 (2012-02-06) ++++++++++++++++++ * Fixed packaging bug of not shipping ``test-requirements.txt``. (Issue #47) 1.2.1 (2012-02-05) ++++++++++++++++++ * Fixed another bug related to when ``ssl`` module is not available. (Issue #41) * Location parsing errors now raise ``urllib3.exceptions.LocationParseError`` which inherits from ``ValueError``. 1.2 (2012-01-29) ++++++++++++++++ * Added Python 3 support (tested on 3.2.2) * Dropped Python 2.5 support (tested on 2.6.7, 2.7.2) * Use ``select.poll`` instead of ``select.select`` for platforms that support it. * Use ``Queue.LifoQueue`` instead of ``Queue.Queue`` for more aggressive connection reusing. Configurable by overriding ``ConnectionPool.QueueCls``. * Fixed ``ImportError`` during install when ``ssl`` module is not available. (Issue #41) * Fixed ``PoolManager`` redirects between schemes (such as HTTP -> HTTPS) not completing properly. (Issue #28, uncovered by Issue #10 in v1.1) * Ported ``dummyserver`` to use ``tornado`` instead of ``webob`` + ``eventlet``. Removed extraneous unsupported dummyserver testing backends. Added socket-level tests. * More tests. Achievement Unlocked: 99% Coverage. 1.1 (2012-01-07) ++++++++++++++++ * Refactored ``dummyserver`` to its own root namespace module (used for testing). * Added hostname verification for ``VerifiedHTTPSConnection`` by vendoring in Py32's ``ssl_match_hostname``. (Issue #25) * Fixed cross-host HTTP redirects when using ``PoolManager``. (Issue #10) * Fixed ``decode_content`` being ignored when set through ``urlopen``. (Issue #27) * Fixed timeout-related bugs. (Issues #17, #23) 1.0.2 (2011-11-04) ++++++++++++++++++ * Fixed typo in ``VerifiedHTTPSConnection`` which would only present as a bug if you're using the object manually. (Thanks pyos) * Made RecentlyUsedContainer (and consequently PoolManager) more thread-safe by wrapping the access log in a mutex. (Thanks @christer) * Made RecentlyUsedContainer more dict-like (corrected ``__delitem__`` and ``__getitem__`` behaviour), with tests. Shouldn't affect core urllib3 code. 1.0.1 (2011-10-10) ++++++++++++++++++ * Fixed a bug where the same connection would get returned into the pool twice, causing extraneous "HttpConnectionPool is full" log warnings. 1.0 (2011-10-08) ++++++++++++++++ * Added ``PoolManager`` with LRU expiration of connections (tested and documented). * Added ``ProxyManager`` (needs tests, docs, and confirmation that it works with HTTPS proxies). * Added optional partial-read support for responses when ``preload_content=False``. You can now make requests and just read the headers without loading the content. * Made response decoding optional (default on, same as before). * Added optional explicit boundary string for ``encode_multipart_formdata``. * Convenience request methods are now inherited from ``RequestMethods``. Old helpers like ``get_url`` and ``post_url`` should be abandoned in favour of the new ``request(method, url, ...)``. * Refactored code to be even more decoupled, reusable, and extendable. * License header added to ``.py`` files. * Embiggened the documentation: Lots of Sphinx-friendly docstrings in the code and docs in ``docs/`` and on urllib3.readthedocs.org. * Embettered all the things! * Started writing this file. 0.4.1 (2011-07-17) ++++++++++++++++++ * Minor bug fixes, code cleanup. 0.4 (2011-03-01) ++++++++++++++++ * Better unicode support. * Added ``VerifiedHTTPSConnection``. * Added ``NTLMConnectionPool`` in contrib. * Minor improvements. 0.3.1 (2010-07-13) ++++++++++++++++++ * Added ``assert_host_name`` optional parameter. Now compatible with proxies. 0.3 (2009-12-10) ++++++++++++++++ * Added HTTPS support. * Minor bug fixes. * Refactored, broken backwards compatibility with 0.2. * API to be treated as stable from this version forward. 0.2 (2008-11-17) ++++++++++++++++ * Added unit tests. * Bug fixes. 0.1 (2008-11-16) ++++++++++++++++ * First release. Keywords: urllib httplib threadsafe filepost http https ssl pooling Platform: UNKNOWN Classifier: Environment :: Web Environment Classifier: Intended Audience :: Developers Classifier: License :: OSI Approved :: MIT License Classifier: Operating System :: OS Independent Classifier: Programming Language :: Python Classifier: Programming Language :: Python :: 2 Classifier: Programming Language :: Python :: 3 Classifier: Topic :: Internet :: WWW/HTTP Classifier: Topic :: Software Development :: Libraries