aboutsummaryrefslogtreecommitdiff
path: root/README.rst
diff options
context:
space:
mode:
Diffstat (limited to 'README.rst')
-rw-r--r--README.rst141
1 files changed, 24 insertions, 117 deletions
diff --git a/README.rst b/README.rst
index 08dfc50..093edc1 100644
--- a/README.rst
+++ b/README.rst
@@ -16,8 +16,8 @@ Because it relies on html5lib_, Bleach is as good as modern browsers at dealing
with weird, quirky HTML fragments. And *any* of Bleach's methods will fix
unbalanced or mis-nested tags.
-The version on `github <http://github.com/jsocol/bleach>`_ is the most
-up-to-date and contains the latest bug fixes.
+The version on GitHub_ is the most up-to-date and contains the latest bug
+fixes. You can find full documentation on `ReadTheDocs`_.
Basic Use
@@ -33,9 +33,6 @@ The simplest way to use Bleach is::
>>> bleach.linkify('an http://example.com url')
u'an <a href="http://example.com" rel="nofollow">http://example.com</a> url
- >>> bleach.delinkify('a <a href="http://ex.mp">link</a>')
- u'a link'
-
*NB*: Bleach always returns a ``unicode`` object, whether you give it a
bytestring or a ``unicode`` object, but Bleach does not attempt to detect
incoming character encodings, and will assume UTF-8. If you are using a
@@ -43,117 +40,27 @@ different character encoding, you should convert from a bytestring to
``unicode`` before passing the text to Bleach.
-Customizing Bleach
-==================
-
-``clean()``, ``linkify()`` and ``delinkify()`` can take several optional
-keyword arguments to customize their behavior.
-
-
-``clean()``
------------
-
-``bleach.clean()`` is the primary tool in Bleach. It uses html5lib_ to parse a
-document fragment into a tree and does the sanitization during tokenizing,
-which is incredibly powerful and has several advantages over regular
-expression-based sanitization.
-
-``tags``
- A whitelist of HTML tags. Must be a list. Defaults to
- ``bleach.ALLOWED_TAGS``.
-``attributes``
- A whitelist of HTML attributes. Either a list, in which case all attributes
- are allowed on all elements, or a dict, with tag names as keys and lists of
- allowed attributes as values ('*' is a wildcard key to allow an attribute on
- any tag). Or it is possible to pass a callable instead of a list that accepts
- name and value of attribute and returns True of False. Defaults to
- ``bleach.ALLOWED_ATTRIBUTES``.
-``styles``
- A whitelist of allowed CSS properties within a ``style`` attribute. (Note
- that ``style`` attributes are not allowed by default.) Must be a list.
- Defaults to ``[]``.
-``strip``
- Strip disallowed HTML instead of escaping it. A boolean. Defaults to
- ``False``.
-``strip_comments``
- Strip HTML comments. A boolean. Defaults to ``True``.
-
-
-``linkify()``
--------------
-
-``bleach.linkify()`` turns things that look like URLs or (optionally) email
-addresses and turns them into links. It does this smartly, only looking in text
-nodes, and never within ``<a>`` tags.
-
-There are options that affect output, and some of these are also applied to
-links already found in the text. These are designed to allow you to set
-attributes like ``rel="nofollow"`` or ``target``, or push outgoing links
-through a redirection URL, and do this to links already in the text, as well.
-
-``nofollow``
- Add ``rel="nofollow"`` to non-relative links (both created by ``linkify()``
- and those already present in the text). Defaults to ``True``.
-``filter_url``
- A callable through which the ``href`` attribute of links (both created by
- ``linkify()`` and already present in the text) will be passed. Must accept a
- single argument and return a string.
-``filter_text``
- A callable through which the text of links (only those created by
- ``linkify``) will be passed. Must accept a single argument and return a
- string.
-``skip_pre``
- Do not create new links inside ``<pre>`` sections. Still follows
- ``nofollow``. Defaults to ``False``.
-``parse_email``
- Linkify email addresses with ``mailto:``. Defaults to ``False``.
-``target``
- Set a ``target`` attribute on links. Like ``nofollow``, if ``target`` is not
- ``None``, will set the attribute on links already in the text, as well.
- Defaults to ``None``.
-
-
-``delinkify()``
----------------
-
-``bleach.delinkify()`` is basically the opposite of ``linkify()``. It strips
-links out of text except, optionally, relative links, or links to domains
-you've whitelisted.
-
-``allow_domains``
- Allow links to the domains in this list. Set to ``None`` or an empty list to
- disallow all non-relative domains. See below for wildcards. Defaults to
- ``None``.
-``allow_relative``
- Allow relative links (i.e. those with no hostname). Defaults to ``False``.
-
-
-Wildcards
-^^^^^^^^^
-
-To allow links to a domain and its subdomains, ``allow_domains`` accepts two
-types of wildcard arguments in domains:
-
-``*``
- Allow a single level of subdomain. This can be anywhere in the hostname, even
- the TLD. This allows you to, for example, allow links to ``example.*``.
- ``*.example.com`` will match both ``foo.example.com`` and ``example.com``.
- ::
- >>> delinkify('<a href="http://foo.ex.mp">bar</a>', \
- ... allow_domains=['*.ex.*'])
- u'<a href="http://foo.ex.mp">bar</a>'
- >>> delinkify('<a href="http://ex.mp">bar</a>', allow_domains=['*.ex.mp'])
- u'<a href="http://ex.mp">bar</a>
-``**``
- To allow any number of *preceding* subdomains, you can start a hostname with
- ``**``. Note that unlike ``*``, ``**`` may only appear once, and only at the
- beginning of a hostname.
- ::
- >>> delinkify('<a href="http://a.b.ex.mp">t</a>', \
- ... allow_domains=['**.ex.mp'])
- u'<a href="http://a.b.ex.mp">t</a>'
- If ``**`` appears anywhere but the beginning of a hostname, ``delinkify``
- will throw ``bleach.ValidationError`` (which is a ``ValueError`` subclass,
- for easy catching).
+Installation
+------------
+
+Bleach is available on PyPI_, so you can install it with ``pip``::
+
+ $ pip install bleach
+
+Or with ``easy_install``::
+
+ $ easy_install bleach
+
+Or by cloning the repo from GitHub_::
+
+ $ git clone git://github.com/jsocol/bleach.git
+
+Then install it by running::
+
+ $ python setup.py install
+
.. _html5lib: http://code.google.com/p/html5lib/
+.. _GitHub: https://github.com/jsocol/bleach
+.. _ReadTheDocs: http://bleach.readthedocs.org/
+.. _PyPI: http://pypi.python.org/pypi/bleach