pyndexter.util

(Not documented)

class URI()

Parse a URI into its component parts. The query component is passed through cgi.parse_qs().

scheme://username:password@host/path?query#fragment

Each component is available as an attribute of the object.

TODO: Support "parameters???" Never seen this in the wild:
scheme://username:password@host/path;parameters?query#fragment

PS. urlparse is not useful.

The URI constructor can be passed a string:

>>> u = URI('http://user:password@www.example.com/some/path?parm=1&parm=2&other=3#fragment')
>>> u
URI(u'http://user:password@www.example.com/some/path?other=3&parm=1&parm=2#fragment')
>>> u.scheme
'http'
>>> u.username
'user'
>>> u.password
'password'
>>> u.host
'www.example.com'
>>> u.path
'/some/path'
>>> u.query
{'parm': ['1', '2'], 'other': ['3']}
>>> u.fragment
'fragment'

...or the individual URI components as keyword arguments:

>>> URI(scheme='http', username='user', password='password', host='www.example.com', path='/some/path', query={'parm': [1, 2], 'other': [3]}, fragment='fragment')
URI(u'http://user:password@www.example.com/some/path?other=3&parm=1&parm=2#fragment')

...or finally, another URI object:

>>> v = URI(u)
>>> v == u
True
>>> v.query is u.query
False
>>> v
URI(u'http://user:password@www.example.com/some/path?other=3&parm=1&parm=2#fragment')

URI also normalises the path component:

>>> URI('http://www.example.com//some/../foo/path/')
URI(u'http://www.example.com/foo/path')

__init__(self, uri=None, scheme='', username='', password='', host='', port='', path='', query=, fragment='')

(Not documented)

__cmp__(self, other)

Compare two URI objects.

>>> u = URI('http://user:password@www.example.com/some/path?parm=1&parm=2&other=3#fragment')
>>> v = URI(u)
>>> u == v
True
>>> v.host = 'www.google.com'
>>> u == v
False

__repr__(self)

(Not documented)

__str__(self)

(Not documented)

excerpt(text, terms, max_len=240, fuzz=60)

Generate an excerpt of a Document. Attempts to include as many terms as possible in the excerpt.