Welcome to Knowledge Base!

KB at your finger tips

This is one stop global knowledge base where you can learn about all the products, solutions and support features.

Categories
All

Programming-Python

10. Brief Tour of the Standard Library






10. Brief Tour of the Standard Library¶



10.1. Operating System Interface¶


The os module provides dozens of functions for interacting with the
operating system:


>>> import os
>>> os.getcwd() # Return the current working directory
'C:\\Python311'
>>> os.chdir('/server/accesslogs') # Change current working directory
>>> os.system('mkdir today') # Run the command mkdir in the system shell
0


Be sure to use the import os style instead of from os import * . This
will keep os.open() from shadowing the built-in open() function which
operates much differently.


The built-in dir() and help() functions are useful as interactive
aids for working with large modules like os :


>>> import os
>>> dir(os)
<returns a list of all module functions>
>>> help(os)
<returns an extensive manual page created from the module's docstrings>


For daily file and directory management tasks, the shutil module provides
a higher level interface that is easier to use:


>>> import shutil
>>> shutil.copyfile('data.db', 'archive.db')
'archive.db'
>>> shutil.move('/build/executables', 'installdir')
'installdir'




10.2. File Wildcards¶


The glob module provides a function for making file lists from directory
wildcard searches:


>>> import glob
>>> glob.glob('*.py')
['primes.py', 'random.py', 'quote.py']




10.3. Command Line Arguments¶


Common utility scripts often need to process command line arguments. These
arguments are stored in the sys module’s argv attribute as a list. For
instance the following output results from running python demo.py one two
three
at the command line:


>>> import sys
>>> print(sys.argv)
['demo.py', 'one', 'two', 'three']


The argparse module provides a more sophisticated mechanism to process
command line arguments. The following script extracts one or more filenames
and an optional number of lines to be displayed:


import argparse

parser = argparse.ArgumentParser(
prog='top',
description='Show top lines from each file')
parser.add_argument('filenames', nargs='+')
parser.add_argument('-l', '--lines', type=int, default=10)
args = parser.parse_args()
print(args)


When run at the command line with python top.py --lines=5 alpha.txt
beta.txt
, the script sets args.lines to 5 and args.filenames
to ['alpha.txt', 'beta.txt'] .




10.4. Error Output Redirection and Program Termination¶


The sys module also has attributes for stdin , stdout , and stderr .
The latter is useful for emitting warnings and error messages to make them
visible even when stdout has been redirected:


>>> sys.stderr.write('Warning, log file not found starting a new one\n')
Warning, log file not found starting a new one


The most direct way to terminate a script is to use sys.exit() .




10.5. String Pattern Matching¶


The re module provides regular expression tools for advanced string
processing. For complex matching and manipulation, regular expressions offer
succinct, optimized solutions:


>>> import re
>>> re.findall(r'\bf[a-z]*', 'which foot or hand fell fastest')
['foot', 'fell', 'fastest']
>>> re.sub(r'(\b[a-z]+) \1', r'\1', 'cat in the the hat')
'cat in the hat'


When only simple capabilities are needed, string methods are preferred because
they are easier to read and debug:


>>> 'tea for too'.replace('too', 'two')
'tea for two'




10.6. Mathematics¶


The math module gives access to the underlying C library functions for
floating point math:


>>> import math
>>> math.cos(math.pi / 4)
0.70710678118654757
>>> math.log(1024, 2)
10.0


The random module provides tools for making random selections:


>>> import random
>>> random.choice(['apple', 'pear', 'banana'])
'apple'
>>> random.sample(range(100), 10) # sampling without replacement
[30, 83, 16, 4, 8, 81, 41, 50, 18, 33]
>>> random.random() # random float
0.17970987693706186
>>> random.randrange(6) # random integer chosen from range(6)
4


The statistics module calculates basic statistical properties
(the mean, median, variance, etc.) of numeric data:


>>> import statistics
>>> data = [2.75, 1.75, 1.25, 0.25, 0.5, 1.25, 3.5]
>>> statistics.mean(data)
1.6071428571428572
>>> statistics.median(data)
1.25
>>> statistics.variance(data)
1.3720238095238095


The SciPy project <https://scipy.org> has many other modules for numerical
computations.




10.7. Internet Access¶


There are a number of modules for accessing the internet and processing internet
protocols. Two of the simplest are urllib.request for retrieving data
from URLs and smtplib for sending mail:


>>> from urllib.request import urlopen
>>> with urlopen('http://worldtimeapi.org/api/timezone/etc/UTC.txt') as response:
... for line in response:
... line = line.decode() # Convert bytes to a str
... if line.startswith('datetime'):
... print(line.rstrip()) # Remove trailing newline
...
datetime: 2022-01-01T01:36:47.689215+00:00

>>> import smtplib
>>> server = smtplib.SMTP('localhost')
>>> server.sendmail('soothsayer@example.org', 'jcaesar@example.org',
... """To: jcaesar@example.org
... From: soothsayer@example.org
...
... Beware the Ides of March.
... """)
>>> server.quit()


(Note that the second example needs a mailserver running on localhost.)




10.8. Dates and Times¶


The datetime module supplies classes for manipulating dates and times in
both simple and complex ways. While date and time arithmetic is supported, the
focus of the implementation is on efficient member extraction for output
formatting and manipulation. The module also supports objects that are timezone
aware.


>>> # dates are easily constructed and formatted
>>> from datetime import date
>>> now = date.today()
>>> now
datetime.date(2003, 12, 2)
>>> now.strftime("%m-%d-%y. %d %b %Y is a %A on the %d day of %B.")
'12-02-03. 02 Dec 2003 is a Tuesday on the 02 day of December.'

>>> # dates support calendar arithmetic
>>> birthday = date(1964, 7, 31)
>>> age = now - birthday
>>> age.days
14368




10.9. Data Compression¶


Common data archiving and compression formats are directly supported by modules
including: zlib , gzip , bz2 , lzma , zipfile and
tarfile .


>>> import zlib
>>> s = b'witch which has which witches wrist watch'
>>> len(s)
41
>>> t = zlib.compress(s)
>>> len(t)
37
>>> zlib.decompress(t)
b'witch which has which witches wrist watch'
>>> zlib.crc32(s)
226805979




10.10. Performance Measurement¶


Some Python users develop a deep interest in knowing the relative performance of
different approaches to the same problem. Python provides a measurement tool
that answers those questions immediately.


For example, it may be tempting to use the tuple packing and unpacking feature
instead of the traditional approach to swapping arguments. The timeit
module quickly demonstrates a modest performance advantage:


>>> from timeit import Timer
>>> Timer('t=a; a=b; b=t', 'a=1; b=2').timeit()
0.57535828626024577
>>> Timer('a,b = b,a', 'a=1; b=2').timeit()
0.54962537085770791


In contrast to timeit ’s fine level of granularity, the profile and
pstats modules provide tools for identifying time critical sections in
larger blocks of code.




10.11. Quality Control¶


One approach for developing high quality software is to write tests for each
function as it is developed and to run those tests frequently during the
development process.


The doctest module provides a tool for scanning a module and validating
tests embedded in a program’s docstrings. Test construction is as simple as
cutting-and-pasting a typical call along with its results into the docstring.
This improves the documentation by providing the user with an example and it
allows the doctest module to make sure the code remains true to the
documentation:


def average(values):
"""Computes the arithmetic mean of a list of numbers.

>>> print(average([20, 30, 70]))
40.0
"""
return sum(values) / len(values)

import doctest
doctest.testmod() # automatically validate the embedded tests


The unittest module is not as effortless as the doctest module,
but it allows a more comprehensive set of tests to be maintained in a separate
file:


import unittest

class TestStatisticalFunctions(unittest.TestCase):

def test_average(self):
self.assertEqual(average([20, 30, 70]), 40.0)
self.assertEqual(round(average([1, 5, 7]), 1), 4.3)
with self.assertRaises(ZeroDivisionError):
average([])
with self.assertRaises(TypeError):
average(20, 30, 70)

unittest.main() # Calling from the command line invokes all tests




10.12. Batteries Included¶


Python has a “batteries included” philosophy. This is best seen through the
sophisticated and robust capabilities of its larger packages. For example:



  • The xmlrpc.client and xmlrpc.server modules make implementing
    remote procedure calls into an almost trivial task. Despite the modules’
    names, no direct knowledge or handling of XML is needed.


  • The email package is a library for managing email messages, including
    MIME and other RFC 2822 -based message documents. Unlike smtplib and
    poplib which actually send and receive messages, the email package has
    a complete toolset for building or decoding complex message structures
    (including attachments) and for implementing internet encoding and header
    protocols.


  • The json package provides robust support for parsing this
    popular data interchange format. The csv module supports
    direct reading and writing of files in Comma-Separated Value format,
    commonly supported by databases and spreadsheets. XML processing is
    supported by the xml.etree.ElementTree , xml.dom and
    xml.sax packages. Together, these modules and packages
    greatly simplify data interchange between Python applications and
    other tools.


  • The sqlite3 module is a wrapper for the SQLite database
    library, providing a persistent database that can be updated and
    accessed using slightly nonstandard SQL syntax.


  • Internationalization is supported by a number of modules including
    gettext , locale , and the codecs package.










11. Brief Tour of the Standard Library — Part II






11. Brief Tour of the Standard Library — Part II¶


This second tour covers more advanced modules that support professional
programming needs. These modules rarely occur in small scripts.



11.1. Output Formatting¶


The reprlib module provides a version of repr() customized for
abbreviated displays of large or deeply nested containers:


>>> import reprlib
>>> reprlib.repr(set('supercalifragilisticexpialidocious'))
"{'a', 'c', 'd', 'e', 'f', 'g', ...}"


The pprint module offers more sophisticated control over printing both
built-in and user defined objects in a way that is readable by the interpreter.
When the result is longer than one line, the “pretty printer” adds line breaks
and indentation to more clearly reveal data structure:


>>> import pprint
>>> t = [[[['black', 'cyan'], 'white', ['green', 'red']], [['magenta',
... 'yellow'], 'blue']]]
...
>>> pprint.pprint(t, width=30)
[[[['black', 'cyan'],
'white',
['green', 'red']],
[['magenta', 'yellow'],
'blue']]]


The textwrap module formats paragraphs of text to fit a given screen
width:


>>> import textwrap
>>> doc = """The wrap() method is just like fill() except that it returns
... a list of strings instead of one big string with newlines to separate
... the wrapped lines."""
...
>>> print(textwrap.fill(doc, width=40))
The wrap() method is just like fill()
except that it returns a list of strings
instead of one big string with newlines
to separate the wrapped lines.


The locale module accesses a database of culture specific data formats.
The grouping attribute of locale’s format function provides a direct way of
formatting numbers with group separators:


>>> import locale
>>> locale.setlocale(locale.LC_ALL, 'English_United States.1252')
'English_United States.1252'
>>> conv = locale.localeconv() # get a mapping of conventions
>>> x = 1234567.8
>>> locale.format("%d", x, grouping=True)
'1,234,567'
>>> locale.format_string("%s%.*f", (conv['currency_symbol'],
... conv['frac_digits'], x), grouping=True)
'$1,234,567.80'




11.2. Templating¶


The string module includes a versatile Template class
with a simplified syntax suitable for editing by end-users. This allows users
to customize their applications without having to alter the application.


The format uses placeholder names formed by $ with valid Python identifiers
(alphanumeric characters and underscores). Surrounding the placeholder with
braces allows it to be followed by more alphanumeric letters with no intervening
spaces. Writing $$ creates a single escaped $ :


>>> from string import Template
>>> t = Template('${village}folk send $$10 to $cause.')
>>> t.substitute(village='Nottingham', cause='the ditch fund')
'Nottinghamfolk send $10 to the ditch fund.'


The substitute() method raises a KeyError when a
placeholder is not supplied in a dictionary or a keyword argument. For
mail-merge style applications, user supplied data may be incomplete and the
safe_substitute() method may be more appropriate —
it will leave placeholders unchanged if data is missing:


>>> t = Template('Return the $item to $owner.')
>>> d = dict(item='unladen swallow')
>>> t.substitute(d)
Traceback (most recent call last):
...
KeyError: 'owner'
>>> t.safe_substitute(d)
'Return the unladen swallow to $owner.'


Template subclasses can specify a custom delimiter. For example, a batch
renaming utility for a photo browser may elect to use percent signs for
placeholders such as the current date, image sequence number, or file format:


>>> import time, os.path
>>> photofiles = ['img_1074.jpg', 'img_1076.jpg', 'img_1077.jpg']
>>> class BatchRename(Template):
... delimiter = '%'
...
>>> fmt = input('Enter rename style (%d-date %n-seqnum %f-format): ')
Enter rename style (%d-date %n-seqnum %f-format): Ashley_%n%f

>>> t = BatchRename(fmt)
>>> date = time.strftime('%d%b%y')
>>> for i, filename in enumerate(photofiles):
... base, ext = os.path.splitext(filename)
... newname = t.substitute(d=date, n=i, f=ext)
... print('{0} --> {1}'.format(filename, newname))

img_1074.jpg --> Ashley_0.jpg
img_1076.jpg --> Ashley_1.jpg
img_1077.jpg --> Ashley_2.jpg


Another application for templating is separating program logic from the details
of multiple output formats. This makes it possible to substitute custom
templates for XML files, plain text reports, and HTML web reports.




11.3. Working with Binary Data Record Layouts¶


The struct module provides pack() and
unpack() functions for working with variable length binary
record formats. The following example shows
how to loop through header information in a ZIP file without using the
zipfile module. Pack codes "H" and "I" represent two and four
byte unsigned numbers respectively. The "<" indicates that they are
standard size and in little-endian byte order:


import struct

with open('myfile.zip', 'rb') as f:
data = f.read()

start = 0
for i in range(3): # show the first 3 file headers
start += 14
fields = struct.unpack('<IIIHH', data[start:start+16])
crc32, comp_size, uncomp_size, filenamesize, extra_size = fields

start += 16
filename = data[start:start+filenamesize]
start += filenamesize
extra = data[start:start+extra_size]
print(filename, hex(crc32), comp_size, uncomp_size)

start += extra_size + comp_size # skip to the next header




11.4. Multi-threading¶


Threading is a technique for decoupling tasks which are not sequentially
dependent. Threads can be used to improve the responsiveness of applications
that accept user input while other tasks run in the background. A related use
case is running I/O in parallel with computations in another thread.


The following code shows how the high level threading module can run
tasks in background while the main program continues to run:


import threading, zipfile

class AsyncZip(threading.Thread):
def __init__(self, infile, outfile):
threading.Thread.__init__(self)
self.infile = infile
self.outfile = outfile

def run(self):
f = zipfile.ZipFile(self.outfile, 'w', zipfile.ZIP_DEFLATED)
f.write(self.infile)
f.close()
print('Finished background zip of:', self.infile)

background = AsyncZip('mydata.txt', 'myarchive.zip')
background.start()
print('The main program continues to run in foreground.')

background.join() # Wait for the background task to finish
print('Main program waited until background was done.')


The principal challenge of multi-threaded applications is coordinating threads
that share data or other resources. To that end, the threading module provides
a number of synchronization primitives including locks, events, condition
variables, and semaphores.


While those tools are powerful, minor design errors can result in problems that
are difficult to reproduce. So, the preferred approach to task coordination is
to concentrate all access to a resource in a single thread and then use the
queue module to feed that thread with requests from other threads.
Applications using Queue objects for inter-thread communication and
coordination are easier to design, more readable, and more reliable.




11.5. Logging¶


The logging module offers a full featured and flexible logging system.
At its simplest, log messages are sent to a file or to sys.stderr :


import logging
logging.debug('Debugging information')
logging.info('Informational message')
logging.warning('Warning:config file %s not found', 'server.conf')
logging.error('Error occurred')
logging.critical('Critical error -- shutting down')


This produces the following output:


WARNING:root:Warning:config file server.conf not found
ERROR:root:Error occurred
CRITICAL:root:Critical error -- shutting down


By default, informational and debugging messages are suppressed and the output
is sent to standard error. Other output options include routing messages
through email, datagrams, sockets, or to an HTTP Server. New filters can select
different routing based on message priority: DEBUG ,
INFO , WARNING , ERROR ,
and CRITICAL .


The logging system can be configured directly from Python or can be loaded from
a user editable configuration file for customized logging without altering the
application.




11.6. Weak References¶


Python does automatic memory management (reference counting for most objects and
garbage collection to eliminate cycles). The memory is freed shortly
after the last reference to it has been eliminated.


This approach works fine for most applications but occasionally there is a need
to track objects only as long as they are being used by something else.
Unfortunately, just tracking them creates a reference that makes them permanent.
The weakref module provides tools for tracking objects without creating a
reference. When the object is no longer needed, it is automatically removed
from a weakref table and a callback is triggered for weakref objects. Typical
applications include caching objects that are expensive to create:


>>> import weakref, gc
>>> class A:
... def __init__(self, value):
... self.value = value
... def __repr__(self):
... return str(self.value)
...
>>> a = A(10) # create a reference
>>> d = weakref.WeakValueDictionary()
>>> d['primary'] = a # does not create a reference
>>> d['primary'] # fetch the object if it is still alive
10
>>> del a # remove the one reference
>>> gc.collect() # run garbage collection right away
0
>>> d['primary'] # entry was automatically removed
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
d['primary'] # entry was automatically removed
File "C:/python311/lib/weakref.py", line 46, in __getitem__
o = self.data[key]()
KeyError: 'primary'




11.7. Tools for Working with Lists¶


Many data structure needs can be met with the built-in list type. However,
sometimes there is a need for alternative implementations with different
performance trade-offs.


The array module provides an array() object that is like
a list that stores only homogeneous data and stores it more compactly. The
following example shows an array of numbers stored as two byte unsigned binary
numbers (typecode "H" ) rather than the usual 16 bytes per entry for regular
lists of Python int objects:


>>> from array import array
>>> a = array('H', [4000, 10, 700, 22222])
>>> sum(a)
26932
>>> a[1:3]
array('H', [10, 700])


The collections module provides a deque() object
that is like a list with faster appends and pops from the left side but slower
lookups in the middle. These objects are well suited for implementing queues
and breadth first tree searches:


>>> from collections import deque
>>> d = deque(["task1", "task2", "task3"])
>>> d.append("task4")
>>> print("Handling", d.popleft())
Handling task1


unsearched = deque([starting_node])
def breadth_first_search(unsearched):
node = unsearched.popleft()
for m in gen_moves(node):
if is_goal(m):
return m
unsearched.append(m)


In addition to alternative list implementations, the library also offers other
tools such as the bisect module with functions for manipulating sorted
lists:


>>> import bisect
>>> scores = [(100, 'perl'), (200, 'tcl'), (400, 'lua'), (500, 'python')]
>>> bisect.insort(scores, (300, 'ruby'))
>>> scores
[(100, 'perl'), (200, 'tcl'), (300, 'ruby'), (400, 'lua'), (500, 'python')]


The heapq module provides functions for implementing heaps based on
regular lists. The lowest valued entry is always kept at position zero. This
is useful for applications which repeatedly access the smallest element but do
not want to run a full list sort:


>>> from heapq import heapify, heappop, heappush
>>> data = [1, 3, 5, 7, 9, 2, 4, 6, 8, 0]
>>> heapify(data) # rearrange the list into heap order
>>> heappush(data, -5) # add a new entry
>>> [heappop(data) for i in range(3)] # fetch the three smallest entries
[-5, 0, 1]




11.8. Decimal Floating Point Arithmetic¶


The decimal module offers a Decimal datatype for
decimal floating point arithmetic. Compared to the built-in float
implementation of binary floating point, the class is especially helpful for



  • financial applications and other uses which require exact decimal
    representation,


  • control over precision,


  • control over rounding to meet legal or regulatory requirements,


  • tracking of significant decimal places, or


  • applications where the user expects the results to match calculations done by
    hand.



For example, calculating a 5% tax on a 70 cent phone charge gives different
results in decimal floating point and binary floating point. The difference
becomes significant if the results are rounded to the nearest cent:


>>> from decimal import *
>>> round(Decimal('0.70') * Decimal('1.05'), 2)
Decimal('0.74')
>>> round(.70 * 1.05, 2)
0.73


The Decimal result keeps a trailing zero, automatically
inferring four place significance from multiplicands with two place
significance. Decimal reproduces mathematics as done by hand and avoids
issues that can arise when binary floating point cannot exactly represent
decimal quantities.


Exact representation enables the Decimal class to perform
modulo calculations and equality tests that are unsuitable for binary floating
point:


>>> Decimal('1.00') % Decimal('.10')
Decimal('0.00')
>>> 1.00 % 0.10
0.09999999999999995

>>> sum([Decimal('0.1')]*10) == Decimal('1.0')
True
>>> sum([0.1]*10) == 1.0
False


The decimal module provides arithmetic with as much precision as needed:


>>> getcontext().prec = 36
>>> Decimal(1) / Decimal(7)
Decimal('0.142857142857142857142857142857142857')









Read article

12. Virtual Environments and Packages






12. Virtual Environments and Packages¶



12.1. Introduction¶


Python applications will often use packages and modules that don’t
come as part of the standard library. Applications will sometimes
need a specific version of a library, because the application may
require that a particular bug has been fixed or the application may be
written using an obsolete version of the library’s interface.


This means it may not be possible for one Python installation to meet
the requirements of every application. If application A needs version
1.0 of a particular module but application B needs version 2.0, then
the requirements are in conflict and installing either version 1.0 or 2.0
will leave one application unable to run.


The solution for this problem is to create a virtual environment , a
self-contained directory tree that contains a Python installation for a
particular version of Python, plus a number of additional packages.


Different applications can then use different virtual environments.
To resolve the earlier example of conflicting requirements,
application A can have its own virtual environment with version 1.0
installed while application B has another virtual environment with version 2.0.
If application B requires a library be upgraded to version 3.0, this will
not affect application A’s environment.




12.2. Creating Virtual Environments¶


The module used to create and manage virtual environments is called
venv . venv will usually install the most recent version of
Python that you have available. If you have multiple versions of Python on your
system, you can select a specific Python version by running python3 or
whichever version you want.


To create a virtual environment, decide upon a directory where you want to
place it, and run the venv module as a script with the directory path:


python3 -m venv tutorial-env


This will create the tutorial-env directory if it doesn’t exist,
and also create directories inside it containing a copy of the Python
interpreter and various supporting files.


A common directory location for a virtual environment is .venv .
This name keeps the directory typically hidden in your shell and thus
out of the way while giving it a name that explains why the directory
exists. It also prevents clashing with .env environment variable
definition files that some tooling supports.


Once you’ve created a virtual environment, you may activate it.


On Windows, run:


tutorial-env\Scripts\activate.bat


On Unix or MacOS, run:


source tutorial-env/bin/activate


(This script is written for the bash shell. If you use the
csh or fish shells, there are alternate
activate.csh and activate.fish scripts you should use
instead.)


Activating the virtual environment will change your shell’s prompt to show what
virtual environment you’re using, and modify the environment so that running
python will get you that particular version and installation of Python.
For example:


$ source ~/envs/tutorial-env/bin/activate
(tutorial-env) $ python
Python 3.5.1 (default, May 6 2016, 10:59:36)
...
>>> import sys
>>> sys.path
['', '/usr/local/lib/python35.zip', ...,
'~/envs/tutorial-env/lib/python3.5/site-packages']
>>>


To deactivate a virtual environment, type:


deactivate


into the terminal.




12.3. Managing Packages with pip¶


You can install, upgrade, and remove packages using a program called
pip . By default pip will install packages from the Python
Package Index, <https://pypi.org>. You can browse the Python
Package Index by going to it in your web browser.


pip has a number of subcommands: “install”, “uninstall”,
“freeze”, etc. (Consult the Installing Python Modules guide for
complete documentation for pip .)


You can install the latest version of a package by specifying a package’s name:


(tutorial-env) $ python -m pip install novas
Collecting novas
Downloading novas-3.1.1.3.tar.gz (136kB)
Installing collected packages: novas
Running setup.py install for novas
Successfully installed novas-3.1.1.3


You can also install a specific version of a package by giving the
package name followed by == and the version number:


(tutorial-env) $ python -m pip install requests==2.6.0
Collecting requests==2.6.0
Using cached requests-2.6.0-py2.py3-none-any.whl
Installing collected packages: requests
Successfully installed requests-2.6.0


If you re-run this command, pip will notice that the requested
version is already installed and do nothing. You can supply a
different version number to get that version, or you can run python
-m pip install --upgrade
to upgrade the package to the latest version:


(tutorial-env) $ python -m pip install --upgrade requests
Collecting requests
Installing collected packages: requests
Found existing installation: requests 2.6.0
Uninstalling requests-2.6.0:
Successfully uninstalled requests-2.6.0
Successfully installed requests-2.7.0


python -m pip uninstall followed by one or more package names will
remove the packages from the virtual environment.


python -m pip show will display information about a particular package:


(tutorial-env) $ python -m pip show requests
---
Metadata-Version: 2.0
Name: requests
Version: 2.7.0
Summary: Python HTTP for Humans.
Home-page: http://python-requests.org
Author: Kenneth Reitz
Author-email: me@kennethreitz.com
License: Apache 2.0
Location: /Users/akuchling/envs/tutorial-env/lib/python3.4/site-packages
Requires:


python -m pip list will display all of the packages installed in
the virtual environment:


(tutorial-env) $ python -m pip list
novas (3.1.1.3)
numpy (1.9.2)
pip (7.0.3)
requests (2.7.0)
setuptools (16.0)


python -m pip freeze will produce a similar list of the installed packages,
but the output uses the format that python -m pip install expects.
A common convention is to put this list in a requirements.txt file:


(tutorial-env) $ python -m pip freeze > requirements.txt
(tutorial-env) $ cat requirements.txt
novas==3.1.1.3
numpy==1.9.2
requests==2.7.0


The requirements.txt can then be committed to version control and
shipped as part of an application. Users can then install all the
necessary packages with install -r :


(tutorial-env) $ python -m pip install -r requirements.txt
Collecting novas==3.1.1.3 (from -r requirements.txt (line 1))
...
Collecting numpy==1.9.2 (from -r requirements.txt (line 2))
...
Collecting requests==2.7.0 (from -r requirements.txt (line 3))
...
Installing collected packages: novas, numpy, requests
Running setup.py install for novas
Successfully installed novas-3.1.1.3 numpy-1.9.2 requests-2.7.0


pip has many more options. Consult the Installing Python Modules
guide for complete documentation for pip . When you’ve written
a package and want to make it available on the Python Package Index,
consult the Distributing Python Modules guide.









Read article

13. What Now?






13. What Now?¶


Reading this tutorial has probably reinforced your interest in using Python —
you should be eager to apply Python to solving your real-world problems. Where
should you go to learn more?


This tutorial is part of Python’s documentation set. Some other documents in
the set are:



  • The Python Standard Library :


    You should browse through this manual, which gives complete (though terse)
    reference material about types, functions, and the modules in the standard
    library. The standard Python distribution includes a lot of additional code.
    There are modules to read Unix mailboxes, retrieve documents via HTTP, generate
    random numbers, parse command-line options, compress data,
    and many other tasks. Skimming through the Library Reference will give you an
    idea of what’s available.



  • Installing Python Modules explains how to install additional modules written
    by other Python users.


  • The Python Language Reference : A detailed explanation of Python’s syntax and
    semantics. It’s heavy reading, but is useful as a complete guide to the
    language itself.



More Python resources:



  • https://www.python.org: The major Python web site. It contains code,
    documentation, and pointers to Python-related pages around the web.


  • https://docs.python.org: Fast access to Python’s documentation.


  • https://pypi.org: The Python Package Index, previously also nicknamed
    the Cheese Shop 1, is an index of user-created Python modules that are available
    for download. Once you begin releasing code, you can register it here so that
    others can find it.


  • https://code.activestate.com/recipes/langs/python/: The Python Cookbook is a
    sizable collection of code examples, larger modules, and useful scripts.
    Particularly notable contributions are collected in a book also titled Python
    Cookbook (O’Reilly & Associates, ISBN 0-596-00797-3.)


  • https://pyvideo.org collects links to Python-related videos from
    conferences and user-group meetings.


  • https://scipy.org: The Scientific Python project includes modules for fast
    array computations and manipulations plus a host of packages for such
    things as linear algebra, Fourier transforms, non-linear solvers,
    random number distributions, statistical analysis and the like.



For Python-related questions and problem reports, you can post to the newsgroup
comp.lang.python , or send them to the mailing list at
python-list @ python . org. The newsgroup and mailing list are gatewayed, so
messages posted to one will automatically be forwarded to the other. There are
hundreds of postings a day, asking (and
answering) questions, suggesting new features, and announcing new modules.
Mailing list archives are available at https://mail.python.org/pipermail/.


Before posting, be sure to check the list of
Frequently Asked Questions (also called the FAQ). The
FAQ answers many of the questions that come up again and again, and may
already contain the solution for your problem.


Footnotes



1

“Cheese Shop” is a Monty Python’s sketch: a customer enters a cheese shop,
but whatever cheese he asks for, the clerk says it’s missing.










Read article

14. Interactive Input Editing and History Substitution






14. Interactive Input Editing and History Substitution¶


Some versions of the Python interpreter support editing of the current input
line and history substitution, similar to facilities found in the Korn shell and
the GNU Bash shell. This is implemented using the GNU Readline library,
which supports various styles of editing. This library has its own
documentation which we won’t duplicate here.



14.1. Tab Completion and History Editing¶


Completion of variable and module names is
automatically enabled at interpreter startup so
that the Tab key invokes the completion function; it looks at
Python statement names, the current local variables, and the available
module names. For dotted expressions such as string.a , it will evaluate
the expression up to the final '.' and then suggest completions from
the attributes of the resulting object. Note that this may execute
application-defined code if an object with a __getattr__() method
is part of the expression. The default configuration also saves your
history into a file named .python_history in your user directory.
The history will be available again during the next interactive interpreter
session.




14.2. Alternatives to the Interactive Interpreter¶


This facility is an enormous step forward compared to earlier versions of the
interpreter; however, some wishes are left: It would be nice if the proper
indentation were suggested on continuation lines (the parser knows if an indent
token is required next). The completion mechanism might use the interpreter’s
symbol table. A command to check (or even suggest) matching parentheses,
quotes, etc., would also be useful.


One alternative enhanced interactive interpreter that has been around for quite
some time is IPython, which features tab completion, object exploration and
advanced history management. It can also be thoroughly customized and embedded
into other applications. Another similar enhanced interactive environment is
bpython.









Read article

15. Floating Point Arithmetic: Issues and Limitations






15. Floating Point Arithmetic: Issues and Limitations¶


Floating-point numbers are represented in computer hardware as base 2 (binary)
fractions. For example, the decimal fraction 0.125
has value 1/10 + 2/100 + 5/1000, and in the same way the binary fraction 0.001
has value 0/2 + 0/4 + 1/8. These two fractions have identical values, the only
real difference being that the first is written in base 10 fractional notation,
and the second in base 2.


Unfortunately, most decimal fractions cannot be represented exactly as binary
fractions. A consequence is that, in general, the decimal floating-point
numbers you enter are only approximated by the binary floating-point numbers
actually stored in the machine.


The problem is easier to understand at first in base 10. Consider the fraction
1/3. You can approximate that as a base 10 fraction:


0.3


or, better,


0.33


or, better,


0.333


and so on. No matter how many digits you’re willing to write down, the result
will never be exactly 1/3, but will be an increasingly better approximation of
1/3.


In the same way, no matter how many base 2 digits you’re willing to use, the
decimal value 0.1 cannot be represented exactly as a base 2 fraction. In base
2, 1/10 is the infinitely repeating fraction


0.0001100110011001100110011001100110011001100110011...


Stop at any finite number of bits, and you get an approximation. On most
machines today, floats are approximated using a binary fraction with
the numerator using the first 53 bits starting with the most significant bit and
with the denominator as a power of two. In the case of 1/10, the binary fraction
is 3602879701896397 / 2 ** 55 which is close to but not exactly
equal to the true value of 1/10.


Many users are not aware of the approximation because of the way values are
displayed. Python only prints a decimal approximation to the true decimal
value of the binary approximation stored by the machine. On most machines, if
Python were to print the true decimal value of the binary approximation stored
for 0.1, it would have to display


>>> 0.1
0.1000000000000000055511151231257827021181583404541015625


That is more digits than most people find useful, so Python keeps the number
of digits manageable by displaying a rounded value instead


>>> 1 / 10
0.1


Just remember, even though the printed result looks like the exact value
of 1/10, the actual stored value is the nearest representable binary fraction.


Interestingly, there are many different decimal numbers that share the same
nearest approximate binary fraction. For example, the numbers 0.1 and
0.10000000000000001 and
0.1000000000000000055511151231257827021181583404541015625 are all
approximated by 3602879701896397 / 2 ** 55 . Since all of these decimal
values share the same approximation, any one of them could be displayed
while still preserving the invariant eval(repr(x)) == x .


Historically, the Python prompt and built-in repr() function would choose
the one with 17 significant digits, 0.10000000000000001 . Starting with
Python 3.1, Python (on most systems) is now able to choose the shortest of
these and simply display 0.1 .


Note that this is in the very nature of binary floating-point: this is not a bug
in Python, and it is not a bug in your code either. You’ll see the same kind of
thing in all languages that support your hardware’s floating-point arithmetic
(although some languages may not display the difference by default, or in all
output modes).


For more pleasant output, you may wish to use string formatting to produce a limited number of significant digits:


>>> format(math.pi, '.12g')  # give 12 significant digits
'3.14159265359'

>>> format(math.pi, '.2f') # give 2 digits after the point
'3.14'

>>> repr(math.pi)
'3.141592653589793'


It’s important to realize that this is, in a real sense, an illusion: you’re
simply rounding the display of the true machine value.


One illusion may beget another. For example, since 0.1 is not exactly 1/10,
summing three values of 0.1 may not yield exactly 0.3, either:


>>> .1 + .1 + .1 == .3
False


Also, since the 0.1 cannot get any closer to the exact value of 1/10 and
0.3 cannot get any closer to the exact value of 3/10, then pre-rounding with
round() function cannot help:


>>> round(.1, 1) + round(.1, 1) + round(.1, 1) == round(.3, 1)
False


Though the numbers cannot be made closer to their intended exact values,
the round() function can be useful for post-rounding so that results
with inexact values become comparable to one another:


>>> round(.1 + .1 + .1, 10) == round(.3, 10)
True


Binary floating-point arithmetic holds many surprises like this. The problem
with “0.1” is explained in precise detail below, in the “Representation Error”
section. See The Perils of Floating Point
for a more complete account of other common surprises.


As that says near the end, “there are no easy answers.” Still, don’t be unduly
wary of floating-point! The errors in Python float operations are inherited
from the floating-point hardware, and on most machines are on the order of no
more than 1 part in 2**53 per operation. That’s more than adequate for most
tasks, but you do need to keep in mind that it’s not decimal arithmetic and
that every float operation can suffer a new rounding error.


While pathological cases do exist, for most casual use of floating-point
arithmetic you’ll see the result you expect in the end if you simply round the
display of your final results to the number of decimal digits you expect.
str() usually suffices, and for finer control see the str.format()
method’s format specifiers in Format String Syntax .


For use cases which require exact decimal representation, try using the
decimal module which implements decimal arithmetic suitable for
accounting applications and high-precision applications.


Another form of exact arithmetic is supported by the fractions module
which implements arithmetic based on rational numbers (so the numbers like
1/3 can be represented exactly).


If you are a heavy user of floating point operations you should take a look
at the NumPy package and many other packages for mathematical and
statistical operations supplied by the SciPy project. See <https://scipy.org>.


Python provides tools that may help on those rare occasions when you really
do want to know the exact value of a float. The
float.as_integer_ratio() method expresses the value of a float as a
fraction:


>>> x = 3.14159
>>> x.as_integer_ratio()
(3537115888337719, 1125899906842624)


Since the ratio is exact, it can be used to losslessly recreate the
original value:


>>> x == 3537115888337719 / 1125899906842624
True


The float.hex() method expresses a float in hexadecimal (base
16), again giving the exact value stored by your computer:


>>> x.hex()
'0x1.921f9f01b866ep+1'


This precise hexadecimal representation can be used to reconstruct
the float value exactly:


>>> x == float.fromhex('0x1.921f9f01b866ep+1')
True


Since the representation is exact, it is useful for reliably porting values
across different versions of Python (platform independence) and exchanging
data with other languages that support the same format (such as Java and C99).


Another helpful tool is the math.fsum() function which helps mitigate
loss-of-precision during summation. It tracks “lost digits” as values are
added onto a running total. That can make a difference in overall accuracy
so that the errors do not accumulate to the point where they affect the
final total:


>>> sum([0.1] * 10) == 1.0
False
>>> math.fsum([0.1] * 10) == 1.0
True



15.1. Representation Error¶


This section explains the “0.1” example in detail, and shows how you can perform
an exact analysis of cases like this yourself. Basic familiarity with binary
floating-point representation is assumed.


Representation error refers to the fact that some (most, actually)
decimal fractions cannot be represented exactly as binary (base 2) fractions.
This is the chief reason why Python (or Perl, C, C++, Java, Fortran, and many
others) often won’t display the exact decimal number you expect.


Why is that? 1/10 is not exactly representable as a binary fraction. Almost all
machines today (November 2000) use IEEE-754 floating point arithmetic, and
almost all platforms map Python floats to IEEE-754 “double precision”. 754
doubles contain 53 bits of precision, so on input the computer strives to
convert 0.1 to the closest fraction it can of the form J /2** N where J is
an integer containing exactly 53 bits. Rewriting


1 / 10 ~= J / (2**N)


as


J ~= 2**N / 10


and recalling that J has exactly 53 bits (is >= 2**52 but < 2**53 ),
the best value for N is 56:


>>> 2**52 <=  2**56 // 10  < 2**53
True


That is, 56 is the only value for N that leaves J with exactly 53 bits. The
best possible value for J is then that quotient rounded:


>>> q, r = divmod(2**56, 10)
>>> r
6


Since the remainder is more than half of 10, the best approximation is obtained
by rounding up:


>>> q+1
7205759403792794


Therefore the best possible approximation to 1/10 in 754 double precision is:


7205759403792794 / 2 ** 56


Dividing both the numerator and denominator by two reduces the fraction to:


3602879701896397 / 2 ** 55


Note that since we rounded up, this is actually a little bit larger than 1/10;
if we had not rounded up, the quotient would have been a little bit smaller than
1/10. But in no case can it be exactly 1/10!


So the computer never “sees” 1/10: what it sees is the exact fraction given
above, the best 754 double approximation it can get:


>>> 0.1 * 2 ** 55
3602879701896397.0


If we multiply that fraction by 10**55, we can see the value out to
55 decimal digits:


>>> 3602879701896397 * 10 ** 55 // 2 ** 55
1000000000000000055511151231257827021181583404541015625


meaning that the exact number stored in the computer is equal to
the decimal value 0.1000000000000000055511151231257827021181583404541015625.
Instead of displaying the full decimal value, many languages (including
older versions of Python), round the result to 17 significant digits:


>>> format(0.1, '.17f')
'0.10000000000000001'


The fractions and decimal modules make these calculations
easy:


>>> from decimal import Decimal
>>> from fractions import Fraction

>>> Fraction.from_float(0.1)
Fraction(3602879701896397, 36028797018963968)

>>> (0.1).as_integer_ratio()
(3602879701896397, 36028797018963968)

>>> Decimal.from_float(0.1)
Decimal('0.1000000000000000055511151231257827021181583404541015625')

>>> format(Decimal.from_float(0.1), '.17')
'0.10000000000000001'









Read article