1 Introduction
2 ------------
3
4 The pprocess module provides elementary support for parallel programming in
5 Python using a fork-based process creation model in conjunction with a
6 channel-based communications model implemented using socketpair and poll. On
7 systems with multiple CPUs or multicore CPUs, processes should take advantage
8 of as many CPUs or cores as the operating system permits.
9
10 Tutorial
11 --------
12
13 The tutorial provides some information about the examples described below.
14 See the docs/tutorial.html file in the distribution for more details.
15
16 Reference
17 ---------
18
19 A description of the different mechanisms provided by the pprocess module can
20 be found in the reference document. See the docs/reference.html file in the
21 distribution for more details.
22
23 Quick Start
24 -----------
25
26 Try running the simple examples. For example:
27
28 PYTHONPATH=. python examples/simple_create.py
29
30 (These examples show in different ways how limited number of processes can be
31 used to perform a parallel computation. The simple.py, simple1.py, simple2.py
32 and simple_map.py programs are sequential versions of the other programs.)
33
34 The following table summarises the features used in the programs:
35
36 Program (.py) pmap MakeParallel manage start create Map Queue Exchange
37 ------------- ---- ------------ ------ ----- ------ --- ----- --------
38 simple_create_map Yes Yes
39 simple_create_queue Yes Yes
40 simple_create Yes Yes
41 simple_managed_map Yes Yes Yes
42 simple_managed_queue Yes Yes Yes
43 simple_managed Yes Yes Yes
44 simple_pmap Yes
45 simple_start_queue Yes Yes Yes
46 simple_start Yes Yes
47
48 The simplest parallel program is simple_pmap.py which employs the pmap
49 function resembling the built-in map function in Python.
50
51 Other simple programs are those employing the Queue class, together with those
52 using the manage method which associates functions or callables with Queue or
53 Exchange objects for convenient invocation of those functions and the
54 management of their communications.
55
56 The most technically involved program is simple_start.py which uses the
57 Exchange class together with a calculation function which is aware of the
58 parallel environment and which communicates over the supplied communications
59 channel directly to the creating process.
60
61 It should be noted that with the exception of simple_start.py, those examples
62 employing calculation functions (as opposed to doing a calculation inline in a
63 loop body) all use MakeParallel to make those functions parallel-aware, thus
64 permitting the conversion of "normal" functions to a form usable in the
65 parallel environment.
66
67 Reusable Processes
68 ------------------
69
70 An additional example not listed above, simple_managed_map_reusable.py,
71 employs the MakeReusable class instead of MakeParallel in order to demonstrate
72 reusable processes and channels:
73
74 PYTHONPATH=. python examples/simple_managed_map_reusable.py
75
76 Persistent Processes
77 --------------------
78
79 A number of persistent variants of some of the above examples employ a
80 persistent or background process which can be started by one process and
81 contacted later by another in order to collect the results of a computation.
82 For example:
83
84 PYTHONPATH=. python examples/simple_persistent_managed.py --start
85 PYTHONPATH=. python examples/simple_persistent_managed.py --reconnect
86
87 PYTHONPATH=. python examples/simple_background_queue.py --start
88 PYTHONPATH=. python examples/simple_background_queue.py --reconnect
89
90 PYTHONPATH=. python examples/simple_persistent_queue.py --start
91 PYTHONPATH=. python examples/simple_persistent_queue.py --reconnect
92
93 Parallel Raytracing with PyGmy
94 ------------------------------
95
96 The PyGmy raytracer modified to use pprocess can be run to investigate the
97 potential for speed increases in "real world" programs:
98
99 cd examples/PyGmy
100 PYTHONPATH=../..:. python scene.py
101
102 (This should produce a file called test.tif - a TIFF file containing a
103 raytraced scene image.)
104
105 Test Programs
106 -------------
107
108 There are some elementary tests:
109
110 PYTHONPATH=. python tests/create_loop.py
111 PYTHONPATH=. python tests/start_loop.py
112
113 (Simple loop demonstrations which use two different ways of creating and
114 starting the parallel processes.)
115
116 PYTHONPATH=. python tests/start_indexer.py <directory>
117
118 (A text indexing demonstration, where <directory> should be a directory
119 containing text files to be indexed, although HTML files will also work well
120 enough. After indexing the files, a prompt will appear, words or word
121 fragments can be entered, and matching words and their locations will be
122 shown. Run the program without arguments to see more information.)
123
124 Contact, Copyright and Licence Information
125 ------------------------------------------
126
127 The current Web page for pprocess at the time of release is:
128
129 http://www.boddie.org.uk/python/pprocess.html
130
131 The author can be contacted at the following e-mail address:
132
133 paul@boddie.org.uk
134
135 Copyright and licence information can be found in the docs directory - see
136 docs/COPYING.txt, docs/lgpl-3.0.txt and docs/gpl-3.0.txt for more information.
137
138 For the PyGmy raytracer example, different copyright and licence information
139 is provided in the docs directory - see docs/COPYING-PyGmy.txt and
140 docs/LICENCE-PyGmy.txt for more information.
141
142 Dependencies
143 ------------
144
145 This software depends on standard library features which are stated as being
146 available only on "UNIX"; it has only been tested on a GNU/Linux system.
147
148 New in pprocess 0.4 (Changes since pprocess 0.3.1)
149 --------------------------------------------------
150
151 * Added support for persistent/background processes.
152 * Added a utility function to detect and return the number of processor
153 cores available.
154 * Added missing documentation stylesheet.
155 * Added support for Solaris using pipes instead of socket pairs, since
156 the latter do not apparently work properly with poll on Solaris.
157
158 New in pprocess 0.3.1 (Changes since pprocess 0.3)
159 --------------------------------------------------
160
161 * Moved the reference material out of the module docstring and into a
162 separate document, converting it to XHTML in the process.
163 * Fixed the project name in the setup script.
164
165 New in pprocess 0.3 (Changes since parallel 0.2.5)
166 --------------------------------------------------
167
168 * Added managed callables: wrappers around callables which cause them to be
169 automatically managed by the exchange from which they were acquired.
170 * Added MakeParallel: a wrapper instantiated around a normal function which
171 sends the result of that function over the supplied channel when invoked.
172 * Added MakeReusable: a wrapper like MakeParallel which can be used in
173 conjunction with the newly-added reuse capability of the Exchange class in
174 order to reuse processes and channels.
175 * Added a Map class which attempts to emulate the built-in map function,
176 along with a pmap function using this class.
177 * Added a Queue class which provides a simpler iterator-style interface to
178 data produced by created processes.
179 * Added a create method to the Exchange class and an exit convenience
180 function to the module.
181 * Changed the Exchange implementation to not block when attempting to start
182 new processes beyond the process limit: such requests are queued and
183 performed as running processes are completed. This permits programs using
184 the start method to proceed to consumption of results more quickly.
185 * Extended and updated the examples. Added a tutorial.
186 * Added Ubuntu Feisty (7.04) package support.
187
188 New in parallel 0.2.5 (Changes since parallel 0.2.4)
189 ----------------------------------------------------
190
191 * Added a start method to the Exchange class for more convenient creation of
192 processes.
193 * Relicensed under the LGPL (version 3 or later) - this also fixes the
194 contradictory situation where the GPL was stated in the pprocess module
195 (which was not, in fact, the intention) and the LGPL was stated in the
196 documentation.
197
198 New in parallel 0.2.4 (Changes since parallel 0.2.3)
199 ----------------------------------------------------
200
201 * Set buffer sizes to zero for the file object wrappers around sockets: this
202 may prevent deadlock issues.
203
204 New in parallel 0.2.3 (Changes since parallel 0.2.2)
205 ----------------------------------------------------
206
207 * Added convenient message exchanges, offering methods handling common
208 situations at the cost of having to define a subclass of Exchange.
209 * Added a simple example of performing a parallel computation.
210 * Improved the PyGmy raytracer example to use the newly added functionality.
211
212 New in parallel 0.2.2 (Changes since parallel 0.2.1)
213 ----------------------------------------------------
214
215 * Changed the status testing in the Exchange class, potentially fixing the
216 premature closure of channels before all data was read.
217 * Fixed the PyGmy raytracer example's process accounting by relying on the
218 possibly more reliable Exchange behaviour, whilst also preventing
219 erroneous creation of "out of bounds" processes.
220 * Added a removed attribute on the Exchange to record which channels were
221 removed in the last call to the ready method.
222
223 New in parallel 0.2.1 (Changes since parallel 0.2)
224 --------------------------------------------------
225
226 * Added a PyGmy raytracer example.
227 * Updated copyright and licensing details (FSF address, additional works).
228
229 New in parallel 0.2 (Changes since parallel 0.1)
230 ------------------------------------------------
231
232 * Changed the name of the included module from parallel to pprocess in order
233 to avoid naming conflicts with PyParallel.
234
235 Release Procedures
236 ------------------
237
238 Update the pprocess __version__ attribute.
239 Change the version number and package filename/directory in the documentation.
240 Update the release notes (see above).
241 Check the release information in the PKG-INFO file.
242 Tag, export.
243 Archive, upload.
244 Update PyPI.
245
246 Making Packages
247 ---------------
248
249 To make Debian-based packages:
250
251 1. Create new package directories under packages if necessary.
252 2. Make a symbolic link in the distribution's root directory to keep the
253 Debian tools happy:
254
255 ln -s packages/ubuntu-hoary/python2.4-parallel-pprocess/debian/
256
257 Or:
258
259 ln -s packages/ubuntu-feisty/python-pprocess/debian/
260
261 3. Run the package builder:
262
263 dpkg-buildpackage -rfakeroot
264
265 4. Locate and tidy up the packages in the parent directory of the
266 distribution's root directory.