paulb@22 | 1 | Introduction
|
paulb@22 | 2 | ------------
|
paulb@22 | 3 |
|
paulb@40 | 4 | The pprocess module provides elementary support for parallel programming in
|
paulb@22 | 5 | Python using a fork-based process creation model in conjunction with a
|
paulb@68 | 6 | channel-based communications model implemented using socketpair and poll. On
|
paulb@68 | 7 | systems with multiple CPUs or multicore CPUs, processes should take advantage
|
paulb@68 | 8 | of as many CPUs or cores as the operating system permits.
|
paulb@22 | 9 |
|
paul@168 | 10 | Since pprocess distributes work to other processes, certain aspects of the
|
paul@168 | 11 | behaviour of those processes may differ from the normal behaviour of such
|
paul@168 | 12 | code. For example, any mutable objects distributed to other processes can
|
paul@168 | 13 | still be modified, but any modifications will not be visible outside the
|
paul@169 | 14 | processes making such modifications.
|
paul@168 | 15 |
|
paulb@140 | 16 | Tutorial
|
paulb@140 | 17 | --------
|
paulb@140 | 18 |
|
paulb@140 | 19 | The tutorial provides some information about the examples described below.
|
paulb@144 | 20 | See the docs/tutorial.html file in the distribution for more details.
|
paulb@140 | 21 |
|
paulb@140 | 22 | Reference
|
paulb@140 | 23 | ---------
|
paulb@140 | 24 |
|
paulb@140 | 25 | A description of the different mechanisms provided by the pprocess module can
|
paulb@144 | 26 | be found in the reference document. See the docs/reference.html file in the
|
paulb@140 | 27 | distribution for more details.
|
paulb@140 | 28 |
|
paulb@22 | 29 | Quick Start
|
paulb@22 | 30 | -----------
|
paulb@22 | 31 |
|
paulb@105 | 32 | Try running the simple examples. For example:
|
paulb@68 | 33 |
|
paulb@100 | 34 | PYTHONPATH=. python examples/simple_create.py
|
paulb@105 | 35 |
|
paulb@105 | 36 | (These examples show in different ways how limited number of processes can be
|
paulb@113 | 37 | used to perform a parallel computation. The simple.py, simple1.py, simple2.py
|
paulb@113 | 38 | and simple_map.py programs are sequential versions of the other programs.)
|
paulb@105 | 39 |
|
paulb@105 | 40 | The following table summarises the features used in the programs:
|
paulb@105 | 41 |
|
paulb@113 | 42 | Program (.py) pmap MakeParallel manage start create Map Queue Exchange
|
paulb@113 | 43 | ------------- ---- ------------ ------ ----- ------ --- ----- --------
|
paulb@113 | 44 | simple_create_map Yes Yes
|
paulb@113 | 45 | simple_create_queue Yes Yes
|
paulb@113 | 46 | simple_create Yes Yes
|
paulb@113 | 47 | simple_managed_map Yes Yes Yes
|
paulb@113 | 48 | simple_managed_queue Yes Yes Yes
|
paulb@113 | 49 | simple_managed Yes Yes Yes
|
paulb@113 | 50 | simple_pmap Yes
|
paul@156 | 51 | simple_pmap_iter Yes
|
paulb@113 | 52 | simple_start_queue Yes Yes Yes
|
paulb@113 | 53 | simple_start Yes Yes
|
paulb@68 | 54 |
|
paul@156 | 55 | The simplest parallel programs are simple_pmap.py and simple_pmap_iter.py
|
paul@156 | 56 | which employ the pmap function resembling the built-in map function in
|
paul@156 | 57 | Python.
|
paulb@105 | 58 |
|
paulb@105 | 59 | Other simple programs are those employing the Queue class, together with those
|
paulb@105 | 60 | using the manage method which associates functions or callables with Queue or
|
paulb@105 | 61 | Exchange objects for convenient invocation of those functions and the
|
paulb@105 | 62 | management of their communications.
|
paulb@105 | 63 |
|
paulb@105 | 64 | The most technically involved program is simple_start.py which uses the
|
paulb@105 | 65 | Exchange class together with a calculation function which is aware of the
|
paulb@105 | 66 | parallel environment and which communicates over the supplied communications
|
paulb@105 | 67 | channel directly to the creating process.
|
paulb@105 | 68 |
|
paulb@105 | 69 | It should be noted that with the exception of simple_start.py, those examples
|
paulb@105 | 70 | employing calculation functions (as opposed to doing a calculation inline in a
|
paulb@105 | 71 | loop body) all use MakeParallel to make those functions parallel-aware, thus
|
paulb@105 | 72 | permitting the conversion of "normal" functions to a form usable in the
|
paulb@105 | 73 | parallel environment.
|
paulb@100 | 74 |
|
paulb@140 | 75 | Reusable Processes
|
paulb@140 | 76 | ------------------
|
paulb@140 | 77 |
|
paulb@119 | 78 | An additional example not listed above, simple_managed_map_reusable.py,
|
paulb@119 | 79 | employs the MakeReusable class instead of MakeParallel in order to demonstrate
|
paulb@140 | 80 | reusable processes and channels:
|
paulb@140 | 81 |
|
paulb@140 | 82 | PYTHONPATH=. python examples/simple_managed_map_reusable.py
|
paulb@140 | 83 |
|
paul@158 | 84 | Continuous Process Communications
|
paul@158 | 85 | ---------------------------------
|
paul@158 | 86 |
|
paul@158 | 87 | Another example not listed above, simple_continuous_queue.py, employs
|
paul@158 | 88 | continuous communications to monitor output from created processes:
|
paul@158 | 89 |
|
paul@158 | 90 | PYTHONPATH=. python examples/simple_continuous_queue.py
|
paul@158 | 91 |
|
paulb@140 | 92 | Persistent Processes
|
paulb@140 | 93 | --------------------
|
paulb@119 | 94 |
|
paulb@140 | 95 | A number of persistent variants of some of the above examples employ a
|
paulb@140 | 96 | persistent or background process which can be started by one process and
|
paulb@140 | 97 | contacted later by another in order to collect the results of a computation.
|
paulb@140 | 98 | For example:
|
paulb@140 | 99 |
|
paulb@140 | 100 | PYTHONPATH=. python examples/simple_persistent_managed.py --start
|
paulb@140 | 101 | PYTHONPATH=. python examples/simple_persistent_managed.py --reconnect
|
paulb@140 | 102 |
|
paulb@144 | 103 | PYTHONPATH=. python examples/simple_background_queue.py --start
|
paulb@144 | 104 | PYTHONPATH=. python examples/simple_background_queue.py --reconnect
|
paulb@100 | 105 |
|
paulb@148 | 106 | PYTHONPATH=. python examples/simple_persistent_queue.py --start
|
paulb@148 | 107 | PYTHONPATH=. python examples/simple_persistent_queue.py --reconnect
|
paulb@148 | 108 |
|
paulb@105 | 109 | Parallel Raytracing with PyGmy
|
paulb@105 | 110 | ------------------------------
|
paulb@105 | 111 |
|
paulb@100 | 112 | The PyGmy raytracer modified to use pprocess can be run to investigate the
|
paulb@105 | 113 | potential for speed increases in "real world" programs:
|
paulb@68 | 114 |
|
paulb@100 | 115 | cd examples/PyGmy
|
paulb@100 | 116 | PYTHONPATH=../..:. python scene.py
|
paulb@100 | 117 |
|
paulb@100 | 118 | (This should produce a file called test.tif - a TIFF file containing a
|
paulb@100 | 119 | raytraced scene image.)
|
paulb@100 | 120 |
|
paul@158 | 121 | Examples from the Concurrency SIG
|
paul@158 | 122 | ---------------------------------
|
paul@158 | 123 |
|
paul@163 | 124 | The special interest group (SIG) for concurrency in Python proposed a
|
paul@163 | 125 | particular application as a showcase for concurrency libraries. Two examples
|
paul@163 | 126 | are included which demonstrate pprocess and the use of continuous processes to
|
paul@163 | 127 | implement the application concerned:
|
paul@163 | 128 |
|
paul@158 | 129 | PYTHONPATH=. python examples/concurrency-sig/bottles.py
|
paul@158 | 130 | PYTHONPATH=. python examples/concurrency-sig/bottles_heartbeat.py
|
paul@158 | 131 |
|
paul@168 | 132 | Examples of Modifying Mutable Objects
|
paul@168 | 133 | -------------------------------------
|
paul@168 | 134 |
|
paul@168 | 135 | Mutable objects can be modified in processes created by pprocess, but the
|
paul@168 | 136 | modifications will not be visible in the parent process. The following
|
paul@168 | 137 | examples illustrate the problem:
|
paul@168 | 138 |
|
paul@171 | 139 | PYTHONPATH=. python examples/simple_mutation.py
|
paul@171 | 140 | PYTHONPATH=. python examples/simple_mutation_queue.py
|
paul@168 | 141 |
|
paul@168 | 142 | The former, non-parallel program will display the expected result of the
|
paul@168 | 143 | computation, whereas the latter, parallel program will fail to do so. This is
|
paul@168 | 144 | because the latter attempts to modify the input collection in order to use it
|
paul@168 | 145 | as a result collection, but these modifications are not propagated back to the
|
paul@168 | 146 | parent process.
|
paul@168 | 147 |
|
paulb@105 | 148 | Test Programs
|
paulb@105 | 149 | -------------
|
paulb@105 | 150 |
|
paulb@100 | 151 | There are some elementary tests:
|
paulb@22 | 152 |
|
paulb@22 | 153 | PYTHONPATH=. python tests/create_loop.py
|
paulb@22 | 154 | PYTHONPATH=. python tests/start_loop.py
|
paulb@22 | 155 |
|
paulb@22 | 156 | (Simple loop demonstrations which use two different ways of creating and
|
paulb@22 | 157 | starting the parallel processes.)
|
paulb@22 | 158 |
|
paulb@36 | 159 | PYTHONPATH=. python tests/start_indexer.py <directory>
|
paulb@22 | 160 |
|
paulb@36 | 161 | (A text indexing demonstration, where <directory> should be a directory
|
paulb@36 | 162 | containing text files to be indexed, although HTML files will also work well
|
paulb@36 | 163 | enough. After indexing the files, a prompt will appear, words or word
|
paulb@36 | 164 | fragments can be entered, and matching words and their locations will be
|
paulb@36 | 165 | shown. Run the program without arguments to see more information.)
|
paulb@22 | 166 |
|
paulb@22 | 167 | Contact, Copyright and Licence Information
|
paulb@22 | 168 | ------------------------------------------
|
paulb@22 | 169 |
|
paulb@132 | 170 | The current Web page for pprocess at the time of release is:
|
paulb@132 | 171 |
|
paulb@132 | 172 | http://www.boddie.org.uk/python/pprocess.html
|
paulb@132 | 173 |
|
paulb@132 | 174 | The author can be contacted at the following e-mail address:
|
paulb@22 | 175 |
|
paulb@22 | 176 | paul@boddie.org.uk
|
paulb@22 | 177 |
|
paulb@22 | 178 | Copyright and licence information can be found in the docs directory - see
|
paulb@78 | 179 | docs/COPYING.txt, docs/lgpl-3.0.txt and docs/gpl-3.0.txt for more information.
|
paulb@22 | 180 |
|
paulb@48 | 181 | For the PyGmy raytracer example, different copyright and licence information
|
paulb@48 | 182 | is provided in the docs directory - see docs/COPYING-PyGmy.txt and
|
paulb@48 | 183 | docs/LICENCE-PyGmy.txt for more information.
|
paulb@48 | 184 |
|
paulb@22 | 185 | Dependencies
|
paulb@22 | 186 | ------------
|
paulb@22 | 187 |
|
paulb@22 | 188 | This software depends on standard library features which are stated as being
|
paul@156 | 189 | available only on "UNIX"; it has only been tested repeatedly on a GNU/Linux
|
paul@156 | 190 | system, and occasionally on systems running OpenSolaris.
|
paulb@22 | 191 |
|
paul@176 | 192 | New in pprocess 0.5.3 (Changes since pprocess 0.5.2)
|
paul@176 | 193 | ----------------------------------------------------
|
paul@176 | 194 |
|
paul@176 | 195 | * Added CPU core counting for Mac OS X, based on feedback from Kai Staats.
|
paul@176 | 196 |
|
paul@168 | 197 | New in pprocess 0.5.2 (Changes since pprocess 0.5.1)
|
paul@168 | 198 | ----------------------------------------------------
|
paul@168 | 199 |
|
paul@168 | 200 | * Added examples involving mutable objects and the inability of pprocess to
|
paul@168 | 201 | automatically propagate changes to such objects back to parent processes.
|
paul@171 | 202 | * Added an explanatory section to the tutorial about data exchange between
|
paul@171 | 203 | processes and the differences from "normal" Python program behaviour.
|
paul@168 | 204 |
|
paul@166 | 205 | New in pprocess 0.5.1 (Changes since pprocess 0.5)
|
paul@166 | 206 | --------------------------------------------------
|
paul@166 | 207 |
|
paul@166 | 208 | * Added IOError handling when processes exit apparently without warning.
|
paul@166 | 209 |
|
paul@160 | 210 | New in pprocess 0.5 (Changes since pprocess 0.4)
|
paul@160 | 211 | ------------------------------------------------
|
paul@155 | 212 |
|
paul@160 | 213 | * Added proper support in the Exchange class for continuous communications
|
paul@160 | 214 | between processes, providing examples: simple_continuous_queue.py and the
|
paul@160 | 215 | concurrency-sig directory.
|
paul@156 | 216 | * Changed the Map class to permit incremental access to received results
|
paul@156 | 217 | from completed parts of the sequence of inputs, also adding an iteration
|
paul@156 | 218 | interface.
|
paul@156 | 219 | * Added an example, simple_pmap_iter.py, to demonstrate iteration over maps.
|
paul@160 | 220 | * Fixed the get_number_of_cores function to work with /proc/cpuinfo where
|
paul@160 | 221 | the "physical id" field is missing.
|
paul@160 | 222 | * Tidied the Exchange class, adding distinct status methods: unfinished and
|
paul@160 | 223 | busy.
|
paul@155 | 224 |
|
paulb@144 | 225 | New in pprocess 0.4 (Changes since pprocess 0.3.1)
|
paulb@144 | 226 | --------------------------------------------------
|
paulb@135 | 227 |
|
paulb@140 | 228 | * Added support for persistent/background processes.
|
paulb@135 | 229 | * Added a utility function to detect and return the number of processor
|
paulb@135 | 230 | cores available.
|
paulb@137 | 231 | * Added missing documentation stylesheet.
|
paulb@150 | 232 | * Added support for Solaris using pipes instead of socket pairs, since
|
paulb@150 | 233 | the latter do not apparently work properly with poll on Solaris.
|
paulb@135 | 234 |
|
paulb@131 | 235 | New in pprocess 0.3.1 (Changes since pprocess 0.3)
|
paulb@131 | 236 | --------------------------------------------------
|
paulb@131 | 237 |
|
paulb@131 | 238 | * Moved the reference material out of the module docstring and into a
|
paulb@131 | 239 | separate document, converting it to XHTML in the process.
|
paulb@131 | 240 | * Fixed the project name in the setup script.
|
paulb@131 | 241 |
|
paulb@126 | 242 | New in pprocess 0.3 (Changes since parallel 0.2.5)
|
paulb@100 | 243 | --------------------------------------------------
|
paulb@84 | 244 |
|
paulb@84 | 245 | * Added managed callables: wrappers around callables which cause them to be
|
paulb@84 | 246 | automatically managed by the exchange from which they were acquired.
|
paulb@84 | 247 | * Added MakeParallel: a wrapper instantiated around a normal function which
|
paulb@84 | 248 | sends the result of that function over the supplied channel when invoked.
|
paulb@119 | 249 | * Added MakeReusable: a wrapper like MakeParallel which can be used in
|
paulb@119 | 250 | conjunction with the newly-added reuse capability of the Exchange class in
|
paulb@119 | 251 | order to reuse processes and channels.
|
paulb@89 | 252 | * Added a Map class which attempts to emulate the built-in map function,
|
paulb@89 | 253 | along with a pmap function using this class.
|
paulb@100 | 254 | * Added a Queue class which provides a simpler iterator-style interface to
|
paulb@100 | 255 | data produced by created processes.
|
paulb@100 | 256 | * Added a create method to the Exchange class and an exit convenience
|
paulb@100 | 257 | function to the module.
|
paulb@100 | 258 | * Changed the Exchange implementation to not block when attempting to start
|
paulb@100 | 259 | new processes beyond the process limit: such requests are queued and
|
paulb@100 | 260 | performed as running processes are completed. This permits programs using
|
paulb@100 | 261 | the start method to proceed to consumption of results more quickly.
|
paulb@105 | 262 | * Extended and updated the examples. Added a tutorial.
|
paulb@100 | 263 | * Added Ubuntu Feisty (7.04) package support.
|
paulb@84 | 264 |
|
paulb@78 | 265 | New in parallel 0.2.5 (Changes since parallel 0.2.4)
|
paulb@78 | 266 | ----------------------------------------------------
|
paulb@78 | 267 |
|
paulb@78 | 268 | * Added a start method to the Exchange class for more convenient creation of
|
paulb@78 | 269 | processes.
|
paulb@78 | 270 | * Relicensed under the LGPL (version 3 or later) - this also fixes the
|
paulb@78 | 271 | contradictory situation where the GPL was stated in the pprocess module
|
paulb@78 | 272 | (which was not, in fact, the intention) and the LGPL was stated in the
|
paulb@78 | 273 | documentation.
|
paulb@78 | 274 |
|
paulb@73 | 275 | New in parallel 0.2.4 (Changes since parallel 0.2.3)
|
paulb@73 | 276 | ----------------------------------------------------
|
paulb@73 | 277 |
|
paulb@73 | 278 | * Set buffer sizes to zero for the file object wrappers around sockets: this
|
paulb@73 | 279 | may prevent deadlock issues.
|
paulb@73 | 280 |
|
paulb@68 | 281 | New in parallel 0.2.3 (Changes since parallel 0.2.2)
|
paulb@68 | 282 | ----------------------------------------------------
|
paulb@68 | 283 |
|
paulb@68 | 284 | * Added convenient message exchanges, offering methods handling common
|
paulb@68 | 285 | situations at the cost of having to define a subclass of Exchange.
|
paulb@68 | 286 | * Added a simple example of performing a parallel computation.
|
paulb@68 | 287 | * Improved the PyGmy raytracer example to use the newly added functionality.
|
paulb@68 | 288 |
|
paulb@55 | 289 | New in parallel 0.2.2 (Changes since parallel 0.2.1)
|
paulb@55 | 290 | ----------------------------------------------------
|
paulb@55 | 291 |
|
paulb@55 | 292 | * Changed the status testing in the Exchange class, potentially fixing the
|
paulb@55 | 293 | premature closure of channels before all data was read.
|
paulb@55 | 294 | * Fixed the PyGmy raytracer example's process accounting by relying on the
|
paulb@55 | 295 | possibly more reliable Exchange behaviour, whilst also preventing
|
paulb@55 | 296 | erroneous creation of "out of bounds" processes.
|
paulb@58 | 297 | * Added a removed attribute on the Exchange to record which channels were
|
paulb@58 | 298 | removed in the last call to the ready method.
|
paulb@55 | 299 |
|
paulb@48 | 300 | New in parallel 0.2.1 (Changes since parallel 0.2)
|
paulb@48 | 301 | --------------------------------------------------
|
paulb@48 | 302 |
|
paulb@48 | 303 | * Added a PyGmy raytracer example.
|
paulb@53 | 304 | * Updated copyright and licensing details (FSF address, additional works).
|
paulb@48 | 305 |
|
paulb@40 | 306 | New in parallel 0.2 (Changes since parallel 0.1)
|
paulb@40 | 307 | ------------------------------------------------
|
paulb@40 | 308 |
|
paulb@40 | 309 | * Changed the name of the included module from parallel to pprocess in order
|
paulb@40 | 310 | to avoid naming conflicts with PyParallel.
|
paulb@40 | 311 |
|
paulb@22 | 312 | Release Procedures
|
paulb@22 | 313 | ------------------
|
paulb@22 | 314 |
|
paul@155 | 315 | Update the pprocess __version__ attribute and the setup.py file version field.
|
paulb@22 | 316 | Change the version number and package filename/directory in the documentation.
|
paulb@22 | 317 | Update the release notes (see above).
|
paulb@22 | 318 | Check the release information in the PKG-INFO file.
|
paulb@22 | 319 | Tag, export.
|
paulb@22 | 320 | Archive, upload.
|
paulb@68 | 321 | Update PyPI.
|
paulb@26 | 322 |
|
paulb@26 | 323 | Making Packages
|
paulb@26 | 324 | ---------------
|
paulb@26 | 325 |
|
paulb@44 | 326 | To make Debian-based packages:
|
paulb@26 | 327 |
|
paulb@44 | 328 | 1. Create new package directories under packages if necessary.
|
paulb@26 | 329 | 2. Make a symbolic link in the distribution's root directory to keep the
|
paulb@26 | 330 | Debian tools happy:
|
paulb@26 | 331 |
|
paulb@44 | 332 | ln -s packages/ubuntu-hoary/python2.4-parallel-pprocess/debian/
|
paulb@26 | 333 |
|
paulb@100 | 334 | Or:
|
paulb@100 | 335 |
|
paulb@100 | 336 | ln -s packages/ubuntu-feisty/python-pprocess/debian/
|
paulb@100 | 337 |
|
paulb@26 | 338 | 3. Run the package builder:
|
paulb@26 | 339 |
|
paulb@26 | 340 | dpkg-buildpackage -rfakeroot
|
paulb@26 | 341 |
|
paulb@26 | 342 | 4. Locate and tidy up the packages in the parent directory of the
|
paulb@26 | 343 | distribution's root directory.
|