1.1 --- a/docs/tutorial.html Fri Sep 25 16:02:56 2015 +0200
1.2 +++ b/docs/tutorial.html Fri Sep 25 16:40:39 2015 +0200
1.3 @@ -16,6 +16,7 @@
1.4 use the <code>pmap</code> function.</p>
1.5
1.6 <ul>
1.7 +<li><a href="#note">A Note on Parallel Processes</a></li>
1.8 <li><a href="#pmap">Converting Map-Style Code</a></li>
1.9 <li><a href="#Map">Converting Invocations to Parallel Operations</a></li>
1.10 <li><a href="#Queue">Converting Arbitrarily-Ordered Invocations</a>
1.11 @@ -35,6 +36,81 @@
1.12 <p>For a brief summary of each of the features of <code>pprocess</code>, see
1.13 the <a href="reference.html">reference document</a>.</p>
1.14
1.15 +<h2 id="note">A Note on Parallel Processes</h2>
1.16 +
1.17 +<p>The way <code>pprocess</code> uses multiple processes to perform work in
1.18 +parallel involves the <code>fork</code> system call, which on modern operating
1.19 +systems involves what is known as "copy-on-write" semantics. In plain language,
1.20 +when <code>pprocess</code> creates a new <em>child</em> process to perform work
1.21 +in parallel with other work that needs to be done, this new process will be a
1.22 +near-identical copy of the original <em>parent</em> process, and the running
1.23 +code will be able to access data resident in that parent process.</p>
1.24 +
1.25 +<p>However, when a child process modifies data, instead of changing that data
1.26 +in such a way that the parent process can see the modifications, the parent
1.27 +process will, in fact, remain oblivious to such changes. What happens is that
1.28 +as soon as the child process attempts to modify the data, it obtains its own
1.29 +separate copy which is then modified independently of the original data. Thus,
1.30 +a <em>copy</em> of any data is made when an attempt is made to <em>write</em>
1.31 +to such data. Meanwhile, the parent's copy of that data will be left untouched
1.32 +by the activities of the child.</p>
1.33 +
1.34 +<p>It is therefore essential to note that any data distributed to other
1.35 +processes, and which will then be modified by those processes, will not appear
1.36 +to change in the parent process even if the objects employed are mutable. This
1.37 +is rather different to the behaviour of a normal Python program: passing a
1.38 +list to a function, for example, mutates that list in such a way that upon
1.39 +returning from that function the modifications will still be present. For
1.40 +example:</p>
1.41 +
1.42 +<pre>
1.43 +def mutator(l):
1.44 + l.append(3)
1.45 +
1.46 +l = [1, 2]
1.47 +mutator(l) # l is now [1, 2, 3]
1.48 +</pre>
1.49 +
1.50 +<p>In contrast, passing a list to a child process will cause the list to
1.51 +mutate in the child process, but the parent process will not see the list
1.52 +change. For example:</p>
1.53 +
1.54 +<pre>
1.55 +def mutator(l):
1.56 + l.append(3)
1.57 +
1.58 +results = pprocess.Map()
1.59 +mutator = results.manage(pprocess.MakeParallel(mutator))
1.60 +
1.61 +l = [1, 2]
1.62 +mutator(l) # l is now [1, 2]
1.63 +</pre>
1.64 +
1.65 +<p>To communicate changes to data between processes, the modified objects must
1.66 +be explicitly returned from child processes using the mechanisms described in
1.67 +this documentation. For example:</p>
1.68 +
1.69 +<pre>
1.70 +def mutator(l):
1.71 + l.append(3)
1.72 + return l # the modified object is explicitly returned
1.73 +
1.74 +results = pprocess.Map()
1.75 +mutator = results.manage(pprocess.MakeParallel(mutator))
1.76 +
1.77 +l = [1, 2]
1.78 +mutator(l)
1.79 +
1.80 +all_l = results[:] # there are potentially many results, not just one
1.81 +l = all_l[0] # l is now [1, 2, 3], taken from the first result
1.82 +</pre>
1.83 +
1.84 +<p>It is perhaps easiest to think of the communications mechanisms as
1.85 +providing a gateway between processes through which information can be passed,
1.86 +with the rest of a program's data being private and hidden from the other
1.87 +processes (even if that data initially resembles what the other processes also
1.88 +see within themselves).</p>
1.89 +
1.90 <h2 id="pmap">Converting Map-Style Code</h2>
1.91
1.92 <p>Consider a program using the built-in <code>map</code> function and a sequence of inputs:</p>