1 Optimisations
2 =============
3
4 Some optimisations around constant objects might be possible; these depend on
5 the following:
6
7 * Reliable tracking of assignments: where assignment operations occur, the
8 target of the assignment should be determined if any hope of optimisation
9 is to be maintained. Where no guarantees can be made about the target of
10 an assignment, no assignment-related information should be written to
11 potential targets.
12
13 * Objects acting as "containers" of attributes must be regarded as "safe":
14 where assignments are recorded as occurring on an attribute, it must be
15 guaranteed that no other unforeseen ways exist to assign to such
16 attributes.
17
18 The discussion below presents certain rules which must be imposed to uphold
19 the above requirements.
20
21 Safe Containers
22 ---------------
23
24 Where attributes of modules, classes and instances are only set once and are
25 effectively constant, it should be possible to circumvent the attribute lookup
26 mechanism and to directly reference the attribute value. This technique may
27 only be considered applicable for the following "container" objects, subject
28 to the noted constraints:
29
30 1. For modules, "safety" is enforced by ensuring that assignments to module
31 attributes are only permitted within the module itself either at the
32 top-level or via names declared as globals. Thus, the following would not
33 be permitted:
34
35 another_module.this_module.attr = value
36
37 In the above, this_module is a reference to the current module.
38
39 Where assignments to module attributes are permitted from outside a
40 module, modules can no longer be relied upon to provide constant attribute
41 values, even if a cursory inspection of module-level code would suggest
42 that some attributes are effectively constant.
43
44 2. For classes, "safety" is enforced by ensuring that assignments to class
45 attributes are only permitted within the class definition, outside
46 methods. This would mean that classes would be "sealed" at definition time
47 (like functions).
48
49 Unlike the property of function locals that they may only sensibly be accessed
50 within the function in which they reside, these cases demand additional
51 controls or assumptions on or about access to the stored data. Meanwhile, it
52 would be difficult to detect eligible attributes on arbitrary instances due to
53 the need for some kind of type inference or abstract execution.
54
55 Constant Attributes
56 -------------------
57
58 When accessed via "safe containers", as described above, any attribute with
59 only one recorded assignment on it can be considered a constant attribute and
60 this eligible for optimisation, the consequence of which would be the
61 replacement of a LoadAttrIndex instruction (which needs to look up an
62 attribute using the run-time details of the "container" and the compile-time
63 details of the attribute) with a LoadAttr instruction.
64
65 However, some restrictions exist on assignment operations which may be
66 regarded to cause only one assignment in the lifetime of a program:
67
68 1. For module attributes, only assignments at the top-level outside loop
69 statements can be reliably assumed to cause only a single assignment.
70
71 2. For class attributes, only assignments at the top-level within class
72 definitions and outside loop statements can be reliably assumed to cause
73 only a single assignment.
74
75 All assignments satisfying the "safe container" requirements, but not the
76 requirements immediately above, should each be recorded as causing at least
77 one assignment.
78
79 Additional Controls
80 -------------------
81
82 For the above cases for "container" objects, the following controls would need
83 to apply:
84
85 1. Modules would need to be immutable after initialisation. However, during
86 initialisation, there remains a possibility of another module attempting
87 to access the original module. For example, if ppp/__init__.py contained
88 the following...
89
90 x = 1
91 import ppp.qqq
92 print x
93
94 ...and if ppp/qqq.py contained the following...
95
96 import ppp
97 ppp.x = 2
98
99 ...then the value 2 would be printed. Since modules are objects which are
100 registered globally in a program, it would be possible to set attributes
101 in the above way.
102
103 2. Classes would need to be immutable after initialisation. However, since
104 classes are objects, any reference to a class after initialisation could
105 be used to set attributes on the class.
106
107 Solutions:
108
109 1. Insist on global scope for module attribute assignments.
110
111 2. Insist on local scope within classes.
112
113 Both of the above measures need to be enforced at run-time, since an arbitrary
114 attribute assignment could be attempted on any kind of object, yet to uphold
115 the properties of "safe containers", attempts to change attributes of such
116 objects should be denied. Since foreseen attribute assignment operations have
117 certain properties detectable at compile-time, it could be appropriate to
118 generate special instructions (or modified instructions) during the
119 initialisation of modules and classes for such foreseen assignments, whilst
120 employing normal attribute assignment operations in all other cases. Indeed,
121 the StoreAttr instruction, which is used to set attributes in "safe
122 containers" would be used exclusively for this purpose; the StoreAttrIndex
123 instruction would be used exclusively for all other attribute assignments.
124
125 To ensure the "sealing" of modules and classes, entries in the attribute
126 lookup table would encode whether a class or module is being accessed, so
127 that the StoreAttrIndex instruction could reject such accesses.
128
129 Constant Attribute Values
130 -------------------------
131
132 Where an attribute value is itself regarded as constant, is a "safe container"
133 and is used in an operation accessing its own attributes, the value can be
134 directly inspected for optimisations or employed in the generated code. For
135 the attribute values themselves, only objects of a constant nature may be
136 considered suitable for this particular optimisation:
137
138 * Classes
139 * Modules
140 * Instances defined as constant literals
141
142 This is because arbitrary objects (such as most instances) have no
143 well-defined form before run-time and cannot be investigated further at
144 compile-time or have a representation inserted into the generated code.
145
146 Class Attributes and Access via Instances
147 -----------------------------------------
148
149 Unlike module attributes, class attributes can be accessed in a number of
150 different ways:
151
152 * Using the class itself:
153
154 C.x = 123
155 cls = C; cls.x = 234
156
157 * Using a subclass of the class (for reading attributes):
158
159 class D(C):
160 pass
161 D.x # setting D.x would populate D, not C
162
163 * Using instances of the class or a subclass of the class (for reading
164 attributes):
165
166 c = C()
167 c.x # setting c.x would populate c, not C
168
169 Since assignments are only achieved using direct references to the class, and
170 since class attributes should be defined only within the class initialisation
171 process, the properties of class attributes should be consistent with those
172 desired.
173
174 Method Access via Instances
175 ---------------------------
176
177 It is desirable to optimise method access, even though most method calls are
178 likely to occur via instances. It is possible, given the properties of methods
179 as class attributes to investigate the kind of instance that the self
180 parameter/local refers to within each method: it should be an instance either
181 of the class in which the method is defined or a compatible class, although
182 situations exist where this might not be the case:
183
184 * Explicit invocation of a method:
185
186 d = D() # D is not related to C
187 C.f(d) # calling f(self) in C
188
189 If blatant usage of incompatible instances were somehow disallowed, it would
190 still be necessary to investigate the properties of an instance's class and
191 its relationship with other classes. Consider the following example:
192
193 class A:
194 def f(self): ...
195
196 class B:
197 def f(self): ...
198 def g(self):
199 self.f()
200
201 class C(A, B):
202 pass
203
204 Here, instances of B passed into the method B.g could be assumed to provide
205 access to B.f when self.f is resolved at compile-time. However, instances of C
206 passed into B.g would instead provide access to A.f when self.f is resolved at
207 compile-time (since the method resolution order is C, A, B instead of just B).
208
209 One solution might be to specialise methods for each instance type, but this
210 could be costly. Another less ambitious solution might only involve the
211 optimisation of such internal method calls if an unambiguous target can be
212 resolved.
213
214 Narrowing Type Candidates using Attribute Names
215 -----------------------------------------------
216
217 Within functions, it might be useful to record attribute accesses on local
218 names and to collect the names in order to help resolve targets in the
219 generated code. For example:
220
221 def f(x, y):
222 x.method(y)
223 if x.something:
224 ...
225 return x.attr
226
227 Here, the object referenced via x should have "method", "something" and "attr"
228 as attributes. We could impose a test for a type providing these attributes at
229 the earliest opportunity and then specialise the attribute accesses, perhaps
230 also invocations, in the generated code. AttributeError occurrences would need
231 to be considered, however, potentially disqualifying certain attributes from
232 any optimisations, and control flow would also need to be considered. The
233 result would resemble the following:
234
235 def f(x, y):
236 if not isinstance(x, C) and x is not C:
237 raise TypeError
238 x.method(y)
239 if x.something:
240 ...
241 return x.attr
242
243 With more specific information about the attributes used with a given name,
244 combinations of attributes can be provided instead of single attributes when
245 recording attribute usage. For example, from the above:
246
247 x uses method -> indicates potential usage of C.method, D1.method, ...
248 x uses something -> indicates potential usage of C.something,
249 D2.something, ...
250 x uses attr -> indicates potential usage of C.attr, D3.attr, ...
251
252 These unspecific declarations of attribute usage can be replaced with the
253 following:
254
255 x uses method, something, attr -> indicates potential usage of C, D4, ...
256 (each providing all of the stated
257 attributes)
258
259 Reducing Object Table Scope
260 ---------------------------
261
262 Where attributes may be used in a program but never accessed via the object
263 table-dependent instructions, such attributes could be omitted from the object
264 table.
265
266 Implemented Optimisation Types
267 ==============================
268
269 Optimisation Prerequisites and Effect
270 ------------ ------------------------
271
272 constant_storage value instruction references a constant;
273 (guidance) storage instruction references a constant;
274 | indicate whether both instructions satisfy the
275 | preconditions and should be removed/omitted
276
277 constant_accessor value instruction references a constant;
278 (guidance) | target name provided (for use in producing an
279 | address access instruction)
280
281 known_target value instruction references a constant;
282 (guidance) | target and context are provided (no instructions
283 | are removed)
284
285 self_access value instruction references "self" in a method;
286 (guidance) specified attribute name always has the same
287 position;
288 | indicate whether an appropriate instruction can
289 | be generated for the access
290
291 temp_storage value instruction is a simple input operation;
292 (elimination) value instruction is the last instruction;
293 (guidance) | remove the value instruction, provide the value
294 | instruction in place of a temporary storage
295 | reference
296
297 no_operations input to the current instruction loads from the
298 (guidance) destination of the current instruction;
299 | indicate that the current instruction should be
300 | omitted
301
302 unused_results value instruction is a simple input operation;
303 (elimination) value instruction is the final instruction of a
304 discarded expression;
305 | remove the value instruction
306
307 unused_objects attribute is defined using a name which is not
308 (inspection) used in an active region of the program or is
309 defined within a class which is not used by
310 the program, object is unambiguously
311 referenced by such attributes
312 | remove such attributes and objects