1 Namespace Definition
2 ====================
3
4 Module attributes are defined either at the module level or by global
5 statements.
6
7 Class attributes are defined only within class statements.
8
9 Instance attributes are defined only by assignments to attributes of self
10 within __init__ methods.
11
12 Potential Restrictions
13 ----------------------
14
15 Names of classes and functions could be restricted to only refer to those
16 objects within the same namespace. If redefinition were to occur, or if
17 multiple possibilities were present, these restrictions could be moderated as
18 follows:
19
20 * Classes assigned to the same name could provide the union of their
21 attributes. This would, however, cause a potential collision of attribute
22 definitions such as methods.
23
24 * Functions, if they share compatible signatures, could share parameter list
25 definitions.
26
27 Data Structures
28 ===============
29
30 The fundamental "value type" is a pair of references: one pointing to the
31 referenced object represented by the interchangeable value; one referring to
32 the context of the referenced object, typically the object through which the
33 referenced object was acquired as an attribute.A
34
35 Value Layout
36 ------------
37
38 0 1
39 object context
40 reference reference
41
42 Acquiring Values
43 ----------------
44
45 Values are acquired through name lookups and attribute access, yielding
46 the appropriate object reference together with a context reference as
47 indicated in the following table:
48
49 Type of Access Context Notes
50 -------------- ------- -----
51
52 Local name Preserved Functions provide no context
53
54 Global name Preserved Modules provide no context
55
56 Class-originating Accessor Methods acquire the context of their
57 attribute -or- accessor if an instance...
58 Preserved or retain the original context if the
59 accessor is a class
60
61 Instance-originating Preserved Methods retain their original context
62 attribute
63
64 There may be some scope for simplifying the above, to the detriment of Python
65 compatibility, since the unbound vs. bound methods situation can be confusing.
66
67 Objects
68 -------
69
70 Since classes, functions and instances are all "objects", each must support
71 certain features and operations in the same way.
72
73 The __class__ Attribute
74 -----------------------
75
76 All objects support the __class__ attribute:
77
78 Class: refers to the type class (type.__class__ also refers to the type class)
79 Function: refers to the function class
80 Instance: refers to the class instantiated to make the object
81
82 Invocation
83 ----------
84
85 The following actions need to be supported:
86
87 Class: create instance, call __init__ with instance, return object
88 Function: call function body, return result
89 Instance: call __call__ method, return result
90
91 Structure Layout
92 ----------------
93
94 A suitable structure layout might be something like this:
95
96 0 1 2 3 4
97 classcode invocation __class__ attribute ...
98 reference reference reference
99
100 Here, the classcode refers to the attribute lookup table for the object. Since
101 classes and instances share the same classcode, they might resemble the
102 following:
103
104 Class C:
105
106 0 1 2 3 4
107 code for C __new__ class type attribute ...
108 reference reference reference
109
110 Instance of C:
111
112 0 1 2 3 4
113 code for C C.__call__ class C attribute ...
114 reference reference reference
115 (if exists)
116
117 The __new__ reference would lead to code consisting of the following
118 instructions:
119
120 create instance for C
121 call C.__init__(instance, ...)
122 return instance
123
124 If C has a __call__ attribute, the invocation "slot" of C instances would
125 refer to the same thing as C.__call__.
126
127 For functions, the same general layout applies:
128
129 Function f:
130
131 0 1 2 3 4
132 code for code class attribute ...
133 function reference function reference
134 reference
135
136 Here, the code reference would lead to code for the function. Note that the
137 function locals are completely distinct from this structure and are not
138 comparable to attributes.
139
140 For modules, there is no meaningful invocation reference:
141
142 Module m:
143
144 0 1 2 3 4
145 code for m (unused) module type attribute ...
146 reference (global)
147 reference
148
149 Both classes and modules have code in their definitions, but this would be
150 generated in order and not referenced externally.
151
152 Invocation Operation
153 --------------------
154
155 Consequently, regardless of the object an invocation is always done as
156 follows:
157
158 get invocation reference (at object+1)
159 jump to reference
160
161 Additional preparation is necessary before the above code: positional
162 arguments must be saved to the parameter stack, and keyword arguments must be
163 resolved and saved to the appropriate position in the parameter stack.
164
165 Attribute Operations
166 --------------------
167
168 Attribute access needs to go through the attribute lookup table. Some
169 optimisations are possible and are described in the appropriate section.
170
171 One important aspect of attribute access is the appropriate setting of the
172 context in the acquired attribute value. From the table describing the
173 acquisition of values, it is clear that the principal exception is that where
174 a class-originating attribute is accessed on an instance. Consequently, the
175 following algorithm could be employed once an attribute has been located:
176
177 1. If the attribute's context is a special value, indicating that it should
178 be replaced upon instance access, then proceed to the next step;
179 otherwise, acquire both the context and the object as they are.
180
181 2. If the accessor is an instance, use that as the value's context, acquiring
182 only the object from the attribute.
183
184 Where accesses can be determined ahead of time (as discussed in the
185 optimisations section), the above algorithm may not necessarily be employed in
186 the generated code for some accesses.
187
188 Instruction Evaluation Model
189 ============================
190
191 Programs use a value stack where evaluated instructions may save their
192 results. A value stack pointer indicates the top of this stack. In addition, a
193 separate stack is used to record the invocation frames. All stack pointers
194 refer to the next address to be used by the stack, not the address of the
195 uppermost element.
196
197 Frame Stack Value Stack
198 ----------- ----------- Address of Callable
199 -------------------
200 previous ...
201 current ------> callable -----> identifier
202 arg1 reference to code
203 arg2
204 arg3
205 local4
206 local5
207 ...
208
209 Loading local names is a matter of performing frame-relative accesses to the
210 value stack.
211
212 Invocations and Argument Evaluation
213 -----------------------------------
214
215 When preparing for an invocation, the caller first sets the invocation frame
216 pointer. Then, positional arguments are added to the stack such that the first
217 argument positions are filled. A number of stack locations for the remaining
218 arguments specified in the program are then reserved. The names of keyword
219 arguments are used (in the form of table index values) to consult the
220 parameter table and to find the location in which such arguments are to be
221 stored.
222
223 fn(a, b, d=1, e=2, c=3) -> fn(a, b, c, d, e)
224
225 Value Stack
226 -----------
227
228 ... ... ... ...
229 fn fn fn fn
230 a a a a
231 b b b b
232 ___ ___ ___ --> 3
233 ___ --> 1 1 | 1
234 ___ | ___ --> 2 | 2
235 1 ----------- 2 ----------- 3 -----------
236
237 Conceptually, the frame can be considered as a collection of attributes, as
238 seen in other kinds of structures:
239
240 Frame for invocation of fn:
241
242 0 1 2 3 4 5
243 code a b c d e
244 reference
245
246 However, where arguments are specified positionally, such "attributes" are not
247 set using a comparable approach to that employed with other structures.
248 Keyword arguments are set using an attribute-like mechanism, though, where the
249 position of each argument discovered using the parameter table.
250
251 Method invocations incorporate an implicit first argument which is obtained
252 from the context of the method:
253
254 method(a, b, d=1, e=2, c=3) -> method(self, a, b, c, d, e)
255
256 Value Stack
257 -----------
258
259 ...
260 method
261 context of method
262 a
263 b
264 3
265 1
266 2
267
268 Although it could be possible to permit any object to be provided as the first
269 argument, in order to optimise instance attribute access in methods, we should
270 seek to restrict the object type.
271
272 Verifying Supplied Arguments
273 ----------------------------
274
275 In order to ensure a correct invocation, it is also necessary to check the
276 number of supplied arguments. If the target of the invocation is known at
277 compile-time, no additional instructions need to be emitted; otherwise, the
278 generated code must test for the following situations:
279
280 1. That the number of supplied arguments is equal to the number of expected
281 parameters.
282
283 2. That no keyword argument overwrites an existing positional parameter.
284
285 Tuples, Frames and Allocation
286 -----------------------------
287
288 Using the approach where arguments are treated like attributes in some kind of
289 structure, we could choose to allocate frames in places other than a stack.
290 This would produce something somewhat similar to a plain tuple object.
291
292 Optimisations
293 =============
294
295 Some optimisations around constant objects might be possible; these depend on
296 the following:
297
298 * Reliable tracking of assignments: where assignment operations occur, the
299 target of the assignment should be determined if any hope of optimisation
300 is to be maintained. Where no guarantees can be made about the target of
301 an assignment, no assignment-related information should be written to
302 potential targets.
303
304 * Objects acting as "containers" of attributes must be regarded as "safe":
305 where assignments are recorded as occurring on an attribute, it must be
306 guaranteed that no other unforeseen ways exist to assign to such
307 attributes.
308
309 The discussion below presents certain rules which must be imposed to uphold
310 the above requirements.
311
312 Safe Containers
313 ---------------
314
315 Where attributes of modules, classes and instances are only set once and are
316 effectively constant, it should be possible to circumvent the attribute lookup
317 mechanism and to directly reference the attribute value. This technique may
318 only be considered applicable for the following "container" objects, subject
319 to the noted constraints:
320
321 1. For modules, "safety" is enforced by ensuring that assignments to module
322 attributes are only permitted within the module itself either at the
323 top-level or via names declared as globals. Thus, the following would not
324 be permitted:
325
326 another_module.this_module.attr = value
327
328 In the above, this_module is a reference to the current module.
329
330 2. For classes, "safety" is enforced by ensuring that assignments to class
331 attributes are only permitted within the class definition, outside
332 methods. This would mean that classes would be "sealed" at definition time
333 (like functions).
334
335 Unlike the property of function locals that they may only sensibly be accessed
336 within the function in which they reside, these cases demand additional
337 controls or assumptions on or about access to the stored data. Meanwhile, it
338 would be difficult to detect eligible attributes on arbitrary instances due to
339 the need for some kind of type inference or abstract execution.
340
341 Constant Attributes
342 -------------------
343
344 When accessed via "safe containers", as described above, any attribute with
345 only one recorded assignment on it can be considered a constant attribute and
346 this eligible for optimisation, the consequence of which would be the
347 replacement of a LoadAttrIndex instruction (which needs to look up an
348 attribute using the run-time details of the "container" and the compile-time
349 details of the attribute) with a LoadAttr instruction.
350
351 However, some restrictions exist on assignment operations which may be
352 regarded to cause only one assignment in the lifetime of a program:
353
354 1. For module attributes, only assignments at the top-level outside loop
355 statements can be reliably assumed to cause only a single assignment.
356
357 2. For class attributes, only assignments at the top-level within class
358 definitions and outside loop statements can be reliably assumed to cause
359 only a single assignment.
360
361 All assignments satisfying the "safe container" requirements, but not the
362 requirements immediately above, should each be recorded as causing at least
363 one assignment.
364
365 Additional Controls
366 -------------------
367
368 For the above cases for "container" objects, the following controls would need
369 to apply:
370
371 1. Modules would need to be immutable after initialisation. However, during
372 initialisation, there remains a possibility of another module attempting
373 to access the original module. For example, if ppp/__init__.py contained
374 the following...
375
376 x = 1
377 import ppp.qqq
378 print x
379
380 ...and if ppp/qqq.py contained the following...
381
382 import ppp
383 ppp.x = 2
384
385 ...then the value 2 would be printed. Since modules are objects which are
386 registered globally in a program, it would be possible to set attributes
387 in the above way.
388
389 2. Classes would need to be immutable after initialisation. However, since
390 classes are objects, any reference to a class after initialisation could
391 be used to set attributes on the class.
392
393 Solutions:
394
395 1. Insist on global scope for module attribute assignments.
396
397 2. Insist on local scope within classes.
398
399 Both of the above measures need to be enforced at run-time, since an arbitrary
400 attribute assignment could be attempted on any kind of object, yet to uphold
401 the properties of "safe containers", attempts to change attributes of such
402 objects should be denied. Since foreseen attribute assignment operations have
403 certain properties detectable at compile-time, it could be appropriate to
404 generate special instructions (or modified instructions) during the
405 initialisation of modules and classes for such foreseen assignments, whilst
406 employing normal attribute assignment operations in all other cases. Indeed,
407 the StoreAttr instruction, which is used to set attributes in "safe
408 containers" would be used exclusively for this purpose; the StoreAttrIndex
409 instruction would be used exclusively for all other attribute assignments.
410
411 To ensure the "sealing" of modules and classes, entries in the attribute
412 lookup table would encode whether a class or module is being accessed, so
413 that the StoreAttrIndex instruction could reject such accesses.
414
415 Constant Attribute Values
416 -------------------------
417
418 Where an attribute value is itself regarded as constant, is a "safe container"
419 and is used in an operation accessing its own attributes, the value can be
420 directly inspected for optimisations or employed in the generated code. For
421 the attribute values themselves, only objects of a constant nature may be
422 considered suitable for this particular optimisation:
423
424 * Classes
425 * Modules
426 * Instances defined as constant literals
427
428 This is because arbitrary objects (such as most instances) have no
429 well-defined form before run-time and cannot be investigated further at
430 compile-time or have a representation inserted into the generated code.
431
432 Class Attributes and Access via Instances
433 -----------------------------------------
434
435 Unlike module attributes, class attributes can be accessed in a number of
436 different ways:
437
438 * Using the class itself:
439
440 C.x = 123
441 cls = C; cls.x = 234
442
443 * Using a subclass of the class (for reading attributes):
444
445 class D(C):
446 pass
447 D.x # setting D.x would populate D, not C
448
449 * Using instances of the class or a subclass of the class (for reading
450 attributes):
451
452 c = C()
453 c.x # setting c.x would populate c, not C
454
455 Since assignments are only achieved using direct references to the class, and
456 since class attributes should be defined only within the class initialisation
457 process, the properties of class attributes should be consistent with those
458 desired.
459
460 Method Access via Instances
461 ---------------------------
462
463 It is desirable to optimise method access, even though most method calls are
464 likely to occur via instances. It is possible, given the properties of methods
465 as class attributes to investigate the kind of instance that the self
466 parameter/local refers to within each method: it should be an instance either
467 of the class in which the method is defined or a compatible class, although
468 situations exist where this might not be the case:
469
470 * Explicit invocation of a method:
471
472 d = D() # D is not related to C
473 C.f(d) # calling f(self) in C
474
475 If blatant usage of incompatible instances were somehow disallowed, it would
476 still be necessary to investigate the properties of an instance's class and
477 its relationship with other classes. Consider the following example:
478
479 class A:
480 def f(self): ...
481
482 class B:
483 def f(self): ...
484 def g(self):
485 self.f()
486
487 class C(A, B):
488 pass
489
490 Here, instances of B passed into the method B.g could be assumed to provide
491 access to B.f when self.f is resolved at compile-time. However, instances of C
492 passed into B.g would instead provide access to A.f when self.f is resolved at
493 compile-time (since the method resolution order is C, A, B instead of just B).
494
495 One solution might be to specialise methods for each instance type, but this
496 could be costly. Another less ambitious solution might only involve the
497 optimisation of such internal method calls if an unambiguous target can be
498 resolved.
499
500 Optimising Function Invocations
501 -------------------------------
502
503 Where an attribute value is itself regarded as constant and is a function,
504 knowledge about the parameters of the function can be employed to optimise the
505 preparation of the invocation frame.