1 Concepts
2 ========
3
4 This document describes the underlying concepts employed in micropython.
5
6 * Namespaces and attribute definition
7 * Contexts and values
8 * Tables, attributes and lookups
9 * Objects and structures
10 * Parameters and lookups
11 * Instantiation
12 * Register usage
13
14 Namespaces and Attribute Definition
15 ===================================
16
17 Namespaces are any objects which can retain attributes.
18
19 * Module attributes are defined either at the module level or by global
20 statements.
21 * Class attributes are defined only within class statements.
22 * Instance attributes are defined only by assignments to attributes of self
23 within __init__ methods.
24
25 These restrictions apply because such attributes are thus explicitly declared,
26 permitting the use of tables (described below). Module and class attributes
27 can also be finalised in this way in order to permit certain optimisations.
28
29 See rejected.txt for complicating mechanisms which could be applied to
30 mitigate the effects of these restrictions on optimisations.
31
32 Contexts and Values
33 ===================
34
35 Values are used as the common reference representation in micropython: as
36 stored representations of attributes (of classes, instances, modules, and
37 other objects supporting attribute-like entities) as well as the stored values
38 associated with names in functions and methods.
39
40 Unlike other implementations, micropython does not create things like bound
41 method objects for individual instances. Instead, all objects are referenced
42 using a context, reference pair:
43
44 Value Layout
45 ------------
46
47 0 1
48 context object
49 reference reference
50
51 Specific implementations might reverse this ordering for optimisation
52 purposes.
53
54 Rationale
55 ---------
56
57 To reduce the number of created objects whilst retaining the ability to
58 support bound method invocations. The context indicates the context in which
59 an invocation is performed, typically the owner of the method.
60
61 Usage
62 -----
63
64 The context may be inserted as the first argument when a value is involved in
65 an invocation. This argument may then be omitted from the invocation if its
66 usage is not appropriate.
67
68 See invocation.txt for details.
69
70 Context Value Types
71 -------------------
72
73 The following types of context value exist:
74
75 Type Usage Transformations
76 ---- ----- ---------------
77
78 Replaceable With functions (not methods) May be replaced with an
79 instance or a class when a
80 value is stored on an
81 instance or class
82
83 Placeholder With classes May not be replaced
84
85 Instance With instances (and constants) May not be replaced
86 or functions as methods
87
88 Class With functions as methods May be replaced when a
89 value is loaded from a
90 class attribute via an
91 instance
92
93 Contexts in Acquired Values
94 ---------------------------
95
96 There are four classes of instructions which provide values:
97
98 Instruction Purpose Context Operations
99 ----------- ------- ------------------
100
101 1) LoadConst Load module, constant Use loaded object with itself
102 as context
103
104 2) LoadFunction Load function Combine replaceable context
105 with loaded object
106
107 3) LoadClass Load class Combine placeholder context
108 with loaded object
109
110 4) LoadAddress* Load attribute from Preserve or override stored
111 LoadAttr* class, module, context (as described in
112 instance assignment.txt)
113
114 In order to comply with traditional Python behaviour, contexts may or may not
115 represent the object from which an attribute has been acquired.
116
117 See assignment.txt for details.
118
119 Contexts in Stored Values
120 -------------------------
121
122 There are two classes of instruction for storing values:
123
124 Instruction Purpose Context Operations
125 ----------- ------- ------------------
126
127 1) StoreAddress Store attribute in a Preserve context; note that no
128 known object test for class attribute
129 assignment should be necessary
130 since this instruction should only
131 be generated for module globals
132
133 StoreAttr Store attribute in an Preserve context; note that no
134 instance test for class attribute
135 assignment should be necessary
136 since this instruction should only
137 be generated for self accesses
138
139 StoreAttrIndex Store attribute in an Preserve context; since the index
140 unknown object lookup could yield a class
141 attribute, a test of the nature of
142 the nature of the structure is
143 necessary in order to prevent
144 assignments to classes
145
146 2) StoreAddressContext Store attribute in a Override context if appropriate;
147 known object if the value has a replaceable
148 context, permit the target to
149 take ownership of the value
150
151 See assignment.txt for details.
152
153 Tables, Attributes and Lookups
154 ==============================
155
156 Attribute lookups, where the exact location of an object attribute is deduced,
157 are performed differently in micropython than in other implementations.
158 Instead of providing attribute dictionaries, in which attributes are found,
159 attributes are located at fixed places in object structures (described below)
160 and their locations are stored using a special representation known as a
161 table.
162
163 For a given program, a table can be considered as being like a matrix mapping
164 classes to attribute names. For example:
165
166 class A:
167 # instances have attributes x, y
168
169 class B(A):
170 # introduces attribute z for instances
171
172 class C:
173 # instances have attributes a, b, z
174
175 This would provide the following table, referred to as an object table in the
176 context of classes and instances:
177
178 Class/attr a b x y z
179
180 A 1 2
181 B 1 2 3
182 C 1 2 3
183
184 A limitation of this representation is that instance attributes may not shadow
185 class attributes: if an attribute with a given name is not defined on an
186 instance, an attribute with the same name cannot be provided by the class of
187 the instance or any superclass of the instance's class.
188
189 The table can be compacted using a representation known as a displacement
190 list (referred to as an object list in this context):
191
192 Classes with attribute offsets
193
194 classcode A
195 attrcode a b x y z
196
197 B
198 a b x y z
199
200 C
201 a b x y z
202
203 List . . 1 2 1 2 3 1 2 . . 3
204
205 Here, the classcode refers to the offset in the list at which a class's
206 attributes are defined, whereas the attrcode defines the offset within a
207 region of attributes corresponding to a single attribute of a given name.
208
209 Attribute Locations
210 -------------------
211
212 The locations stored in table/list elements are for instance attributes
213 relative to the location of the instance, whereas those for class attributes
214 and modules are absolute addresses (although these could also be changed to
215 object-relative locations).
216
217 Objects and Structures
218 ======================
219
220 As well as references, micropython needs to have actual objects to refer to.
221 Since classes, functions and instances are all objects, it is desirable that
222 certain common features and operations are supported in the same way for all
223 of these things. To permit this, a common data structure format is used.
224
225 Header.................................................... Attributes.................
226
227 Identifier Identifier Address Identifier Size Object Object ...
228
229 0 1 2 3 4 5 6 7
230 classcode attrcode/ invocation funccode size __class__ attribute ...
231 instance reference reference reference
232 status
233
234 Classcode
235 ---------
236
237 Used in attribute lookup.
238
239 Here, the classcode refers to the attribute lookup table for the object (as
240 described above). Classes and instances share the same classcode, and their
241 structures reflect this. Functions all belong to the same type and thus employ
242 the classcode for the function built-in type, whereas modules have distinct
243 types since they must support different sets of attributes.
244
245 Attrcode
246 --------
247
248 Used to test instances for membership of classes (or descendants of classes).
249
250 Since, in traditional Python, classes are only ever instances of the "type"
251 built-in class, support for testing such a relationship directly has been
252 removed and the attrcode is not specified for classes: the presence of an
253 attrcode indicates that a given object is an instance.
254
255 See the "Testing Instance Compatibility with Classes (Attrcode)" section below
256 for details of attrcodes.
257
258 Invocation Reference
259 --------------------
260
261 Used when an object is called.
262
263 This is the address of the code to be executed when an invocation is performed
264 on the object.
265
266 Funccode
267 --------
268
269 Used to look up argument positions by name.
270
271 The strategy with keyword arguments in micropython is to attempt to position
272 such arguments in the invocation frame as it is being constructed.
273
274 See the "Parameters and Lookups" section for more information.
275
276 Size
277 ----
278
279 Used to indicate the size of an object including attributes.
280
281 Attributes
282 ----------
283
284 For classes, modules and instances, the attributes in the structure correspond
285 to the attributes of each kind of object. For functions, however, the
286 attributes in the structure correspond to the default arguments for each
287 function, if any.
288
289 Structure Types
290 ---------------
291
292 Class C:
293
294 0 1 2 3 4 5 6 7
295 classcode (unused) __new__ funccode size class type attribute ...
296 for C reference for reference reference
297 instantiator
298
299 Instance of C:
300
301 0 1 2 3 4 5 6 7
302 classcode attrcode C.__call__ funccode size class C attribute ...
303 for C for C reference for reference reference
304 (if exists) C.__call__
305
306 Function f:
307
308 0 1 2 3 4 5 6 7
309 classcode attrcode code funccode size class attribute ...
310 for for reference function (default)
311 function function reference reference
312
313 Module m:
314
315 0 1 2 3 4 5 6 7
316 classcode attrcode (unused) (unused) (unused) module type attribute ...
317 for m for m reference (global)
318 reference
319
320 The __class__ Attribute
321 -----------------------
322
323 All objects support the __class__ attribute and this is illustrated above with
324 the first attribute.
325
326 Class: refers to the type class (type.__class__ also refers to the type class)
327 Function: refers to the function class
328 Instance: refers to the class instantiated to make the object
329
330 Lists and Tuples
331 ----------------
332
333 The built-in list and tuple sequences employ variable length structures using
334 the attribute locations to store their elements, where each element is a
335 reference to a separately stored object.
336
337 Testing Instance Compatibility with Classes (Attrcode)
338 ------------------------------------------------------
339
340 Although it would be possible to have a data structure mapping classes to
341 compatible classes, such as a matrix indicating the subclasses (or
342 superclasses) of each class, the need to retain the key to such a data
343 structure for each class might introduce a noticeable overhead.
344
345 Instead of having a separate structure, descendant classes of each class are
346 inserted as special attributes into the object table. This requires an extra
347 key to be retained, since each class must provide its own attribute code such
348 that upon an instance/class compatibility test, the code may be obtained and
349 used in the object table.
350
351 Invocation and Code References
352 ------------------------------
353
354 Modules: there is no meaningful invocation reference since modules cannot be
355 explicitly called.
356
357 Functions: a simple code reference is employed pointing to code implementing
358 the function. Note that the function locals are completely distinct from this
359 structure and are not comparable to attributes. Instead, attributes are
360 reserved for default parameter values, although they do not appear in the
361 object table described above, appearing instead in a separate parameter table
362 described below.
363
364 Classes: given that classes must be invoked in order to create instances, a
365 reference must be provided in class structures. However, this reference does
366 not point directly at the __init__ method of the class. Instead, the
367 referenced code belongs to a special initialiser function, __new__, consisting
368 of the following instructions:
369
370 create instance for C
371 call C.__init__(instance, ...)
372 return instance
373
374 Instances: each instance employs a reference to any __call__ method defined in
375 the class hierarchy for the instance, thus maintaining its callable nature.
376
377 Both classes and modules may contain code in their definitions - the former in
378 the "body" of the class, potentially defining attributes, and the latter as
379 the "top-level" code in the module, potentially defining attributes/globals -
380 but this code is not associated with any invocation target. It is thus
381 generated in order of appearance and is not referenced externally.
382
383 Invocation Operation
384 --------------------
385
386 Consequently, regardless of the object an invocation is always done as
387 follows:
388
389 get invocation reference from the header
390 jump to reference
391
392 Additional preparation is necessary before the above code: positional
393 arguments must be saved in the invocation frame, and keyword arguments must be
394 resolved and saved to the appropriate position in the invocation frame.
395
396 See invocation.txt for details.
397
398 Parameters and Lookups
399 ======================
400
401 Since Python supports keyword arguments when making invocations, it becomes
402 necessary to record the parameter names associated with each function or
403 method. Just as object tables record attributes positions on classes and
404 instances, parameter tables record parameter positions in function or method
405 parameter lists.
406
407 For a given program, a parameter table can be considered as being like a
408 matrix mapping functions/methods to parameter names. For example:
409
410 def f(x, y, z):
411 pass
412
413 def g(a, b, c):
414 pass
415
416 def h(a, x):
417 pass
418
419 This would provide the following table, referred to as a parameter table in
420 the context of functions and methods:
421
422 Function/param a b c x y z
423
424 f 1 2 3
425 g 1 2 3
426 h 1 2
427
428 Confusion can occur when functions are adopted as methods, since the context
429 then occupies the first slot in the invocation frame:
430
431 def f(x, y, z):
432 pass
433
434 f(x=1, y=2, z=3) -> f(<context>, 1, 2, 3)
435 -> f(1, 2, 3)
436
437 class C:
438 f = f
439
440 def g(x, y, z):
441 pass
442
443 c = C()
444
445 c.f(y=2, z=3) -> f(<context>, 2, 3)
446 c.g(y=2, z=3) -> C.g(<context>, 2, 3)
447
448 Just as with parameter tables, a displacement list can be prepared from a
449 parameter table:
450
451 Functions with parameter (attribute) offsets
452
453 funccode f
454 attrcode a b c x y z
455
456 g
457 a b c x y z
458
459 h
460 a b c x y z
461
462 List . . . 1 2 3 1 2 3 1 . . 2 . .
463
464 Here, the funccode refers to the offset in the list at which a function's
465 parameters are defined, whereas the attrcode defines the offset within a
466 region of attributes corresponding to a single parameter of a given name.
467
468 Instantiation
469 =============
470
471 When instantiating classes, memory must be reserved for the header of the
472 resulting instance, along with locations for the attributes of the instance.
473 Since the instance header contains data common to all instances of a class, a
474 template header is copied to the start of the newly reserved memory region.
475
476 Register Usage
477 ==============
478
479 During code generation, much of the evaluation produces results which are
480 implicitly recorded in the "active value" register, and various instructions
481 will consume the active value. In addition, some instructions will consume a
482 separate "active source value" from a register, typically those which are
483 assigning the result of an expression to an assignment target.
484
485 Since values often need to be retained for later use, a set of temporary
486 storage locations are typically employed. However, optimisations may reduce
487 the need to use such temporary storage where instructions which provide the
488 "active value" can be re-executed and will produce the same result.