1 Concepts
2 ========
3
4 This document describes the underlying concepts employed in micropython.
5
6 * Namespaces and attribute definition
7 * Contexts and values
8 * Tables, attributes and lookups
9 * Objects and structures
10 * Parameters and lookups
11 * Instantiation
12 * Register usage
13
14 Namespaces and Attribute Definition
15 ===================================
16
17 Namespaces are any objects which can retain attributes.
18
19 * Module attributes are defined either at the module level or by global
20 statements.
21 * Class attributes are defined only within class statements.
22 * Instance attributes are defined only by assignments to attributes of self
23 within __init__ methods.
24
25 These restrictions apply because such attributes are thus explicitly declared,
26 permitting the use of tables (described below). Module and class attributes
27 can also be finalised in this way in order to permit certain optimisations.
28
29 An additional restriction required for the current implementation of tables
30 (as described below) applies to class definitions: each class must be defined
31 using a unique name; repeated definition of classes having the same name is
32 thus not permitted. This restriction arises from the use of the "full name" of
33 a class as a key to the object table, where the full name is a qualified path
34 via the module hierarchy ending with the name of the class.
35
36 See rejected.txt for complicating mechanisms which could be applied to
37 mitigate the effects of these restrictions on optimisations.
38
39 Contexts and Values
40 ===================
41
42 Values are used as the common reference representation in micropython: as
43 stored representations of attributes (of classes, instances, modules, and
44 other objects supporting attribute-like entities) as well as the stored values
45 associated with names in functions and methods.
46
47 Unlike other implementations, micropython does not create things like bound
48 method objects for individual instances. Instead, all objects are referenced
49 using a context, reference pair:
50
51 Value Layout
52 ------------
53
54 0 1
55 context object
56 reference reference
57
58 Specific implementations might reverse this ordering for optimisation
59 purposes.
60
61 Rationale
62 ---------
63
64 To reduce the number of created objects whilst retaining the ability to
65 support bound method invocations. The context indicates the context in which
66 an invocation is performed, typically the owner of the method.
67
68 Usage
69 -----
70
71 The context may be inserted as the first argument when a value is involved in
72 an invocation. This argument may then be omitted from the invocation if its
73 usage is not appropriate.
74
75 See invocation.txt for details.
76
77 Context Value Types
78 -------------------
79
80 The following types of context value exist:
81
82 Type Usage Transformations
83 ---- ----- ---------------
84
85 Replaceable With functions (not methods) May be replaced with an
86 instance or a class when a
87 value is stored on an
88 instance or class
89
90 Placeholder With classes May not be replaced
91
92 Instance With instances (and constants) May not be replaced
93 or functions as methods
94
95 Class With functions as methods May be replaced when a
96 value is loaded from a
97 class attribute via an
98 instance
99
100 Contexts in Acquired Values
101 ---------------------------
102
103 There are four classes of instructions which provide values:
104
105 Instruction Purpose Context Operations
106 ----------- ------- ------------------
107
108 1) LoadConst Load module, constant Use loaded object with itself
109 as context
110
111 2) LoadFunction Load function Combine replaceable context
112 with loaded object
113
114 3) LoadClass Load class Combine placeholder context
115 with loaded object
116
117 4) LoadAddress* Load attribute from Preserve or override stored
118 LoadAttr* class, module, context (as described in
119 instance assignment.txt)
120
121 In order to comply with traditional Python behaviour, contexts may or may not
122 represent the object from which an attribute has been acquired.
123
124 See assignment.txt for details.
125
126 Contexts in Stored Values
127 -------------------------
128
129 There are two classes of instruction for storing values:
130
131 Instruction Purpose Context Operations
132 ----------- ------- ------------------
133
134 1) StoreAddress Store attribute in a Preserve context; note that no
135 known object test for class attribute
136 assignment should be necessary
137 since this instruction should only
138 be generated for module globals
139
140 StoreAttr Store attribute in an Preserve context; note that no
141 instance test for class attribute
142 assignment should be necessary
143 since this instruction should only
144 be generated for self accesses
145
146 StoreAttrIndex Store attribute in an Preserve context; since the index
147 unknown object lookup could yield a class
148 attribute, a test of the nature of
149 the nature of the structure is
150 necessary in order to prevent
151 assignments to classes
152
153 2) StoreAddressContext Store attribute in a Override context if appropriate;
154 known object if the value has a replaceable
155 context, permit the target to
156 take ownership of the value
157
158 See assignment.txt for details.
159
160 Tables, Attributes and Lookups
161 ==============================
162
163 Attribute lookups, where the exact location of an object attribute is deduced,
164 are performed differently in micropython than in other implementations.
165 Instead of providing attribute dictionaries, in which attributes are found,
166 attributes are located at fixed places in object structures (described below)
167 and their locations are stored using a special representation known as a
168 table.
169
170 For a given program, a table can be considered as being like a matrix mapping
171 classes to attribute names. For example:
172
173 class A:
174 # instances have attributes x, y
175
176 class B(A):
177 # introduces attribute z for instances
178
179 class C:
180 # instances have attributes a, b, z
181
182 This would provide the following table, referred to as an object table in the
183 context of classes and instances:
184
185 Class/attr a b x y z
186
187 A 1 2
188 B 1 2 3
189 C 1 2 3
190
191 A limitation of this representation is that instance attributes may not shadow
192 class attributes: if an attribute with a given name is not defined on an
193 instance, an attribute with the same name cannot be provided by the class of
194 the instance or any superclass of the instance's class.
195
196 The table can be compacted using a representation known as a displacement
197 list (referred to as an object list in this context):
198
199 Classes with attribute offsets
200
201 classcode A
202 attrcode a b x y z
203
204 B
205 a b x y z
206
207 C
208 a b x y z
209
210 List . . 1 2 1 2 3 1 2 . . 3
211
212 Here, the classcode refers to the offset in the list at which a class's
213 attributes are defined, whereas the attrcode defines the offset within a
214 region of attributes corresponding to a single attribute of a given name.
215
216 Attribute Locations
217 -------------------
218
219 The locations stored in table/list elements are for instance attributes
220 relative to the location of the instance, whereas those for class attributes
221 and modules are absolute addresses (although these could also be changed to
222 object-relative locations). Thus, each occupied table cell has the following
223 structure:
224
225 attrcode, uses-absolute-address, address (or location)
226
227 Objects and Structures
228 ======================
229
230 As well as references, micropython needs to have actual objects to refer to.
231 Since classes, functions and instances are all objects, it is desirable that
232 certain common features and operations are supported in the same way for all
233 of these things. To permit this, a common data structure format is used.
234
235 Header.................................................... Attributes.................
236
237 Identifier Identifier Address Identifier Size Object Object ...
238
239 0 1 2 3 4 5 6 7
240 classcode attrcode/ invocation funccode size __class__ attribute ...
241 instance reference reference reference
242 status
243
244 Classcode
245 ---------
246
247 Used in attribute lookup.
248
249 Here, the classcode refers to the attribute lookup table for the object (as
250 described above). Classes and instances share the same classcode, and their
251 structures reflect this. Functions all belong to the same type and thus employ
252 the classcode for the function built-in type, whereas modules have distinct
253 types since they must support different sets of attributes.
254
255 Attrcode
256 --------
257
258 Used to test instances for membership of classes (or descendants of classes).
259
260 Since, in traditional Python, classes are only ever instances of some generic
261 built-in type, support for testing such a relationship directly has been
262 removed and the attrcode is not specified for classes: the presence of an
263 attrcode indicates that a given object is an instance. In addition, support
264 has also been removed for testing modules in the same way, meaning that the
265 attrcode is also not specified for modules.
266
267 See the "Testing Instance Compatibility with Classes (Attrcode)" section below
268 for details of attrcodes.
269
270 Invocation Reference
271 --------------------
272
273 Used when an object is called.
274
275 This is the address of the code to be executed when an invocation is performed
276 on the object.
277
278 Funccode
279 --------
280
281 Used to look up argument positions by name.
282
283 The strategy with keyword arguments in micropython is to attempt to position
284 such arguments in the invocation frame as it is being constructed.
285
286 See the "Parameters and Lookups" section for more information.
287
288 Size
289 ----
290
291 Used to indicate the size of an object including attributes.
292
293 Attributes
294 ----------
295
296 For classes, modules and instances, the attributes in the structure correspond
297 to the attributes of each kind of object. For functions, however, the
298 attributes in the structure correspond to the default arguments for each
299 function, if any.
300
301 Structure Types
302 ---------------
303
304 Class C:
305
306 0 1 2 3 4 5 6 7
307 classcode (unused) __new__ funccode size class type attribute ...
308 for C reference for reference reference
309 instantiator
310
311 Instance of C:
312
313 0 1 2 3 4 5 6 7
314 classcode attrcode C.__call__ funccode size class C attribute ...
315 for C for C reference for reference reference
316 (if exists) C.__call__
317
318 Function f:
319
320 0 1 2 3 4 5 6 7
321 classcode attrcode code funccode size class attribute ...
322 for for reference function (default)
323 function function reference reference
324
325 Module m:
326
327 0 1 2 3 4 5 6 7
328 classcode attrcode (unused) (unused) (unused) module type attribute ...
329 for m for m reference (global)
330 reference
331
332 The __class__ Attribute
333 -----------------------
334
335 All objects support the __class__ attribute and this is illustrated above with
336 the first attribute.
337
338 Class: refers to the type class (type.__class__ also refers to the type class)
339 Function: refers to the function class
340 Instance: refers to the class instantiated to make the object
341
342 Lists and Tuples
343 ----------------
344
345 The built-in list and tuple sequences employ variable length structures using
346 the attribute locations to store their elements, where each element is a
347 reference to a separately stored object.
348
349 Testing Instance Compatibility with Classes (Attrcode)
350 ------------------------------------------------------
351
352 Although it would be possible to have a data structure mapping classes to
353 compatible classes, such as a matrix indicating the subclasses (or
354 superclasses) of each class, the need to retain the key to such a data
355 structure for each class might introduce a noticeable overhead.
356
357 Instead of having a separate structure, descendant classes of each class are
358 inserted as special attributes into the object table. This requires an extra
359 key to be retained, since each class must provide its own attribute code such
360 that upon an instance/class compatibility test, the code may be obtained and
361 used in the object table.
362
363 Invocation and Code References
364 ------------------------------
365
366 Modules: there is no meaningful invocation reference since modules cannot be
367 explicitly called.
368
369 Functions: a simple code reference is employed pointing to code implementing
370 the function. Note that the function locals are completely distinct from this
371 structure and are not comparable to attributes. Instead, attributes are
372 reserved for default parameter values, although they do not appear in the
373 object table described above, appearing instead in a separate parameter table
374 described below.
375
376 Classes: given that classes must be invoked in order to create instances, a
377 reference must be provided in class structures. However, this reference does
378 not point directly at the __init__ method of the class. Instead, the
379 referenced code belongs to a special initialiser function, __new__, consisting
380 of the following instructions:
381
382 create instance for C
383 call C.__init__(instance, ...)
384 return instance
385
386 Instances: each instance employs a reference to any __call__ method defined in
387 the class hierarchy for the instance, thus maintaining its callable nature.
388
389 Both classes and modules may contain code in their definitions - the former in
390 the "body" of the class, potentially defining attributes, and the latter as
391 the "top-level" code in the module, potentially defining attributes/globals -
392 but this code is not associated with any invocation target. It is thus
393 generated in order of appearance and is not referenced externally.
394
395 Invocation Operation
396 --------------------
397
398 Consequently, regardless of the object an invocation is always done as
399 follows:
400
401 get invocation reference from the header
402 jump to reference
403
404 Additional preparation is necessary before the above code: positional
405 arguments must be saved in the invocation frame, and keyword arguments must be
406 resolved and saved to the appropriate position in the invocation frame.
407
408 See invocation.txt for details.
409
410 Parameters and Lookups
411 ======================
412
413 Since Python supports keyword arguments when making invocations, it becomes
414 necessary to record the parameter names associated with each function or
415 method. Just as object tables record attributes positions on classes and
416 instances, parameter tables record parameter positions in function or method
417 parameter lists.
418
419 For a given program, a parameter table can be considered as being like a
420 matrix mapping functions/methods to parameter names. For example:
421
422 def f(x, y, z):
423 pass
424
425 def g(a, b, c):
426 pass
427
428 def h(a, x):
429 pass
430
431 This would provide the following table, referred to as a parameter table in
432 the context of functions and methods:
433
434 Function/param a b c x y z
435
436 f 1 2 3
437 g 1 2 3
438 h 1 2
439
440 Confusion can occur when functions are adopted as methods, since the context
441 then occupies the first slot in the invocation frame:
442
443 def f(x, y, z):
444 pass
445
446 f(x=1, y=2, z=3) -> f(<context>, 1, 2, 3)
447 -> f(1, 2, 3)
448
449 class C:
450 f = f
451
452 def g(x, y, z):
453 pass
454
455 c = C()
456
457 c.f(y=2, z=3) -> f(<context>, 2, 3)
458 c.g(y=2, z=3) -> C.g(<context>, 2, 3)
459
460 Just as with parameter tables, a displacement list can be prepared from a
461 parameter table:
462
463 Functions with parameter (attribute) offsets
464
465 funccode f
466 attrcode a b c x y z
467
468 g
469 a b c x y z
470
471 h
472 a b c x y z
473
474 List . . . 1 2 3 1 2 3 1 . . 2 . .
475
476 Here, the funccode refers to the offset in the list at which a function's
477 parameters are defined, whereas the attrcode defines the offset within a
478 region of attributes corresponding to a single parameter of a given name.
479
480 Instantiation
481 =============
482
483 When instantiating classes, memory must be reserved for the header of the
484 resulting instance, along with locations for the attributes of the instance.
485 Since the instance header contains data common to all instances of a class, a
486 template header is copied to the start of the newly reserved memory region.
487
488 Register Usage
489 ==============
490
491 During code generation, much of the evaluation produces results which are
492 implicitly recorded in the "active value" register, and various instructions
493 will consume the active value. In addition, some instructions will consume a
494 separate "active source value" from a register, typically those which are
495 assigning the result of an expression to an assignment target.
496
497 Since values often need to be retained for later use, a set of temporary
498 storage locations are typically employed. However, optimisations may reduce
499 the need to use such temporary storage where instructions which provide the
500 "active value" can be re-executed and will produce the same result.