micropython

Annotated docs/concepts.txt

206:0a00400508cd
2009-04-27 Paul Boddie Added structure field notes.
paul@199 1
Concepts
paul@199 2
========
paul@199 3
paul@201 4
This document describes the underlying concepts employed in micropython.
paul@201 5
paul@201 6
  * Namespaces and attribute definition
paul@199 7
  * Contexts and values
paul@200 8
  * Tables, attributes and lookups
paul@199 9
  * Objects and structures
paul@200 10
  * Parameters and lookups
paul@200 11
  * Instantiation
paul@199 12
paul@201 13
Namespaces and Attribute Definition
paul@201 14
===================================
paul@201 15
paul@201 16
Namespaces are any objects which can retain attributes.
paul@201 17
paul@201 18
  * Module attributes are defined either at the module level or by global
paul@201 19
    statements.
paul@201 20
  * Class attributes are defined only within class statements.
paul@201 21
  * Instance attributes are defined only by assignments to attributes of self
paul@201 22
    within __init__ methods.
paul@201 23
paul@201 24
These restrictions apply because such attributes are thus explicitly declared,
paul@201 25
permitting the use of tables (described below). Module and class attributes
paul@201 26
can also be finalised in this way in order to permit certain optimisations.
paul@201 27
paul@201 28
See rejected.txt for complicating mechanisms which could be applied to
paul@201 29
mitigate the effects of these restrictions on optimisations.
paul@201 30
paul@199 31
Contexts and Values
paul@199 32
===================
paul@199 33
paul@199 34
Values are used as the common reference representation in micropython: as
paul@199 35
stored representations of attributes (of classes, instances, modules, and
paul@199 36
other objects supporting attribute-like entities) as well as the stored values
paul@199 37
associated with names in functions and methods.
paul@199 38
paul@199 39
Unlike other implementations, micropython does not create things like bound
paul@199 40
method objects for individual instances. Instead, all objects are referenced
paul@199 41
using a context, reference pair:
paul@199 42
paul@199 43
Value Layout
paul@199 44
------------
paul@199 45
paul@199 46
    0           1
paul@199 47
    context     object
paul@199 48
    reference   reference
paul@199 49
paul@199 50
Specific implementations might reverse this ordering for optimisation
paul@199 51
purposes.
paul@199 52
paul@199 53
Rationale
paul@199 54
---------
paul@199 55
paul@199 56
To reduce the number of created objects whilst retaining the ability to
paul@199 57
support bound method invocations. The context indicates the context in which
paul@199 58
an invocation is performed, typically the owner of the method.
paul@199 59
paul@199 60
Usage
paul@199 61
-----
paul@199 62
paul@199 63
The context may be inserted as the first argument when a value is involved in
paul@199 64
an invocation. This argument may then be omitted from the invocation if its
paul@199 65
usage is not appropriate.
paul@199 66
paul@199 67
See invocation.txt for details.
paul@199 68
paul@199 69
Contexts in Acquired Values
paul@199 70
---------------------------
paul@199 71
paul@199 72
There are two classes of instructions which provide values:
paul@199 73
paul@199 74
    Instruction         Purpose                 Context Operations
paul@199 75
    -----------         -------                 ------------------
paul@199 76
paul@199 77
    LoadConst           Load class, function,   Combine null context with
paul@199 78
                        module, constant        loaded object
paul@199 79
paul@201 80
    LoadAddress*        Load attribute from     Preserve or override stored
paul@201 81
    LoadAttr*           class, module,          context (as described in
paul@201 82
                        instance                assignment.txt)
paul@199 83
paul@199 84
In order to comply with traditional Python behaviour, contexts may or may not
paul@199 85
represent the object from which an attribute has been acquired.
paul@199 86
paul@199 87
See assignment.txt for details.
paul@199 88
paul@199 89
Contexts in Stored Values
paul@199 90
-------------------------
paul@199 91
paul@199 92
There is only one class of instruction for storing values:
paul@199 93
paul@199 94
    Instruction     Purpose                 Context Operations
paul@199 95
    -----------     -------                 ------------------
paul@199 96
paul@199 97
    StoreAddress    Store attribute in a    Preserve context; note that no
paul@199 98
                    known object            test for class attribute
paul@199 99
                                            assignment should be necessary
paul@199 100
                                            since this instruction should only
paul@199 101
                                            be generated for module globals
paul@199 102
paul@199 103
    StoreAttr       Store attribute in an   Preserve context; note that no
paul@199 104
                    instance                test for class attribute
paul@199 105
                                            assignment should be necessary
paul@199 106
                                            since this instruction should only
paul@199 107
                                            be generated for self accesses
paul@199 108
paul@199 109
    StoreAttrIndex  Store attribute in an   Preserve context; since the index
paul@199 110
                    unknown object          lookup could yield a class
paul@199 111
                                            attribute, a test of the nature of
paul@199 112
                                            the nature of the structure is
paul@199 113
                                            necessary in order to prevent
paul@199 114
                                            assignments to classes
paul@199 115
paul@199 116
Note that contexts are never changed in the stored value: they are preserved.
paul@199 117
paul@199 118
See assignment.txt for details.
paul@199 119
paul@200 120
Tables, Attributes and Lookups
paul@200 121
==============================
paul@199 122
paul@199 123
Attribute lookups, where the exact location of an object attribute is deduced,
paul@199 124
are performed differently in micropython than in other implementations.
paul@199 125
Instead of providing attribute dictionaries, in which attributes are found,
paul@199 126
attributes are located at fixed places in object structures (described below)
paul@199 127
and their locations are stored using a special representation known as a
paul@199 128
table.
paul@199 129
paul@199 130
For a given program, a table can be considered as being like a matrix mapping
paul@199 131
classes to attribute names. For example:
paul@199 132
paul@199 133
    class A:
paul@200 134
        # instances have attributes x, y
paul@199 135
paul@199 136
    class B(A):
paul@200 137
        # introduces attribute z for instances
paul@199 138
paul@199 139
    class C:
paul@200 140
        # instances have attributes a, b, z
paul@199 141
paul@200 142
This would provide the following table, referred to as an object table in the
paul@200 143
context of classes and instances:
paul@199 144
paul@199 145
    Class/attr      a   b   x   y   z
paul@199 146
paul@199 147
    A                       1   2
paul@199 148
    B                       1   2   3
paul@199 149
    C               1   2           3
paul@199 150
paul@199 151
A limitation of this representation is that instance attributes may not shadow
paul@199 152
class attributes: if an attribute with a given name is not defined on an
paul@199 153
instance, an attribute with the same name cannot be provided by the class of
paul@199 154
the instance or any superclass of the instance's class.
paul@199 155
paul@199 156
The table can be compacted using a representation known as a displacement
paul@200 157
list (referred to as an object list in this context):
paul@199 158
paul@199 159
                Classes with attribute offsets
paul@199 160
paul@199 161
    classcode   A
paul@199 162
    attrcode    a   b   x   y   z
paul@199 163
paul@199 164
                        B
paul@199 165
                        a   b   x   y   z
paul@199 166
paul@199 167
                                            C
paul@199 168
                                            a   b   x   y   z
paul@199 169
paul@199 170
    List        .   .   1   2   1   2   3   1   2   .   .   3
paul@199 171
paul@199 172
Here, the classcode refers to the offset in the list at which a class's
paul@199 173
attributes are defined, whereas the attrcode defines the offset within a
paul@199 174
region of attributes corresponding to a single attribute of a given name.
paul@199 175
paul@200 176
Attribute Locations
paul@200 177
-------------------
paul@200 178
paul@200 179
The locations stored in table/list elements are for instance attributes
paul@200 180
relative to the location of the instance, whereas those for class attributes
paul@200 181
and modules are absolute addresses (although these could also be changed to
paul@200 182
object-relative locations).
paul@200 183
paul@199 184
Objects and Structures 
paul@199 185
======================
paul@199 186
paul@199 187
As well as references, micropython needs to have actual objects to refer to.
paul@199 188
Since classes, functions and instances are all objects, it is desirable that
paul@199 189
certain common features and operations are supported in the same way for all
paul@199 190
of these things. To permit this, a common data structure format is used.
paul@199 191
paul@203 192
    Header............................................................................  Attributes.................
paul@200 193
paul@203 194
    Identifier  Identifier  Address     Details     Flag        Identifier  Size        Object      Object      ...
paul@199 195
paul@203 196
    0           1           2           3           4           5           6           7           8           9
paul@203 197
    classcode   attrcode    invocation  invocation  instance    funccode    size        __class__   attribute   ...
paul@203 198
                            reference   #args,      status                              reference   reference
paul@199 199
                                        defaults
paul@199 200
                                        reference
paul@199 201
paul@206 202
Classcode
paul@206 203
---------
paul@206 204
paul@206 205
Used in attribute lookup.
paul@206 206
paul@199 207
Here, the classcode refers to the attribute lookup table for the object (as
paul@200 208
described above). Classes and instances share the same classcode, and their
paul@200 209
structures reflect this. Functions all belong to the same type and thus employ
paul@200 210
the classcode for the function built-in type, whereas modules have distinct
paul@200 211
types since they must support different sets of attributes.
paul@199 212
paul@206 213
Attrcode
paul@206 214
--------
paul@206 215
paul@206 216
Used to test instances for membership of classes (or descendants of classes).
paul@206 217
paul@206 218
Since, in traditional Python, classes are only ever instances of the "type"
paul@206 219
built-in class, support for testing such a relationship could be removed and
paul@206 220
the attrcode eliminated for classes, such that the presence of an attrcode
paul@206 221
would indicate that a given object is an instance.
paul@206 222
paul@206 223
Structure Types
paul@206 224
---------------
paul@206 225
paul@199 226
Class C:
paul@199 227
paul@203 228
    0           1           2           3           4           5           6           7           8           9
paul@203 229
    classcode   attrcode    __new__     __new__     false                   size        class type  attribute   ...
paul@203 230
    for C       for C       reference   #args,                                          reference   reference
paul@199 231
                                        defaults
paul@199 232
                                        reference
paul@199 233
paul@199 234
Instance of C:
paul@199 235
paul@203 236
    0           1           2           3           4           5           6           7           8           9
paul@203 237
    classcode   attrcode    C.__call__  C.__call__  true                    size        class C     attribute   ...
paul@203 238
    for C       for C       reference   #args,                                          reference   reference
paul@199 239
                            (if exists) defaults
paul@199 240
                                        reference
paul@199 241
paul@200 242
Function f:
paul@199 243
paul@203 244
    0           1           2           3           4           5           6           7           8           9
paul@203 245
    classcode   attrcode    code        code        true        funccode    size        class       attribute   ...
paul@203 246
    for         for         reference   #args,                                          function    (default)
paul@203 247
    function    function                defaults                                        reference   reference
paul@200 248
                                        reference
paul@200 249
paul@200 250
Module m:
paul@200 251
paul@203 252
    0           1           2           3           4           5           6           7           8           9
paul@203 253
    classcode   attrcode    (unused)    (unused)    true                                module type attribute   ...
paul@203 254
    for m       for m                                                                   reference   (global)
paul@203 255
                                                                                                    reference
paul@200 256
paul@200 257
The __class__ Attribute
paul@200 258
-----------------------
paul@200 259
paul@200 260
All objects support the __class__ attribute and this is illustrated above with
paul@200 261
the first attribute.
paul@200 262
paul@200 263
Class: refers to the type class (type.__class__ also refers to the type class)
paul@200 264
Function: refers to the function class
paul@200 265
Instance: refers to the class instantiated to make the object
paul@200 266
paul@203 267
Lists and Tuples
paul@203 268
----------------
paul@203 269
paul@203 270
The built-in list and tuple sequences employ variable length structures using
paul@203 271
the attribute locations to store their elements, where each element is a
paul@203 272
reference to a separately stored object.
paul@203 273
paul@200 274
Testing Instance Compatibility with Classes (attrcode)
paul@200 275
------------------------------------------------------
paul@200 276
paul@200 277
Although it would be possible to have a data structure mapping classes to
paul@200 278
compatible classes, such as a matrix indicating the subclasses (or
paul@200 279
superclasses) of each class, the need to retain the key to such a data
paul@200 280
structure for each class might introduce a noticeable overhead.
paul@200 281
paul@200 282
Instead of having a separate structure, descendant classes of each class are
paul@200 283
inserted as special attributes into the object table. This requires an extra
paul@200 284
key to be retained, since each class must provide its own attribute code such
paul@200 285
that upon an instance/class compatibility test, the code may be obtained and
paul@200 286
used in the object table.
paul@200 287
paul@200 288
Invocation and Code References
paul@200 289
------------------------------
paul@200 290
paul@200 291
Modules: there is no meaningful invocation reference since modules cannot be
paul@200 292
explicitly called.
paul@200 293
paul@200 294
Functions: a simple code reference is employed pointing to code implementing
paul@200 295
the function. Note that the function locals are completely distinct from this
paul@200 296
structure and are not comparable to attributes. Instead, attributes are
paul@200 297
reserved for default parameter values, although they do not appear in the
paul@200 298
object table described above, appearing instead in a separate parameter table
paul@200 299
described below.
paul@200 300
paul@200 301
Classes: given that classes must be invoked in order to create instances, a
paul@200 302
reference must be provided in class structures. However, this reference does
paul@200 303
not point directly at the __init__ method of the class. Instead, the
paul@200 304
referenced code belongs to a special initialiser function, __new__, consisting
paul@200 305
of the following instructions:
paul@200 306
paul@200 307
    create instance for C
paul@200 308
    call C.__init__(instance, ...)
paul@200 309
    return instance
paul@200 310
paul@200 311
Instances: each instance employs a reference to any __call__ method defined in
paul@200 312
the class hierarchy for the instance, thus maintaining its callable nature.
paul@200 313
paul@200 314
Both classes and modules may contain code in their definitions - the former in
paul@200 315
the "body" of the class, potentially defining attributes, and the latter as
paul@200 316
the "top-level" code in the module, potentially defining attributes/globals -
paul@200 317
but this code is not associated with any invocation target. It is thus
paul@200 318
generated in order of appearance and is not referenced externally.
paul@200 319
paul@200 320
Invocation Operation
paul@200 321
--------------------
paul@200 322
paul@200 323
Consequently, regardless of the object an invocation is always done as
paul@200 324
follows:
paul@200 325
paul@200 326
    get invocation reference from the header
paul@200 327
    jump to reference
paul@200 328
paul@200 329
Additional preparation is necessary before the above code: positional
paul@200 330
arguments must be saved in the invocation frame, and keyword arguments must be
paul@200 331
resolved and saved to the appropriate position in the invocation frame.
paul@200 332
paul@200 333
See invocation.txt for details.
paul@200 334
paul@200 335
Parameters and Lookups 
paul@200 336
======================
paul@200 337
paul@200 338
Since Python supports keyword arguments when making invocations, it becomes
paul@200 339
necessary to record the parameter names associated with each function or
paul@200 340
method. Just as object tables record attributes positions on classes and
paul@200 341
instances, parameter tables record parameter positions in function or method
paul@200 342
parameter lists.
paul@200 343
paul@200 344
For a given program, a parameter table can be considered as being like a
paul@200 345
matrix mapping functions/methods to parameter names. For example:
paul@200 346
paul@200 347
    def f(x, y, z):
paul@200 348
        pass
paul@200 349
paul@200 350
    def g(a, b, c):
paul@200 351
        pass
paul@200 352
paul@200 353
    def h(a, x):
paul@200 354
        pass
paul@200 355
paul@200 356
This would provide the following table, referred to as a parameter table in
paul@200 357
the context of functions and methods:
paul@200 358
paul@200 359
    Function/param  a   b   c   x   y   z
paul@200 360
paul@200 361
    f                           1   2   3
paul@200 362
    g               1   2   3
paul@200 363
    h               1           2
paul@200 364
paul@200 365
Just as with parameter tables, a displacement list can be prepared from a
paul@200 366
parameter table:
paul@200 367
paul@200 368
                Functions with parameter (attribute) offsets
paul@200 369
paul@200 370
    funccode    f
paul@200 371
    attrcode    a   b   c   x   y   z
paul@200 372
paul@200 373
                                        g
paul@200 374
                                        a   b   c   x   y   z
paul@200 375
paul@200 376
                                                    h
paul@200 377
                                                    a   b   c   x   y   z
paul@200 378
paul@200 379
    List        .   .   .   1   2   3   1   2   3   1   .   .   2   .   .
paul@200 380
paul@200 381
Here, the funccode refers to the offset in the list at which a function's
paul@200 382
parameters are defined, whereas the attrcode defines the offset within a
paul@200 383
region of attributes corresponding to a single parameter of a given name.
paul@200 384
paul@200 385
Instantiation
paul@200 386
=============
paul@200 387
paul@200 388
When instantiating classes, memory must be reserved for the header of the
paul@200 389
resulting instance, along with locations for the attributes of the instance.
paul@200 390
Since the instance header contains data common to all instances of a class, a
paul@200 391
template header is copied to the start of the newly reserved memory region.