micropython

docs/concepts.txt

394:9c6deab845e0
2011-01-29 Paul Boddie Introduced a degree of support for classes and instances having separate __class__ attributes that are accessed in a class/instance-relative fashion. Added the __class__ attribute as a special case of a non-static attribute. This changes code generation such that a constant accessor (a class) may yield an attribute which should not be accessed via a fixed location. (Although this may be cautious and also limited by the behaviour of the object table, if class attribute assignment is to be permitted in future, more distinctions between apparently static attributes may need to be enforced.) Made the type class a special case which is created in advance so that all classes may refer to it via their __class__ attributes, and whose details are populated when the class is actually encountered in the __builtins__ module. Made extra space in the instance templates for the __class__ attribute, adding support for the copying of this data when new instances are created. Introduced type-specific vacuum support in order to correctly perform vacuuming. Changed various class references in the micropython module. Tidied up to "to do" document, putting items into separate sections. Added various tests of the new features.
     1 Concepts
     2 ========
     3 
     4 This document describes the underlying concepts employed in micropython.
     5 
     6   * Namespaces and attribute definition
     7   * Contexts and values
     8   * Tables, attributes and lookups
     9   * Objects and structures
    10   * Parameters and lookups
    11   * Instantiation
    12   * Register usage
    13   * List and tuple representations
    14 
    15 Namespaces and Attribute Definition
    16 ===================================
    17 
    18 Namespaces are any objects which can retain attributes.
    19 
    20   * Module attributes are defined either at the module level or by global
    21     statements.
    22   * Class attributes are defined only within class statements.
    23   * Instance attributes are defined only by assignments to attributes of self
    24     within __init__ methods.
    25 
    26 These restrictions apply because such attributes are thus explicitly declared,
    27 permitting the use of tables (described below). Module and class attributes
    28 can also be finalised in this way in order to permit certain optimisations.
    29 
    30 An additional restriction required for the current implementation of tables
    31 (as described below) applies to class definitions: each class must be defined
    32 using a unique name; repeated definition of classes having the same name is
    33 thus not permitted. This restriction arises from the use of the "full name" of
    34 a class as a key to the object table, where the full name is a qualified path
    35 via the module hierarchy ending with the name of the class.
    36 
    37 See rejected.txt for complicating mechanisms which could be applied to
    38 mitigate the effects of these restrictions on optimisations.
    39 
    40 Contexts and Values
    41 ===================
    42 
    43 Values are used as the common reference representation in micropython: as
    44 stored representations of attributes (of classes, instances, modules, and
    45 other objects supporting attribute-like entities) as well as the stored values
    46 associated with names in functions and methods.
    47 
    48 Unlike other implementations, micropython does not create things like bound
    49 method objects for individual instances. Instead, all objects are referenced
    50 using a context, reference pair:
    51 
    52 Value Layout
    53 ------------
    54 
    55     0           1
    56     context     object
    57     reference   reference
    58 
    59 Specific implementations might reverse this ordering for optimisation
    60 purposes.
    61 
    62 Rationale
    63 ---------
    64 
    65 To reduce the number of created objects whilst retaining the ability to
    66 support bound method invocations. The context indicates the context in which
    67 an invocation is performed, typically the owner of the method.
    68 
    69 Usage
    70 -----
    71 
    72 The context may be inserted as the first argument when a value is involved in
    73 an invocation. This argument may then be omitted from the invocation if its
    74 usage is not appropriate.
    75 
    76 See invocation.txt for details.
    77 
    78 Context Value Types
    79 -------------------
    80 
    81 The following types of context value exist:
    82 
    83     Type            Usage                           Transformations
    84     ----            -----                           ---------------
    85 
    86     Replaceable     With functions (not methods)    May be replaced with an
    87                                                     instance or a class when a
    88                                                     value is stored on an
    89                                                     instance or class
    90 
    91     Placeholder     With classes                    May not be replaced
    92 
    93     Instance        With instances (and constants)  May not be replaced
    94                     or functions as methods
    95 
    96     Class           With functions as methods       May be replaced when a
    97                                                     value is loaded from a
    98                                                     class attribute via an
    99                                                     instance
   100 
   101 Contexts in Acquired Values
   102 ---------------------------
   103 
   104 There are four classes of instructions which provide values:
   105 
   106     Instruction         Purpose                 Context Operations
   107     -----------         -------                 ------------------
   108 
   109 1)  LoadConst           Load module, constant   Use loaded object with itself
   110                                                 as context
   111 
   112 2)  LoadFunction        Load function           Combine replaceable context
   113                                                 with loaded object
   114 
   115 3)  LoadClass           Load class              Combine placeholder context
   116                                                 with loaded object
   117 
   118 4)  LoadAddress*        Load attribute from     Preserve or override stored
   119     LoadAttr*           class, module,          context (as described in
   120                         instance                assignment.txt)
   121 
   122 In order to comply with traditional Python behaviour, contexts may or may not
   123 represent the object from which an attribute has been acquired.
   124 
   125 See assignment.txt for details.
   126 
   127 Contexts in Stored Values
   128 -------------------------
   129 
   130 There are two classes of instruction for storing values:
   131 
   132     Instruction         Purpose                 Context Operations
   133     -----------         -------                 ------------------
   134 
   135 1)  StoreAddress        Store attribute in a    Preserve context; note that no
   136                         known object            test for class attribute
   137                                                 assignment should be necessary
   138                                                 since this instruction should only
   139                                                 be generated for module globals
   140 
   141     StoreAttr           Store attribute in an   Preserve context; note that no
   142                         instance                test for class attribute
   143                                                 assignment should be necessary
   144                                                 since this instruction should only
   145                                                 be generated for self accesses
   146 
   147     StoreAttrIndex      Store attribute in an   Preserve context; since the index
   148                         unknown object          lookup could yield a class
   149                                                 attribute, a test of the nature of
   150                                                 the nature of the structure is
   151                                                 necessary in order to prevent
   152                                                 assignments to classes
   153 
   154 2)  StoreAddressContext Store attribute in a    Override context if appropriate;
   155                         known object            if the value has a replaceable
   156                                                 context, permit the target to
   157                                                 take ownership of the value
   158 
   159 See assignment.txt for details.
   160 
   161 Tables, Attributes and Lookups
   162 ==============================
   163 
   164 Attribute lookups, where the exact location of an object attribute is deduced,
   165 are performed differently in micropython than in other implementations.
   166 Instead of providing attribute dictionaries, in which attributes are found,
   167 attributes are located at fixed places in object structures (described below)
   168 and their locations are stored using a special representation known as a
   169 table.
   170 
   171 For a given program, a table can be considered as being like a matrix mapping
   172 classes to attribute names. For example:
   173 
   174     class A:
   175         # instances have attributes x, y
   176 
   177     class B(A):
   178         # introduces attribute z for instances
   179 
   180     class C:
   181         # instances have attributes a, b, z
   182 
   183 This would provide the following table, referred to as an object table in the
   184 context of classes and instances:
   185 
   186     Class/attr      a   b   x   y   z
   187 
   188     A                       1   2
   189     B                       1   2   3
   190     C               1   2           3
   191 
   192 A limitation of this representation is that instance attributes may not shadow
   193 class attributes: if an attribute with a given name is not defined on an
   194 instance, an attribute with the same name cannot be provided by the class of
   195 the instance or any superclass of the instance's class.
   196 
   197 The table can be compacted using a representation known as a displacement
   198 list (referred to as an object list in this context):
   199 
   200                 Classes with attribute offsets
   201 
   202     classcode   A
   203     attrcode    a   b   x   y   z
   204 
   205                         B
   206                         a   b   x   y   z
   207 
   208                                             C
   209                                             a   b   x   y   z
   210 
   211     List        .   .   1   2   1   2   3   1   2   .   .   3
   212 
   213 Here, the classcode refers to the offset in the list at which a class's
   214 attributes are defined, whereas the attrcode defines the offset within a
   215 region of attributes corresponding to a single attribute of a given name.
   216 
   217 Attribute Locations
   218 -------------------
   219 
   220 The locations stored in table/list elements are generally for instance
   221 attributes relative to the location of the instance, whereas those for class
   222 attributes and module attributes are generally absolute addresses. Thus, each
   223 occupied table cell has the following structure:
   224 
   225     attrcode, uses-absolute-address, address (or location)
   226 
   227 This could be given instead as follows:
   228 
   229     attrcode, is-class-or-module, location
   230 
   231 Since uses-absolute-address corresponds to is-class-or-module, and since there
   232 is a need to test for classes and modules to prevent assignment to attributes
   233 of such objects, this particular information is always required.
   234 
   235 The __class__ Attribute
   236 -----------------------
   237 
   238 The exception to the above general rules about relative locations and absolute
   239 addresses involves the __class__ attribute which is defined differently for
   240 each class and its instances. Since the table elements can only refer to a
   241 single absolute address, thus providing only a single value, such absolute
   242 references which are sufficient for most class attributes would not be
   243 appropriate for the __class__ attribute. However, using an object-relative
   244 location would require both classes and instances to retain an attribute
   245 location specifically to hold the value appropriate for each object type.
   246 
   247 Comparing Tables as Matrices with Displacement Lists
   248 ----------------------------------------------------
   249 
   250 Although displacement lists can provide reasonable levels of compaction for
   251 attribute data, the element size is larger than that required for a simple
   252 matrix: the attribute code (attrcode) need not be stored since each element
   253 unambiguously refers to the availability of an attribute for a particular
   254 class or instance of that class, and so the data at a given element need not
   255 be tested for relevance to a given attribute access operation.
   256 
   257 Given a program with 20 object types and 100 attribute types, a matrix would
   258 occupy the following amount of space:
   259 
   260     number of object types * number of attribute types * element size
   261   = 20 * 100 * 1 (assuming that a single location is sufficient for an element)
   262   = 2000
   263 
   264 In contrast, given a compaction to 40% of the matrix size (without considering
   265 element size) in a displacement list, the amount of space would be as follows:
   266 
   267     number of elements * element size
   268   = 40% * (20 * 100) * 2 (assuming that one additional location is required)
   269   = 1600
   270 
   271 Consequently, the principal overhead of using a displacement list is likely to
   272 be in the need to check element relevance when retrieving values from such a
   273 list.
   274 
   275 Objects and Structures 
   276 ======================
   277 
   278 As well as references, micropython needs to have actual objects to refer to.
   279 Since classes, functions and instances are all objects, it is desirable that
   280 certain common features and operations are supported in the same way for all
   281 of these things. To permit this, a common data structure format is used.
   282 
   283     Header....................................................  Attributes.................
   284 
   285     Identifier  Identifier  Address     Identifier  Size        Object      Object      ...
   286 
   287     0           1           2           3           4           5           6           7
   288     classcode   attrcode/   invocation  funccode    size        __class__   attribute   ...
   289                 instance    reference                           reference   reference
   290                 status
   291 
   292 Classcode
   293 ---------
   294 
   295 Used in attribute lookup.
   296 
   297 Here, the classcode refers to the attribute lookup table for the object (as
   298 described above). Classes and instances share the same classcode, and their
   299 structures reflect this. Functions all belong to the same type and thus employ
   300 the classcode for the function built-in type, whereas modules have distinct
   301 types since they must support different sets of attributes.
   302 
   303 Attrcode
   304 --------
   305 
   306 Used to test instances for membership of classes (or descendants of classes).
   307 
   308 Since, in traditional Python, classes are only ever instances of some generic
   309 built-in type, support for testing such a relationship directly has been
   310 removed and the attrcode is not specified for classes: the presence of an
   311 attrcode indicates that a given object is an instance. In addition, support
   312 has also been removed for testing modules in the same way, meaning that the
   313 attrcode is also not specified for modules.
   314 
   315 See the "Testing Instance Compatibility with Classes (Attrcode)" section below
   316 for details of attrcodes.
   317 
   318 Invocation Reference
   319 --------------------
   320 
   321 Used when an object is called.
   322 
   323 This is the address of the code to be executed when an invocation is performed
   324 on the object.
   325 
   326 Funccode
   327 --------
   328 
   329 Used to look up argument positions by name.
   330 
   331 The strategy with keyword arguments in micropython is to attempt to position
   332 such arguments in the invocation frame as it is being constructed.
   333 
   334 See the "Parameters and Lookups" section for more information.
   335 
   336 Size
   337 ----
   338 
   339 Used to indicate the size of an object including attributes.
   340 
   341 Attributes
   342 ----------
   343 
   344 For classes, modules and instances, the attributes in the structure correspond
   345 to the attributes of each kind of object. For functions, however, the
   346 attributes in the structure correspond to the default arguments for each
   347 function, if any.
   348 
   349 Structure Types
   350 ---------------
   351 
   352 Class C:
   353 
   354     0           1           2           3           4           5           6           7
   355     classcode   (unused)    __new__     funccode    size        class type  attribute   ...
   356     for C                   reference   for                     reference   reference
   357                                         instantiator
   358 
   359 Instance of C:
   360 
   361     0           1           2           3           4           5           6           7
   362     classcode   attrcode    C.__call__  funccode    size        class C     attribute   ...
   363     for C       for C       reference   for                     reference   reference
   364                             (if exists) C.__call__
   365 
   366 Function f:
   367 
   368     0           1           2           3           4           5           6           7
   369     classcode   attrcode    code        funccode    size        class       attribute   ...
   370     for         for         reference                           function    (default)
   371     function    function                                        reference   reference
   372 
   373 Module m:
   374 
   375     0           1           2           3           4           5           6           7
   376     classcode   attrcode    (unused)    (unused)    (unused)    module type attribute   ...
   377     for m       for m                                           reference   (global)
   378                                                                             reference
   379 
   380 The __class__ Attribute
   381 -----------------------
   382 
   383 All objects support the __class__ attribute and this is illustrated above with
   384 the first attribute.
   385 
   386 Class: refers to the type class (type.__class__ also refers to the type class)
   387 Function: refers to the function class
   388 Instance: refers to the class instantiated to make the object
   389 
   390 Lists and Tuples
   391 ----------------
   392 
   393 The built-in list and tuple sequences employ variable length structures using
   394 the attribute locations to store their elements, where each element is a
   395 reference to a separately stored object.
   396 
   397 Testing Instance Compatibility with Classes (Attrcode)
   398 ------------------------------------------------------
   399 
   400 Although it would be possible to have a data structure mapping classes to
   401 compatible classes, such as a matrix indicating the subclasses (or
   402 superclasses) of each class, the need to retain the key to such a data
   403 structure for each class might introduce a noticeable overhead.
   404 
   405 Instead of having a separate structure, descendant classes of each class are
   406 inserted as special attributes into the object table. This requires an extra
   407 key to be retained, since each class must provide its own attribute code such
   408 that upon an instance/class compatibility test, the code may be obtained and
   409 used in the object table.
   410 
   411 Invocation and Code References
   412 ------------------------------
   413 
   414 Modules: there is no meaningful invocation reference since modules cannot be
   415 explicitly called.
   416 
   417 Functions: a simple code reference is employed pointing to code implementing
   418 the function. Note that the function locals are completely distinct from this
   419 structure and are not comparable to attributes. Instead, attributes are
   420 reserved for default parameter values, although they do not appear in the
   421 object table described above, appearing instead in a separate parameter table
   422 described below.
   423 
   424 Classes: given that classes must be invoked in order to create instances, a
   425 reference must be provided in class structures. However, this reference does
   426 not point directly at the __init__ method of the class. Instead, the
   427 referenced code belongs to a special initialiser function, __new__, consisting
   428 of the following instructions:
   429 
   430     create instance for C
   431     call C.__init__(instance, ...)
   432     return instance
   433 
   434 Instances: each instance employs a reference to any __call__ method defined in
   435 the class hierarchy for the instance, thus maintaining its callable nature.
   436 
   437 Both classes and modules may contain code in their definitions - the former in
   438 the "body" of the class, potentially defining attributes, and the latter as
   439 the "top-level" code in the module, potentially defining attributes/globals -
   440 but this code is not associated with any invocation target. It is thus
   441 generated in order of appearance and is not referenced externally.
   442 
   443 Invocation Operation
   444 --------------------
   445 
   446 Consequently, regardless of the object an invocation is always done as
   447 follows:
   448 
   449     get invocation reference from the header
   450     jump to reference
   451 
   452 Additional preparation is necessary before the above code: positional
   453 arguments must be saved in the invocation frame, and keyword arguments must be
   454 resolved and saved to the appropriate position in the invocation frame.
   455 
   456 See invocation.txt for details.
   457 
   458 Parameters and Lookups 
   459 ======================
   460 
   461 Since Python supports keyword arguments when making invocations, it becomes
   462 necessary to record the parameter names associated with each function or
   463 method. Just as object tables record attributes positions on classes and
   464 instances, parameter tables record parameter positions in function or method
   465 parameter lists.
   466 
   467 For a given program, a parameter table can be considered as being like a
   468 matrix mapping functions/methods to parameter names. For example:
   469 
   470     def f(x, y, z):
   471         pass
   472 
   473     def g(a, b, c):
   474         pass
   475 
   476     def h(a, x):
   477         pass
   478 
   479 This would provide the following table, referred to as a parameter table in
   480 the context of functions and methods:
   481 
   482     Function/param  a   b   c   x   y   z
   483 
   484     f                           1   2   3
   485     g               1   2   3
   486     h               1           2
   487 
   488 Confusion can occur when functions are adopted as methods, since the context
   489 then occupies the first slot in the invocation frame:
   490 
   491     def f(x, y, z):
   492         pass
   493 
   494     f(x=1, y=2, z=3) -> f(<context>, 1, 2, 3)
   495                      -> f(1, 2, 3)
   496 
   497     class C:
   498         f = f
   499 
   500         def g(x, y, z):
   501             pass
   502 
   503     c = C()
   504 
   505     c.f(y=2, z=3) -> f(<context>, 2, 3)
   506     c.g(y=2, z=3) -> C.g(<context>, 2, 3)
   507 
   508 Just as with parameter tables, a displacement list can be prepared from a
   509 parameter table:
   510 
   511                 Functions with parameter (attribute) offsets
   512 
   513     funccode    f
   514     attrcode    a   b   c   x   y   z
   515 
   516                                         g
   517                                         a   b   c   x   y   z
   518 
   519                                                     h
   520                                                     a   b   c   x   y   z
   521 
   522     List        .   .   .   1   2   3   1   2   3   1   .   .   2   .   .
   523 
   524 Here, the funccode refers to the offset in the list at which a function's
   525 parameters are defined, whereas the attrcode defines the offset within a
   526 region of attributes corresponding to a single parameter of a given name.
   527 
   528 Instantiation
   529 =============
   530 
   531 When instantiating classes, memory must be reserved for the header of the
   532 resulting instance, along with locations for the attributes of the instance.
   533 Since the instance header contains data common to all instances of a class, a
   534 template header is copied to the start of the newly reserved memory region.
   535 
   536 Register Usage
   537 ==============
   538 
   539 During code generation, much of the evaluation produces results which are
   540 implicitly recorded in the "active value" register, and various instructions
   541 will consume the active value. In addition, some instructions will consume a
   542 separate "active source value" from a register, typically those which are
   543 assigning the result of an expression to an assignment target.
   544 
   545 Since values often need to be retained for later use, a set of temporary
   546 storage locations are typically employed. However, optimisations may reduce
   547 the need to use such temporary storage where instructions which provide the
   548 "active value" can be re-executed and will produce the same result.
   549 
   550 List and Tuple Representations
   551 ==============================
   552 
   553 Since tuples have a fixed size, the representation of a tuple instance is
   554 merely a header describing the size of the entire object, together with a
   555 sequence of references to the object "stored" at each position in the
   556 structure. Such references consist of the usual context and reference pair.
   557 
   558 Lists, however, have a variable size and must be accessible via an unchanging
   559 location even as more memory is allocated elsewhere to accommodate the
   560 contents of the list. Consequently, the representation must resemble the
   561 following:
   562 
   563     Structure header for list (size == header plus special attribute)
   564     Special attribute referencing the underlying sequence
   565 
   566 The underlying sequence has a fixed size, like a tuple, but may contain fewer
   567 elements than the size of the sequence permits:
   568 
   569     Special header indicating the current size and allocated size
   570     Element
   571     ...             <-- current size
   572     (Unused space)
   573     ...             <-- allocated size
   574 
   575 This representation permits the allocation of a new sequence when space is
   576 exhausted in an existing sequence, with the new sequence address stored in the
   577 main list structure. Since access to the contents of the list must go through
   578 the main list structure, underlying allocation activities may take place
   579 without the users of a list having to be aware of such activities.