micropython

README.txt

137:f660fe1aac5c
2008-09-01 Paul Boddie Added notes about calling initialisers and instantiators, adopting a strategy where instantiation detected at compile-time is performed using an initialiser directly, whereas that detected at run-time is done using an instantiator whose code is now generated in the image. Added a finalise method to the Importer in order to set attribute locations before code generation, since some code (use of initialisers) requires details of a different program unit's locals (although this is actually unnecessary, but done because Attr instances are employed in the generated code). Changed class invocation at compile-time to acquire the new object reference from the frame of an already invoked initialiser just before dropping the frame. Added some support for raw image encoding of classes and functions. Changed JumpWithFrame usage to involve the current callable, not the current value. Added RecoverFrame and AdjustFrame instructions. Improved the tests around instantiation.
     1 Namespace Definition
     2 ====================
     3 
     4 Module attributes are defined either at the module level or by global
     5 statements.
     6 
     7 Class attributes are defined only within class statements.
     8 
     9 Instance attributes are defined only by assignments to attributes of self
    10 within __init__ methods.
    11 
    12 (These restrictions apply because such attributes are thus explicitly
    13 declared. Module and class attributes can also be finalised in this way in
    14 order to permit certain optimisations.)
    15 
    16 Potential Restrictions
    17 ----------------------
    18 
    19 Names of classes and functions could be restricted to only refer to those
    20 objects within the same namespace. If redefinition were to occur, or if
    21 multiple possibilities were present, these restrictions could be moderated as
    22 follows:
    23 
    24   * Classes assigned to the same name could provide the union of their
    25     attributes. This would, however, cause a potential collision of attribute
    26     definitions such as methods.
    27 
    28   * Functions, if they share compatible signatures, could share parameter list
    29     definitions.
    30 
    31 Instruction Evaluation Model
    32 ============================
    33 
    34 Programs use a value stack where evaluated instructions may save their
    35 results. A value stack pointer indicates the top of this stack. In addition, a
    36 separate stack is used to record the invocation frames. All stack pointers
    37 refer to the next address to be used by the stack, not the address of the
    38 uppermost element.
    39 
    40     Frame Stack     Value Stack
    41     -----------     -----------     Address of Callable
    42                                     -------------------
    43     previous        ...
    44     current ------> callable -----> identifier
    45                     arg1            reference to code
    46                     arg2
    47                     arg3
    48                     local4
    49                     local5
    50                     ...
    51 
    52 Loading local names is a matter of performing frame-relative accesses to the
    53 value stack.
    54 
    55 Invocations and Argument Evaluation
    56 -----------------------------------
    57 
    58 When preparing for an invocation, the caller first sets the invocation frame
    59 pointer. Then, positional arguments are added to the stack such that the first
    60 argument positions are filled. A number of stack locations for the remaining
    61 arguments specified in the program are then reserved. The names of keyword
    62 arguments are used (in the form of table index values) to consult the
    63 parameter table and to find the location in which such arguments are to be
    64 stored.
    65 
    66     fn(a, b, d=1, e=2, c=3) -> fn(a, b, c, d, e)
    67 
    68     Value Stack
    69     -----------
    70 
    71     ...             ...             ...             ...
    72     fn              fn              fn              fn
    73     a               a               a               a
    74     b               b               b               b
    75     ___             ___             ___         --> 3
    76     ___         --> 1               1           |   1
    77     ___         |   ___         --> 2           |   2
    78     1 -----------   2 -----------   3 -----------
    79 
    80 Conceptually, the frame can be considered as a collection of attributes, as
    81 seen in other kinds of structures:
    82 
    83 Frame for invocation of fn:
    84 
    85     0           1           2           3           4           5
    86     code        a           b           c           d           e
    87     reference
    88 
    89 However, where arguments are specified positionally, such "attributes" are not
    90 set using a comparable approach to that employed with other structures.
    91 Keyword arguments are set using an attribute-like mechanism, though, where the
    92 position of each argument discovered using the parameter table.
    93 
    94 Where the target of the invocation is known, the above procedure can be
    95 optimised slightly by attempting to add keyword argument values directly to
    96 the stack:
    97 
    98     Value Stack
    99     -----------
   100 
   101     ...             ...             ...             ...             ...
   102     fn              fn              fn              fn              fn
   103     a               a               a               a               a
   104     b               b               b               b               b
   105                     ___             ___             ___         --> 3
   106                     1               1               1           |   1
   107                                     2               1           |   2
   108                                                     3 -----------
   109 
   110     (reserve for    (add in-place)  (add in-place)  (evaluate)      (store by
   111     parameter c)                                                    index)
   112 
   113 Method Invocations
   114 ------------------
   115 
   116 Method invocations incorporate an implicit first argument which is obtained
   117 from the context of the method:
   118 
   119     method(a, b, d=1, e=2, c=3) -> method(self, a, b, c, d, e)
   120 
   121     Value Stack
   122     -----------
   123 
   124     ...
   125     method
   126     context of method
   127     a
   128     b
   129     3
   130     1
   131     2
   132 
   133 Although it could be possible to permit any object to be provided as the first
   134 argument, in order to optimise instance attribute access in methods, we should
   135 seek to restrict the object type.
   136 
   137 Verifying Supplied Arguments
   138 ----------------------------
   139 
   140 In order to ensure a correct invocation, it is also necessary to check the
   141 number of supplied arguments. If the target of the invocation is known at
   142 compile-time, no additional instructions need to be emitted; otherwise, the
   143 generated code must test for the following situations:
   144 
   145  1. That the number of supplied arguments is equal to the number of expected
   146     parameters.
   147 
   148  2. That no keyword argument overwrites an existing positional parameter.
   149 
   150 Default Arguments
   151 -----------------
   152 
   153 Some arguments may have default values which are used if no value is provided
   154 in an invocation. Such defaults are initialised when the function itself is
   155 initialised, and are used to fill in any invocation frames that are known at
   156 compile-time.
   157 
   158 Tuples, Frames and Allocation
   159 -----------------------------
   160 
   161 Using the approach where arguments are treated like attributes in some kind of
   162 structure, we could choose to allocate frames in places other than a stack.
   163 This would produce something somewhat similar to a plain tuple object.
   164 
   165 Optimisations
   166 =============
   167 
   168 Some optimisations around constant objects might be possible; these depend on
   169 the following:
   170 
   171   * Reliable tracking of assignments: where assignment operations occur, the
   172     target of the assignment should be determined if any hope of optimisation
   173     is to be maintained. Where no guarantees can be made about the target of
   174     an assignment, no assignment-related information should be written to
   175     potential targets.
   176 
   177   * Objects acting as "containers" of attributes must be regarded as "safe":
   178     where assignments are recorded as occurring on an attribute, it must be
   179     guaranteed that no other unforeseen ways exist to assign to such
   180     attributes.
   181 
   182 The discussion below presents certain rules which must be imposed to uphold
   183 the above requirements.
   184 
   185 Safe Containers
   186 ---------------
   187 
   188 Where attributes of modules, classes and instances are only set once and are
   189 effectively constant, it should be possible to circumvent the attribute lookup
   190 mechanism and to directly reference the attribute value. This technique may
   191 only be considered applicable for the following "container" objects, subject
   192 to the noted constraints:
   193 
   194  1. For modules, "safety" is enforced by ensuring that assignments to module
   195     attributes are only permitted within the module itself either at the
   196     top-level or via names declared as globals. Thus, the following would not
   197     be permitted:
   198 
   199     another_module.this_module.attr = value
   200 
   201     In the above, this_module is a reference to the current module.
   202 
   203  2. For classes, "safety" is enforced by ensuring that assignments to class
   204     attributes are only permitted within the class definition, outside
   205     methods. This would mean that classes would be "sealed" at definition time
   206     (like functions).
   207 
   208 Unlike the property of function locals that they may only sensibly be accessed
   209 within the function in which they reside, these cases demand additional
   210 controls or assumptions on or about access to the stored data. Meanwhile, it
   211 would be difficult to detect eligible attributes on arbitrary instances due to
   212 the need for some kind of type inference or abstract execution.
   213 
   214 Constant Attributes
   215 -------------------
   216 
   217 When accessed via "safe containers", as described above, any attribute with
   218 only one recorded assignment on it can be considered a constant attribute and
   219 this eligible for optimisation, the consequence of which would be the
   220 replacement of a LoadAttrIndex instruction (which needs to look up an
   221 attribute using the run-time details of the "container" and the compile-time
   222 details of the attribute) with a LoadAttr instruction.
   223 
   224 However, some restrictions exist on assignment operations which may be
   225 regarded to cause only one assignment in the lifetime of a program:
   226 
   227  1. For module attributes, only assignments at the top-level outside loop
   228     statements can be reliably assumed to cause only a single assignment.
   229 
   230  2. For class attributes, only assignments at the top-level within class
   231     definitions and outside loop statements can be reliably assumed to cause
   232     only a single assignment.
   233 
   234 All assignments satisfying the "safe container" requirements, but not the
   235 requirements immediately above, should each be recorded as causing at least
   236 one assignment.
   237 
   238 Additional Controls
   239 -------------------
   240 
   241 For the above cases for "container" objects, the following controls would need
   242 to apply:
   243 
   244  1. Modules would need to be immutable after initialisation. However, during
   245     initialisation, there remains a possibility of another module attempting
   246     to access the original module. For example, if ppp/__init__.py contained
   247     the following...
   248 
   249     x = 1
   250     import ppp.qqq
   251     print x
   252 
   253     ...and if ppp/qqq.py contained the following...
   254 
   255     import ppp
   256     ppp.x = 2
   257 
   258     ...then the value 2 would be printed. Since modules are objects which are
   259     registered globally in a program, it would be possible to set attributes
   260     in the above way.
   261 
   262  2. Classes would need to be immutable after initialisation. However, since
   263     classes are objects, any reference to a class after initialisation could
   264     be used to set attributes on the class.
   265 
   266 Solutions:
   267 
   268  1. Insist on global scope for module attribute assignments.
   269 
   270  2. Insist on local scope within classes.
   271 
   272 Both of the above measures need to be enforced at run-time, since an arbitrary
   273 attribute assignment could be attempted on any kind of object, yet to uphold
   274 the properties of "safe containers", attempts to change attributes of such
   275 objects should be denied. Since foreseen attribute assignment operations have
   276 certain properties detectable at compile-time, it could be appropriate to
   277 generate special instructions (or modified instructions) during the
   278 initialisation of modules and classes for such foreseen assignments, whilst
   279 employing normal attribute assignment operations in all other cases. Indeed,
   280 the StoreAttr instruction, which is used to set attributes in "safe
   281 containers" would be used exclusively for this purpose; the StoreAttrIndex
   282 instruction would be used exclusively for all other attribute assignments.
   283 
   284 To ensure the "sealing" of modules and classes, entries in the attribute
   285 lookup table would encode whether a class or module is being accessed, so
   286 that the StoreAttrIndex instruction could reject such accesses.
   287 
   288 Constant Attribute Values
   289 -------------------------
   290 
   291 Where an attribute value is itself regarded as constant, is a "safe container"
   292 and is used in an operation accessing its own attributes, the value can be
   293 directly inspected for optimisations or employed in the generated code. For
   294 the attribute values themselves, only objects of a constant nature may be
   295 considered suitable for this particular optimisation:
   296 
   297   * Classes
   298   * Modules
   299   * Instances defined as constant literals
   300 
   301 This is because arbitrary objects (such as most instances) have no
   302 well-defined form before run-time and cannot be investigated further at
   303 compile-time or have a representation inserted into the generated code.
   304 
   305 Class Attributes and Access via Instances
   306 -----------------------------------------
   307 
   308 Unlike module attributes, class attributes can be accessed in a number of
   309 different ways:
   310 
   311   * Using the class itself:
   312 
   313     C.x = 123
   314     cls = C; cls.x = 234
   315 
   316   * Using a subclass of the class (for reading attributes):
   317 
   318     class D(C):
   319         pass
   320     D.x # setting D.x would populate D, not C
   321 
   322   * Using instances of the class or a subclass of the class (for reading
   323     attributes):
   324 
   325     c = C()
   326     c.x # setting c.x would populate c, not C
   327 
   328 Since assignments are only achieved using direct references to the class, and
   329 since class attributes should be defined only within the class initialisation
   330 process, the properties of class attributes should be consistent with those
   331 desired.
   332 
   333 Method Access via Instances
   334 ---------------------------
   335 
   336 It is desirable to optimise method access, even though most method calls are
   337 likely to occur via instances. It is possible, given the properties of methods
   338 as class attributes to investigate the kind of instance that the self
   339 parameter/local refers to within each method: it should be an instance either
   340 of the class in which the method is defined or a compatible class, although
   341 situations exist where this might not be the case:
   342 
   343   * Explicit invocation of a method:
   344 
   345     d = D() # D is not related to C
   346     C.f(d) # calling f(self) in C
   347 
   348 If blatant usage of incompatible instances were somehow disallowed, it would
   349 still be necessary to investigate the properties of an instance's class and
   350 its relationship with other classes. Consider the following example:
   351 
   352     class A:
   353         def f(self): ...
   354 
   355     class B:
   356         def f(self): ...
   357         def g(self):
   358             self.f()
   359 
   360     class C(A, B):
   361         pass
   362 
   363 Here, instances of B passed into the method B.g could be assumed to provide
   364 access to B.f when self.f is resolved at compile-time. However, instances of C
   365 passed into B.g would instead provide access to A.f when self.f is resolved at
   366 compile-time (since the method resolution order is C, A, B instead of just B).
   367 
   368 One solution might be to specialise methods for each instance type, but this
   369 could be costly. Another less ambitious solution might only involve the
   370 optimisation of such internal method calls if an unambiguous target can be
   371 resolved.
   372 
   373 Optimising Function Invocations
   374 -------------------------------
   375 
   376 Where an attribute value is itself regarded as constant and is a function,
   377 knowledge about the parameters of the function can be employed to optimise the
   378 preparation of the invocation frame.