paul@766 | 1 | Low-level Implementation Details
|
paul@766 | 2 | ================================
|
paul@766 | 3 |
|
paul@766 | 4 | Although micropython delegates the generation of low-level program code and
|
paul@766 | 5 | data to syspython, various considerations of how an eventual program might be
|
paul@766 | 6 | structured have been used to inform the way micropython represents the details
|
paul@766 | 7 | of a program. This document describes these considerations and indicates how
|
paul@766 | 8 | syspython or other technologies might represent a working program.
|
paul@766 | 9 |
|
paul@768 | 10 | * Objects and structures
|
paul@768 | 11 | * Instantiation
|
paul@768 | 12 | * List and tuple representations
|
paul@768 | 13 | * Instruction evaluation model
|
paul@769 | 14 | * Exception handling
|
paul@768 | 15 |
|
paul@766 | 16 | Objects and Structures
|
paul@766 | 17 | ======================
|
paul@766 | 18 |
|
paul@766 | 19 | As well as references, micropython needs to have actual objects to refer to.
|
paul@766 | 20 | Since classes, functions and instances are all objects, it is desirable that
|
paul@766 | 21 | certain common features and operations are supported in the same way for all
|
paul@766 | 22 | of these things. To permit this, a common data structure format is used.
|
paul@766 | 23 |
|
paul@766 | 24 | Header.................................................... Attributes.................
|
paul@766 | 25 |
|
paul@766 | 26 | Identifier Identifier Address Identifier Size Object ...
|
paul@766 | 27 |
|
paul@766 | 28 | 0 1 2 3 4 5 6
|
paul@766 | 29 | classcode attrcode/ invocation funccode size attribute ...
|
paul@766 | 30 | instance reference reference
|
paul@766 | 31 | status
|
paul@766 | 32 |
|
paul@766 | 33 | Classcode
|
paul@766 | 34 | ---------
|
paul@766 | 35 |
|
paul@766 | 36 | Used in attribute lookup.
|
paul@766 | 37 |
|
paul@766 | 38 | Here, the classcode refers to the attribute lookup table for the object (as
|
paul@766 | 39 | described in concepts.txt). Classes and instances share the same classcode,
|
paul@766 | 40 | and their structures reflect this. Functions all belong to the same type and
|
paul@766 | 41 | thus employ the classcode for the function built-in type, whereas modules have
|
paul@766 | 42 | distinct types since they must support different sets of attributes.
|
paul@766 | 43 |
|
paul@766 | 44 | Attrcode
|
paul@766 | 45 | --------
|
paul@766 | 46 |
|
paul@766 | 47 | Used to test instances for membership of classes (or descendants of classes).
|
paul@766 | 48 |
|
paul@766 | 49 | Since, in traditional Python, classes are only ever instances of some generic
|
paul@766 | 50 | built-in type, support for testing such a relationship directly has been
|
paul@766 | 51 | removed and the attrcode is not specified for classes: the presence of an
|
paul@766 | 52 | attrcode indicates that a given object is an instance. In addition, support
|
paul@766 | 53 | has also been removed for testing modules in the same way, meaning that the
|
paul@766 | 54 | attrcode is also not specified for modules.
|
paul@766 | 55 |
|
paul@766 | 56 | See the "Testing Instance Compatibility with Classes (Attrcode)" section below
|
paul@766 | 57 | for details of attrcodes.
|
paul@766 | 58 |
|
paul@766 | 59 | Invocation Reference
|
paul@766 | 60 | --------------------
|
paul@766 | 61 |
|
paul@766 | 62 | Used when an object is called.
|
paul@766 | 63 |
|
paul@766 | 64 | This is the address of the code to be executed when an invocation is performed
|
paul@766 | 65 | on the object.
|
paul@766 | 66 |
|
paul@766 | 67 | Funccode
|
paul@766 | 68 | --------
|
paul@766 | 69 |
|
paul@766 | 70 | Used to look up argument positions by name.
|
paul@766 | 71 |
|
paul@766 | 72 | The strategy with keyword arguments in micropython is to attempt to position
|
paul@766 | 73 | such arguments in the invocation frame as it is being constructed.
|
paul@766 | 74 |
|
paul@766 | 75 | See the "Parameters and Lookups" section for more information.
|
paul@766 | 76 |
|
paul@766 | 77 | Size
|
paul@766 | 78 | ----
|
paul@766 | 79 |
|
paul@766 | 80 | Used to indicate the size of an object including attributes.
|
paul@766 | 81 |
|
paul@766 | 82 | Attributes
|
paul@766 | 83 | ----------
|
paul@766 | 84 |
|
paul@766 | 85 | For classes, modules and instances, the attributes in the structure correspond
|
paul@766 | 86 | to the attributes of each kind of object. For functions, however, the
|
paul@766 | 87 | attributes in the structure correspond to the default arguments for each
|
paul@766 | 88 | function, if any.
|
paul@766 | 89 |
|
paul@766 | 90 | Structure Types
|
paul@766 | 91 | ---------------
|
paul@766 | 92 |
|
paul@766 | 93 | Class C:
|
paul@766 | 94 |
|
paul@766 | 95 | 0 1 2 3 4 5 6
|
paul@766 | 96 | classcode (unused) __new__ funccode size attribute ...
|
paul@766 | 97 | for C reference for reference
|
paul@766 | 98 | instantiator
|
paul@766 | 99 |
|
paul@766 | 100 | Instance of C:
|
paul@766 | 101 |
|
paul@766 | 102 | 0 1 2 3 4 5 6
|
paul@766 | 103 | classcode attrcode C.__call__ funccode size attribute ...
|
paul@766 | 104 | for C for C reference for reference
|
paul@766 | 105 | (if exists) C.__call__
|
paul@766 | 106 |
|
paul@766 | 107 | Function f:
|
paul@766 | 108 |
|
paul@766 | 109 | 0 1 2 3 4 5 6
|
paul@766 | 110 | classcode attrcode code funccode size attribute ...
|
paul@766 | 111 | for for reference (default)
|
paul@766 | 112 | function function reference
|
paul@766 | 113 |
|
paul@766 | 114 | Module m:
|
paul@766 | 115 |
|
paul@766 | 116 | 0 1 2 3 4 5 6
|
paul@766 | 117 | classcode attrcode (unused) (unused) (unused) attribute ...
|
paul@766 | 118 | for m for m (global)
|
paul@766 | 119 | reference
|
paul@766 | 120 |
|
paul@766 | 121 | The __class__ Attribute
|
paul@766 | 122 | -----------------------
|
paul@766 | 123 |
|
paul@766 | 124 | All objects should support the __class__ attribute, and in most cases this is
|
paul@766 | 125 | done using the object table, yielding a common address for all instances of a
|
paul@766 | 126 | given class.
|
paul@766 | 127 |
|
paul@766 | 128 | Function: refers to the function class
|
paul@766 | 129 | Instance: refers to the class instantiated to make the object
|
paul@766 | 130 |
|
paul@766 | 131 | The object table cannot support two definitions simultaneously for both
|
paul@766 | 132 | instances and their classes. Consequently, __class__ access on classes must be
|
paul@766 | 133 | tested for and a special result returned.
|
paul@766 | 134 |
|
paul@766 | 135 | Class: refers to the type class (type.__class__ also refers to the type class)
|
paul@766 | 136 |
|
paul@766 | 137 | For convenience, the first attribute of a class will be the common __class__
|
paul@766 | 138 | attribute for all its instances. As noted above, direct access to this
|
paul@766 | 139 | attribute will not be possible for classes, and a constant result will be
|
paul@766 | 140 | returned instead.
|
paul@766 | 141 |
|
paul@766 | 142 | Lists and Tuples
|
paul@766 | 143 | ----------------
|
paul@766 | 144 |
|
paul@766 | 145 | The built-in list and tuple sequences employ variable length structures using
|
paul@766 | 146 | the attribute locations to store their elements, where each element is a
|
paul@766 | 147 | reference to a separately stored object.
|
paul@766 | 148 |
|
paul@766 | 149 | Testing Instance Compatibility with Classes (Attrcode)
|
paul@766 | 150 | ------------------------------------------------------
|
paul@766 | 151 |
|
paul@766 | 152 | Although it would be possible to have a data structure mapping classes to
|
paul@766 | 153 | compatible classes, such as a matrix indicating the subclasses (or
|
paul@766 | 154 | superclasses) of each class, the need to retain the key to such a data
|
paul@766 | 155 | structure for each class might introduce a noticeable overhead.
|
paul@766 | 156 |
|
paul@766 | 157 | Instead of having a separate structure, descendant classes of each class are
|
paul@766 | 158 | inserted as special attributes into the object table. This requires an extra
|
paul@766 | 159 | key to be retained, since each class must provide its own attribute code such
|
paul@766 | 160 | that upon an instance/class compatibility test, the code may be obtained and
|
paul@766 | 161 | used in the object table.
|
paul@766 | 162 |
|
paul@766 | 163 | Invocation and Code References
|
paul@766 | 164 | ------------------------------
|
paul@766 | 165 |
|
paul@766 | 166 | Modules: there is no meaningful invocation reference since modules cannot be
|
paul@766 | 167 | explicitly called.
|
paul@766 | 168 |
|
paul@766 | 169 | Functions: a simple code reference is employed pointing to code implementing
|
paul@766 | 170 | the function. Note that the function locals are completely distinct from this
|
paul@766 | 171 | structure and are not comparable to attributes. Instead, attributes are
|
paul@766 | 172 | reserved for default parameter values, although they do not appear in the
|
paul@766 | 173 | object table described above, appearing instead in a separate parameter table
|
paul@766 | 174 | described in concepts.txt.
|
paul@766 | 175 |
|
paul@766 | 176 | Classes: given that classes must be invoked in order to create instances, a
|
paul@766 | 177 | reference must be provided in class structures. However, this reference does
|
paul@766 | 178 | not point directly at the __init__ method of the class. Instead, the
|
paul@766 | 179 | referenced code belongs to a special initialiser function, __new__, consisting
|
paul@766 | 180 | of the following instructions:
|
paul@766 | 181 |
|
paul@766 | 182 | create instance for C
|
paul@766 | 183 | call C.__init__(instance, ...)
|
paul@766 | 184 | return instance
|
paul@766 | 185 |
|
paul@766 | 186 | Instances: each instance employs a reference to any __call__ method defined in
|
paul@766 | 187 | the class hierarchy for the instance, thus maintaining its callable nature.
|
paul@766 | 188 |
|
paul@766 | 189 | Both classes and modules may contain code in their definitions - the former in
|
paul@766 | 190 | the "body" of the class, potentially defining attributes, and the latter as
|
paul@766 | 191 | the "top-level" code in the module, potentially defining attributes/globals -
|
paul@766 | 192 | but this code is not associated with any invocation target. It is thus
|
paul@766 | 193 | generated in order of appearance and is not referenced externally.
|
paul@766 | 194 |
|
paul@766 | 195 | Invocation Operation
|
paul@766 | 196 | --------------------
|
paul@766 | 197 |
|
paul@766 | 198 | Consequently, regardless of the object an invocation is always done as
|
paul@766 | 199 | follows:
|
paul@766 | 200 |
|
paul@766 | 201 | get invocation reference from the header
|
paul@766 | 202 | jump to reference
|
paul@766 | 203 |
|
paul@766 | 204 | Additional preparation is necessary before the above code: positional
|
paul@766 | 205 | arguments must be saved in the invocation frame, and keyword arguments must be
|
paul@766 | 206 | resolved and saved to the appropriate position in the invocation frame.
|
paul@766 | 207 |
|
paul@766 | 208 | See invocation.txt for details.
|
paul@766 | 209 |
|
paul@766 | 210 | Instantiation
|
paul@766 | 211 | =============
|
paul@766 | 212 |
|
paul@766 | 213 | When instantiating classes, memory must be reserved for the header of the
|
paul@766 | 214 | resulting instance, along with locations for the attributes of the instance.
|
paul@766 | 215 | Since the instance header contains data common to all instances of a class, a
|
paul@766 | 216 | template header is copied to the start of the newly reserved memory region.
|
paul@766 | 217 |
|
paul@766 | 218 | List and Tuple Representations
|
paul@766 | 219 | ==============================
|
paul@766 | 220 |
|
paul@766 | 221 | Since tuples have a fixed size, the representation of a tuple instance is
|
paul@766 | 222 | merely a header describing the size of the entire object, together with a
|
paul@766 | 223 | sequence of references to the object "stored" at each position in the
|
paul@766 | 224 | structure. Such references consist of the usual context and reference pair.
|
paul@766 | 225 |
|
paul@766 | 226 | Lists, however, have a variable size and must be accessible via an unchanging
|
paul@766 | 227 | location even as more memory is allocated elsewhere to accommodate the
|
paul@766 | 228 | contents of the list. Consequently, the representation must resemble the
|
paul@766 | 229 | following:
|
paul@766 | 230 |
|
paul@766 | 231 | Structure header for list (size == header plus special attribute)
|
paul@766 | 232 | Special attribute referencing the underlying sequence
|
paul@766 | 233 |
|
paul@766 | 234 | The underlying sequence has a fixed size, like a tuple, but may contain fewer
|
paul@766 | 235 | elements than the size of the sequence permits:
|
paul@766 | 236 |
|
paul@766 | 237 | Special header indicating the current size and allocated size
|
paul@766 | 238 | Element
|
paul@766 | 239 | ... <-- current size
|
paul@766 | 240 | (Unused space)
|
paul@766 | 241 | ... <-- allocated size
|
paul@766 | 242 |
|
paul@766 | 243 | This representation permits the allocation of a new sequence when space is
|
paul@766 | 244 | exhausted in an existing sequence, with the new sequence address stored in the
|
paul@766 | 245 | main list structure. Since access to the contents of the list must go through
|
paul@766 | 246 | the main list structure, underlying allocation activities may take place
|
paul@766 | 247 | without the users of a list having to be aware of such activities.
|
paul@768 | 248 |
|
paul@768 | 249 | Instruction Evaluation Model
|
paul@768 | 250 | ============================
|
paul@768 | 251 |
|
paul@768 | 252 | Programs use a value stack containing local and temporary storage. A value
|
paul@768 | 253 | stack pointer indicates the top of this stack. In addition, a separate stack
|
paul@768 | 254 | is used to record the invocation frames. All stack pointers refer to the next
|
paul@768 | 255 | address to be used by the stack, not the address of the uppermost element.
|
paul@768 | 256 |
|
paul@768 | 257 | Frame Stack Value Stack
|
paul@768 | 258 | ----------- ----------- Address of Callable
|
paul@768 | 259 | -------------------
|
paul@768 | 260 | previous ...
|
paul@768 | 261 | current ------> callable -----> identifier
|
paul@768 | 262 | arg1 reference to code
|
paul@768 | 263 | arg2
|
paul@768 | 264 | arg3
|
paul@768 | 265 | local4
|
paul@768 | 266 | local5
|
paul@768 | 267 | ...
|
paul@768 | 268 |
|
paul@768 | 269 | Unlike the CPython virtual machine, programs do not use a value stack
|
paul@768 | 270 | containing the results of expression evaluations. Instead, temporary storage
|
paul@768 | 271 | is statically allocated and employed by instructions.
|
paul@768 | 272 |
|
paul@768 | 273 | Loading local names and temporary values is a matter of performing
|
paul@768 | 274 | frame-relative accesses to the value stack.
|
paul@768 | 275 |
|
paul@768 | 276 | Invocations and Argument Evaluation
|
paul@768 | 277 | -----------------------------------
|
paul@768 | 278 |
|
paul@768 | 279 | When preparing for an invocation, the caller first sets the invocation frame
|
paul@768 | 280 | pointer. A number of locations for the arguments required by the invocation
|
paul@768 | 281 | are then reserved. With a frame to prepare, positional arguments are added to
|
paul@768 | 282 | the frame in order such that the first argument positions are filled. The
|
paul@768 | 283 | names of keyword arguments are used (in the form of table index values) to
|
paul@768 | 284 | consult the parameter table and to find the frame location in which such
|
paul@768 | 285 | arguments are to be stored.
|
paul@768 | 286 |
|
paul@768 | 287 | fn(a, b, d=1, e=2, c=3) -> fn(a, b, c, d, e)
|
paul@768 | 288 |
|
paul@768 | 289 | Frame
|
paul@768 | 290 | -----
|
paul@768 | 291 |
|
paul@768 | 292 | a a a a
|
paul@768 | 293 | b b b b
|
paul@768 | 294 | ___ ___ ___ --> 3
|
paul@768 | 295 | ___ --> 1 1 | 1
|
paul@768 | 296 | ___ | ___ --> 2 | 2
|
paul@768 | 297 | | | |
|
paul@768 | 298 | 1 ----------- 2 ----------- 3 -----------
|
paul@768 | 299 |
|
paul@768 | 300 | Conceptually, the frame can be considered as a collection of attributes, as
|
paul@768 | 301 | seen in other kinds of structures:
|
paul@768 | 302 |
|
paul@768 | 303 | Frame for invocation of fn:
|
paul@768 | 304 |
|
paul@768 | 305 | 0 1 2 3 4 5
|
paul@768 | 306 | code a b c d e
|
paul@768 | 307 | reference
|
paul@768 | 308 |
|
paul@768 | 309 | Where arguments are specified positionally, such "attributes" are set in
|
paul@768 | 310 | order. Keyword arguments are set using a name-to-position attribute-like
|
paul@768 | 311 | mapping, where the position of each argument is discovered using the parameter
|
paul@768 | 312 | table.
|
paul@768 | 313 |
|
paul@768 | 314 | Method Invocations
|
paul@768 | 315 | ------------------
|
paul@768 | 316 |
|
paul@768 | 317 | Method invocations incorporate an implicit first argument which is obtained
|
paul@768 | 318 | from the context of the method:
|
paul@768 | 319 |
|
paul@768 | 320 | method(a, b, d=1, e=2, c=3) -> method(self, a, b, c, d, e)
|
paul@768 | 321 |
|
paul@768 | 322 | Frame
|
paul@768 | 323 | -----
|
paul@768 | 324 |
|
paul@768 | 325 | context of method
|
paul@768 | 326 | a
|
paul@768 | 327 | b
|
paul@768 | 328 | 3
|
paul@768 | 329 | 1
|
paul@768 | 330 | 2
|
paul@768 | 331 |
|
paul@768 | 332 | Although it could be possible to permit any object to be provided as the first
|
paul@768 | 333 | argument, in order to optimise instance attribute access in methods, we should
|
paul@768 | 334 | seek to restrict the object type.
|
paul@768 | 335 |
|
paul@768 | 336 | Verifying Supplied Arguments
|
paul@768 | 337 | ----------------------------
|
paul@768 | 338 |
|
paul@768 | 339 | In order to ensure a correct invocation, it is also necessary to check the
|
paul@768 | 340 | number of supplied arguments. If the target of the invocation is known at
|
paul@768 | 341 | compile-time, no additional instructions need to be emitted; otherwise, the
|
paul@768 | 342 | generated code must test for the following situations:
|
paul@768 | 343 |
|
paul@768 | 344 | 1. That the number of supplied arguments is equal to the number of expected
|
paul@768 | 345 | parameters.
|
paul@768 | 346 |
|
paul@768 | 347 | 2. That no keyword argument overwrites an existing positional parameter.
|
paul@768 | 348 |
|
paul@768 | 349 | Default Arguments
|
paul@768 | 350 | -----------------
|
paul@768 | 351 |
|
paul@768 | 352 | Some arguments may have default values which are used if no value is provided
|
paul@768 | 353 | in an invocation. Such defaults are initialised when the function itself is
|
paul@768 | 354 | initialised, and are used to fill in any invocation frames that are known at
|
paul@768 | 355 | compile-time.
|
paul@768 | 356 |
|
paul@768 | 357 | To obtain the default values, a reference to the function object is required.
|
paul@768 | 358 | This can be provided either in an additional frame location or in a special
|
paul@768 | 359 | register.
|
paul@768 | 360 |
|
paul@768 | 361 | Tuples, Frames and Allocation
|
paul@768 | 362 | -----------------------------
|
paul@768 | 363 |
|
paul@768 | 364 | Using the approach where arguments are treated like attributes in some kind of
|
paul@768 | 365 | structure, we could choose to allocate frames in places other than a stack.
|
paul@768 | 366 | This would produce something somewhat similar to a plain tuple object.
|
paul@769 | 367 |
|
paul@769 | 368 | Exception Handling
|
paul@769 | 369 | ==================
|
paul@769 | 370 |
|
paul@769 | 371 | Active Exceptions
|
paul@769 | 372 | -----------------
|
paul@769 | 373 |
|
paul@769 | 374 | When an exception is raised, an exception instance is defined as the active
|
paul@769 | 375 | exception. This active exception remains in force until either a new exception
|
paul@769 | 376 | is raised in any handling of the exception, or upon exit of a handler where
|
paul@769 | 377 | the active exception has been caught (thus resetting the active exception).
|
paul@769 | 378 |
|
paul@769 | 379 | The following operations can be employed to achieve this:
|
paul@769 | 380 |
|
paul@769 | 381 | StoreException records the current value reference using the exception
|
paul@769 | 382 | register.
|
paul@769 | 383 |
|
paul@769 | 384 | LoadException obtains the current exception and puts it in the value register.
|
paul@769 | 385 |
|
paul@769 | 386 | CheckException checks the current exception against a class referenced by the
|
paul@769 | 387 | current value.
|
paul@769 | 388 |
|
paul@769 | 389 | ClearException resets the exception register.
|
paul@769 | 390 |
|
paul@769 | 391 | Handlers
|
paul@769 | 392 | --------
|
paul@769 | 393 |
|
paul@769 | 394 | An exception handler stack is defined such that when a try...except or
|
paul@769 | 395 | try...finally block is entered, a new handler is defined.
|
paul@769 | 396 |
|
paul@769 | 397 | When an exception is raised, the program jumps to the most recently defined
|
paul@769 | 398 | handler. Inside the handler, the stack entry for the handler will be removed.
|
paul@769 | 399 |
|
paul@769 | 400 | Depending on the nature of the handler and whether the exception is handled,
|
paul@769 | 401 | the program may jump to the next most recent handler, and so on.
|
paul@769 | 402 |
|
paul@769 | 403 | If no handler is defined when an exception is raised or re-raised, the program
|
paul@769 | 404 | should terminate. This might be done by having a "handler #0" which explicitly
|
paul@769 | 405 | terminates the program.
|
paul@769 | 406 |
|
paul@769 | 407 | The following operations can be employed to achieve this:
|
paul@769 | 408 |
|
paul@769 | 409 | PushHandler(block) defines an active handler at the location indicated by the
|
paul@769 | 410 | given block.
|
paul@769 | 411 |
|
paul@769 | 412 | PopHandler(n) removes the n topmost active handlers. A single handler is
|
paul@769 | 413 | typically removed when leaving a try block, but potentially more handlers are
|
paul@769 | 414 | removed when such a block is exited using a return statement.
|