# HG changeset patch # User Paul Boddie # Date 1386520297 -3600 # Node ID 14b0c2538e87b4cc16842390b64bede12eeac6b7 # Parent c20f8d7fe5a4d0fd34b06079a2d93a47607527d4 Split the concepts document into two, making a low-level concepts document that describes things that are not directly relevant to micropython itself. diff -r c20f8d7fe5a4 -r 14b0c2538e87 docs/concepts.txt --- a/docs/concepts.txt Sun Dec 08 16:56:40 2013 +0100 +++ b/docs/concepts.txt Sun Dec 08 17:31:37 2013 +0100 @@ -8,10 +8,7 @@ * Imports and circular import detection * Contexts and values * Tables, attributes and lookups - * Objects and structures * Parameters and lookups - * Instantiation - * Register usage * List and tuple representations Namespaces and Attribute Definition @@ -254,9 +251,9 @@ Attribute lookups, where the exact location of an object attribute is deduced, are performed differently in micropython than in other implementations. Instead of providing attribute dictionaries, in which attributes are found, -attributes are located at fixed places in object structures (described below) -and their locations are stored using a special representation known as a -table. +attributes are located at fixed places in object structures (described in +lowlevel.txt) and their locations are stored using a special representation +known as a table. For a given program, a table can be considered as being like a matrix mapping classes to attribute names. For example: @@ -379,200 +376,6 @@ be in the need to check element relevance when retrieving values from such a list. -Objects and Structures -====================== - -As well as references, micropython needs to have actual objects to refer to. -Since classes, functions and instances are all objects, it is desirable that -certain common features and operations are supported in the same way for all -of these things. To permit this, a common data structure format is used. - - Header.................................................... Attributes................. - - Identifier Identifier Address Identifier Size Object ... - - 0 1 2 3 4 5 6 - classcode attrcode/ invocation funccode size attribute ... - instance reference reference - status - -Classcode ---------- - -Used in attribute lookup. - -Here, the classcode refers to the attribute lookup table for the object (as -described above). Classes and instances share the same classcode, and their -structures reflect this. Functions all belong to the same type and thus employ -the classcode for the function built-in type, whereas modules have distinct -types since they must support different sets of attributes. - -Attrcode --------- - -Used to test instances for membership of classes (or descendants of classes). - -Since, in traditional Python, classes are only ever instances of some generic -built-in type, support for testing such a relationship directly has been -removed and the attrcode is not specified for classes: the presence of an -attrcode indicates that a given object is an instance. In addition, support -has also been removed for testing modules in the same way, meaning that the -attrcode is also not specified for modules. - -See the "Testing Instance Compatibility with Classes (Attrcode)" section below -for details of attrcodes. - -Invocation Reference --------------------- - -Used when an object is called. - -This is the address of the code to be executed when an invocation is performed -on the object. - -Funccode --------- - -Used to look up argument positions by name. - -The strategy with keyword arguments in micropython is to attempt to position -such arguments in the invocation frame as it is being constructed. - -See the "Parameters and Lookups" section for more information. - -Size ----- - -Used to indicate the size of an object including attributes. - -Attributes ----------- - -For classes, modules and instances, the attributes in the structure correspond -to the attributes of each kind of object. For functions, however, the -attributes in the structure correspond to the default arguments for each -function, if any. - -Structure Types ---------------- - -Class C: - - 0 1 2 3 4 5 6 - classcode (unused) __new__ funccode size attribute ... - for C reference for reference - instantiator - -Instance of C: - - 0 1 2 3 4 5 6 - classcode attrcode C.__call__ funccode size attribute ... - for C for C reference for reference - (if exists) C.__call__ - -Function f: - - 0 1 2 3 4 5 6 - classcode attrcode code funccode size attribute ... - for for reference (default) - function function reference - -Module m: - - 0 1 2 3 4 5 6 - classcode attrcode (unused) (unused) (unused) attribute ... - for m for m (global) - reference - -The __class__ Attribute ------------------------ - -All objects should support the __class__ attribute, and in most cases this is -done using the object table, yielding a common address for all instances of a -given class. - -Function: refers to the function class -Instance: refers to the class instantiated to make the object - -The object table cannot support two definitions simultaneously for both -instances and their classes. Consequently, __class__ access on classes must be -tested for and a special result returned. - -Class: refers to the type class (type.__class__ also refers to the type class) - -For convenience, the first attribute of a class will be the common __class__ -attribute for all its instances. As noted above, direct access to this -attribute will not be possible for classes, and a constant result will be -returned instead. - -Lists and Tuples ----------------- - -The built-in list and tuple sequences employ variable length structures using -the attribute locations to store their elements, where each element is a -reference to a separately stored object. - -Testing Instance Compatibility with Classes (Attrcode) ------------------------------------------------------- - -Although it would be possible to have a data structure mapping classes to -compatible classes, such as a matrix indicating the subclasses (or -superclasses) of each class, the need to retain the key to such a data -structure for each class might introduce a noticeable overhead. - -Instead of having a separate structure, descendant classes of each class are -inserted as special attributes into the object table. This requires an extra -key to be retained, since each class must provide its own attribute code such -that upon an instance/class compatibility test, the code may be obtained and -used in the object table. - -Invocation and Code References ------------------------------- - -Modules: there is no meaningful invocation reference since modules cannot be -explicitly called. - -Functions: a simple code reference is employed pointing to code implementing -the function. Note that the function locals are completely distinct from this -structure and are not comparable to attributes. Instead, attributes are -reserved for default parameter values, although they do not appear in the -object table described above, appearing instead in a separate parameter table -described below. - -Classes: given that classes must be invoked in order to create instances, a -reference must be provided in class structures. However, this reference does -not point directly at the __init__ method of the class. Instead, the -referenced code belongs to a special initialiser function, __new__, consisting -of the following instructions: - - create instance for C - call C.__init__(instance, ...) - return instance - -Instances: each instance employs a reference to any __call__ method defined in -the class hierarchy for the instance, thus maintaining its callable nature. - -Both classes and modules may contain code in their definitions - the former in -the "body" of the class, potentially defining attributes, and the latter as -the "top-level" code in the module, potentially defining attributes/globals - -but this code is not associated with any invocation target. It is thus -generated in order of appearance and is not referenced externally. - -Invocation Operation --------------------- - -Consequently, regardless of the object an invocation is always done as -follows: - - get invocation reference from the header - jump to reference - -Additional preparation is necessary before the above code: positional -arguments must be saved in the invocation frame, and keyword arguments must be -resolved and saved to the appropriate position in the invocation frame. - -See invocation.txt for details. - Parameters and Lookups ====================== @@ -642,67 +445,3 @@ Here, the funccode refers to the offset in the list at which a function's parameters are defined, whereas the attrcode defines the offset within a region of attributes corresponding to a single parameter of a given name. - -Instantiation -============= - -When instantiating classes, memory must be reserved for the header of the -resulting instance, along with locations for the attributes of the instance. -Since the instance header contains data common to all instances of a class, a -template header is copied to the start of the newly reserved memory region. - -Register Usage -============== - -During code generation, much of the evaluation produces results which are -implicitly recorded in the "active value" or "working" register, and various -instructions will consume this active value. In addition, some instructions -will consume a separate "active source value" from a register, typically those -which are assigning the result of an expression to an assignment target. - -Since values often need to be retained for later use, a set of temporary -storage locations are typically employed. However, optimisations may reduce -the need to use such temporary storage where instructions which provide the -"active value" can be re-executed and will produce the same result. Whether -re-use of instructions is possible or not, values still need to be loaded into -a "source" register to be accessed by an assignment instruction. - -RSVP instructions generally have the notion of working, target and source -registers (see registers.txt). These register "roles" are independent from the -actual registers defined for the RSVP machine itself, even though the naming -is similar. Generally, instructions do regard the RSVP "working" registers as -the class of register in the "working" role, although some occurrences of -instruction usage may employ other registers (such as the "result" registers) -and thus take advantage of the generality of the RSVP implementation of such -instructions. - -List and Tuple Representations -============================== - -Since tuples have a fixed size, the representation of a tuple instance is -merely a header describing the size of the entire object, together with a -sequence of references to the object "stored" at each position in the -structure. Such references consist of the usual context and reference pair. - -Lists, however, have a variable size and must be accessible via an unchanging -location even as more memory is allocated elsewhere to accommodate the -contents of the list. Consequently, the representation must resemble the -following: - - Structure header for list (size == header plus special attribute) - Special attribute referencing the underlying sequence - -The underlying sequence has a fixed size, like a tuple, but may contain fewer -elements than the size of the sequence permits: - - Special header indicating the current size and allocated size - Element - ... <-- current size - (Unused space) - ... <-- allocated size - -This representation permits the allocation of a new sequence when space is -exhausted in an existing sequence, with the new sequence address stored in the -main list structure. Since access to the contents of the list must go through -the main list structure, underlying allocation activities may take place -without the users of a list having to be aware of such activities. diff -r c20f8d7fe5a4 -r 14b0c2538e87 docs/invocation.txt --- a/docs/invocation.txt Sun Dec 08 16:56:40 2013 +0100 +++ b/docs/invocation.txt Sun Dec 08 17:31:37 2013 +0100 @@ -64,7 +64,10 @@ 5. Jump to target. The target is needed for dynamic functions and methods, but is external to - any notion of arguments or locals. + any notion of arguments or locals. In a low-level representation of a + program, the target will be held in either a register or some other + convenient storage (perhaps in the frame itself) so that defaults and other + state-related information can be accessed inside the invoked function. Methods vs. functions: diff -r c20f8d7fe5a4 -r 14b0c2538e87 docs/lowlevel.txt --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/docs/lowlevel.txt Sun Dec 08 17:31:37 2013 +0100 @@ -0,0 +1,241 @@ +Low-level Implementation Details +================================ + +Although micropython delegates the generation of low-level program code and +data to syspython, various considerations of how an eventual program might be +structured have been used to inform the way micropython represents the details +of a program. This document describes these considerations and indicates how +syspython or other technologies might represent a working program. + +Objects and Structures +====================== + +As well as references, micropython needs to have actual objects to refer to. +Since classes, functions and instances are all objects, it is desirable that +certain common features and operations are supported in the same way for all +of these things. To permit this, a common data structure format is used. + + Header.................................................... Attributes................. + + Identifier Identifier Address Identifier Size Object ... + + 0 1 2 3 4 5 6 + classcode attrcode/ invocation funccode size attribute ... + instance reference reference + status + +Classcode +--------- + +Used in attribute lookup. + +Here, the classcode refers to the attribute lookup table for the object (as +described in concepts.txt). Classes and instances share the same classcode, +and their structures reflect this. Functions all belong to the same type and +thus employ the classcode for the function built-in type, whereas modules have +distinct types since they must support different sets of attributes. + +Attrcode +-------- + +Used to test instances for membership of classes (or descendants of classes). + +Since, in traditional Python, classes are only ever instances of some generic +built-in type, support for testing such a relationship directly has been +removed and the attrcode is not specified for classes: the presence of an +attrcode indicates that a given object is an instance. In addition, support +has also been removed for testing modules in the same way, meaning that the +attrcode is also not specified for modules. + +See the "Testing Instance Compatibility with Classes (Attrcode)" section below +for details of attrcodes. + +Invocation Reference +-------------------- + +Used when an object is called. + +This is the address of the code to be executed when an invocation is performed +on the object. + +Funccode +-------- + +Used to look up argument positions by name. + +The strategy with keyword arguments in micropython is to attempt to position +such arguments in the invocation frame as it is being constructed. + +See the "Parameters and Lookups" section for more information. + +Size +---- + +Used to indicate the size of an object including attributes. + +Attributes +---------- + +For classes, modules and instances, the attributes in the structure correspond +to the attributes of each kind of object. For functions, however, the +attributes in the structure correspond to the default arguments for each +function, if any. + +Structure Types +--------------- + +Class C: + + 0 1 2 3 4 5 6 + classcode (unused) __new__ funccode size attribute ... + for C reference for reference + instantiator + +Instance of C: + + 0 1 2 3 4 5 6 + classcode attrcode C.__call__ funccode size attribute ... + for C for C reference for reference + (if exists) C.__call__ + +Function f: + + 0 1 2 3 4 5 6 + classcode attrcode code funccode size attribute ... + for for reference (default) + function function reference + +Module m: + + 0 1 2 3 4 5 6 + classcode attrcode (unused) (unused) (unused) attribute ... + for m for m (global) + reference + +The __class__ Attribute +----------------------- + +All objects should support the __class__ attribute, and in most cases this is +done using the object table, yielding a common address for all instances of a +given class. + +Function: refers to the function class +Instance: refers to the class instantiated to make the object + +The object table cannot support two definitions simultaneously for both +instances and their classes. Consequently, __class__ access on classes must be +tested for and a special result returned. + +Class: refers to the type class (type.__class__ also refers to the type class) + +For convenience, the first attribute of a class will be the common __class__ +attribute for all its instances. As noted above, direct access to this +attribute will not be possible for classes, and a constant result will be +returned instead. + +Lists and Tuples +---------------- + +The built-in list and tuple sequences employ variable length structures using +the attribute locations to store their elements, where each element is a +reference to a separately stored object. + +Testing Instance Compatibility with Classes (Attrcode) +------------------------------------------------------ + +Although it would be possible to have a data structure mapping classes to +compatible classes, such as a matrix indicating the subclasses (or +superclasses) of each class, the need to retain the key to such a data +structure for each class might introduce a noticeable overhead. + +Instead of having a separate structure, descendant classes of each class are +inserted as special attributes into the object table. This requires an extra +key to be retained, since each class must provide its own attribute code such +that upon an instance/class compatibility test, the code may be obtained and +used in the object table. + +Invocation and Code References +------------------------------ + +Modules: there is no meaningful invocation reference since modules cannot be +explicitly called. + +Functions: a simple code reference is employed pointing to code implementing +the function. Note that the function locals are completely distinct from this +structure and are not comparable to attributes. Instead, attributes are +reserved for default parameter values, although they do not appear in the +object table described above, appearing instead in a separate parameter table +described in concepts.txt. + +Classes: given that classes must be invoked in order to create instances, a +reference must be provided in class structures. However, this reference does +not point directly at the __init__ method of the class. Instead, the +referenced code belongs to a special initialiser function, __new__, consisting +of the following instructions: + + create instance for C + call C.__init__(instance, ...) + return instance + +Instances: each instance employs a reference to any __call__ method defined in +the class hierarchy for the instance, thus maintaining its callable nature. + +Both classes and modules may contain code in their definitions - the former in +the "body" of the class, potentially defining attributes, and the latter as +the "top-level" code in the module, potentially defining attributes/globals - +but this code is not associated with any invocation target. It is thus +generated in order of appearance and is not referenced externally. + +Invocation Operation +-------------------- + +Consequently, regardless of the object an invocation is always done as +follows: + + get invocation reference from the header + jump to reference + +Additional preparation is necessary before the above code: positional +arguments must be saved in the invocation frame, and keyword arguments must be +resolved and saved to the appropriate position in the invocation frame. + +See invocation.txt for details. + +Instantiation +============= + +When instantiating classes, memory must be reserved for the header of the +resulting instance, along with locations for the attributes of the instance. +Since the instance header contains data common to all instances of a class, a +template header is copied to the start of the newly reserved memory region. + +List and Tuple Representations +============================== + +Since tuples have a fixed size, the representation of a tuple instance is +merely a header describing the size of the entire object, together with a +sequence of references to the object "stored" at each position in the +structure. Such references consist of the usual context and reference pair. + +Lists, however, have a variable size and must be accessible via an unchanging +location even as more memory is allocated elsewhere to accommodate the +contents of the list. Consequently, the representation must resemble the +following: + + Structure header for list (size == header plus special attribute) + Special attribute referencing the underlying sequence + +The underlying sequence has a fixed size, like a tuple, but may contain fewer +elements than the size of the sequence permits: + + Special header indicating the current size and allocated size + Element + ... <-- current size + (Unused space) + ... <-- allocated size + +This representation permits the allocation of a new sequence when space is +exhausted in an existing sequence, with the new sequence address stored in the +main list structure. Since access to the contents of the list must go through +the main list structure, underlying allocation activities may take place +without the users of a list having to be aware of such activities.