paul@15 | 1 | Namespace Definition
|
paul@15 | 2 | ====================
|
paul@15 | 3 |
|
paul@15 | 4 | Module attributes are defined either at the module level or by global
|
paul@15 | 5 | statements.
|
paul@15 | 6 |
|
paul@15 | 7 | Class attributes are defined only within class statements.
|
paul@15 | 8 |
|
paul@15 | 9 | Instance attributes are defined only by assignments to attributes of self
|
paul@15 | 10 | within __init__ methods.
|
paul@15 | 11 |
|
paul@42 | 12 | Potential Restrictions
|
paul@42 | 13 | ----------------------
|
paul@42 | 14 |
|
paul@42 | 15 | Names of classes and functions could be restricted to only refer to those
|
paul@42 | 16 | objects within the same namespace. If redefinition were to occur, or if
|
paul@42 | 17 | multiple possibilities were present, these restrictions could be moderated as
|
paul@42 | 18 | follows:
|
paul@42 | 19 |
|
paul@42 | 20 | * Classes assigned to the same name could provide the union of their
|
paul@42 | 21 | attributes. This would, however, cause a potential collision of attribute
|
paul@42 | 22 | definitions such as methods.
|
paul@42 | 23 |
|
paul@42 | 24 | * Functions, if they share compatible signatures, could share parameter list
|
paul@42 | 25 | definitions.
|
paul@42 | 26 |
|
paul@11 | 27 | Data Structures
|
paul@11 | 28 | ===============
|
paul@11 | 29 |
|
paul@45 | 30 | The fundamental "value type" is a pair of references: one pointing to the
|
paul@45 | 31 | referenced object represented by the interchangeable value; one referring to
|
paul@45 | 32 | the context of the referenced object, typically the object through which the
|
paul@45 | 33 | referenced object was acquired as an attribute.A
|
paul@45 | 34 |
|
paul@45 | 35 | Value Layout
|
paul@45 | 36 | ------------
|
paul@45 | 37 |
|
paul@45 | 38 | 0 1
|
paul@45 | 39 | object context
|
paul@45 | 40 | reference reference
|
paul@45 | 41 |
|
paul@45 | 42 | Objects
|
paul@45 | 43 | -------
|
paul@45 | 44 |
|
paul@11 | 45 | Since classes, functions and instances are all "objects", each must support
|
paul@11 | 46 | certain features and operations in the same way.
|
paul@11 | 47 |
|
paul@11 | 48 | The __class__ Attribute
|
paul@11 | 49 | -----------------------
|
paul@11 | 50 |
|
paul@11 | 51 | All objects support the __class__ attribute:
|
paul@11 | 52 |
|
paul@11 | 53 | Class: refers to the type class (type.__class__ also refers to the type class)
|
paul@11 | 54 | Function: refers to the function class
|
paul@11 | 55 | Instance: refers to the class instantiated to make the object
|
paul@11 | 56 |
|
paul@11 | 57 | Invocation
|
paul@11 | 58 | ----------
|
paul@11 | 59 |
|
paul@11 | 60 | The following actions need to be supported:
|
paul@11 | 61 |
|
paul@11 | 62 | Class: create instance, call __init__ with instance, return object
|
paul@11 | 63 | Function: call function body, return result
|
paul@11 | 64 | Instance: call __call__ method, return result
|
paul@11 | 65 |
|
paul@11 | 66 | Structure Layout
|
paul@11 | 67 | ----------------
|
paul@11 | 68 |
|
paul@11 | 69 | A suitable structure layout might be something like this:
|
paul@11 | 70 |
|
paul@11 | 71 | 0 1 2 3 4
|
paul@11 | 72 | classcode invocation __class__ attribute ...
|
paul@11 | 73 | reference reference reference
|
paul@11 | 74 |
|
paul@11 | 75 | Here, the classcode refers to the attribute lookup table for the object. Since
|
paul@11 | 76 | classes and instances share the same classcode, they might resemble the
|
paul@11 | 77 | following:
|
paul@11 | 78 |
|
paul@11 | 79 | Class C:
|
paul@11 | 80 |
|
paul@11 | 81 | 0 1 2 3 4
|
paul@11 | 82 | code for C __new__ class type attribute ...
|
paul@11 | 83 | reference reference reference
|
paul@11 | 84 |
|
paul@11 | 85 | Instance of C:
|
paul@11 | 86 |
|
paul@11 | 87 | 0 1 2 3 4
|
paul@11 | 88 | code for C C.__call__ class C attribute ...
|
paul@11 | 89 | reference reference reference
|
paul@11 | 90 | (if exists)
|
paul@11 | 91 |
|
paul@11 | 92 | The __new__ reference would lead to code consisting of the following
|
paul@11 | 93 | instructions:
|
paul@11 | 94 |
|
paul@11 | 95 | create instance for C
|
paul@11 | 96 | call C.__init__(instance, ...)
|
paul@11 | 97 | return instance
|
paul@11 | 98 |
|
paul@11 | 99 | If C has a __call__ attribute, the invocation "slot" of C instances would
|
paul@11 | 100 | refer to the same thing as C.__call__.
|
paul@11 | 101 |
|
paul@11 | 102 | For functions, the same general layout applies:
|
paul@11 | 103 |
|
paul@11 | 104 | Function f:
|
paul@11 | 105 |
|
paul@11 | 106 | 0 1 2 3 4
|
paul@11 | 107 | code for code class attribute ...
|
paul@11 | 108 | function reference function reference
|
paul@11 | 109 | reference
|
paul@11 | 110 |
|
paul@37 | 111 | Here, the code reference would lead to code for the function. Note that the
|
paul@37 | 112 | function locals are completely distinct from this structure and are not
|
paul@37 | 113 | comparable to attributes.
|
paul@37 | 114 |
|
paul@38 | 115 | For modules, there is no meaningful invocation reference:
|
paul@37 | 116 |
|
paul@37 | 117 | Module m:
|
paul@37 | 118 |
|
paul@37 | 119 | 0 1 2 3 4
|
paul@38 | 120 | code for m (unused) module type attribute ...
|
paul@38 | 121 | reference (global)
|
paul@37 | 122 | reference
|
paul@11 | 123 |
|
paul@38 | 124 | Both classes and modules have code in their definitions, but this would be
|
paul@38 | 125 | generated in order and not referenced externally.
|
paul@38 | 126 |
|
paul@11 | 127 | Invocation Operation
|
paul@11 | 128 | --------------------
|
paul@11 | 129 |
|
paul@11 | 130 | Consequently, regardless of the object an invocation is always done as
|
paul@11 | 131 | follows:
|
paul@11 | 132 |
|
paul@11 | 133 | get invocation reference (at object+1)
|
paul@11 | 134 | jump to reference
|
paul@11 | 135 |
|
paul@11 | 136 | Additional preparation is necessary before the above code: positional
|
paul@11 | 137 | arguments must be saved to the parameter stack, and keyword arguments must be
|
paul@11 | 138 | resolved and saved to the appropriate position in the parameter stack.
|
paul@11 | 139 |
|
paul@11 | 140 | Attribute Operations
|
paul@11 | 141 | --------------------
|
paul@11 | 142 |
|
paul@11 | 143 | Attribute access needs to go through the attribute lookup table.
|
paul@21 | 144 |
|
paul@21 | 145 | Instruction Evaluation Model
|
paul@21 | 146 | ============================
|
paul@21 | 147 |
|
paul@21 | 148 | Programs use a value stack where evaluated instructions may save their
|
paul@21 | 149 | results. A value stack pointer indicates the top of this stack. In addition, a
|
paul@21 | 150 | separate stack is used to record the invocation frames. All stack pointers
|
paul@21 | 151 | refer to the next address to be used by the stack, not the address of the
|
paul@21 | 152 | uppermost element.
|
paul@21 | 153 |
|
paul@21 | 154 | Frame Stack Value Stack
|
paul@21 | 155 | ----------- ----------- Address of Callable
|
paul@21 | 156 | -------------------
|
paul@21 | 157 | previous ...
|
paul@21 | 158 | current ------> callable -----> identifier
|
paul@21 | 159 | arg1 reference to code
|
paul@21 | 160 | arg2
|
paul@21 | 161 | arg3
|
paul@21 | 162 | local4
|
paul@21 | 163 | local5
|
paul@21 | 164 | ...
|
paul@21 | 165 |
|
paul@21 | 166 | Loading local names is a matter of performing frame-relative accesses to the
|
paul@21 | 167 | value stack.
|
paul@21 | 168 |
|
paul@21 | 169 | Invocations and Argument Evaluation
|
paul@21 | 170 | -----------------------------------
|
paul@21 | 171 |
|
paul@21 | 172 | When preparing for an invocation, the caller first sets the invocation frame
|
paul@21 | 173 | pointer. Then, positional arguments are added to the stack such that the first
|
paul@21 | 174 | argument positions are filled. A number of stack locations for the remaining
|
paul@21 | 175 | arguments specified in the program are then reserved. The names of keyword
|
paul@21 | 176 | arguments are used (in the form of table index values) to consult the
|
paul@21 | 177 | parameter table and to find the location in which such arguments are to be
|
paul@21 | 178 | stored.
|
paul@21 | 179 |
|
paul@21 | 180 | fn(a, b, d=1, e=2, c=3) -> fn(a, b, c, d, e)
|
paul@21 | 181 |
|
paul@21 | 182 | Value Stack
|
paul@21 | 183 | -----------
|
paul@21 | 184 |
|
paul@21 | 185 | ... ... ... ...
|
paul@21 | 186 | fn fn fn fn
|
paul@21 | 187 | a a a a
|
paul@21 | 188 | b b b b
|
paul@21 | 189 | ___ ___ ___ --> 3
|
paul@21 | 190 | ___ --> 1 1 | 1
|
paul@21 | 191 | ___ | ___ --> 2 | 2
|
paul@21 | 192 | 1 ----------- 2 ----------- 3 -----------
|
paul@21 | 193 |
|
paul@21 | 194 | Conceptually, the frame can be considered as a collection of attributes, as
|
paul@21 | 195 | seen in other kinds of structures:
|
paul@21 | 196 |
|
paul@21 | 197 | Frame for invocation of fn:
|
paul@21 | 198 |
|
paul@21 | 199 | 0 1 2 3 4 5
|
paul@21 | 200 | code a b c d e
|
paul@21 | 201 | reference
|
paul@21 | 202 |
|
paul@21 | 203 | However, where arguments are specified positionally, such "attributes" are not
|
paul@21 | 204 | set using a comparable approach to that employed with other structures.
|
paul@21 | 205 | Keyword arguments are set using an attribute-like mechanism, though, where the
|
paul@21 | 206 | position of each argument discovered using the parameter table.
|
paul@21 | 207 |
|
paul@45 | 208 | Method invocations incorporate an implicit first argument which is obtained
|
paul@45 | 209 | from the context of the method:
|
paul@45 | 210 |
|
paul@45 | 211 | method(a, b, d=1, e=2, c=3) -> method(self, a, b, c, d, e)
|
paul@45 | 212 |
|
paul@45 | 213 | Value Stack
|
paul@45 | 214 | -----------
|
paul@45 | 215 |
|
paul@45 | 216 | ...
|
paul@45 | 217 | method
|
paul@45 | 218 | context of method
|
paul@45 | 219 | a
|
paul@45 | 220 | b
|
paul@45 | 221 | 3
|
paul@45 | 222 | 1
|
paul@45 | 223 | 2
|
paul@45 | 224 |
|
paul@45 | 225 | Although it could be possible to permit any object to be provided as the first
|
paul@45 | 226 | argument, in order to optimise instance attribute access in methods, we should
|
paul@45 | 227 | seek to restrict the object type.
|
paul@45 | 228 |
|
paul@21 | 229 | Tuples, Frames and Allocation
|
paul@21 | 230 | -----------------------------
|
paul@21 | 231 |
|
paul@21 | 232 | Using the approach where arguments are treated like attributes in some kind of
|
paul@21 | 233 | structure, we could choose to allocate frames in places other than a stack.
|
paul@21 | 234 | This would produce something somewhat similar to a plain tuple object.
|
paul@23 | 235 |
|
paul@23 | 236 | Optimisations
|
paul@23 | 237 | =============
|
paul@23 | 238 |
|
paul@29 | 239 | Some optimisations around constant objects might be possible; these depend on
|
paul@29 | 240 | the following:
|
paul@29 | 241 |
|
paul@29 | 242 | * Reliable tracking of assignments: where assignment operations occur, the
|
paul@29 | 243 | target of the assignment should be determined if any hope of optimisation
|
paul@29 | 244 | is to be maintained. Where no guarantees can be made about the target of
|
paul@29 | 245 | an assignment, no assignment-related information should be written to
|
paul@29 | 246 | potential targets.
|
paul@29 | 247 |
|
paul@29 | 248 | * Objects acting as "containers" of attributes must be regarded as "safe":
|
paul@29 | 249 | where assignments are recorded as occurring on an attribute, it must be
|
paul@29 | 250 | guaranteed that no other unforeseen ways exist to assign to such
|
paul@29 | 251 | attributes.
|
paul@29 | 252 |
|
paul@29 | 253 | The discussion below presents certain rules which must be imposed to uphold
|
paul@29 | 254 | the above requirements.
|
paul@29 | 255 |
|
paul@30 | 256 | Safe Containers
|
paul@30 | 257 | ---------------
|
paul@28 | 258 |
|
paul@23 | 259 | Where attributes of modules, classes and instances are only set once and are
|
paul@23 | 260 | effectively constant, it should be possible to circumvent the attribute lookup
|
paul@28 | 261 | mechanism and to directly reference the attribute value. This technique may
|
paul@30 | 262 | only be considered applicable for the following "container" objects, subject
|
paul@30 | 263 | to the noted constraints:
|
paul@28 | 264 |
|
paul@30 | 265 | 1. For modules, "safety" is enforced by ensuring that assignments to module
|
paul@30 | 266 | attributes are only permitted within the module itself either at the
|
paul@30 | 267 | top-level or via names declared as globals. Thus, the following would not
|
paul@30 | 268 | be permitted:
|
paul@28 | 269 |
|
paul@28 | 270 | another_module.this_module.attr = value
|
paul@28 | 271 |
|
paul@29 | 272 | In the above, this_module is a reference to the current module.
|
paul@28 | 273 |
|
paul@30 | 274 | 2. For classes, "safety" is enforced by ensuring that assignments to class
|
paul@30 | 275 | attributes are only permitted within the class definition, outside
|
paul@30 | 276 | methods. This would mean that classes would be "sealed" at definition time
|
paul@30 | 277 | (like functions).
|
paul@28 | 278 |
|
paul@28 | 279 | Unlike the property of function locals that they may only sensibly be accessed
|
paul@28 | 280 | within the function in which they reside, these cases demand additional
|
paul@28 | 281 | controls or assumptions on or about access to the stored data. Meanwhile, it
|
paul@28 | 282 | would be difficult to detect eligible attributes on arbitrary instances due to
|
paul@28 | 283 | the need for some kind of type inference or abstract execution.
|
paul@28 | 284 |
|
paul@30 | 285 | Constant Attributes
|
paul@30 | 286 | -------------------
|
paul@30 | 287 |
|
paul@30 | 288 | When accessed via "safe containers", as described above, any attribute with
|
paul@30 | 289 | only one recorded assignment on it can be considered a constant attribute and
|
paul@30 | 290 | this eligible for optimisation, the consequence of which would be the
|
paul@30 | 291 | replacement of a LoadAttrIndex instruction (which needs to look up an
|
paul@30 | 292 | attribute using the run-time details of the "container" and the compile-time
|
paul@30 | 293 | details of the attribute) with a LoadAttr instruction.
|
paul@30 | 294 |
|
paul@30 | 295 | However, some restrictions exist on assignment operations which may be
|
paul@30 | 296 | regarded to cause only one assignment in the lifetime of a program:
|
paul@30 | 297 |
|
paul@30 | 298 | 1. For module attributes, only assignments at the top-level outside loop
|
paul@30 | 299 | statements can be reliably assumed to cause only a single assignment.
|
paul@30 | 300 |
|
paul@30 | 301 | 2. For class attributes, only assignments at the top-level within class
|
paul@30 | 302 | definitions and outside loop statements can be reliably assumed to cause
|
paul@30 | 303 | only a single assignment.
|
paul@30 | 304 |
|
paul@30 | 305 | All assignments satisfying the "safe container" requirements, but not the
|
paul@30 | 306 | requirements immediately above, should each be recorded as causing at least
|
paul@30 | 307 | one assignment.
|
paul@28 | 308 |
|
paul@29 | 309 | Additional Controls
|
paul@29 | 310 | -------------------
|
paul@29 | 311 |
|
paul@29 | 312 | For the above cases for "container" objects, the following controls would need
|
paul@29 | 313 | to apply:
|
paul@29 | 314 |
|
paul@29 | 315 | 1. Modules would need to be immutable after initialisation. However, during
|
paul@29 | 316 | initialisation, there remains a possibility of another module attempting
|
paul@29 | 317 | to access the original module. For example, if ppp/__init__.py contained
|
paul@29 | 318 | the following...
|
paul@29 | 319 |
|
paul@29 | 320 | x = 1
|
paul@29 | 321 | import ppp.qqq
|
paul@29 | 322 | print x
|
paul@29 | 323 |
|
paul@29 | 324 | ...and if ppp/qqq.py contained the following...
|
paul@29 | 325 |
|
paul@29 | 326 | import ppp
|
paul@29 | 327 | ppp.x = 2
|
paul@29 | 328 |
|
paul@29 | 329 | ...then the value 2 would be printed. Since modules are objects which are
|
paul@29 | 330 | registered globally in a program, it would be possible to set attributes
|
paul@29 | 331 | in the above way.
|
paul@29 | 332 |
|
paul@29 | 333 | 2. Classes would need to be immutable after initialisation. However, since
|
paul@29 | 334 | classes are objects, any reference to a class after initialisation could
|
paul@29 | 335 | be used to set attributes on the class.
|
paul@29 | 336 |
|
paul@29 | 337 | Solutions:
|
paul@29 | 338 |
|
paul@29 | 339 | 1. Insist on global scope for module attribute assignments.
|
paul@29 | 340 |
|
paul@29 | 341 | 2. Insist on local scope within classes.
|
paul@29 | 342 |
|
paul@29 | 343 | Both of the above measures need to be enforced at run-time, since an arbitrary
|
paul@29 | 344 | attribute assignment could be attempted on any kind of object, yet to uphold
|
paul@29 | 345 | the properties of "safe containers", attempts to change attributes of such
|
paul@29 | 346 | objects should be denied. Since foreseen attribute assignment operations have
|
paul@29 | 347 | certain properties detectable at compile-time, it could be appropriate to
|
paul@29 | 348 | generate special instructions (or modified instructions) during the
|
paul@29 | 349 | initialisation of modules and classes for such foreseen assignments, whilst
|
paul@29 | 350 | employing normal attribute assignment operations in all other cases. Indeed,
|
paul@29 | 351 | the StoreAttr instruction, which is used to set attributes in "safe
|
paul@29 | 352 | containers" would be used exclusively for this purpose; the StoreAttrIndex
|
paul@29 | 353 | instruction would be used exclusively for all other attribute assignments.
|
paul@29 | 354 |
|
paul@43 | 355 | To ensure the "sealing" of modules and classes, entries in the attribute
|
paul@43 | 356 | lookup table would encode whether a class or module is being accessed, so
|
paul@43 | 357 | that the StoreAttrIndex instruction could reject such accesses.
|
paul@43 | 358 |
|
paul@28 | 359 | Constant Attribute Values
|
paul@28 | 360 | -------------------------
|
paul@28 | 361 |
|
paul@29 | 362 | Where an attribute value is itself regarded as constant, is a "safe container"
|
paul@29 | 363 | and is used in an operation accessing its own attributes, the value can be
|
paul@29 | 364 | directly inspected for optimisations or employed in the generated code. For
|
paul@29 | 365 | the attribute values themselves, only objects of a constant nature may be
|
paul@28 | 366 | considered suitable for this particular optimisation:
|
paul@28 | 367 |
|
paul@28 | 368 | * Classes
|
paul@28 | 369 | * Modules
|
paul@28 | 370 | * Instances defined as constant literals
|
paul@28 | 371 |
|
paul@28 | 372 | This is because arbitrary objects (such as most instances) have no
|
paul@28 | 373 | well-defined form before run-time and cannot be investigated further at
|
paul@28 | 374 | compile-time or have a representation inserted into the generated code.
|
paul@29 | 375 |
|
paul@29 | 376 | Class Attributes and Access via Instances
|
paul@29 | 377 | -----------------------------------------
|
paul@29 | 378 |
|
paul@29 | 379 | Unlike module attributes, class attributes can be accessed in a number of
|
paul@29 | 380 | different ways:
|
paul@29 | 381 |
|
paul@29 | 382 | * Using the class itself:
|
paul@29 | 383 |
|
paul@29 | 384 | C.x = 123
|
paul@29 | 385 | cls = C; cls.x = 234
|
paul@29 | 386 |
|
paul@29 | 387 | * Using a subclass of the class (for reading attributes):
|
paul@29 | 388 |
|
paul@29 | 389 | class D(C):
|
paul@29 | 390 | pass
|
paul@29 | 391 | D.x # setting D.x would populate D, not C
|
paul@29 | 392 |
|
paul@29 | 393 | * Using instances of the class or a subclass of the class (for reading
|
paul@29 | 394 | attributes):
|
paul@29 | 395 |
|
paul@29 | 396 | c = C()
|
paul@29 | 397 | c.x # setting c.x would populate c, not C
|
paul@29 | 398 |
|
paul@29 | 399 | Since assignments are only achieved using direct references to the class, and
|
paul@29 | 400 | since class attributes should be defined only within the class initialisation
|
paul@29 | 401 | process, the properties of class attributes should be consistent with those
|
paul@29 | 402 | desired.
|
paul@29 | 403 |
|
paul@29 | 404 | Method Access via Instances
|
paul@29 | 405 | ---------------------------
|
paul@29 | 406 |
|
paul@29 | 407 | It is desirable to optimise method access, even though most method calls are
|
paul@29 | 408 | likely to occur via instances. It is possible, given the properties of methods
|
paul@29 | 409 | as class attributes to investigate the kind of instance that the self
|
paul@29 | 410 | parameter/local refers to within each method: it should be an instance either
|
paul@29 | 411 | of the class in which the method is defined or a compatible class, although
|
paul@29 | 412 | situations exist where this might not be the case:
|
paul@29 | 413 |
|
paul@29 | 414 | * Explicit invocation of a method:
|
paul@29 | 415 |
|
paul@29 | 416 | d = D() # D is not related to C
|
paul@29 | 417 | C.f(d) # calling f(self) in C
|
paul@29 | 418 |
|
paul@30 | 419 | If blatant usage of incompatible instances were somehow disallowed, it would
|
paul@30 | 420 | still be necessary to investigate the properties of an instance's class and
|
paul@30 | 421 | its relationship with other classes. Consider the following example:
|
paul@30 | 422 |
|
paul@30 | 423 | class A:
|
paul@30 | 424 | def f(self): ...
|
paul@30 | 425 |
|
paul@30 | 426 | class B:
|
paul@30 | 427 | def f(self): ...
|
paul@30 | 428 | def g(self):
|
paul@30 | 429 | self.f()
|
paul@30 | 430 |
|
paul@30 | 431 | class C(A, B):
|
paul@30 | 432 | pass
|
paul@30 | 433 |
|
paul@30 | 434 | Here, instances of B passed into the method B.g could be assumed to provide
|
paul@30 | 435 | access to B.f when self.f is resolved at compile-time. However, instances of C
|
paul@30 | 436 | passed into B.g would instead provide access to A.f when self.f is resolved at
|
paul@30 | 437 | compile-time (since the method resolution order is C, A, B instead of just B).
|
paul@30 | 438 |
|
paul@30 | 439 | One solution might be to specialise methods for each instance type, but this
|
paul@30 | 440 | could be costly. Another less ambitious solution might only involve the
|
paul@30 | 441 | optimisation of such internal method calls if an unambiguous target can be
|
paul@30 | 442 | resolved.
|
paul@30 | 443 |
|
paul@29 | 444 | Optimising Function Invocations
|
paul@29 | 445 | -------------------------------
|
paul@29 | 446 |
|
paul@29 | 447 | Where an attribute value is itself regarded as constant and is a function,
|
paul@29 | 448 | knowledge about the parameters of the function can be employed to optimise the
|
paul@29 | 449 | preparation of the invocation frame.
|