Lichen

docs/wiki/Toolchain

861:f745d151a441
2018-08-17 Paul Boddie Wrapped text in the documentation for more convenient "offline" editing.
     1 = Toolchain =     2      3 The toolchain implements the process of analysing Lichen source files,     4 compiling information about the structures and routines expressed in each     5 program, and generating output for further processing that can produce an     6 executable program.     7      8 <<TableOfContents(2,3)>>     9     10 == Compiling Programs ==    11     12 The principal interface to the toolchain is the `lplc` command, run on source    13 files as in the following example:    14     15 {{{    16 lplc tests/unicode.py    17 }}}    18     19 There is no need to specify all the files that might be required by the    20 complete program. Instead, the toolchain identifies files in the program by    21 searching its module search path. This can be configured using the    22 `LICHENPATH` environment variable and the `-E` option.    23     24 Various [[../Prerequisites|prerequisites]] are needed for the toolchain to    25 work properly. By specifying the `-c` option, the specified program will be    26 translated to a C programming language representation but not built, avoiding    27 the need for some development tools to be installed if this is desirable.    28     29 The default output file from a successful compilation is a file called    30 `_main`, but this can be overridden using the `-o` option. For example:    31     32 {{{    33 lplc -o unicode tests/unicode.py    34 }}}    35     36 The complete set of options can be viewed by specifying the `--help` option,    37 and a manual page is also provided in the `docs` directory of the source    38 distribution:    39     40 {{{    41 man -l docs/lplc.1    42 }}}    43     44 This page may already be installed if the software was provided as a package    45 as part of an operating system distribution:    46     47 {{{    48 man lplc    49 }}}    50     51 == Toolchain Implementation ==    52     53 The toolchain itself is currently written in Python, but it is envisaged that    54 it will eventually be written in the Lichen language, hopefully needing only    55 minor modifications so that it may be able to accept its own source files as    56 input and ultimately produce a representation of itself as an executable    57 program. Since the Lichen language is based on Python, it is convenient to use    58 existing Python implementations to access libraries that support the parsing    59 of Python source files into useful representations.    60     61 The Python standard library provides two particularly useful modules or    62 packages of relevance: the `compiler` package and the `parser` module;    63 `parser` is employed by `compiler` to decode source text, whereas `compiler`    64 takes the concrete syntax tree representation from `parser` and produces an    65 abstract syntax tree (AST) which is particularly helpful to software of the    66 nature described here. (Contrary to impressions that    67 [[http://eli.thegreenplace.net/2009/11/28/python-internals-working-with-python-asts/|some    68 articles]] might give, the `ast` module available in Python 2.5 and later was    69 not the first module to offer AST representations of Python programs in    70 Python, nor was it even the first such module in the standard library.)    71     72 However, it is not desirable to have a dependency on a Python implementation,    73 which the `parser` module effectively is (as would the `ast` module also be if    74 it were used here), with it typically being implemented as an extension module    75 in a non-Python language (in C for CPython, in Java for Jython, and so on).    76 Fortunately, the !PyPy project implemented their own parsing module,    77 `pyparser`, that is intended to be used within the !PyPy environment together    78 with their own `ast` equivalent, but it has been possible to rework `pyparser`    79 to produce representations that are compatible with the `compiler` package,    80 itself being modified in various ways to achieve compatibility (and also to    81 provide various other conveniences).    82     83 == Program Analysis ==    84     85 With the means of inspecting source files available through a `compiler`    86 package producing a usable representation of each file, it becomes possible to    87 identify the different elements in each file and to collect information that    88 may be put to use later. But before any files are inspected, it must be    89 determined ''which'' files are to be inspected, these comprising the complete    90 program to be analysed.    91     92 Both Lichen and Python support the notion of a main source file (sometimes    93 called the "script" file or the main module or `__main__`) and of imported    94 modules and packages. The complete set of modules employed in a program is    95 defined as those imported by the main module, then those imported by those    96 modules, and so on. Thus, the complete set is not known without inspecting    97 part of the program, and this set must be built incrementally until no new    98 modules are encountered.    99    100 Where Lichen and Python differ is in the handling of [[../Imports|imports]]   101 themselves. Python [[https://docs.python.org/3/reference/import.html|employs]]   102 an intricate mechanism that searches for modules and packages, loading modules   103 encountered when descending into packages to retrieve specific modules. In   104 contrast, Lichen only imports the modules that are explicitly mentioned in   105 programs. Thus, a Lichen program will not accumulate potentially large numbers   106 of superfluous modules.   107    108 With a given module identified as being part of a program, the module will   109 then be [[../Inspection|inspected]] for the purposes of gathering useful   110 information. Since the primary objectives are to characterise the structure of   111 the objects in a program and to determine how such objects are used, certain   112 kinds of program constructs will be inspected more closely than others. Note   113 that this initial inspection activity is not concerned with the translation of   114 program operations to other forms: such [[../Translation|translation]] will   115 occur later; this initial inspection is purely concerned with obtaining enough   116 information to inform such later activities, with the original program being   117 revisited to provide the necessary detail required to translate it.