1 = Toolchain = 2 3 The toolchain implements the process of analysing Lichen source files, 4 compiling information about the structures and routines expressed in each 5 program, and generating output for further processing that can produce an 6 executable program. 7 8 <<TableOfContents(2,3)>> 9 10 == Compiling Programs == 11 12 The principal interface to the toolchain is the `lplc` command, run on source 13 files as in the following example: 14 15 {{{ 16 lplc tests/unicode.py 17 }}} 18 19 There is no need to specify all the files that might be required by the 20 complete program. Instead, the toolchain identifies files in the program by 21 searching its module search path. This can be configured using the 22 `LICHENPATH` environment variable and the `-E` option. 23 24 Various [[../Prerequisites|prerequisites]] are needed for the toolchain to 25 work properly. By specifying the `-c` option, the specified program will be 26 translated to a C programming language representation but not built, avoiding 27 the need for some development tools to be installed if this is desirable. 28 29 The default output file from a successful compilation is a file called 30 `_main`, but this can be overridden using the `-o` option. For example: 31 32 {{{ 33 lplc -o unicode tests/unicode.py 34 }}} 35 36 The complete set of options can be viewed by specifying the `--help` option, 37 and a manual page is also provided in the `docs` directory of the source 38 distribution: 39 40 {{{ 41 man -l docs/lplc.1 42 }}} 43 44 This page may already be installed if the software was provided as a package 45 as part of an operating system distribution: 46 47 {{{ 48 man lplc 49 }}} 50 51 == Toolchain Implementation == 52 53 The toolchain itself is currently written in Python, but it is envisaged that 54 it will eventually be written in the Lichen language, hopefully needing only 55 minor modifications so that it may be able to accept its own source files as 56 input and ultimately produce a representation of itself as an executable 57 program. Since the Lichen language is based on Python, it is convenient to use 58 existing Python implementations to access libraries that support the parsing 59 of Python source files into useful representations. 60 61 The Python standard library provides two particularly useful modules or 62 packages of relevance: the `compiler` package and the `parser` module; 63 `parser` is employed by `compiler` to decode source text, whereas `compiler` 64 takes the concrete syntax tree representation from `parser` and produces an 65 abstract syntax tree (AST) which is particularly helpful to software of the 66 nature described here. (Contrary to impressions that 67 [[http://eli.thegreenplace.net/2009/11/28/python-internals-working-with-python-asts/|some 68 articles]] might give, the `ast` module available in Python 2.5 and later was 69 not the first module to offer AST representations of Python programs in 70 Python, nor was it even the first such module in the standard library.) 71 72 However, it is not desirable to have a dependency on a Python implementation, 73 which the `parser` module effectively is (as would the `ast` module also be if 74 it were used here), with it typically being implemented as an extension module 75 in a non-Python language (in C for CPython, in Java for Jython, and so on). 76 Fortunately, the !PyPy project implemented their own parsing module, 77 `pyparser`, that is intended to be used within the !PyPy environment together 78 with their own `ast` equivalent, but it has been possible to rework `pyparser` 79 to produce representations that are compatible with the `compiler` package, 80 itself being modified in various ways to achieve compatibility (and also to 81 provide various other conveniences). 82 83 == Program Analysis == 84 85 With the means of inspecting source files available through a `compiler` 86 package producing a usable representation of each file, it becomes possible to 87 identify the different elements in each file and to collect information that 88 may be put to use later. But before any files are inspected, it must be 89 determined ''which'' files are to be inspected, these comprising the complete 90 program to be analysed. 91 92 Both Lichen and Python support the notion of a main source file (sometimes 93 called the "script" file or the main module or `__main__`) and of imported 94 modules and packages. The complete set of modules employed in a program is 95 defined as those imported by the main module, then those imported by those 96 modules, and so on. Thus, the complete set is not known without inspecting 97 part of the program, and this set must be built incrementally until no new 98 modules are encountered. 99 100 Where Lichen and Python differ is in the handling of [[../Imports|imports]] 101 themselves. Python [[https://docs.python.org/3/reference/import.html|employs]] 102 an intricate mechanism that searches for modules and packages, loading modules 103 encountered when descending into packages to retrieve specific modules. In 104 contrast, Lichen only imports the modules that are explicitly mentioned in 105 programs. Thus, a Lichen program will not accumulate potentially large numbers 106 of superfluous modules. 107 108 With a given module identified as being part of a program, the module will 109 then be [[../Inspection|inspected]] for the purposes of gathering useful 110 information. Since the primary objectives are to characterise the structure of 111 the objects in a program and to determine how such objects are used, certain 112 kinds of program constructs will be inspected more closely than others. Note 113 that this initial inspection activity is not concerned with the translation of 114 program operations to other forms: such [[../Translation|translation]] will 115 occur later; this initial inspection is purely concerned with obtaining enough 116 information to inform such later activities, with the original program being 117 revisited to provide the necessary detail required to translate it.