1.1 --- a/docs/wiki/Toolchain Sat Jul 21 23:19:26 2018 +0200
1.2 +++ b/docs/wiki/Toolchain Fri Aug 17 11:41:28 2018 +0200
1.3 @@ -1,34 +1,48 @@
1.4 = Toolchain =
1.5
1.6 -The toolchain implements the process of analysing Lichen source files, compiling information about the structures and routines expressed in each program, and generating output for further processing that can produce an executable program.
1.7 +The toolchain implements the process of analysing Lichen source files,
1.8 +compiling information about the structures and routines expressed in each
1.9 +program, and generating output for further processing that can produce an
1.10 +executable program.
1.11
1.12 <<TableOfContents(2,3)>>
1.13
1.14 == Compiling Programs ==
1.15
1.16 -The principal interface to the toolchain is the `lplc` command, run on source files as in the following example:
1.17 +The principal interface to the toolchain is the `lplc` command, run on source
1.18 +files as in the following example:
1.19
1.20 {{{
1.21 lplc tests/unicode.py
1.22 }}}
1.23
1.24 -There is no need to specify all the files that might be required by the complete program. Instead, the toolchain identifies files in the program by searching its module search path. This can be configured using the `LICHENPATH` environment variable and the `-E` option.
1.25 +There is no need to specify all the files that might be required by the
1.26 +complete program. Instead, the toolchain identifies files in the program by
1.27 +searching its module search path. This can be configured using the
1.28 +`LICHENPATH` environment variable and the `-E` option.
1.29
1.30 -Various [[../Prerequisites|prerequisites]] are needed for the toolchain to work properly. By specifying the `-c` option, the specified program will be translated to a C programming language representation but not built, avoiding the need for some development tools to be installed if this is desirable.
1.31 +Various [[../Prerequisites|prerequisites]] are needed for the toolchain to
1.32 +work properly. By specifying the `-c` option, the specified program will be
1.33 +translated to a C programming language representation but not built, avoiding
1.34 +the need for some development tools to be installed if this is desirable.
1.35
1.36 -The default output file from a successful compilation is a file called `_main`, but this can be overridden using the `-o` option. For example:
1.37 +The default output file from a successful compilation is a file called
1.38 +`_main`, but this can be overridden using the `-o` option. For example:
1.39
1.40 {{{
1.41 lplc -o unicode tests/unicode.py
1.42 }}}
1.43
1.44 -The complete set of options can be viewed by specifying the `--help` option, and a manual page is also provided in the `docs` directory of the source distribution:
1.45 +The complete set of options can be viewed by specifying the `--help` option,
1.46 +and a manual page is also provided in the `docs` directory of the source
1.47 +distribution:
1.48
1.49 {{{
1.50 man -l docs/lplc.1
1.51 }}}
1.52
1.53 -This page may already be installed if the software was provided as a package as part of an operating system distribution:
1.54 +This page may already be installed if the software was provided as a package
1.55 +as part of an operating system distribution:
1.56
1.57 {{{
1.58 man lplc
1.59 @@ -36,18 +50,68 @@
1.60
1.61 == Toolchain Implementation ==
1.62
1.63 -The toolchain itself is currently written in Python, but it is envisaged that it will eventually be written in the Lichen language, hopefully needing only minor modifications so that it may be able to accept its own source files as input and ultimately produce a representation of itself as an executable program. Since the Lichen language is based on Python, it is convenient to use existing Python implementations to access libraries that support the parsing of Python source files into useful representations.
1.64 +The toolchain itself is currently written in Python, but it is envisaged that
1.65 +it will eventually be written in the Lichen language, hopefully needing only
1.66 +minor modifications so that it may be able to accept its own source files as
1.67 +input and ultimately produce a representation of itself as an executable
1.68 +program. Since the Lichen language is based on Python, it is convenient to use
1.69 +existing Python implementations to access libraries that support the parsing
1.70 +of Python source files into useful representations.
1.71
1.72 -The Python standard library provides two particularly useful modules or packages of relevance: the `compiler` package and the `parser` module; `parser` is employed by `compiler` to decode source text, whereas `compiler` takes the concrete syntax tree representation from `parser` and produces an abstract syntax tree (AST) which is particularly helpful to software of the nature described here. (Contrary to impressions that [[http://eli.thegreenplace.net/2009/11/28/python-internals-working-with-python-asts/|some articles]] might give, the `ast` module available in Python 2.5 and later was not the first module to offer AST representations of Python programs in Python, nor was it even the first such module in the standard library.)
1.73 +The Python standard library provides two particularly useful modules or
1.74 +packages of relevance: the `compiler` package and the `parser` module;
1.75 +`parser` is employed by `compiler` to decode source text, whereas `compiler`
1.76 +takes the concrete syntax tree representation from `parser` and produces an
1.77 +abstract syntax tree (AST) which is particularly helpful to software of the
1.78 +nature described here. (Contrary to impressions that
1.79 +[[http://eli.thegreenplace.net/2009/11/28/python-internals-working-with-python-asts/|some
1.80 +articles]] might give, the `ast` module available in Python 2.5 and later was
1.81 +not the first module to offer AST representations of Python programs in
1.82 +Python, nor was it even the first such module in the standard library.)
1.83
1.84 -However, it is not desirable to have a dependency on a Python implementation, which the `parser` module effectively is (as would the `ast` module also be if it were used here), with it typically being implemented as an extension module in a non-Python language (in C for CPython, in Java for Jython, and so on). Fortunately, the !PyPy project implemented their own parsing module, `pyparser`, that is intended to be used within the !PyPy environment together with their own `ast` equivalent, but it has been possible to rework `pyparser` to produce representations that are compatible with the `compiler` package, itself being modified in various ways to achieve compatibility (and also to provide various other conveniences).
1.85 +However, it is not desirable to have a dependency on a Python implementation,
1.86 +which the `parser` module effectively is (as would the `ast` module also be if
1.87 +it were used here), with it typically being implemented as an extension module
1.88 +in a non-Python language (in C for CPython, in Java for Jython, and so on).
1.89 +Fortunately, the !PyPy project implemented their own parsing module,
1.90 +`pyparser`, that is intended to be used within the !PyPy environment together
1.91 +with their own `ast` equivalent, but it has been possible to rework `pyparser`
1.92 +to produce representations that are compatible with the `compiler` package,
1.93 +itself being modified in various ways to achieve compatibility (and also to
1.94 +provide various other conveniences).
1.95
1.96 == Program Analysis ==
1.97
1.98 -With the means of inspecting source files available through a `compiler` package producing a usable representation of each file, it becomes possible to identify the different elements in each file and to collect information that may be put to use later. But before any files are inspected, it must be determined ''which'' files are to be inspected, these comprising the complete program to be analysed.
1.99 +With the means of inspecting source files available through a `compiler`
1.100 +package producing a usable representation of each file, it becomes possible to
1.101 +identify the different elements in each file and to collect information that
1.102 +may be put to use later. But before any files are inspected, it must be
1.103 +determined ''which'' files are to be inspected, these comprising the complete
1.104 +program to be analysed.
1.105
1.106 -Both Lichen and Python support the notion of a main source file (sometimes called the "script" file or the main module or `__main__`) and of imported modules and packages. The complete set of modules employed in a program is defined as those imported by the main module, then those imported by those modules, and so on. Thus, the complete set is not known without inspecting part of the program, and this set must be built incrementally until no new modules are encountered.
1.107 +Both Lichen and Python support the notion of a main source file (sometimes
1.108 +called the "script" file or the main module or `__main__`) and of imported
1.109 +modules and packages. The complete set of modules employed in a program is
1.110 +defined as those imported by the main module, then those imported by those
1.111 +modules, and so on. Thus, the complete set is not known without inspecting
1.112 +part of the program, and this set must be built incrementally until no new
1.113 +modules are encountered.
1.114
1.115 -Where Lichen and Python differ is in the handling of [[../Imports|imports]] themselves. Python [[https://docs.python.org/3/reference/import.html|employs]] an intricate mechanism that searches for modules and packages, loading modules encountered when descending into packages to retrieve specific modules. In contrast, Lichen only imports the modules that are explicitly mentioned in programs. Thus, a Lichen program will not accumulate potentially large numbers of superfluous modules.
1.116 +Where Lichen and Python differ is in the handling of [[../Imports|imports]]
1.117 +themselves. Python [[https://docs.python.org/3/reference/import.html|employs]]
1.118 +an intricate mechanism that searches for modules and packages, loading modules
1.119 +encountered when descending into packages to retrieve specific modules. In
1.120 +contrast, Lichen only imports the modules that are explicitly mentioned in
1.121 +programs. Thus, a Lichen program will not accumulate potentially large numbers
1.122 +of superfluous modules.
1.123
1.124 -With a given module identified as being part of a program, the module will then be [[../Inspection|inspected]] for the purposes of gathering useful information. Since the primary objectives are to characterise the structure of the objects in a program and to determine how such objects are used, certain kinds of program constructs will be inspected more closely than others. Note that this initial inspection activity is not concerned with the translation of program operations to other forms: such [[../Translation|translation]] will occur later; this initial inspection is purely concerned with obtaining enough information to inform such later activities, with the original program being revisited to provide the necessary detail required to translate it.
1.125 +With a given module identified as being part of a program, the module will
1.126 +then be [[../Inspection|inspected]] for the purposes of gathering useful
1.127 +information. Since the primary objectives are to characterise the structure of
1.128 +the objects in a program and to determine how such objects are used, certain
1.129 +kinds of program constructs will be inspected more closely than others. Note
1.130 +that this initial inspection activity is not concerned with the translation of
1.131 +program operations to other forms: such [[../Translation|translation]] will
1.132 +occur later; this initial inspection is purely concerned with obtaining enough
1.133 +information to inform such later activities, with the original program being
1.134 +revisited to provide the necessary detail required to translate it.