Context Navigation


Changeset 174


Ignore:
Timestamp:
Jan 23, 2008, 12:50:49 AM (18 years ago)
Author:
neil.c.c.brown
Message:

Added sections on code organisation and monads

File:
1 edited

Legend:

Unmodified
Added
Removed
  • docs/trunk/hacking-guide/tock-intro.tex

    r173 r174
    2929\section{Finding the Right Place}
    3030
    31%TODO explain code organisation
    31Tock's modules are currently arranged into four directories. They are:
    32
    33\begin{enumerate}
    34\item ``common'' -- All modules that are used by many/most parts of the program.
    35\item ``frontends'' -- All modules relating to lexing, parsing, preprocessing occam and Rain, as well
    36as early steps in compilation like resolving names, checking types, etc.
    37\item ``transformations'' -- All modules relating to transforming the tree either for simplicity,
    38efficiency or simply to remove elements (e.g. parallel assignment) not supported by the backends.
    39\item ``backends'' -- All modules related to the final step of turning AST into actual code.
    40\end{enumerate}
    41
    42Tests are in the same directory as the thing they test. The separation is by no means hard-and-fast,
    43or perfect, but it's better than nothing.
    44
    45The directories should provide a quick idea of where to find what you are interested in. Data types
    46and functions common to the whole compiler, such as the AST definition and type helper functions
    47are in ``common''. The other parts of the compiler are in the obvious order (frontends, transformations,
    48backends).
    49
    50The \lstinline|Main| module in the main tock directory is the actual module for the tock executable.
    51It merely deals with the command-line options and strings together the various passes according to
    52the options given.
    53
    54If you want to add a new frontend or backend, then add a new command-line option (look in the modules
    55\lstinline|Main| and \lstinline|CompState|) for it and handle the option accordingly in the \lstinline|Main|
    56module. To add a new pass, add it to the appropriate place in the list in the \lstinline|PassList|
    57module, or add it to the appropriate pass-item already listed there (e.g. \lstinline|simplifyTypes|).
    3258
    3359\section{Understand the Existing Code}
    3460
    35%TODO give a brief explanation of the monadic code
    61There are (unfortunately, but realistically) several barriers to understanding the existing Tock code:
    62
    63\begin{enumerate}
    64\item It's written in Haskell.
    65\item It makes heavy use of monads.
    66\item It uses generics.
    67\item You need to understand the AST well.
    68\item The C and C++ backends are quite dense and tricky.
    69\end{enumerate}
    70
    71The last point is somewhat unavoidable, without an inspired re-write. Knowledge of occam will help
    72a lot with understanding the AST, except perhaps for the \lstinline|Structured| item (see below).
    73Haskell knowledge can be solved with a book or two (or other web resources); monads are also covered
    74in a section below.
    75
    76\subsection{A.Structured}
    77
    78%TODO
    79
    80\subsection{Monadic Code}
    81
    82A monad is a particular sort of type in Haskell. The most common (and in a way, the most complex) Haskell
    83monad is \lstinline|IO|. As an example, here is the type of the \lstinline|readFile| function:
    84
    85\begin{lstlisting}
    86readFile :: FilePath -> IO String
    87\end{lstlisting}
    88
    89The return value here is \lstinline|IO String|. You should read this as ``an IO action that yields a value
    90of type String''. One mistake that newcomers to monads often make is to immediately try to work out
    91how to get the String value out of the IO monad. This is actually the wrong way of looking at it; you
    92should instead look to lift all your other code into the IO monad.
    93
    94Note also that the function is \textit{not} of type: \lstinline|IO FilePath -> IO String|. This would be
    95read as taking an IO action that yields a FilePath, and giving back an IO action that yields a String.
    96This type of function is rarely seen; usually all the arguments will be non-monadic, and only the final
    97(return) value will be monadic.
    98
    99Unlike typical functional programming, monadic actions must be explicitly ordered. The easiest way to do
    100this is to use a \lstinline|do| block.
    101
    102We will now look at a real example from the current Tock codebase. This is the \lstinline|doProcess| function
    103inside the \lstinline|removeParAssign| function in the \lstinline|SimplifyProcs| module. Its purpose
    104is to turn parallel assignments into multiple (sequential) single assignments.
    105
    106\begin{lstlisting}
    107 doProcess :: A.Process -> PassM A.Process
    108 doProcess (A.Assign m vs@(_:_:_) (A.ExpressionList _ es))
    109 = do ts <- mapM typeOfVariable vs
    110 specs <- sequence [makeNonceVariable "assign_temp" m t A.VariableName A.Original | t <- ts]
    111 let temps = [A.Variable m n | A.Specification _ n _ <- specs]
    112 let first = [A.Assign m [v] (A.ExpressionList m [e]) | (v, e) <- zip temps es]
    113 let second = [A.Assign m [v] (A.ExpressionList m [A.ExprVariable m v']) | (v, v') <- zip vs temps]
    114 return $ A.Seq m $ foldl (\s spec -> A.Spec m spec s) (A.Several m (map (A.OnlyP m) (first ++ second))) specs
    115 doProcess p = doGeneric p
    116\end{lstlisting}
    117
    118The type of this function is to take a \lstinline|Process| (the \lstinline |A.| is simply because it's an AST
    119fragment; the AST module is always imported as A) and returns a monadic action in the PassM monad that will yield
    120a Process.
    121
    122The function has a standard (i.e. unrelated to monads) Haskell pattern-match as its header. The direct value of
    123the function is a \lstinline|do| block; a \lstinline|do| block is technically of type `monadic action'.
    124
    125The indentation rule of the do block is fairly simple; each item in the do block should have the same level as
    126indentation, and finishes when something is found with less indentation (fairly standard Haskell rules).
    127
    128The first line (\lstinline|ts <- mapM typeOfVariable vs|) is already somewhat complex. Firstly, the type of
    129\lstinline|typeOfVariable| is:
    130
    131\begin{lstlisting}
    132typeOfVariable :: (CSM m, Die m) => A.Variable -> m A.Type
    133\end{lstlisting}
    134
    135CSM and Die are two typeclasses to which PassM belongs. So in our case, \lstinline|m| can be \lstinline|PassM|.
    136Therefore the effective type for us is \lstinline|A.Variable -> PassM A.Type|. \lstinline|mapM| is a monadic
    137version of \lstinline|map|:
    138
    139\begin{lstlisting}
    140mapM :: Monad m => (a -> m b) -> [a] -> m [b]
    141\end{lstlisting}
    142
    143It basically takes a monadic function, and applies it to each element of the given list, returning a monadic
    144action that will yield the mapped elements.
    145
    146So in our case, \lstinline|mapM typeOfVariable| will have type \lstinline|[A.Variable] -> PassM [A.Type].
    147The argument is then \lstinline|vs|. The notation \lstinline|ts <-| means that the value yielded
    148by the monadic action is labelled as \lstinline|ts|. You may think of it as the monadic version of the
    149\lstinline|let| notation in normal Haskell. Note that \lstinline|ts| is of type \lstinline|[A.Type]|; it
    150is not monadic. The action is actually performed by this statement, and the result is put into \lstinline|ts|.
    151
    152The second line is again interesting. The list is a standard Haskell list comprehension. The type of
    153\lstinline|makeNonceVariable| is:
    154
    155\begin{lstlisting}
    156makeNonceVariable :: CSM m => String -> Meta -> A.Type -> A.NameType -> A.AbbrevMode -> m A.Specification
    157\end{lstlisting}
    158
    159The list comprehension is therefore of type \lstinline|[PassM A.Specification]|; that is, it is a list
    160of monadic actions, each of which will yield a Specification. This is then given to the sequence function
    161which has this type:
    162
    163\begin{lstlisting}
    164sequence :: Monad m => [m a] -> m [a]
    165\end{lstlisting}
    166
    167In other words, sequence takes a list of monadic actions, performs each of them (in sequence!) and returns
    168the resulting list of elements (inside the monad, of course). So sequence performs all our actions and gives
    169us back a list of Specifications. This list is then labelled as \lstinline|specs|.
    170
    171The next three lines in our code fragment begin with \lstinline|let|. These are all non-monadic lines, and
    172are just plain Haskell. Note that there is a slight difference between the \lstinline|let| notation inside
    173a \lstinline|do| block to the normal \lstinline|let|..\lstinline|in| notation; there is no \lstinline|in|
    174keyword in the version in the \lstinline|do| block. This is a technicality, but one that can trip you up;
    175if you add the \lstinline|in| keyword you'll get a not-very-helpful parser error.
    176
    177Finally, the last line of our function features \lstinline|return|. \lstinline|return| is a standard monadic
    178function that is used very frequently. Its type is:
    179
    180\begin{lstlisting}
    181return :: Monad m => a -> m a
    182\end{lstlisting}
    183
    184All it does is turn a plain (non-monadic) value into a simple monadic action that yields the value. More
    185simply, it lifts the value into the monad. In our function we have a plain value, but we need to lift it
    186inside the monad to satisfy the types. This is very complex in relation to the types, so is perhaps best
    187viewed as an analogue to C or Java's return statement.
    188
    189Note that you there is nor really an opposite of the return function (something of type \lstinline|m a -> a|).
    190Values can never really be `freed' from the monad; everything around them just gets lifted inside.
    191
    192\subsection{The PassM monad}
    193
    194It has been mentioned above that the PassM monad is the most common monad. Here is it's actual type:
    195
    196\begin{lstlisting}
    197type PassM = ErrorT ErrorReport (StateT CompState IO)
    198\end{lstlisting}
    199
    200The PassM monad is a mixture of three monads; an error monad, a state monad, and the IO monad. The error
    201monad allows for exception-like mechanisms. In Tock, we throw an error whenever we can proceed no further
    202with the compilation. Examples include parser errors, type errors and parallel safety problems. The
    203state in question is the CompState type (see the module of the same name), which holds things like the
    204name-type dictionary (aka symbol table). The IO monad is included for various small reasons, such as
    205being able to read in files.
    206
    207I have mentioned numerous times that you should put things inside the monad, not look to sneak them out.
    208The way it really works is that you combine all your actions inside the PassM monad into one giant
    209PassM monadic action. Then this action is run. For a simple example we will look at the testPass
    210function:
    211
    212\begin{lstlisting}
    213runPass ::
    214 PassM b -- ^ The actual pass.
    215 -> CompState -- ^ The state to use to run the pass.
    216 -> IO (CompState, Either ErrorReport b) -- ^ The resultant state, and either an error or the successful outcome of the pass.
    217runPass actualPass startState = (liftM (\(x,y) -> (y,x))) (runStateT (runErrorT actualPass) startState)
    218\end{lstlisting}
    219
    220The first part of the function is just to reverse the order of the pair (this is real code). Perhaps
    221the most instructive part is the type. Given a PassM action and a starting state, the function will give
    222you back (still inside the IO monad, admittedly) the resulting state, and either an error (if there
    223was an error during the action) or the final result of the monadic action.
    224
    225This is where the monad is unwrapped; the monadic action finally yields a plain value. But in Tock,
    226the plain value is only used in the \lstinline|Main| module. The whole of the rest of the program,
    227obviously or not, runs inside the PassM monad.
    228
    229\subsection{Generics}
    230
    231%TODO
    36232
    37233\section{Add Your Own Code}
Note: See TracChangeset for help on using the changeset viewer.

AltStyle によって変換されたページ (->オリジナル) /