A Higher Order Reconstruction of Stepwise Enhancement

Lee Naish and Leon Sterling
Department of Computer Science
University of Melbourne,
Parkville, Vic. 3052, Australia
{lee,leon}@cs.mu.oz.au

Motivation

In the last couple of years, there has been renewed interest in systematic methods for the construction of Prolog programs, for example (Gegg-Harrison, 1995), (Kirschenbaum et al., 1996), (Sterling and Yalcinalp, 1996), and (Vasconcelos and Fuchs, 1995). This paper loosely characterises the progression of approaches that have been offered for systematic construction of logic (usually Prolog) programs. There is a trade-off between what is accessible for practical programmers and what is clearly explained in theory.

We claim that both audiences can be addressed through stepwise enhancement. The method can be explained directly in term of programming techniques applied to simple programs, and also given a more theoretical basis in terms of higher order functions. We review stepwise enhancement, sketch how to capture an enhancement as a higher order function adapted from foldr, and then sketch how individual enhancements are specialisations of the particular foldr predicate. We show how this is closely related to the new ideas on shape and polytypism being discussed in the functional programming community (Jay and Cockett, 1994), (Jay, 1995), (Belle et al., 1996) (Jeuring and Jansson, 1996), (Jansson and Jeuring, 1997). We go on to generalise this work in several ways, utilising key features of logic programming such as nondeterminism and flexible modes, and show how foldl can also be adapted.

What are the approaches to developing Prolog programs?

The early approach, characterised by 'Programming in Prolog' (Clocksin and Mellish, 1981) and 'How to Solve it in Prolog' (Coelho et al., 1982) was completely ad hoc. Prolog was a declarative language and it was easier/fun/exciting for many researchers to code in a declarative style. The approach did not lead to systematic Prolog training nor encourage high productivity.
Another approach is based on structural induction. Anyone who programs in a logic language notices that code manipulating lists typically has a similar, recursive structure. It is well known that recursion in programming languages is related to proofs by induction in mathematics. This observation can lead to a systematic approach where logic programs are developed along with structural induction proofs of correctness. A coherent account of structural induction as the basis for systematic logic program construction is given in (Deville, 1990). It is not clear how well Deville's approach scales nor how easy it is for the non-mathematically inclined to master.
An approach advocated for teaching Prolog has been to use templates or schemas. Students are invited to 'fill in the blanks'. The blanks can be thought of as parameters which are predicates. Advocates of schemas are O'Keefe (1990) and Gegg-Harrison (1991). Schemas have not been widely adapted, due partly to the fact that those who grasped Prolog programming didn't need them, and those that didn't had an extra level of complexity to learn, the language that the schemas were expressed in.
The recent work by Fuchs and colleagues (Fuchs and Fromherz, 1991), (Vasconcelos and Fuchs, 1995) attempts to combine the previous two approaches. A logical specification is (somehow) arrived at, and then schemas are used to guide the transformation of the specification to a program.
Stepwise enhancement (Lakhotia, 1989), (Sterling and Kirschenbaum, 1993), and (Sterling and Shapiro, 1994) was introduced by Sterling and colleagues in an attempt to simplify the teaching of complicated Prolog programs. Rather than have to explain higher order concepts, programmers and novices were taught programming techniques which at all times manipulated concrete Prolog programs.
Naish (1996) advocated a higher order style of programming, very similar to that used in functional programming languages. It was shown how some operations of stepwise enhancement, such as applying a technique to a skeleton and composition, could be elegantly reproduced in a higher order framework.

This paper continues the discussion between the authors on how best to characterise stepwise enhancement. The current work was sparked by the challenge to use the higher order approach to explain a 'complicated' program, the rule interpreter described in Section 17.4 of the second edition of 'The Art of Prolog' (Sterling and Shapiro, 1994). In explaining the program, a new method was formulated: the final program is built around an output type rather than an input type. In this paper we give another example of this technique, using a different interpreter. What has emerged is a better understanding of how types can drive program development, and how Naish and Sterling's views of systematic program construction via stepwise enhancement are complementary. The work relates to recent, exciting work in the functional programming community concerning shape, which sets our views on program development in a broader context and also suggests that logic program development is more general. This paper presents the dual view of stepwise enhancement.

Stepwise enhancement for Prolog program construction

The method of stepwise enhancement (Lakhotia, 1989) was originally conceived as an adaptation of stepwise refinement to Prolog. It was advocated as a way to systematically construct Prolog programs which exploits Prolog's high-level features. The key idea underlying stepwise enhancement is to visualise a program or solution in terms of its central control flow, or skeleton, and techniques which perform computations while the control flow of the skeleton is followed. Techniques can be developed independently and combined automatically using the method of composition.

The most common data structure for logic programs is the list, and many programs are based on skeletons for traversing lists. A tutorial example of using stepwise enhancement to develop a simple program is given in Chapter 13 of (Sterling and Shapiro, 1994). In this section we give the basic list processing program as Program 1 for reference, and a (slightly) more elaborate example with binary trees.

is_list([]).
is_list([X|Xs]) :- is_list(Xs).

Program 1  A skeleton for traversing a list (or definition of type list)

Programs 2a and 2b are skeleton programs for traversing binary trees with values only at leaf nodes. Program 2a, the left-hand program, does a complete traversal of the tree, while Program 2b, the right-hand program, traverses a single branch of the tree. Note that Program 2a can be viewed as a type definition of trees.

is_tree(leaf(X)).                       branch(leaf(X)).
is_tree(tree(L,R)) :-                   branch(tree(L,R)) :- branch(L).
    is_tree(L),                         branch(tree(L,R)) :- branch(R).
    is_tree(R).

Programs 2a and 2b  Skeletons for traversing a tree

Techniques capture basic Prolog programming practices, such as building a data structure or performing calculations in recursive code. Informally, a programming technique interleaves some additional computation around the control flow of a skeleton program. The additional computation might calculate a value or produce a side effect such as screen output. Syntactically, techniques may rename predicates, add arguments to predicates, add goals to clauses, and/or add clauses to programs. Unlike skeletons, techniques are not programs but can be conceived as a family of operations that can be applied to a program to produce a program.

A technique applied to a skeleton yields an enhancement. An enhancement which preserves the computational behaviour of the skeleton is called an extension.

We give examples of techniques. The two most commonly used techniques are the calculate and build techniques. They both compute something, a value or a data structure, while following the control flow of the skeleton. An extra argument is added to the defining predicate in the skeleton, and an extra goal is added to the body of each recursive clause. In the case of the calculate technique, the added goal is an arithmetic calculation; in the case of the build technique, the goal builds a data structure. In both cases, the added goal relates the extra argument in the head of the clause to the extra argument(s) in the body of the clause.

Two typical examples of the application of the calculate technique are given as Programs 3a and 3b. Both are extensions of Program 2a which traverses a binary tree with values at its leaves. The left-hand program (3a) computes the product of the value of the leaves of the trees. The extra argument in the base case is the value of the leaf node. In the recursive case, the extra goal says that the product of a tree is the product of its left subtree and its right subtree. The predicate is_tree has been renamed to prod_leaves. The right-hand program (3b), which computes the sum of the leaves, is very similar, the only difference being choice of names and the extra goal.

prod_leaves(leaf(X),X).                 sum_leaves(leaf(X),X).
prod_leaves(tree(L,R),Prod) :-          sum_leaves(tree(L,R),Sum) :-
     prod_leaves(L,LProd),                  sum_leaves(L,LSum),
     prod_leaves(R,RProd),                  sum_leaves(R,RSum),
     Prod is LProd*RProd.                   Sum is LSum+RSum.

Programs 3a and 3b  Extensions of Program 2a using calculate

Two enhancements of the same skeleton share computational behaviour. They can be combined into a single program which combines the functionality of each separate enhancement. Techniques can be developed independently and subsequently combined automatically. The (syntactic) operation for combining enhancements is called composition. This is similar in intent to function composition where the functionality of separate functions are combined into a single function. Program 4 is the result of the composition of Programs 3a and 3b.

prod_sum_leaves(leaf(X),X,X).
prod_sum_leaves(tree(L,R),Prod,Sum) :-
     prod_sum_leaves(L,LProd,LSum),
     prod_sum_leaves(R,RProd,RSum),
     Prod is LProd*RProd,
     Sum is LSum+RSum.

Program 4  The composition of two extensions

A different programming technique uses accumulators. The accumulator-calculate technique adds two arguments to the defining predicate in the skeleton. The first argument is used to record the current value of the variable in question and the second contains the final result of the computation. The base case relates the input and output arguments, usually via unification. One difference between calculate and accumulate-calculate is in the need to add an auxiliary predicate. Another is that goals and initial values need to be placed differently.

Program 5 shows the result of applying the accumulate-calculate technique to the tree traversal program, Program 2a. It computes the sum of the leaves of a binary tree and is comparable to Program 3b. In general, programs written with accumulator techniques will run more efficiently than the equivalent program written with calculate and build techniques, due to the way tail recursion is implemented in Prolog.

sum_leaves(Tree,Sum) :- accum_sum_leaves(Tree,0,Sum).

accum_sum_leaves(leaf(X),Accum,Sum) :-
    Sum is Accum + X.
accum_sum_leaves(tree(L,R),Accum,Sum) :-
    accum_sum_leaves(L,Accum,Accum1),
    accum_sum_leaves(R,Accum1,Sum).

Program 5  Extension of Program 2a using accumulate-calculate

Program 6 is an example of the application of the accumulate-build technique, also applied to Program 2a. It builds an inorder traversal of the leaves of the tree. There is no explicit arithmetic calculation, rather lists built by unification in the base clause. There is one trick here. Accumulators build structures in reverse order and hence the right subtree is traversed before the left subtree in order to have the final list in the correct order.

traversal(Tree,Xs) :- accum_leaves(Tree,[],Sum).

accum_leaves(leaf(X),Accum,[X|Accum]).
accum_leaves(tree(L,R),Accum,Xs) :-
    accum_leaves(R,Accum,Accum1),
    accum_leaves(L,Accum1,Sum),

Program 6  Extension of Program 2a using accumulate-build

The skeletons and techniques presented in this paper are all taken from Prolog, but stepwise enhancement is equally applicable to other logic programming languages, as discussed in Kirschenbaum, Michaylov and Sterling (1996). They claim that skeletons and techniques should be identified when a language is first used, in order to encourage systematic, effective program development. This learning approach should be stressed during teaching. They show that the skeletons and techniques for Prolog can be extended to constraint logic programming languages, notably CLP(R), concurrent logic programming languages such as Flat Concurrent Prolog and Strand, and higher order logic program languages, in particular Lambda-Prolog (Nadathur and Miller, 1988).

A higher order view of programming techniques

Naish (1996) argued for a higher order approach to programming in Prolog, based on similar techniques which are widely used in functional programming. One of the key steps in this approach is to develop suitable higher order predicates which can be used for a whole class of computations over a particular data structure. Modern functional languages have certain data types and higher order functions built in. For example, the polymorphic type list(T) and higher order function foldr which generalises the common simple recursion used to compute a value from a list. Program 7 demonstrates the use of foldr using Prolog syntax in the style of (Naish, 1996).

:- type list(T) ---> [] ; [T|list(T)].

foldr(F, B, [], B).
foldr(F, B, [A|As], R) :-
    foldr(F, B, As, R1),
    call(F, A, R1, R).

sum(As, S) :- foldr(plus, 0, As, S).
product(As, P) :- foldr(times, 1, As, P).
length(As, L) :- foldr(add1, 0, As, L).
add1(_, TailLen, Len) :- Len is TailLen + 1.

Program 7  Using foldr

In addition to the input list and result, foldr has two other arguments. One is the base case: what to return when the end of the list is reached. The other is a function - a predicate in the Prolog context. The predicate takes the head of a list and the result of folding the tail of a list to give the result of folding the whole list. The call/N predicates are available as builtins or library predicates in several Prolog systems. The first argument (a predicate) is called with the additional arguments added. For example, call(plus(A),R1,R) is equivalent to plus(A,R1,R), which is true if A+R1=R. In (Naish, 1996) an alternative higher order primitive, apply/3, is recommended due to its greater flexibility. In this paper we simply use call/N as it is more widely known.

Examples in (Naish, 1996) show how foldr can be used to compute both the sum and product in a single pass by using a pair of numbers for the base case, intermediate results and final answer. These higher order definitions can be optimised very effectively using a partial evaluator such as Mixtus (Sahlin, 1993). Further examples are given to show how predicates which are analogous to foldr can be constructed.

Incorporating shape

Recent work on shape (Jay and Cockett, 1994), (Jay, 1995), (Belle et al., 1996) and polytypism (Jeuring and Jansson, 1996), (Jansson and Jeuring, 1997) has formalised how many data types have certain higher order functions naturally associated with them. For example, map takes a list and produces another list of the same length. The shape of the output, the list structure, is the same as the shape of the input and the elements of the lists are related by the function map applies. The idea of map can be applied to any algebraic type such as lists and trees, and also arrays and matrices. A generic version of map applied to a binary tree will produce a binary tree of the same shape where the elements of the trees are related by the function map applies.

Similarly, foldr can be generalised to any algebraic type. For lists, a call to foldr specifies two things: what should be returned for the empty list, and what should be returned for a non-empty list, given the head and the result of folding the tail. For a general algebraic type we need to specify what should be returned for each constructor in the type, given the arguments of the constructor corresponding to type parameters and the result of folding the arguments which correspond to a concrete type (generally the type being defined recursively).

Consider the prod_leaves example given earlier as Program 3a. The overall operation is to fold a tree into a single number. We need to define the results of folding terms of the form leaf(X) and tree(L,R), given the folded versions of L and R.

Reconstructing the predicate is_tree as a definition of the type bt(T) and using the approach of (Naish, 1996) we arrive at Program 8: the definition of foldr for this tree type, and corresponding definitions of prod_leaves and sum_leaves. In (Naish, 1996) it was assumed that foldrbt would be written by a programmer who has the required degree of insight. It is now clear that this predicate can be generated automatically from a definition of the type.

:- type bt(T) ---> leaf(T) ; tree(bt(T),bt(B)).

foldrbt(TreeP, LeafP, leaf(X), Folded) :-
    call(LeafP, X, Folded).
foldrbt(TreeP, LeafP, tree(L, R), Folded) :-
    foldrbt(TreeP, LeafP, L, FoldedL),
    foldrbt(TreeP, LeafP, R, FoldedR),
    call(TreeP, FoldedL, FoldedR, Folded).

prod_leaves(T, P) :- foldrbt(times, =, T, P).
sum_leaves(T, P) :- foldrbt(plus, =, T, P).

Program 8  Extensions of Program 2a using foldr

'You'll take the high road and I'll take the low road'

The previous work of the authors sketched above can be seen as taking two different roads for program development starting from the same place (a type definition) and arriving at the same destination (the enhanced program), as shown in Figure 1.

Sterling suggested the low road in Figure 1, going from the type to the enhancement via the skeleton. To explain how to travel the low road is simple and does not require very abstract thinking or complex software tools. Such an approach to systematic program construction may be favoured by many programmers.

Naish suggested the high road in Figure 1, writing a version of foldr for the given type, using an instance of this foldr (a particular call to it) and then optimising, for example using a partial evaluator. The additional abstraction and concentration on semantics rather than syntax may be favoured by more experienced programmers using more advanced programming environments. The work on shape allows us to automatically obtain the foldr definition from the definition of the type.

                            high road
    type ---------> foldr -------------> foldr instance
      ^                                      |
      |                                      |
      |                                      | optimise
      |                                      |
      v                                      v
    skeleton ---------------------------> enhancement
             low road

Figure 1: Two roads to enhancement

There is a very simple mapping between algebraic types and a class of logic programs called RUL programs (Yardeni and Shapiro, 1990). RUL programs only have unary predicates. The argument of each clause head has the top level functor instantiated but all sub-terms are unique variables. Clause heads are mutually non-unifiable (thus predicates are deterministic when their arguments are instantiated). The arguments of all calls are variables which occur in the head and nowhere else in the body. Examples of RUL programs are is_list (Program 1) and is_tree (Program 2a).

The high road taken by functional functional programmers is equivalent to starting with a skeleton which is a RUL program and enhancing it with predicates which behave as functions, that is, they are deterministic and always succeed. The theory of functional programming can be used to prove results concerning composition of enhancements, for example, and generally give a theoretical justification for the idea of enhancement.

Generalising both approaches

Having found a class of programs for which the high and low roads are equivalent, we can see several ways to generalise both. The low road can have improved handling of non-recursive clauses and more flexible skeletons. The high road can be made more flexible by eliminating the type, mode and determinism restrictions inherited from functional programming.

Goals in base cases

The first (very simple) observation is that enhancements should allow general goals to be added to the base cases of enhancements. For pedagogical reasons, the original presentation classified enhancements into different categories according to the kind of goal added to recursive clauses and only allowed additional unifications to be added to facts. The limited abstraction simplifies learning but also restricts the scope of enhancements. For example, to find the sum of the squares of the leaves of a tree using foldrbt we could use foldrbt(plus, square, Tree, SumXX), where square takes a number and returns its square. The optimised program (or enhanced skeleton) would have a call to square in the base case.

Non-deterministic skeletons

Algebraic types correspond to RUL programs, in which are predicates are deterministic and only have single arguments. The stepwise enhancement paradigm has no such restriction: nondeterministic skeletons such as branch (Program 2b) and connected (Program 9) can be used.

connected(A, A).
connected(A0, A) :-
    edge(A0, A1),
    connected(A1, A).

Program 9  Transitive closure of the edge/2 relation

As noted in (Naish, 1996), higher order logic programs can also be nondeterministic and nondeterministic analogues of foldr can be constructed. A version of foldr for paths in a graph was written (using considerable intellectual effort) based on the simple transitive closure procedure connected, above. The close relationship between 'shape' and stepwise enhancement we have uncovered can be used to generalise the transformation from algebraic types (or RUL programs) to foldr functions. From an arbitrary skeleton (not necessarily a RUL program), we can generate an appropriate version of foldr as follows.

A procedure with A arguments and C clauses leads to a higher order procedure with C+A+1 arguments. It has C 'higher order' arguments and one additional 'result' argument. The recursive calls in the clause bodies have the same higher order arguments as the head and new variables for their results. Each clause also has an additional call/N with the higher order argument for that clause, the variables in the head which did not appear in any recursive body calls, result arguments of the body calls and the result argument of the head. If this call has only two arguments then call/2 is replaced by =/2 (the 'higher order' argument is simply a term which is returned for the base case). Mutual recursion can be treated in the same way (read recursive as mutually recursive), where C is the number of clauses in the set of mutually recursive procedures.

For list/1 and tree/1 the results are the foldr/4 and foldrbt/4 definitions given in programs 7 and 8. For branch and connected the results are in Program 10. The foldrcon procedure here is actually more general than the manually constructed version (which had a base case of V=FB instead of call/3) and can be used in the applications described in (Naish, 1996).

foldrb(FL, FR, FB, leaf(X), V) :-       foldrcon(FE, FB, A, A, V) :-
    call(FB, X, V).                         call(FB, A, V).
foldrb(FL, FR, FB, t(L,R), V) :-        foldrcon(FE, FB, A0, A, V) :-
    foldrb(FL, FR, FB, L, V1),              edge(A0, A1),
    call(FL, R, V1, V).                     foldrcon(FE, FB, A1, A, V1),
foldrb(FL, FR, FB, t(L,R), V) :-            call(FE, A0, V1, V).
    foldrb(FL, FR, FB, R, V2),
    call(FR, L, V2, V).

Program 10  Nondeterministic versions of foldr for branch and connected

Polymorphic types and higher order skeletons

Monomorphic types such as list correspond to first order skeletons (RUL programs, as we have seen). The work on shape and polytypism uses polymorphic types such as list(T), where T is a type parameter. Polymorphic types correspond to higher order skeletons with additional arguments. A type t(T1,T2) can be mapped to a predicate t(T1,T2,X) which succeeds if X is of type t(T1,T2). If the definition of type t contains the constructor c(E1,E2) (where E1 and E2 are type expressions) then t/3 will have the clause

t(T1, T2, c(X, Y)) :-
    call(

, X),
    call(

E2, Y).

Instances of call/N can be specialised if their first argument is a nonvariable. For example, the type list(T) leads to the predicate list/2 in Program 11. The type rtree(T), an M-way tree consisting of a term rt(X,Y) where X is of type T and Y is of type list(rtree(T)) can be defined using the predicate rtree/2.

list(T, []).                            % type rtree(T)---> rt(T,list(rtree(T))).
list(T, [X|Xs]) :-                      rtree(T, rt(X, RTs)) :-
    call(T, X), list(T, Xs).                call(T, X), list(rtree(T), RTs).

Program 11  Higher order skeletons for polymorphic types list(T) and rtree(T)

Higher order skeletons go against the spirit of simplicity embodied in stepwise enhancement and the control flow of the program above (mutual recursion through call/N) would certainly be confusing for a novice programmer. The advantage is that it saves having multiple copies of similar code. Rather than have a separate skeletons for simple lists, lists of lists, lists of rtrees et cetera, a single higher order definition can be given. A specialised definition of a type such as rtree(any) can be obtained by partial evaluation (eliminating all instances of call/N) and a version of foldr can be derived as described above. For rtree, the result is Program 12.

rtree_any(rt(X, RTs)) :-                foldrrt(FR, FC, B, rt(X, RTs), V) :-
    list_rtree_any(RTs).                    foldrlrt(FR, FC, B, RTs, V1),
                                            call(FR, X, V1, V).

list_rtree_any([]).                     foldrlrt(FR, FC, B, [], V) :-
list_rtree_any(RT.RTs) :-                   B = V.
    rtree_any(RT),                      foldrlrt(FR, FC, B, RT.RTs, V) :-
    list_rtree_any(RTs).                    foldrrt(FR, FC, B, RT, V1),
                                            foldrlrt(FR, FC, B, RTs, V2),
                                            call(FC, V1, V2, V).

Program 12  Specialised skeleton and version of foldr for rtree

Flexible modes

As well as allowing flexibility with types and nondeterminism, logic programs allow flexibility with modes. Rather than having fixed inputs and one output, as in functional programs, logic programs can potentially be run 'backwards' - computing what would normally be considered the input from a given 'output'. This flexibility can extend to higher order predicates, including those generated automatically from skeletons.

As an example, we will construct a meta interpreter for Prolog by using foldrrt backwards. A Prolog proof tree can be represented by an rtree, where each node contains (the representation of) a Prolog atom which succeeded. The foldrrt procedure can be used to check that an rtree of atoms is a valid proof tree for a particular program and goal. A proof tree is valid if the atom in the root is the goal and for each node in the tree containing atom A and children B1,B2,..., there is a program clause instance A:-B1,B2,.... The proof_of procedure in Program 13 represents clauses as a head plus a list of body atoms (procedure lclause) and can check that an rtree is a valid proof tree and return the atom which has been proved.

% Checks Proof is a valid proof tree and returns proved Atom;
% run backwards its a meta interpreter returning a proof tree
proof_of(Proof, Atom) :-
    foldrrt(lclause2, cons, [], Proof, Atom).

% checks H :- B is a clause instance; returns H
lclause2(H, B, H) :- lclause(H, B).

% clause/2 where clause bodies are lists
lclause(append([],A,A), []).
lclause(append(A.B,C,A.D), [append(B,C,D)]).
lclause(append3(A,B,C,D), [append(A,B,E),append(E,C,D)]).
...

cons(H, T, H.T).

Program 13  Interpreter constructed using rtree

With a suitable evaluation order, the code can also be run backwards. Given an atom, foldrrt acts as a meta interpreter, (nondeterministically) returning a proof tree for (a computed instance of) the atom. This is an example of constructing a program based on the type of its output, as discussed earlier. By utilising the natural association between a type and foldr and the flexible modes of logic programming, much of the process can be automated.

Foldl

In many cases, the higher order function foldl is preferably to foldr since it is tail recursive rather than left recursive (thus more efficient, at least for strict evaluation). It is not immediately obvious how to adapt foldl to general tree types rather than just lists. One possibility, suggested by Barry Jay is to perform a breadth first traversal (foldr uses a depth first traversal). This can be coded in a tail recursive fashion and is a familiar programming technique.

Another possibility, which we pursued initially and is used in (Belleannie et al 1994), is to use foldr with more complex data flow, using logic variables. The 'result' argument of foldr can be a pair of terms, one of which can be used as an input, and the accumulator style of programming can be used. If the accumulator is a list, we can think of foldr returning a 'difference list' (Sterling and Shapiro, 1994) instead of a list. With this style of programming, the data dependencies are such that the instances of call/N in the foldr definitions can be executed before the recursive call(s), allowing tail recursion.

However, we believe the most elegant and natural generalisation of foldl is evident in the stepwise enhancement paradigm. We adapted stepwise enhancement to produce higher order foldr procedures using a generalisation of the calculate and build techniques. By using accumulator techniques we can produce a foldl procedure for any skeleton. Accumulators are used much more widely than breadth first traversals and the code produced has simple data flow and can be translated into a functional language if the initial skeleton corresponds to an algebraic type.

The transformation is similar to the one described for foldr. The same number of higher order arguments are used and there is one 'output' argument, as before, but there is also an extra 'accumulator' argument. The call/N is the leftmost atom in the body and the accumulator and output arguments are 'threaded' through this and the recursive calls in the clause body in the familiar way (Sterling and Shapiro, 1994). The accumulator and output arguments can be made implicit by using the standard Definite Clause Grammar notation. The resulting version of foldl for lists is as follows.

% Explicit accumulator                  % DCG (implicit accumulator) version
foldl(FC, FB, [], A0, A) :-             foldl(FC, FB, []) -->
    call(FB, A0, A).                        call(FB).
foldl(FC, FB, X.Xs, A0, A) :-           foldl(FC, FB, X.Xs) -->
    call(FC, X, A0, A1),                    call(FC, X),
    foldl(FC, FB, Xs, A1, A).               foldl(FC, FB, Xs).

Program 14  Automatically derived foldl for lists

There are two differences between this version of foldl and the standard foldl for lists. The first is the argument order for the call to the FC 'function' is swapped. This is is not essential but allows the accumulator and output arguments to be implicit using the DCG notation. It is also consistent with foldr. The second difference is the use of a function called in the base case. The standard version of foldl simply returns the accumulator when the end of the list is reached. This is equivalent to our version of foldl with the identity function (=/2 in Prolog) as the function for the base case.

For 'linear' data structure such as lists, calling a function when the base case is reached adds no real power. The function can always be called at the top level after foldl has returned, with the same effect. However, for tree structures, a function application at the base case is often essential. Below are the versions of foldl for the bt type and connected procedure. Note that for prod_leaves (sum_leaves) the multiplication (addition) is done at the leaves, as in Program 5.

foldlbt(F, B, leaf(X)) -->              foldlcon(F, B, A, A) -->
    call(B, X).                             call(B, A).
foldlbt(F, B, t(L,R)) -->               foldlcon(F, B, A0, A) -->
    call(F),                                call(F, A0),
    foldlbt(F, B, L),                       {edge(A0, A1)},
    foldlbt(F, B, R).                       foldlcon(F, B, A1, A).

prod_leaves(T, P) :-                    % non-looping connected; returns path
    foldlbt(=, times, T, 1, P).         con_no_loop(A0, A, As) :-
                                            foldlcon(cons_nm, cons, A0, A, [], As).
sum_leaves(T, P) :-                     cons_nm(A0, As, A0.As) :-
    foldlbt(=, plus, T, 0, P).              not member(A0, As).

rev_traverse(Tree, Xs) :-
    foldlbt(=, cons, Tree, [], Xs).

Program 15  Versions of foldl for is_tree and connected and examples of use

For foldlcon, the call to edge is not recursive, so accumulator arguments are not added (braces are used to indicate this in the DCG notation). From foldlcon it is simple to code con_no_loop which finds connected nodes but avoids cycles. The accumulator is the list of nodes visited so far, in reverse order. The procedure which adds a new node to the accumulator, cons_nm, fails if the node is already on the path. The path is also be returned at the top level.

Since the skeleton is_tree is a RUL program and hence equivalent to an algebraic type, foldlbt is deterministic and behaves as a higher order function over that type. The threading of the accumulator and result arguments in the body of a clause is equivalent to nesting of functional expressions. For completeness, we give the equivalent Haskell code in Program 16.

>data Bt a = Leaf a | Tree (Bt a) (Bt a)

>foldlbt :: (a->a)->(b->a->a)->(Bt b)->a->a
>foldlbt f b (Leaf x) a = b x a
>foldlbt f b (Tree l r) a =
>    foldlbt f b r (foldlbt f b l (f a))

>sum_leaves t = foldlbt (id) (+) t 0

Program 16  Haskell version of foldl for is_tree/type bt

There are actually two possible versions of foldlbt, depending on the order in which the two subtrees are visited. By swapping the two recursive calls in the DCG version, the argument threading is also changed, leading to a logically different procedure. The procedure rev_traverse in Program 15 returns the reverse of the traversal returned by Program 6. Using the other version of foldlbt would result in the same traversal order. The choice of traversal orders and additional argument in foldl are consistent with the intuition that programming with accumulators or foldl is more complicated than using simple recursion or foldr.

Further work

A category theoretic reconstruction of our method for deriving versions of foldl (restricted to RUL programs) may produce some deeper insights and should extend the understanding of shape and polytypism for functional languages. A more theoretical treatment of the higher order logic programs we derive may also be worthwhile. For example, our approach can be adapted to logic programming languages such as Lambda-Prolog (Nadathur and Miller, 1988) which have higher order constructs with well defined semantics.

Further generalisations of foldr and foldl could also be devised (Belleannie et al 1994). For example, we could add higher order calls to the start and end of each clause body, or even between each call as well. Other 'shapely' operations such as zip2 (which takes two lists and returns a list of pairs) could also be generalised, as suggested by (Jay, 1995). We note that more expressive higher order predicates are not necessarily better in practice. There is no benefit in using a generalised foldr which is applicable in five percent more situations if each use is ten percent more complicated than foldr. The ideal situation is to have a collection of higher order predicates or functions with a good tradeoff between applicability and complexity. Such sets can be developed over time, based on coding patterns which occur in practice.

Conclusions

Research into systematic program construction has the important aim of elevating coding from the realm of arts and entertainment to science and engineering. In this paper we have built a bridge between the pragmatic syntax-based approach of stepwise enhancement and the very theoretical semantic approach of shape and polytypism. Despite the gulf between the research methodologies behind these two approaches, there is a very close relationship between them. This is pleasing in itself but also allows us to see ways in which both approaches can be generalised.

From the work on shape and polytypism in functional languages we have the generality of arbitrary functions as parameters, polymorphic types and the automatic synthesis of certain higher order functions from algebraic types. From the work on stepwise enhancement in logic programming we have the generality of nondeterminism, additional arguments, flexible modes and use of accumulators. By combining the advantages of both approaches, we have shown how more code in both functional and logic programming languages can be constructed in a systematic and partially automated way.

References

Belleannie, C., Brisset, P., Ridoux O., A Pragmatic Reconstruction of Lambda-Prolog, Publication Interne IRISA no. 877, October 1994 (revised 1997)

Belle, G., Jay, C. B. and Moggi, E., Functorial ML, Proc. PLILP '96, Springer LNCS 1140, pp. 32-46, 1996

Clocksin, W. and Mellish, C. Programming in Prolog, Springer-Verlag, 1981

Coelho, H., Cotta, J. and Pereira, L.M. How to Solve it with Prolog, Laboratorio Nacional de Engenharia Civil, Portugal, 1982

Deville, Y. Logic Programming: Systematic Program Development, Addison Wesley, 1990

Fuchs, N. and Fromherz, M. Schema-based Transformations of Logic Programs, Proc. 5th International Workshop on Logic Program Synthesis and Transformation, Proietti, M. (ed.), pp. 111-125, Springer-Verlag, 1991.

Gegg-Harrison, T. Learning Prolog in a Schema-Based Environment, Instructional Science, 20:173-192, 1991.

Gegg-Harrison, T. Representing Logic Program Schemata in Lambda-Prolog, Proc. 12th International Logic Programming Conference (ed. L. Sterling), pp. 467-481, MIT Press, 1995

P. Jansson and J. Jeuring. PolyP - a polytypic programming language extension. In Conference Record of POPL '97: The 24th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 470--482, 1997

Jay, B., A semantics for shape, Science of Computer Programming, 25, pages 251-283, 1995

J. Jeuring and P. Jansson. Polytypic programming. In J. Launchbury, E. Meijer and T. Sheard Advanced Functional Programming, LNCS 1129, pages 68--114, Springer-Verlag, 1996.

Jay, C.B. and Cockett, J.R.B. Shapely Types and Shape Polymorphism, Proc. Programming Languages and Systems - ESOP '94: 5th European Symposium on Programming, (ed. D. Sannella), Springer LNCS, pp. 302-316, Edinburgh, U.K., April 1994

Jay, C.B., A semantics for shape, Science of Computer Programming, 25, 1995

Kirschenbaum, M., Michaylov, S. and Sterling, L.S. Skeletons and Techniques as a Normative Approach to Program Development in Logic-Based Languages, Proc. ACSC'96, Australian Computer Science Communications, 18(1), pp. 516-524, 1996

Lakhotia, A. A Workbench for Developing Logic Programs by Stepwise Enhancement, Ph.D. Thesis, Case Western Reserve University, 1989.

Nadathur, G., Miller D., An Overview of Lambda-Prolog, Proceedings of ICLP/SLP, pages 810-827, MIT Press, 1988

Naish, L. Higher Order Logic Programming in Prolog, Proc. Workshop on Multi-Paradigm Logic Programming, JICSLP'96, Bonn, 1996 (Also available as Tech. Report 96/2, Dept. Computer Science, University of Melbourne, 1996.)

O'Keefe, R. The Craft of Prolog, MIT Press, 1990

Sahlin, D. Mixtus: An Automatic Partial Evaluator for Full Prolog, New Generation Computing, 12(1), pp. 7-51, 1993

Sterling, L. and Kirschenbaum, M. Applying Techniques to Skeletons, in Constructing Logic Programs, (ed. J.M. Jacquet), pp. 127-140, Wiley, 1993.

Sterling, L.S. and Shapiro, E.Y. The Art of Prolog, 2nd edition, MIT Press, 1994.

Sterling, L.S. and Yalcinalp, U. Logic Programming and Software Engineering - Implications for Software Design, Knowledge Engineering Review, 11(4), pp. 333-345, 1996

Vasconcelos, W. and Fuchs, N.E. An Opportunistic Approach for Logic Program Analysis and Optimisation using Enhanced Schema-based Transformations, Proc. LOPSTR'95, (ed. M. Proietti), Springer LNCS, pp. 174-188, 1995

Yardeni, E. and Shapiro E.Y., A Type System for Logic Programs, Journal of Logic Programming, 10(2), pp125-154, 1990