Foldl

Next: Tools for program development Up: Generalising Both Approaches Previous: Flexible Modes

Foldl

In many cases, the higher order function foldl is preferable to foldr since it is tail recursive rather than left recursive (and thus may be more efficient, at least for strict evaluation). Note that the complexity of foldl can actually be worse in some cases (depending on the operation) and if the operation is not associative the result of using foldl will generally be different from that using foldr. It is not immediately obvious how to adapt foldl to general tree types rather than just lists. One possibility, suggested by Barry Jay is to perform a breadth first traversal (foldr uses a depth first traversal). This can be coded in a tail recursive fashion and is a familiar programming technique.

Another possibility, which we pursued initially and is used in (Belleannie et al [BBR97]), is to use foldr with more complex data flow, using logic variables. The result argument of foldr can be a pair of terms, one of which can be used as an input, and the accumulator style of programming can be used. If the accumulator is a list, we can think of foldr returning a difference list (Sterling and Shapiro, [SS94]) instead of a list. With this style of programming, the data dependencies are such that the instances of call/N in the foldr definitions can be executed before the recursive call(s), allowing tail recursion.

However, we believe the most elegant and natural generalisation of foldl is evident in the stepwise enhancement paradigm. We adapted stepwise enhancement to produce higher order foldr procedures using a generalisation of the calculate and build techniques. By using accumulator techniques we can produce a foldl procedure for any skeleton. Expert Prolog programmers use accumulators much more often than breadth first traversals and the code produced has simple data flow and can be translated into a functional language if the initial skeleton corresponds to an algebraic type.

=1pc0 Definition 6.2.(foldl transformation) The transformation is similar to the one described for foldr. The same number of higher order arguments are used and there is one output argument, as before, but there is also an extra accumulator argument. The call/N is the leftmost atom in the body and the accumulator and output arguments are ``threaded'' through this and the recursive calls in the clause body in the familiar way (Sterling and Shapiro, [SS94]).

The accumulator and output arguments can be made implicit by using the standard Definite Clause Grammar notation. The resulting version of foldl for lists is as follows.

% Explicit accumulator version
foldl(FC, FB, [], A0, A) :-            
    call(FB, A0, A).                  
foldl(FC, FB, [X|Xs], A0, A) :-        
    call(FC, X, A0, A1),            
    foldl(FC, FB, Xs, A1, A).      

% DCG (implicit accumulator) version
foldl(FC, FB, []) -->
    call(FB).
foldl(FC, FB, [X|Xs]) -->
    call(FC, X),
    foldl(FC, FB, Xs).

Program 14: Automatically derived foldl for lists

There are two differences between this version of foldl and the standard foldl for lists defined in functional programming languages. The first is the argument order for the call to the FC ``function'' is swapped. This is not essential but allows the accumulator and output arguments to be implicit using the DCG notation. It is also consistent with foldr. The second difference is the use of a function called in the base case. The standard version of foldl simply returns the accumulator when the end of the list is reached. This is equivalent to our version of foldl with the identity function (=/2 in Prolog) as the function for the base case.

For data structure such as lists which are ``linear'' and have no data in the (only) leaf, calling a function when the base case is reached adds no real power. The function can always be called at the top level after foldl has returned, with the same effect. However, for tree structures or linear structures with data in the leaf, a function application at the base case is often essential. Below are the versions of foldl for the bt type and connected/2 procedure. Note prod_leaves/2 (sum_leaves/2) has the multiplication (addition) at the leaves, as in Program 5. These predicates are equivalent to the previous versions, in Program 8, if plus/3 and times/3 are associative.

foldlbt(F, B, leaf(X)) -->              
    call(B, X).                        
foldlbt(F, B, t(L,R)) -->             
    call(F),                         
    foldlbt(F, B, L),               
    foldlbt(F, B, R).              

prod_leaves(T, P) :-              
    foldlbt(=, times, T, 1, P).  
                                
sum_leaves(T, P) :-            
    foldlbt(=, plus, T, 0, P).

rev_traverse(Tree, Xs) :-
    foldlbt(=, cons, Tree, [], Xs).

foldlcon(F, B, A, A) -->
    call(B, A).
foldlcon(F, B, A0, A) -->
    call(F, A0),
    {edge(A0, A1)},
    foldlcon(F, B, A1, A).

% non-looping connected; returns path
con_no_loop(A0, A, As) :-
    foldlcon(cons_nm, cons, A0, A, [], As).

cons_nm(A0, As, [A0|As]) :-
    not member(A0, As).

Program 15: Versions of foldl for is_tree/1 and connected/2

For foldlcon/6, the call to edge/2 is not recursive, so accumulator arguments are not added (braces are used to indicate this in the DCG notation). From foldlcon/2 it is simple to code con_no_loop/3 which finds connected nodes but avoids cycles. The accumulator is the list of nodes visited so far, in reverse order. The procedure which adds a new node to the accumulator, cons_nm/3, fails if the node is already on the path. The path is also be returned at the top level.

Since the skeleton is_tree/1 is a RUL program and hence equivalent to an algebraic type, foldlbt/4 is deterministic and behaves as a higher order function over that type. The threading of the accumulator and result arguments in the body of a clause is equivalent to nesting of functional expressions. For comparison, we give the equivalent Haskell code in Program 16.

>data Bt a = Leaf a | Tree (Bt a) (Bt a)
>foldlbt :: (a->a)->(b->a->a)->(Bt b)->a->a
>foldlbt f b (Leaf x) a = b x a
>foldlbt f b (Tree l r) a =
>    foldlbt f b r (foldlbt f b l (f a))

>sum_leaves t = foldlbt (id) (+) t 0

Program 16: Haskell version of foldl for is_tree/type bt

There are actually two possible versions of foldlbt/5, depending on the order in which the two subtrees are visited. By swapping the two recursive calls in the DCG version, the argument threading is also changed, leading to a logically different procedure. The procedure rev_traverse/2 in Program 15 returns the reverse of the traversal returned by Program 6. Using the other version of foldlbt/5 would result in the same traversal order. The choice of traversal orders and additional argument in foldl are consistent with the intuition that programming with accumulators or foldl is more complicated than using simple recursion or foldr.

Next: Tools for program development Up: Generalising Both Approaches Previous: Flexible Modes

Lee Naish
2000-07-26