|
Wil Baden 2000-07-02
Donald E. Knuth, "Structured Programming with goto
Statements" (1974), reprinted in Literate Programming
(1992), discusses situations where gotos are appropriate. In
this page I will transcribe his examples from pseudo Algol
to ortho Forth.
Knuth's examples use arrays extensively. I define an array with the pattern:
CREATE Array Size CELLS ALLOT
The address of the i'th element _&Array[i]_ is:
i 'TH Array
The value of the i'th element Array[i] is --
i 'TH Array @
'TH is defined for 32-bit arithmetic as --
: 'TH ( i "arrayname" -- addr )
S" 2 LSHIFT " EVALUATE
BL WORD COUNT EVALUATE
S" + " EVALUATE
; IMMEDIATE
For 16-bit arithmetic, replace 2 LSHIFT with 1 LSHIFT or
2*.
A macro is used here and in other definitions for efficiency.
With minimum peep-hole optimization, I expect 2 'TH A to
become A+8.
The definition of OFF hopes to be optimized.
: OFF ( addr -- ) S" 0 SWAP ! " EVALUATE ; IMMEDIATE
++ increments the contents of an address.
: ++ ( addr -- ) S" 1 SWAP +! " EVALUATE ; IMMEDIATE
If you already have efficient versions, use them.
TEXTKnuth's description
Let's suppose that we want to search a table A[0] ... A[m-1] of distinct values, in order to find where a given value x appears; if it is not present in the table, we want to insert it as an additional entry. Let's suppose further that there is another array B, where B[i] equals the number of times we have searched for the value A[i].
I have changed A[1] ... A[m] to A[0] ... A[m-1] here and similarly with other examples.
: Example-1 ( x -- i )
0 BEGIN ( x i)
DUP m @ <
WHILE
2DUP 'TH A @ = IFGOTO found
1+
REPEAT
\ Not found.
m ++
2DUP 'TH A !
DUP 'TH B OFF
LABEL found
NIP ( i)
DUP 'TH B ++ ;
In 1971 Knuth and Robert W. Floyd gave that as an "example
of a typical program for which the ordinary capabilities of
while and if statements are inadequate."
To remove the goto, Knuth uses "short-circuit and". This and "short-circuit or" are defined here.
: ANDIF ( x -- ) S" DUP IF DROP " EVALUATE ; IMMEDIATE
: ORIF ( x -- ) S" DUP 0= IF DROP " EVALUATE ; IMMEDIATE
: Example-1a ( x -- i )
0 BEGIN ( x i)
DUP m @ <
ANDIF
2DUP 'TH A @ = NOT
THEN
WHILE 1+ REPEAT
DUP m @ = IF
m ++
2DUP 'TH A !
DUP 'TH B OFF
THEN
NIP ( i)
DUP 'TH B ++ ;
He also introduces a test and normal branch at the end of the function. (The branch takes place except at the first occurrence of each x.)
In Example-1 for every A[i] that is passed over in the search, there are two tests and one branch. In Example-1a for every A[i] that is passed over, there are two tests, a little extra processing, and a branch.
Example-1 was important for stimulating the relaxing of structured programming restraints. In strictly structured programming ("strictured programming") a loop may have a single exit at the beginning or end, and a function may have a single exit at the end. Most of recent profane languages now allow multiple exits.
In Classical Forth and Standard Forth a BEGIN-loop may have more than one exit. A goto-free slightly improved equivalent of Example-1 can be written --
: Example-1.2nd ( x -- i )
-1 BEGIN 1+ ( x i)
DUP m @ <
WHILE
2DUP 'TH A @ =
UNTIL
\ Found.
NIP ( i)
DUP 'TH B ++
EXIT
THEN ( x i)
\ Not found.
m ++
2DUP 'TH A !
NIP ( i)
1 OVER 'TH B ! ;
Knuth points out that the technique in all versions of Example 1 is almost never a good way to search an array for x. Example 2 beats Example 1 because it makes the inner loop considerably faster.
: Example-2 ( x -- i )
DUP m @ 'TH A !
-1 BEGIN 1+ ( x i)
2DUP 'TH A @ =
UNTIL
NIP ( i)
DUP m @ = IF
m ++
DUP 'TH B OFF
THEN
DUP 'TH B ++ ;
For every A[i] passed over in the search, Example 1 does two tests and one branch. Example 2 does one test and one branch.
In Example-2a, Knuth uses gotos and labels to double up the testing within Example 2. Once again in Forth, the gotos and labels are not needed.
In Example-2a, for every two A[i] passed over in the search, there are two tests and one branch. Knuth estimates a 12 percent saving. My tests showed 15 percent faster.
: NOT ( x -- flag ) S" 0= " EVALUATE ; IMMEDIATE
: Example-2a ( x -- i )
DUP m @ 'TH A !
-2 BEGIN 2 + ( x i)
2DUP 'TH A @ = NOT
WHILE
2DUP 1+ 'TH A @ =
UNTIL
1+
THEN
NIP ( i)
DUP m @ = IF
m ++
DUP 'TH B OFF
THEN
DUP 'TH B ++ ;
...
In Example 3, Knuth solves the problem of Example 1 and 2 using hash coding. As in Example 1 he uses a goto and a label. Once again these are not needed with the multiple exit features of Forth.
Here x is never zero, and the A array is initially erased.
#Array-Size is somewhat larger than what the number of
elements will be. x Hash returns a non-negative number
less than #Array-Size. Variable m disappears.
: Example-3 ( x -- i )
DUP Hash ( x i)
BEGIN
DUP 'TH A @
WHILE
2DUP 'TH A @ = NOT
WHILE
ORIF #Array-Size THEN
1-
REPEAT
NIP ( i)
DUP 'TH B ++
EXIT
THEN ( x i)
2DUP 'TH A !
NIP ( i)
1 OVER 'TH B ! ;
Knuth improves the efficiency of Example-3 by testing for a match first. Forth still does not need goto.
: Example-3a ( x -- i )
DUP Hash ( x i)
BEGIN
2DUP 'TH A @ = NOT
WHILE
DUP 'TH A @
WHILE
ORIF #Array-Size THEN
1-
REPEAT
2DUP 'TH A !
DUP 'TH B OFF
THEN
NIP ( i)
DUP 'TH B ++ ;
Knuth improves the efficiency of Example-3 even more by eliminating the testing of the index for zero when the non-matching element is non-zero. This keeps A[0] equal to 0.
: Example-3b ( x -- i )
DUP Hash ( x i)
BEGIN
2DUP 'TH A @ = NOT
WHILE
DUP 'TH A @ IF
1-
ELSE
DUP 0= IF DROP
#Array-Size 1-
ELSE
2DUP 'TH A !
NIP ( i)
1 OVER 'TH B !
EXIT
THEN THEN
REPEAT ( x i)
NIP ( i)
DUP 'TH B ++ ;
Example-3b is not easy to read, nor is Example-3. I'll keep them in my garage until I need them.
"Structured Programming with goto Statements" was first
published 26 years ago (1974). I'm sure that it was
instrumental in extending the collection of politically
correct control structures.
It's ironic that this first set of examples can all be done in Forth without goto and as efficiently as the goto versions.
In today's languages, these examples are not candidates for GOTO.
It may be different in Knuth's next set of examples.
Knuth's description --
Suppose we are processing a stream of text; and that we want to read and print the next character from the input; however, if that character is a slash ("/") we want to "tabulate" instead (i.e., to advance in the output to the next tab-stop position on the current line ); however, two consecutive slashes means a "carriage return" (i.e., to advance in the output to the beginning of the next line). After printing a period (".") we also want to insert an additional space in the output.
: CASE? ( x y -- true | x false )
S" OVER = DUP IF NIP THEN " EVALUATE ; IMMEDIATE
: Example-4 ( -- )
Read-Char ( c)
[CHAR] / CASE? IF ( )
Read-Char ( c)
[CHAR] / CASE? IF ( )
CR
GOTO done
THEN ( c)
Tabulate
THEN
DUP EMIT
[CHAR] . = IF SPACE THEN ( )
LABEL done ;
Once again Forth does without goto.
: Example-4 ( -- )
Read-Char ( c)
[CHAR] / CASE? IF ( )
Read-Char ( c)
[CHAR] / CASE? IF ( )
CR
EXIT
THEN ( c)
Tabulate
THEN
DUP EMIT
[CHAR] . = IF SPACE THEN ( )
;
Knuth comments
In practice we occasionally run into situations where a sequence of decisions is made via nested
if ... then ... else ...s, and then two or more of the branches merge into one. We can manage such decision-table tasks withoutgotos by copying the common code into each place, or by defining it as a procedure, but this does not seem conceptually simpler than to make such casesgotoa common part of the program.
...
Knuth's description
This is part of the well-known "tree search and insertion" scheme, where a binary search tree is being represented by three arrays. A[i] denotes the information stored at node number i, and L[i], R[i] are the respective node numbers for the roots of that node's left and right subtrees; empty subtrees are represented by zero. The program searches down the tree until finding an empty subtree where is can be inserted. ... For convenience I have assumed in this example that x is not already present in the search tree.
: Example-5 ( x -- j )
0 \ Head of Tree. ( x i)
LABEL compare:left LABEL compare:right
2DUP 'TH A @ >
DUP 'TH L @ IF
'TH L @
GOTO compare:left
ELSE
Next-Node @ OVER 'TH L !
GOTO insert
THEN
ELSE
DUP 'TH R @ IF
'TH R @
GOTO compare:right
ELSE
Next-Node @ OVER 'TH R !
GOTO insert
THEN
THEN
LABEL insert DROP ( x)
Next-Node @ ( x j)
2DUP 'TH A !
NIP ( j)
DUP 'TH L OFF
DUP 'TH R OFF
Next-Node ++ ;
Knuth first eliminates goto by using a local variable. I'll
use the return stack to hold it. This of course takes more
processing, but I think it is a typical Forth approach.
: Example-5a ( x -- j )
TRUE >R
0 BEGIN ( x i)
R@
WHILE
2DUP 'TH A @ > IF
DUP 'TH L @ IF
'TH L @
ELSE
Next-Node @ OVER 'TH L !
FALSE R> DROP >R
THEN
ELSE
DUP 'TH R @ IF
'TH R @
ELSE
Next-Node @ OVER 'TH R !
FALSE R> DROP >R
THEN
THEN
REPEAT R> DROP DROP ( x)
Next-Node @ ( x j)
2DUP 'TH A !
NIP ( j)
DUP 'TH L OFF
DUP 'TH R OFF
Next-Node ++ ;
Knuth introduces C. T. Zahn's situation indicator as a new
form of control structure. This can be emulated with GOTO.
Here is an illustration.
: Example-5b ( x -- j )
0 BEGIN ( x i)
2DUP 'TH A @ > IF
DUP 'TH L @ 0= IFGOTO left-leaf-hit
'TH L @
ELSE
DUP 'TH R @ 0= IFGOTO right-leaf-hit
'TH R @
THEN
AGAIN
LABEL left-leaf-hit
Next-Node @ OVER 'TH L !
GOTO insert
LABEL right-leaf-hit
Next-Node @ OVER 'TH R !
LABEL insert DROP ( x)
Next-Node @ ( x j)
2DUP 'TH A !
NIP ( j)
DUP 'TH L OFF
DUP 'TH R OFF
Next-Node ++ ;
So far Knuth's examples have been concerned with removing gotos. Forth's multiple exits take care of most of his problems. Example-5b is a good candidate for GOTO.
This is a good place to stop in Knuth's article. The rest of the examples are not conducive to being transcribed. The challenge to the reader is to give a goto-less version of Example-5b that's as easy to understand.
Here is a direct transcription of Example-5b into Forth
logic, using function call for gotos, and bottom-up
logic replacing top-down logic.
: Insert-Node ( x i -- j)
DROP ( x)
Next-Node @ ( x j)
2DUP 'TH A !
NIP ( j)
DUP 'TH L OFF
DUP 'TH R OFF
Next-Node ++ ;
: Left-Leaf-Hit ( x i -- same )
Next-Node @ OVER 'TH L !,
Insert-Node ;
: Right-Leaf-Hit ( x i -- same )
Next-Node @ OVER 'TH R !
Insert-Node ;
: Example-5b ( x -- j )
0 BEGIN ( x i)
2DUP 'TH A @ > IF
DUP 'TH L @ 0=
IF Left-Leaf-Hit EXIT THEN
'TH L @
ELSE
DUP 'TH R @ 0=
IF Right-Leaf-Hit EXIT THEN
'TH R @
THEN
AGAIN ;
A final note from Knuth.
There is one remaining use of
gotofor which I have never seen a good replacement, and in fact it's a situation where I still thinkgotois the right idea. This situation typically occurs after a program has made a multiway branch to a rather large number of different but related cases. A little computation often suffices to reduce one case to another; and when we've reduced one problem to a simpler one, the most natural thing is for our program togotothe routine that solves the simpler problem.