Forth to the past - qsort

Question 1

To make a long story short, I must revive my Forth. It is rusty; last time I did anything serious in Forth was 30 years ago (man, I am old). As an exercise, I translated an STL-like implementation of qsort. It works (if you want to test it, install gforth). It performs well comparing to SDCC-compiled native C (I don't have Keil license). I know where my bottlenecks are.

I am mostly interested in how readable the code is; how much did I sin against Forth spirit; if there are modern Forth coding conventions, how does this code fare.

-1 cells constant -cell
: cell- -cell + ;
: xchg ( a0 a1 -- ) 2dup @ >r @ swap ! r> swap ! ;
: unguarded_linear_insert ( last val -- )
 >r
 begin cell- dup @ dup r@ > while over cell+ ! repeat
 drop r> swap cell+ !
;
: unguarded_insertion_sort ( first last -- )
 >r
 begin dup r@ <> while dup dup @ unguarded_linear_insert cell+ repeat
 drop rdrop
;
: linear_insert ( first last val -- )
 >r over @ r@ swap <
 if
 begin 2dup <> while cell- dup @ over cell+ ! repeat
 drop r> swap !
 else
 r> unguarded_linear_insert drop
 then
;
: insertion_sort ( first last -- )
 2dup <>
 if
 >r
 dup begin cell+ dup r@ <> while 2dup dup @ linear_insert repeat
 rdrop
 then
 2drop
;
: unguarded_partition ( l f p -- cut )
 >r swap cell-
 begin dup @ r@ > while cell- repeat swap
 begin dup @ r@ < while cell+ repeat swap
 begin 2dup < while
 2dup xchg cell- swap cell+ swap
 begin dup @ r@ > while cell- repeat swap
 begin dup @ r@ < while cell+ repeat swap
 repeat
 swap drop begin dup @ r@ < while cell+ repeat
 rdrop
;
: median_of_3 ( n0 n1 n2 -- n )
 >r 2dup > if swap then
 r> 2dup > if swap then drop
 2dup < if swap then drop
;
: pivot ( l f len -- l f p )
 2 / -cell and over + ( l f m )
 @ >r over cell- @ over @ r>
 median_of_3
;
: quicksort_loop ( l f t -- )
 begin dup >r -rot 2dup - dup r@ > while
 pivot
 >r 2dup r> unguarded_partition
 >r 2dup + r@ swap r> 2 * < if
 rot over r> recurse swap else
 dup -rot swap r> recurse then
 rot
 repeat
 rdrop 2drop 2drop
;
: quicksort ( l f )
 2dup 2 cells quicksort_loop
 dup 2 cells + dup -rot insertion_sort
 swap unguarded_insertion_sort
;
\ Testing and benchmarking
include random.fs
0 random
: shuffle ( n a -- )
 >r dup
 begin ?dup while 1- 2dup cells r@ + swap rnd swap mod cells r@ + xchg repeat
 rdrop 2drop
;
: fill-i ( n a -- )
 >r
 begin ?dup while 1- dup dup cells r@ + ! repeat
 rdrop
;
: fill-ir ( n a )
 begin 2dup ! cell+ swap 1- ?dup while swap repeat drop
;
1024 1024 * constant total-size
variable data total-size cells allocate throw data !
total-size data @ fill-i
total-size data @ shuffle
variable ssrt total-size cells allocate throw ssrt !
: exetime utime 2>r quicksort execute utime 2r> d- ." dtime " . cr ;
variable size 16 size !
variable logs 4 logs !
: run ( fp )
 >r
 begin total-size size @ >= while
 data @ ssrt @ size @ cells move
 ssrt @ size @ cells + ssrt @
 size @ . cr
 .s cr
 r@ hex. cr
 r@ exetime
 size @ . size @ m*/
 size @ 2 * size !
 repeat
 rdrop
;
' quicksort run
bye

Question 2

I see some things that could be improved with this code.

Fix `exetime`

The exetime word is defined like this:

: exetime utime 2>r quicksort execute utime 2r> d- ." dtime " . cr ;

There are a number of problems with that. First, the formatting is not very good. I'd prefer to see a stack comment, for example. Second, this should probably take a parameter so that any word can be timed (this appears to be the intent from run). Third, that last . should be d. to display the entire delta time rather than just half of it.

: exetime ( fp -- ) 
 utime 2>r execute utime 2r> d- ." dtime " d. cr 
;

Use smaller words

Again using exetime as an example, there are actually two things done by the word as defined. The first is that it calculates a delta time, and the second is that it prints that time. I'd make those separate words.

Follow Forth convention

It's common to use prefixes to simplify code. For example, everywhere ssrt is used, it's followed by @. For that reason, I'd define and use word like these:

: @ssrt ssrt @ ;
: @size size @ ;
: !size size ! ;

Define and use common idioms

A number of cases in the code there is a sequence like this: size @ . but there is a common idiom for that and it's built in to many Forth implementations. If it's not, it's easy to define:

: ? ( a -- ) @ . ;

It would be used like this: size ?. Alternatively, one could use the prefix idiom which is associated with a particular value:

: ?size ( -- ) size @ . ;

Eliminate unused variables

The logs variable is defined but never used. This needlessly clutters the code.

Simplify control structure

Instead of using begin .. while .. repeat, it's often the case that one can use begin .. until instead and simplify the code. Using the refactored smaller words as suggested above, here's what run looks like now:

: run ( fp )
 >r
 begin 
 copy-array
 array-limits r@ exetime
 ." size " ?size ." dtime " d. cr
 size2x total-size > until
 rdrop
;

These are the refactored words:

: exetime ( fp -- ) utime 2>r execute utime 2r> d- ;
: copy-array ( -- ) data @ ssrt @ size @ cells move ;
: array-limits ( -- alo ahi ) ssrt @ dup size @ cells + swap ;
: @size ( -- n ) size @ ;
: ?size ( -- ) @size . ;
: size2x ( -- n ) size @ 2 * dup size ! ;

Make sure comments don't lie

The shuffle word starts like this:

: shuffle ( n a -- )

However, that's not correct. It should instead be:

: shuffle ( n1 n2 a -- )

Use refactoring to improve speed

The code currently contains this:

: median_of_3 ( n0 n1 n2 -- n )
 >r 2dup > if swap then
 r> 2dup > if swap then drop
 2dup < if swap then drop
;

However, in addition to being somewhat opaque, it's not as fast as it could be. Refactoring into smaller chunks improves both readability and speed:

\ arrange top two stack values to assure n0 <= n1
: lohi ( n0 n1 -- n0 n1 ) 2dup > if swap then ;
\ arrange top three stack values to assure n0 <= n1 <= n2
: 3sort ( n0 n1 n2 -- n0 n1 n2 ) lohi >r lohi r> lohi ;
\ extract median value from top 3 items on stack
: median_of_3 ( n0 n1 n2 -- n1 ) 3sort drop nip ;

Be aware of non-standard extensions

The code uses the non-standard extensions:

 rdrop -rot utime

It is not hard to write replacements for the first two if needed:

: rdrop r> drop ;
: -rot rot rot ;

Since utime is only used in the test code, perhaps it's not as critical. There is no standard replacement.

Edward Edward 67.2k4 gold badges120 silver badges284 bronze badges · Answer 1 · 2018-09-23 23:21:36Z

I see some things that could be improved with this code.

Fix `exetime`

The exetime word is defined like this:

: exetime utime 2>r quicksort execute utime 2r> d- ." dtime " . cr ;

There are a number of problems with that. First, the formatting is not very good. I'd prefer to see a stack comment, for example. Second, this should probably take a parameter so that any word can be timed (this appears to be the intent from run). Third, that last . should be d. to display the entire delta time rather than just half of it.

: exetime ( fp -- ) 
 utime 2>r execute utime 2r> d- ." dtime " d. cr 
;

Use smaller words

Again using exetime as an example, there are actually two things done by the word as defined. The first is that it calculates a delta time, and the second is that it prints that time. I'd make those separate words.

Follow Forth convention

It's common to use prefixes to simplify code. For example, everywhere ssrt is used, it's followed by @. For that reason, I'd define and use word like these:

: @ssrt ssrt @ ;
: @size size @ ;
: !size size ! ;

Define and use common idioms

A number of cases in the code there is a sequence like this: size @ . but there is a common idiom for that and it's built in to many Forth implementations. If it's not, it's easy to define:

: ? ( a -- ) @ . ;

It would be used like this: size ?. Alternatively, one could use the prefix idiom which is associated with a particular value:

: ?size ( -- ) size @ . ;

Eliminate unused variables

The logs variable is defined but never used. This needlessly clutters the code.

Simplify control structure

Instead of using begin .. while .. repeat, it's often the case that one can use begin .. until instead and simplify the code. Using the refactored smaller words as suggested above, here's what run looks like now:

: run ( fp )
 >r
 begin 
 copy-array
 array-limits r@ exetime
 ." size " ?size ." dtime " d. cr
 size2x total-size > until
 rdrop
;

These are the refactored words:

: exetime ( fp -- ) utime 2>r execute utime 2r> d- ;
: copy-array ( -- ) data @ ssrt @ size @ cells move ;
: array-limits ( -- alo ahi ) ssrt @ dup size @ cells + swap ;
: @size ( -- n ) size @ ;
: ?size ( -- ) @size . ;
: size2x ( -- n ) size @ 2 * dup size ! ;

Make sure comments don't lie

The shuffle word starts like this:

: shuffle ( n a -- )

However, that's not correct. It should instead be:

: shuffle ( n1 n2 a -- )

Use refactoring to improve speed

The code currently contains this:

: median_of_3 ( n0 n1 n2 -- n )
 >r 2dup > if swap then
 r> 2dup > if swap then drop
 2dup < if swap then drop
;

However, in addition to being somewhat opaque, it's not as fast as it could be. Refactoring into smaller chunks improves both readability and speed:

\ arrange top two stack values to assure n0 <= n1
: lohi ( n0 n1 -- n0 n1 ) 2dup > if swap then ;
\ arrange top three stack values to assure n0 <= n1 <= n2
: 3sort ( n0 n1 n2 -- n0 n1 n2 ) lohi >r lohi r> lohi ;
\ extract median value from top 3 items on stack
: median_of_3 ( n0 n1 n2 -- n1 ) 3sort drop nip ;

Be aware of non-standard extensions

The code uses the non-standard extensions:

 rdrop -rot utime

It is not hard to write replacements for the first two if needed:

: rdrop r> drop ;
: -rot rot rot ;

Since utime is only used in the test code, perhaps it's not as critical. There is no standard replacement.

Stack Exchange Network

Forth to the past - qsort

1 Answer 1

Fix `exetime`

Use smaller words

Follow Forth convention

Define and use common idioms

Eliminate unused variables

Simplify control structure

Make sure comments don't lie

Use refactoring to improve speed

Be aware of non-standard extensions

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Forth to the past - qsort

1 Answer 1

Fix exetime

Use smaller words

Follow Forth convention

Define and use common idioms

Eliminate unused variables

Simplify control structure

Make sure comments don't lie

Use refactoring to improve speed

Be aware of non-standard extensions

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions

Fix `exetime`