Jay Taylor's notes
back to listing indexMoore Forth -- Chuck Moore's Comments on Forth
[web search]Moore Forth
Chuck Moore's Comments on Forth
Quotes Compiled by Jeff Fox
From the 1993 interview
from Color Forth 7/26/97
Good Forth minimizes the number of conditional statements. The
minimum is zero.
OKAD is written in assembler. It was converted from object code
only. A chunk of OKAD is Forth. I have a symbol table in OKAD
and it is used by the Forth. The compiler compiles code for the
Pentium and it is subroutine threaded. + AND OR 2* 2/ are inlined
other things are subroutine calls.
It runs fast. It compiles fast.
You don't see it compiling at all on my small programs. You load
a block and the definitions are available before your finger leaves
the key. They don't need to know it is a compiler, just that it
has a stack and colon definitions.
It should take five to ten minutes for anyone to learn this language.
ANSI Forth
I had reservations about ANSI. I worried that it would be a
disaster and not merely a dubious advantage. All of my fears
of the standard and none of the advantages of the standard
have come to pass. Any spirit of innovation has been thoroughly
quelched. Underground Forths are still needed. I said I
thought the standard should be a publication standard but
they wanted an execution standard. ---
I am utterly frustrated with the software I have to deal with.
Windows is beyond comprehension! UNIX is no better. DOS is no
better. There is no reason for an OS. It is a non-thing. Maybe
it was needed at one time.
I detest Netscape. I switched to the Internet Explorer even though
I detest MicroSoft worse than I detest Netscape. I detest MASM.
I discovered MASM clobbers my reserved memory. It says it will
respect it but it doesn't. I've tried Word, WordPad, and Edit
and they are unusable.
I would like a floppy, one meg is enough. It would have a 10K
program and the rest is data. When it boots it gets into memory
and is ready for operation while the rest of memory is loaded.
It should start in a second or two. You don't want to do this on
your hard disk and turn it into a dedicated machine.
The OS companies have not changed with the world. It is not needed
but they are doing very well.
There is an opposition to standard Forth. The word WORD is an
anathema, it shouldn't exist. The word WORDS is just as bad.
It just confuses a beginner to see a long list of words that they
don't understand. ---
11 LOAD is the command to load screen 11. Then
you see
I don't smudge and unsmudge words so you must redefine
words with a new name. If you redefine a word it would
call itself recursively. This is the simplest way to
do it.
It is a problem that Forth is thought of as a command line
interpreter. This is an obsolete concept.
User Interfaces
I think this is a great example of conserving
the valuable resource of the pixels on the screen. (laughter) It's a resource
that nobody cares about. I saw an ad in one of the internet magazines.
I don't remember what browser it was but it was a browser framed that you
had put up on your screen and customized all these buttons and dials and
that seems to be the pace at which this world is going. They're willing
to give away the edges of their screen (motioning to show that the useful
browsing area was surrounded by all these custom buttons and gauges in
this ad) I saw another example. It was some application I was working on.
It had something to do with word processing that had three layers of frames
and in the middle in about 1/8th of the area on the screen available for
text was part of the text that I wanted to see. It's ludicrous that their
willing to give away most of their 1024x768 pixels and I'm not.
Here's an application that builds a table. It's
a table of temperature to the three halves power. This is the code to build
it. And the thing to notice is that the word FILL is referenced inside
of itself as a jump back to the beginning of the definition. This is relevant
to the zealous debate that's been raging on the standards committee. It
used to be called SMUDGE and I guess the debate is about what to call it.
But anyway I've given up on that. It's just too complicated.
Color Forth
is brutally simple and it will become even more brutally simple. This construction
of a jump back to the beginning is very convenient and it saves a lot of
BEGINs and SWAPs and confusion. I think the control flow is clearer here.
(I asked Chuck to do a presentation on the improvements that he
had made to Forth in the last fifteen years and on quality Forth.)
I can say: 5X X X X X X ; : 20X 5X 5X 5X 5X ;
This is just as good as a loop. When running through memory
the code should compare an address to terminate rather than
use a loop count. If I need conditional code I would like
to use Wil Baden's Flow Diagrams.: G0 C0 ON G G G G G G G G G G PRINT ;
Ten "G"s
is simpler than a loop.
From Fireside Chat 1998
From 1xforth 4/13/99
So that is Forth. And this a requirement to support definitions.
What is a definition? Well classically a definition was colon something, and words, and end of definition somewhere.
: some ~~~ ;
I always tried to explain this in the sense of this is an abbreviation, whatever this string of words you have here that you use frequently you have here you give it a name and you can use it more conveniently. But its not exactly an abbreviation because it can have a parameter perhaps or two. And that is a problem with programmers, perhaps a problem with all programmers; too many input parameters to a routine. Look at some 'C' programs and it gets ludicrous. Everything in the program is passed through the calling sequence and that is dumb.
A Forth word should not have more than one or two arguments. This stack which people have so much trouble manipulating should never be more than three or four deep.
Our current incarnation of our word (:) is to make it red. That way you don't even use colon. This not only reduces the amount of text you have to store in your source file but it vastly clarifies what is going on. The red word is being defined,
some ~~~
the definition is green and it might have a semicolon in the definition which means return but it does not mean end of definition. It can have more than one return, and you can have more than one entry point in here if you want. Without this semicolon this definition would fall through into this definition and return at this point but still there is no state of your in compile mode versus execute mode. Your either running green or your running white background, black. Black means execute, green means compile, red means define.
This to me is simpler and more clear. It is bran new so it hasn't gotten any acceptance but we will see.
But as to stack parameters, the stacks should be shallow. On the i21 we have an on-chip stack 18 deep. This size was chosen as a number effectively infinite.
The words that manipulate that stack are DUP, DROP and OVER period.
There's no, well SWAP is very convenient and you want it, but it isn't
a machine instruction.
But no PICK no ROLL, none of the complex operators to
let you index down into the stack. This is the only part of the
stack, these first two elements, that you have any business worrying about.
Of course on a chip those are the two inputs to the ALU so those
are what are relevant to the hardware.
The others are on the stack because you put them there and you are
going to use them later after the stack falls back to their position.
They are not there because your using them now. You don't want too many
of those things on the stack because you are going to forget what they are.
So people who draw stack diagrams or pictures of things on the
stack should immediately realize that they are doing something wrong.
Even the little parameter pictures that are so popular. You know if you are defining
a word and then you put in a comment showing what the stack effects are
and it indicates F and x and y
F ( x - y )
I used to appreciate this back in the days when I let my stacks get too complicated, but no more. We don't need this kind of information. It should be obvious from the source code or be documented somewhere else.
So the stack operations that I use are very limited. Likewise the conditionals. In Classic Forth we used
IF ELSE THEN
And I have eliminated ELSE.
I don't see that ELSE is as useful as the complexity it introduces would justify. You can see this in my code. I will have IF with a semicolon and then I will exit the definition at that point or continue.
IF ~~~ ; THEN
I have the two way branch but using the new feature of a semicolon which does not end a definition. Likewise with loops, there were a lot of loop constructs. The ones I originally used were taken out of existing languages. I guess that is the way things evolve.
There was
DO LOOP there was
FOR NEXT and there was
BEGIN UNTIL
DO LOOP was from FORTRAN, FOR NEXT was from BASIC, BEGIN UNTIL was from ALGOL.
What one do we pick for Forth? This (DO LOOP) has two loop control parameters and it is just too complicated. This (FOR NEXT) has one loop control parameter and is good with a hardware implementation and is simple enough to have a hardware implementation. And this one (BEGIN) has variable number of parameters. ---
I've got a new looping construct that I am using in Color Forth and that I find superior to all the others. That is that if I have a WORD I can have in here some kind of a conditional with a reference to WORD. And this is my loop.
WORD ~~~ IF ~~~ WORD ;
THEN ~~~ ;
I loop back to the beginning of the current definition. And that is the only construct that I have at the moment and it seems to be adequate and convenient. It has a couple of side effects. One is that it requires a recursive version of Forth. This word must refer to the current definition and not some previous definition. This eliminates the need for the SMUDGE/UNSMUDGE concept which ANS is talking about giving a new name. But the net result is that it is simpler.
It would of course not be convenient to nest loops but nested loops are a very dicey concept anyway. You may as well have nested definitions. We've talked over the last fifteen years about such things. Should you have conditional execution of a word or should you have something like IF THEN? Here is an example where I think it pays well in clarity, the only loop you have to repeat the current word.
WORD ~~~ IF ~~~ WORD ;
THEN ~~~ ;
You can always do that and it leads to more intense factoring. And that is in my mind one of the keystones of Forth, you factor and you factor and you factor until most of your definitions are one or two lines long.
(Jeff) You might point out that your semicolon after WORD results in tail recursion and converting the call in WORD to a jump and that is how it functions.
(Chuck) So there is no reason to make that a call since you are never going to go anywhere afterwards so you just make that jump. In fact in all my latest Forths semicolon kind of meant either return or jump depending on the context and it's optimized in the compiler to do that. It's a very simple look back optimization that actually saves a very important resource, the return stack. ---
Well one thing is to say is that Forth is what Forth programmers do. I would like to think of it as what Forth programmers ought to do. Because I have found that teaching someone Forth does not mean that he is going to be a good Forth programmer. There is something more than the formalism and syntax of Forth that has got to be embedded in your brain before you're going to be effective at what you do.
My contention is that every application that I have seen that I didn't code has ten times as much code in it as it needs. And I see Forth programmers writing applications with ten times as much code as is necessary.
The concern that I have, the problem that I have been pondering for the last few years is:
How can I pursuade these people to write good Forth?
How can I pursuade them that it's possible to write good Forth?
Why would anyone want to write ten times as much as they would
need to write?
Microsoft does this, I'm sure you're all aware, but they almost
have an excuse for doing it because they are trying to be compatible
with everything they have ever done in the past.
If it impossible for you to start with a clean piece of
paper then you will have to write more code.
But ten times a much code? That seems excessive. ---
Program Size
About a thousand instructions seems about right to me to do about anything. To paraphrase the old legend that any program with a thousand instructions can be written in one less. All programs should be a thousand instructions long.
How do you get there? What is the magic? How can you make applications small? Well you can do several things that are prudent to do in any case and in any language.
No Hooks
One is No Hooks. Don't leave openings in which you are going to insert code at some future date when the problem changes because inevitably the problem will change in a way that you didn't anticipate. Whatever the cost it's wasted. Don't anticipate, solve the problem you've got.
Don't Complexify
Simplify the problem you've got or rather don't complexify it. I've done it myself, it's fun to do. You have a boring problem and hiding behind it is a much more interesting problem. So you code the more interesting problem and the one you've got is a subset of it and it falls out trivial. But of course you wrote ten times as much code as you needed to solve the problem that you actually had.
Ten times code means ten times cost; the cost of writing it, the cost of documenting it, it the cost of storing it in memory, the cost of storing it on disk, the cost of compiling it, the cost of loading it, everything you do will be ten times as expensive as it needed to be. Actually worse than that because complexity increases exponentially.
10x the Code
10x the Cost
10x the Bugs
10x the Maintenance
10x the bugs!
And 10x the difficulty of doing maintenance on the code. ---
This is why we are still running programs which are ten or twenty years old and why people can't afford to update, understand, and rewrite these programs because they are significantly more complex, ten times more complex than they should be.
So how do you avoid falling into this trap. How do you write one times programs?
One times, 1x That would make a good name for a web page.
You factor. You factor, you factor, you factor and you throw away everything that isn't being used, that isn't justified.
The whole point of Forth was that you didn't write programs in Forth you wrote vocabularies in Forth. When you devised an application you wrote a hundred words or so that discussed the application and you used those hundred words to write a one line definition to solve the application. It is not easy to find those hundred words, but they exist, they always exist.
Let me give you an example of an application in which not only can you reduce the amount of code required by 90% and here is a case where you can reduce the code by 100% and it is a topic that is dear to our hearts it's called FILES. If you have files in your application, in your Forth system then you have words like
OPEN
CLOSE
READ
WRITE
REWIND whatever
and they are arguably not going to be such short words, They are going to be words like OPEN-FILE because of all kinds of things that you want to be opening and closing like windows.
If you can realize that this is all unnecessary you save one hundred percent of the code that went into writing the file system. Files are not a big part of any typical application but it is a singularly useless part. Identify those aspects of what you are trying to do and saying we don't need to do that. We don't need checksums on top of checksums. We don't need encryption because we aren't transmitting anything that we don't need. You can eliminate all sorts of things.
Now that's the general solution to a problem that all the programmers in the world are out there inventing for you, the general solution, and nobody has the general problem.
I wish I knew what to tell you that would lead you to write
good Forth. I can demonstrate. I have demonstrated in the past,
ad nauseam, applications where I can reduce the amount of code by 90% percent
and in some cases 99%. It can be done, but in a case by case basis.
The general principle still eludes me. ---
Machine Forth
Jeff has reminded me of a couple of other concepts in Machine Forth. Machine Forth is what I like to call using the Forth primitives built into the computer instead of interpreted versions of those or defining macros that do those. One of those is IF.
Classically IF drops the thing that's on the stack and this was inconvenient to do on i21 so IF leaves its argument on the stack and very often you are obliged to write constructs like IF DROP. But not always. It seems that about as often as it is inconvenient to have IF to leave an argument on the stack it is convenient to have that to happen. It avoids using a DUP IF or a ?DUP. So with this convention the need for ?DUP has gone away. And ?DUP is a nasty word because it leaves a variable number of things on the stack and that is not a wise thing to do. ---
The world has changed in the last twenty years in ways that maybe are obvious but I think on one would have anticipated. When I first started in this business, in fifty seven, computers were used for calculating, computing. And the applications in those days usually involved large long complex algebraic expressions. The factorization that we did was to factor things would not have to be recomputed so that the whole thing would go faster, and that was the whole the point of FORTRAN. That tradition has stuck with us even today.
I don't know the statistics but I would guess that most computers don't compute, they move bytes around. If you have a browser your browser is not calculating anything except maybe the limit of what will fit on the screen at one time. ---
Fetch-plus (@+) helps with that. You put an address in the address register, which is called A, and it stays there a while. And if you execute this operator (@+) A gets incremented and you can fetch a sequence of things in order. Likewise you can store a sequence of things. And of course you have fetch (@) and store (!) without increment for when you need them.
These operators are not in classic Forth. I don't think they are even mentioned in the standard. They lead to a completely different style of programming.
In the case of the DO LOOP the efficient thing to do was actually put the address as your loop control parameter and then refer to it as I and do an I fetch (I @) inside the loop. And the DO LOOP is working with the addresses. If you don't do that, if you have the fetch-plus (@+) operator you don't need the DO LOOP, you don't need the I, you use the fetch-plus (@+) inside of the loop to fetch the thing that's different each time. It is different but equivalent. You can map one to the other. In addition to the notion of having to fetch something who's address is conveniently stored in the A register. ---
But such registers raises the question of local variables. There is a lot of discussion about local variables. That is another aspect of your application where you can save 100% of the code. I remain adamant that local variables are not only useless, they are harmful.
If you are writing code that needs them you are writing, non-optimal code? Don't use local variables. Don't come up with new syntaxs for describing them and new schemes for implementing them. You can make local variables very efficient especially if you have local registers to store them in, but don't. It's bad. It's wrong. ---
From Dispelling the User Illusion 5/22/99
From Fireside Chat 1999
I do agree of course that the standard is not necessarily a good thing. My latest thinking on that subject is that perhaps there should be two standards. One for 32 bit machines and one for 8 bit machines. Trying to meld them together into the same package is difficult, makes them both look more complicated, and in fact there is no advantage. They are not working on the same applications in those two environments. But it's not going to happen.
I've been worrying a lot about where we are and where we are going and this is the time at the new millennium. It's kind of pathetic actually that we are all sitting here talking about Forth. It is not the wave of the future. It's never been the wave of the future. It's not within our power to make it the wave of the future. It is a delightful tool for doing the things that I want to do. At least Color Forth is, which I'll come to in a minute. It doesn't have to be. If nothing else reading about a thousand people and all their stupid crusades reminds me that you can have a thousand points of view. We don't all have to agree on everything.
The fact that we are all trying to agree on a personal computer platform is wrong. It's not necessary. We've got a communication standard. We have enough standards in place that I can do something in my context, independently of you doing it in your context and we can communicate, and we can share, and we can co-operate. We don't all need to do it the same way.
The danger of course of us all doing it the same way is that we will all end up doing it wrong. And wrong gets locked in. Forever? If not now, when? If not us, who? ---
Let me remind you what Color Forth is. Color Forth is putting information in the white space of your source listing. I am increasing charmed by this notion.
When I say color I don't really mean color. What I mean is the following word will be displayed in a way that makes it's function clear. Typeface, color, or perhaps all red words are the first word on a line (layout), or perhaps these words are in all caps and these words are in all lower case. Many ways of distinguishing a function. But I am saying the function of a word should be indicated by the preceding space. ---
Color Forth is not a destination it is a path. In wandering around in this universe I am trying to minimize something, I'm not sure what. Total complexity. It really isn't fair to expect people to learn to use a chorded keyboard to use Color Forth. It is mixing concepts. ---
Alright, so what I am really saying here is that
Forth is really word oriented not character oriented and we should
realize that and take advantage of it.
I want a more compact representation. That's nice. I can get 60% the density of ASCII stuff. I want a faster parser. I want the ability to take advantage of the fact that I've got these shift bytes, if you will, that let me use five bit characters. Not that I expect anyone to use this. It's not going to become a universal standard. I'll convert everything to ASCII going out over wires. I'll convert everything to pictures on the screen. It's an internal representation. It's compact, it's efficient, it's easy to work with, and it's fun. Among other things it's colorful! ---
(Question) What is the formula for color blind people?
(Chuck)
Italics, upper case, bold face, ... ---
I would accuse Forth, the Forths that I see, of being old fashion, of not taking proper advantage of huge amounts of memory and disk.
I characterize myself as guru of Color Forth. I have total control over this exciting development I don't have to negotiate, I don't have to be polite about it. Nobody's interested, there is no conceivable profit motive.
Anyone is welcome to borrow these ideas and use them as you feel. Collaboration is wonderful. I don't expect you will. I understand completely that you are all doing your own thing. And your happy doing it and that's ok. But I'll keep making noise to the end of the millennium. ---
From Fireside Chat 2000
But nobody realized what was going on with this kind of threat, its insidious. You don't realize that it is a serious "conquer the world" threat. Well it occurred to me that we are subject to that kind of attack. That somehow someone slipped into the software industry and has been taking over the world. And everybody has been going along with it accepting that this is the way things are in software and that forty mega lines of code is needed to do anything interesting.
Maybe its time to say no. This isn't the way it has to be. We know it doesn't have to be that way. Yet we are virtually the only people in the world who do.
A thought occurs to me, if I were to retire what would I do? One thing I could do is go out to someone who has a million lines of code and say, "Hey, I'll rewrite this for you. You pay me a reasonable amount and a bonus upon completion and I'll shrink it by two orders of magnitude."
(You should charge by the lines eliminated.) (laughter)
How could I do this? The problem is that a million lines of code is about twenty volumes, an encyclopedia set. I couldn't read that much. With ten people at a hundred thousand lines maybe you could read that much.
How can you rewrite legacy code? Now this Banco do Brazil, that's very brave. If each person could write say a hundred thousand lines they must have had, what, eight hundred people working on the project. Those would have to be really good programmers too.
(Imagine debugging that.)
That's one reason that you can't afford to rewrite legacy code. Not that you can't rewrite it, you can't afford to check it out. It's been checked out over a period of twenty years.
(Do you know if there was automated translation involved in that project in Brazil?)
What would that gain? How would that gain anything? Your just compounding the confusion?
(You start up the automated translation and at the end you have maybe a few hundred exceptions at the end.)
I couldn't have reduced this by two orders of magnitude with anything automatic. What I have to do, and again I tried to show you this this morning with my approach to blue tooth. You have to really understand the problem. You have to go in there and read it all, and then think about it, and say what they are really trying to do is this and then do that. Which is much much simpler than what they were doing. ---
I reminded myself a little bit about how people write 'C' code and perhaps why there is sixty million lines of code in many applications. And it is almost understandable. I can excuse a lot here. There were a lot of programmable options that they had to test for. There were a lot of bits you might want to set or not set in a control word. And they had long names for those bits and they would smush them together and put em in a control word. I don't do things that way. To me it's an oxymoron. Those long names don't mean anything to me. I had to find a place in the documentation to find what that long name meant, what the consequences were of setting it or not setting it in order to understand what was going on.
Now having gone to all that work I can just set that bit in a hex word. I don't need the names and the conditions. I only have to do it once in one context. So one setting is good for all. That helped shrink the 'C' a little bit.
Then they had comments, comments ad nauseam. They would say "let us initialize the CCE engine" then there would be a routine called "CCEinit." That's not helpful. In fact it was expensive because I had to read it in case there was something there. But then one of my hot triggers about 'C' is that "CCEinit" would have left paren right paren to show that there weren't any arguments. Alright, that helps bulk up your program.
Other things were disturbing. I was converting into assembler so it's really not fair. They would have some code that was going to copy something from one part of memory to another part of memory and they wouldn't just use a move string instruction which does it all in one instruction after a little register setup. Instead they would have a loop where they would pick things up and move things and they would have plus plusses attached to pointers and such.
In one place it was particularly bad. I was actually surprised by this, but they would pick up bytes one at a time. They would pick up a byte and then pick up the next byte and then shift it left eight and then pick up the next byte and shift it left twelve and build up a word which was little endian and then ship it out. They guaranteed that the word was little endian although it was running on a machine which happened to be little endian and this was all nonsense. In other places they had cared about little endian and they hadn't bothered to do this. So it was bad code.
One of the things that happened, I guess twenty years ago, was that we had the concept of automatic code generation otherwise known as a compiler. And this did not let anyone write code that could have been written before but it did let less skilled people write code than what people were used to. So we introduced into the profession a very very large number of very very unskilled coders. You hardly can call them programmers.
Now this may just not have hurt a little bit it may have destroyed our civilization. The same thing is happening in VLSI design. The silicon compilers are attempting to do the same thing. Let lesser skilled people do what a small pool of experienced expensive people used to do. It isn't cheaper. It isn't better. It's just different.
This is what the bureacrats, the executives want. They want industrial scale workforces. Ten times as many people earning one tenth as much. The only justification is redundancy. It is certainly easier to manage a large group of dumb people than a small group of smart people. You can't replace the smart people, but you can replace the dumb people.
What our culture is doing is trying to do is build a reliable system from a large number of unreliable parts. And to do that you need a large number of unreliable parts. ---
We address a problem that we recognize even if no one else recognizes it. So I leave you with that and I will switch topics and talk about Forth for a minute.
Someone asked this morning, George, "What is Forth?" Forth has stacks. Forth has definitions. Yes definitions are just like subroutines but in my mind definitions are very small and subroutines tend to be very large.
What else is it about Forth that is distinctive? And two things occurred to me that perhaps are being underestimated. One is that Forth is syntax free. This lets you define whatever syntax you want and still be within the rubrics of Forth. ---
There are a lot of clever things that you can do and I value that cleverness. It makes it nice to write a program. To write a program elegantly, a program that is readable, a program that is pretty, is a very satisfying thing to do.
It is a large component of the fun that Forth is and I have yet to find a person that says that 'C' programming is fun. You get the satisfaction of doing things in a way that is elegant.
Another aspect of Forth is analogous to Ziff compression.
Where you scan your problem you find a string which appears in
several places and you factor it out. You factor out the largest
string you can and then smaller strings. And eventually you get
whatever it was, the text file, compressed to arguably what is about
as tightly compressed as it can be.
And in the factoring you do to a Forth
problem you are doing exactly the same the thing. You are taking,
discovering the concepts which if you factor them out leaves the
problem with the simplest description. If you do that
recursively you end up with, I claim, arguably, the most compact
representation of that problem that you can achieve.
This gets tricky because Goerdel demonstrated that you cannot prove the completeness of a system within the system, or even the consistency. So I don't think that will ever be provable. But it's demonstrable in any particular case. ---
We all agree that Forth has logarithmic growth with complexity but none of these words have been defined adequately let alone quantified. So you can't prove it. I think other people either are not aware of that argument or find it completely unconvincing. That is the nature of the factorization.
The problem is that you can do a better job in other languages than people typically do in those languages. You could write small subroutines in 'C'. You could do anything in 'C' that you do in Forth but the only argument is that people typically don't.
You are also confounded by the fact that you have got good programmers and bad programmers. To do a really good job in Forth you have to be a really good programmer. If you are a really good programmer you can probably do a really good job in other languages too. But people don't. ---
The Pentium is basically a register machine. It has a stack and you use it as the return stack and you have to do something to get a data stack. Looking at these various architectures it is clear to me that the stack is essential, you must have a stack. Registers are optional. If you had some registers you might be able to optimize the code for ways where doing it on on a stack is clumsy or difficult.
But to program a register only machine is much more difficult than to program a stack only machine. I think stacks are the important concept. Registers are a nice thing to have if you can afford them. Which is exactly the opposite attitude that everyone who designs computers has.