Jay Taylor's notes

back to listing index

John Carmack discusses the art and science of software engineering | Bits and Behavior

[web search]
Original source (blogs.uw.edu)
Tags: johncarmack software-engineering id blogs.uw.edu
Clipped on: 2012-08-26

John Carmack discusses the art and science of software engineering

Posted on August 22, 2012

I’m not really a hard core gamer any­more, but my fas­ci­na­tion with pro­gram­ming did begin with video games (and specif­i­cally, ren­der­ing algo­rithms). So when I saw John Carmack’s 2012 Quake­Con keynote show up in my feed, I thought I’d lis­ten to a bit of it and learn a bit about the state of game design and development.

What I heard instead was a hacker’s hacker talk about his recent real­iza­tion that soft­ware engi­neer­ing is actu­ally a social sci­ence. Across 10 min­utes, he cov­ers many human aspects of devel­oper mis­takes, pro­gram­ming lan­guage design, sta­tic analy­sis, code reviews, devel­oper train­ing, and cost/benefit analy­ses. The empha­sis through­out is mine (and I also tran­scribed this, so I apol­o­gize for any mistakes).

In try­ing to make the games faster, which has to be our pri­or­ity going for­ward, we’ve made a lot of mis­takes already with Doom 4, a lot of it is water under the bridge, but pri­or­i­tiz­ing that can help us get the games done faster, just has to be where we go. Because we just can’t do this going, you know, six more years, what­ever, between games.

On the soft­ware devel­op­ment side, you know there was an inter­est­ing thing at E3, one of the inter­views I gave, I had men­tioned some­thing about how, you I’ve been learn­ing a whole lot, and I’m a bet­ter pro­gram­mer now than I was a year ago and the inter­viewer expressed a lot of sur­prise at that, you know after 20 years and going through all of this that you’d have it all fig­ured out by now, but I actu­ally have been learn­ing quite a bit about soft­ware devel­op­ment, both on the per­sonal crafts­man level but also pay­ing more atten­tion by what it means on the team dynam­ics side of things. And this is some­thing I prob­a­bly avoided look­ing at squarely for years because, it’s nice to think of myself as a sci­en­tist engi­neer sort, deal­ing in these things that are abstract or prov­able or objec­tive on there and there.

In real­ity in com­puter sci­ence, just about the only thing that’s really sci­ence is when you’re talk­ing about algo­rithms. And opti­miza­tion is an engi­neer­ing. But those don’t actu­ally occupy that much of the total time spent pro­gram­ming. You know, we have a few pro­gram­mers that spend a lot of time on opti­miz­ing and some of the select­ing of algo­rithms on there, but 90% of the pro­gram­mers are doing pro­gram­ming work to make things hap­pen. And when I start to look at what’s really hap­pen­ing in all of these, there really is no sci­ence and engi­neer­ing and objec­tiv­ity to most of these tasks. You know, one of the pro­gram­mers actu­ally says that he does a lot of mon­key programming—you know beat­ing on things and mak­ing stuff hap­pen. And I, you know we like to think that we can be smart engi­neers about this, that there are objec­tive ways to make good soft­ware, but as I’ve been look­ing at this more and more, it’s been strik­ing to me how much that really isn’t the case.

Aside from these that we can mea­sure, that we can mea­sure and repro­duce, which is the essence of sci­ence to be able to mea­sure some­thing, repro­duce it, make an esti­ma­tion and test that, and we get that on opti­miza­tion and algo­rithms there, but every­thing else that we do, really has noth­ing to do with that. It’s about social inter­ac­tions between the pro­gram­mers or even between your­self spread over time. And it’s nice to think where, you know we talk about func­tional pro­gram­ming and lambda cal­cu­lus and mon­ads and this sounds all nice and sci­ency, but it really doesn’t affect what you do in soft­ware engi­neer­ing there, these are all best prac­tices, and these are things that have shown to be help­ful in the past, but really are only help­ful when peo­ple are mak­ing cer­tain classes of mis­takes. Any­thing that I can do in a pure func­tional lan­guage, you know you take your most restric­tive sci­en­tific ori­ented code base on there, in the end of course it all comes down to assem­bly lan­guage, but you could exactly the same thing in BASIC or any other lan­guage that you wanted to.

One of the things that’s also fed into that is my older son’s start­ing to learn how to pro­gram now. I actu­ally tossed around the thought of should I maybe have him try to learn Haskell as a 7 year old or some­thing and I decided not to, that I, you know, I don’t think that I’m a good enough Haskell pro­gram­mer to want to instruct any­body in any­thing, but as I start think­ing about how some­body learns pro­gram­ming from really ground zero, it was open­ing my eyes a lit­tle bit to how much we take for granted in the soft­ware engi­neer­ing com­mu­nity, really is just lay­ers of arti­fice upon top a core fun­da­men­tal thing. Even when you go back to struc­tured pro­gram­ming, whether it’s while loops and for loops and stuff, at the bot­tom when I’m sit­ting think­ing how do you explain pro­gram­ming, what does a com­puter do, it’s really all the way back to flow charts. You do this, if this you do that, if not you do that. And, even try­ing to explain why do you do a for loop or what’s this while loop on here, these are all con­ven­tions that help soft­ware engi­neer­ing in the large when you’re deal­ing with mis­takes that peo­ple make. But they’re not fun­da­men­tal about what the computer’s doing. All of these are things that are just try­ing to help peo­ple not make mis­takes that they’re com­monly making.

One of the things that’s been dri­ven home extremely hard is that pro­gram­mers are mak­ing mis­takes all the time and con­stantly. I talked a lot last year about the work that we’ve done with sta­tic analy­sis and try­ing to run all of our code through sta­tic analy­sis and get it to run squeaky clean through all of these things and it turns up hun­dreds and hun­dreds, even thou­sands of issues. Now its great when you wind up with some­thing that says, now clearly this is a bug, you made a mis­take here, this is a bug, and you can point that out to every­one. And every­one will agree, okay, I won’t do that next time. But the prob­lem is that the best of inten­tions really don’t mat­ter. If some­thing can syn­tac­ti­cally be entered incor­rectly, it even­tu­ally will be. And that’s one of the rea­sons why I’ve got­ten very big on the sta­tic analy­sis, I would like to be able to enable even more restric­tive sub­sets of lan­guages and restrict pro­gram­mers even more because we make mis­takes con­stantly.

One of the things that I started doing rel­a­tively recently is actu­ally doing a daily code review where I look through the check­ins and just try to find some­thing edu­ca­tional to talk about to the team. And I anno­tate a lit­tle bit of code and say, well actu­ally this is a bug dis­cov­ered from code review, but a lot of it is just, favor doing it this way because it’s going to be clearer, it will cause less prob­lems in other cases, and it ruf­fled, there were a few peo­ple that got ruf­fled feath­ers early on about that with the kind of broad­cast nature of it, but I think that every­body is appre­ci­at­ing the process on that now. That’s one of those scal­a­bil­ity issues where there’s clearly no way I can do indi­vid­ual code reviews with every­one all the time, it takes a lot of time to even just scan through what every­one is doing. Being able to point out some­thing that some­body else did and say well, every­body should pay atten­tion to this, that has some real value in it. And as long as the team is agree­able to that, I think that’s been a very pos­i­tive thing.

But what hap­pens in some cases, where you’re argu­ing a point where let’s say we should put const on your func­tion para­me­ters or some­thing, that’s hard to make an objec­tive call on, where lots of stuff we can say, this indi­rec­tion is a cache list, that’s going to cost us, it’s objec­tive, you can mea­sure it, there’s really no argu­ing with it, but so many of these other things are sort of style issues, where I can say, you know, over the years, I’ve seen this cause a lot prob­lems, but a lot of peo­ple will just say, I’ve never seen that prob­lem. That’s not a prob­lem for me, or I don’t make those mis­takes. So it has been really good to be able to point out com­monly on here, this is the mis­take caused by this.

But as I’ve been doing this more and more and think­ing about it, that sense that this isn’t sci­ence, this is just try­ing to deal with all of our human frail­ties on it, and I wish there were bet­ter ways to do this. You know we all want to become bet­ter devel­op­ers and it will help us make bet­ter prod­ucts, do a bet­ter job with what­ever we’re doing, but the fact that it’s com­ing down to train­ing dozens of peo­ple to do things in a con­sis­tent way, know­ing that we have pro­gram­mer turnover as peo­ple come and go, new peo­ple com­ing and look­ing at the code base and not under­stand­ing the con­ven­tions, and there are clearly bet­ter and worse ways of doing things but it’s frus­trat­ingly dif­fi­cult to quan­tify.

That’s some­thing that I’m spend­ing more and more time look­ing at. I read NASA’s soft­ware engi­neer­ing lab­o­ra­tory reports and I can’t seem to get any real value out of a lot of those things. The things that have been valu­able have been auto­mated things, things that don’t require a human to have some analy­sis, have some eval­u­a­tion of it, but just say, enforced or not enforced. And I think that that’s where really where things need to go as larger and larger soft­ware gets devel­oped. And it is strik­ing the scale of what we’re doing now. If you look back at the NASA reports and the scale of things and they con­sid­ered large code bases to be things with three or four hun­dred thou­sand lines of code. And we have far more than that in our game engines now. It’s kind of fun to think that the game engines, things that we’re play­ing games on, have more sophis­ti­cated soft­ware than cer­tainly the things that launch peo­ple to the moon and back and flew the shut­tle, ran Sky­lab, run the space sta­tion, all of these mas­sive projects on there are really out­done in com­plex­ity by any num­ber of major game engine projects.

And the answer is as far as I can tell really isn’t out there. With the NASA style devel­op­ment process, they can deliver very very low bug rates, but it’s at a very very low pro­duc­tiv­ity rate. And one of the things that you wind up doing in so many cases is cost ben­e­fit analy­ses, where you have to say, well we could be per­fect, but then we’ll have the wrong prod­uct and it will be too late. Or we can be really fast and loose, we can go ahead and just be sloppy but we’ll get some­thing really cool hap­pen­ing soon. And this is one of those areas where there’s clearly right tools for the right job, but what hap­pens is you make some­thing really cool really fast and then you live with it for years and you suf­fer over and over with that. And that’s some­thing that I still don’t think that we do the best job at.

We know our code is liv­ing for, real­is­ti­cally, we’re look­ing at a decade. I tell peo­ple that there’s a good chance that what­ever you’re writ­ing here, if it’s not extremely game spe­cific, may well exist a decade from now and it will have hun­dreds of pro­gram­mers, look­ing at the code, using it, inter­act­ing with it in some way, and that’s quite a bur­den. I do think that it’s just and right to impose pretty severe restric­tions on what we’ll let past analy­sis and what we’ll let into it, but there are large scale issues at the soft­ware API design lev­els and fig­ur­ing out things there, that are artis­tic, that are crafts­man like on there. And I wish that there were more quan­tifi­able things to say about that. And I am spend­ing a lot of time on this as we go forward.

This entry was posted in Uncategorized by ajko. Bookmark the permalink.