About the Author
Mass Effect
Final Fantasy X
Batman:Arkham City
Borderlands Series
Weekly Column
Champions Online
World of Warcraft
DM of the Rings
Good Robot
Project Frontier
Forums
"Music"



Ideas about a new programming language for games, Annotated: Part 4

By Shamus
on Wednesday Mar 4, 2015
Filed under:
Programming

 
 

We finally get to the good stuff. The discussion below contains a lot of the reasons I wanted to do this write-up in the first place.


Link (YouTube)

1:00:00 Let’s Make an Array Type.

Yeeeessss.

This might seem trivial to a lot of people. This is basically just a fix for the std::vector<int> name; problem I discussed earlier in this series. We use arrays everywhere in our code. We’re constantly building lists of data. C++ offers tons of ways to do this, and somehow they all suck.

The lazy way:

1
int    things[10];

That gives us space to store 10 things, which sucks if the program suddenly needs 11. The array is of fixed size. So let’s do it the “right” way:

1
2
3
4
5
int*   things;
 
things = (int*)malloc (sizeof (int)*10);
//(A bunch of stuff happens. And then a long time later...
free (things);

We allocate the memory ourselves. If we need more, we can allocate more. But we need to ask for the memory ourselves, we can’t use more than we asked for, and we can’t forget to free it. It’s prone to mistakes and takes an unreasonable level of work and care for what should be a simple, low-stress task.

And the bestNot the best in all cases, but certainly the best for simple lists of unknown size. way to do it:

1
std::vector<int>   things;

That’s actually the best least awful solution so far. This makes a list that can grow or shrink or be re-ordered whenever you like and the compiler will worry about all that stupid memory management crap for you. But because this feature was bolted onto the languageDoes the standard library count as “part of the language”? I honestly don’t know. so late in the lifespan of C/C++, the syntax is strange, verbose, and ugly. If you make a mistake it will sometimes spew pages of gibberish error messages at you over a simple misplaced symbol.

(Nitpick shield: There are a LOT more ways to store data. Linked lists, hash tables, whatever. But they’re kind of specialized and have other tradeoffs. These three are enough for us to consider right now.)

Blow is proposing we take this simple idea and express it in a simple way. That’s it.

I’m not totally sold on the thing where he moves the brackets around and adds exclamation marks four minutes later. I can’t tell if I don’t like it because there’s something wrong with it or if I don’t like it because it’s unfamiliar. But whatever. The important point is that a language could do better than we’re doing now. (And lots of modern languages do.)

1:14:00 NO HEADER FILES

C programs are nothing more than text files. But you generally don’t want to put your whole program into one text file. It would be gigantic and you’d never be able to find anything. So for the sake of organization, we put related ideas into the same file. All the code for handling space marines goes in SpaceMarine.cpp and the code for rockets goes in Projectiles.cpp and the code for sound effects goes in Audio.cpp. Or whatever. You can organize it however you like.

So the functions in these files need to talk to each other. Maybe a rocket, when it detects it’s hit a space marine, needs to make them explode. So it will call SpaceMarineExplode (). So the compiler is working its way through Projectiles.cpp and it sees the use of SpaceMarineExplode (), but it doesn’t know what to do with it. It hasn’t read the space marine fileIT actually reads each file in isolation. It’s… not a great system.. It doesn’t know that SpaceMarineExplode () exists. Maybe I just broke up with my girlfriend Marnie, and in a fit of distress, teen angst, and Freudian slippage I’ve accidentally typed SpaceMarnieExplode (). How is the compiler supposed to know that one is correct and the other is wrong? What about when the phone rings, I hope it’s her, then realize it can’t be her, and won’t ever be her ever again, and I go face-down on the keyboard and as I’m sobbing snotty tears I manage to type kbggggggytyutgiiiiiiiii? How does the compiler know that SpaceMarineExplode () is a valid function and kbggggggytyutgiiiiiiiii isn’t?

Well, in C we have “prototypes”. It’s just a single line of textYou HOPE. that says, “SpaceMarineExplode () is a thing. “It’s not in THIS text file, but I’ll define it in one of the other text files.” It’s a promise to the compiler that the thing will show up later.

I don’t know enough about the history of the language, but I imagine this was done because C was invented in 1968 and computers had very little memory, they were very slow, and disk access was glacial. Compiling would take a long time. The last thing you wanted was for the compiler to have to read ALL THOSE TEXT FILES THERE MUST BE NEARLY 100 KILOBYTES OF THEM ARE YOU MAD and create an inventory of everything, and then go through them all AGAIN to do the actual compiling.

To avoid this, we put all those prototypes in another text file called the header file. Like SpaceMarine.h. Then Projectiles.cpp can #include<SpaceMarine.h> which will bring in all those prototypes. Basically, we’re manually doing that messy “inventory” step manually. By typing things. Instead of letting a computer do it.

This is a monumentally horrible way of doing things, and has been for decades. It gets even more fun when one header file needs to read another. So SpaceMarine.h will #include Weapons.h and that file will include Projectiles.h, which turns around and includes SpaceMarine.h, which is the file it’s already trying to compile. So the compiler goes crazy, locks up, or (if you’re using a modern compiler because you’re not a savage) gives you a cryptic error that lets you know about the circular reference without telling you what it is. There are ways to protect against this, which involves stupid boilerplate code to the effect of “If you’ve read this file before, don’t read it again”, which probably seems like the kind of busywork that the compiler could probably sort out on its own if this paradigm hadn’t been designed a year before The Beatles broke up.

So header files suck. They dump onto programmers really tedious busywork that some sort of compiler could easily take off our hands. Other languages don’t waste your time with this.

So that’s the presentation. Hope you enjoyed it. Or found it educational. Or whatever. I have to go back to complaining about using C++ to get work done.

Footnotes:

[1] Not the best in all cases, but certainly the best for simple lists of unknown size.

[2] Does the standard library count as “part of the language”? I honestly don’t know.

[3] IT actually reads each file in isolation. It’s… not a great system.

[4] You HOPE.


 
 
Comments (104)

  1. Daemian Lucifer says:

    Basically, we're manually doing that messy “inventory” step manually.

    In that sentence,I think you wrote manually twice by mistake in that sentence.

  2. Daemian Lucifer says:

    Well after reading all this,I now agree with what James Schend said in part 2:What Blow wants is C#.Only not made by microsoft,I guess.

    And I know you complained about autocomplete on your twitter,but its actually pretty useful for avoiding your ex gf marnie messing up your code.With autocomplete,you can easily spot if you mistyped a function or a class name.

    • Bloodsquirrel says:

      No, he’s very explicitly not looking for C#. He doesn’t want garbage collection. 90% of his ideas are incompatible with C#, seeing as how C# is managed and doesn’t need stuff like “ownership” for memory.

    • Volfram says:

      “C# not made by Microsoft?” That sounds familiar.

      You can even turn off the garbage collector or tell it not to run in certain areas.(though you shouldn’t use dynamic arrays with the garbage collector off.)

    • 4th Dimension says:

      In Visual Studio the autocomplete is called Intellisense. Visual Studio is smart enough to figure out what things you might trying to type in that start with SpaceMa, and will helpfully offer them in a nice popuplist that hets further filtered as you type. I don’t know how it works in C++ where things can get a bit freeform, but in C# if I start typing marine.SpaceMar it might offer me a list of methods, properties and fields of marine that start with SpaceMar.

      Also as Bloodsquirrel says if he doesn’t want GC he really can not use C#. While there are ways to work with pointers in C# using unsafe keyword, they are there mostly as hacks and workarounds necessary for you to call procedures written and compiled in other languages.

      • swenson says:

        Even better yet, if I’m using marine.SpaceMarineExplode() a bunch of times in a row, it will remember this and the next time I type “marine.” it will automatically select “SpaceMarineExplode()” for me! Intellisense is the most beautiful thing.

  3. Da Mage says:

    I actually really like header files as I basically use them to provide my object documentation. Since all the functions are defined there, I can put a decent sized comment describing what it does and then all my comments are neatly one under the other and I can figure out what the entire object can do without needing to have a huge file full of code. That’s just me though.

    Having the header separate also cuts down on the amount of cyclical dependency you end up with, as if you are only using a object from other header within functions then it can just included in the source file instead of the header. I don’t know enough about C/C++ compiling, but your theory on memory size wouldn’t surprise me.

    • koriantor says:

      I agree. I find it’s much easier to figure out what’s going on in the code if I have a header file that acts as a program outline or a map. It’s the perfect spot for documentation. You have a basic list of functions that the code is aware of, which (in theory) have documentation that describe those functions. No need to hunt down the function’s definition. No need to read code to figure it out (that is if you’re writing proper documentation… which I know is an ideal…)

      C++ also introduced the “#pragma once” command a few years back which takes out the need for the if-endif boilerplate. I don’t remember the specifics of what the preprocessor sees, but it accomplishes the same thing as if-endif, and it’s a lot easier to type. Basically you just need to understand that you should use it and then just type a little bitty line of code for every file you make. (Maybe you shouldn’t need to even write that line of code and the compiler should just understand to not repeat itself, but that’s not an easy fix for the language and I think #pragma once is a pretty good solution considering)

      • DrMcCoy says:

        Keep in mind though that #pragma once, like all #pragma directives, is not standard-compliant C++. Yes, most compilers support it, but it’s not an official feature of the language. If you want to stay strictly portable, it’s better not to use it.

        Me, I don’t see the problem of consistently writing

        #ifndef SUBSYSTEM_FOOBAR_H
        #define SUBSYSTEM_FOOBAR_H
        […]
        #endif // SUBSYSTEM_FOOBAR_H

        in subsystem/foobar.h (and analogously for other files). I also have boilerplate license information and a Doxygen blurb in all my files. It takes maybe a minute copy-pasting this for new files and changing the name; I don’t even think about it anymore. *shrug*

        • Kian says:

          I agree that you can get used to it. I’m used to it too, and it’s never been a source of issues for me. But I can see that it’s a problem too. Whenever you need to do something repetitive, that’s a problem. It’s noise that doesn’t add to the program. That you can do it with your eyes closed is evidence of it. If it’s obvious, I shouldn’t have to write it.

          Most IDEs will offer some kind of shortcut to help, too. Not sure what VS’s is, I should check.

        • nm says:

          Objective C (which is actually a strict superset of C, unlike C++) added a new keyword “import” to do inclusion without reinclusion. It’s not a hard problem.

          • Stuart Hacking says:

            It’s not a hard problem in C++ either, providing you choose a compiler that supports the #pragma directive. However, if you want to write a library that will still work for the lowest common denominator, then you choose the approach that will work for the guy who’s stuck using an older compiler. Objective-C had the benefit of hindsight here.

            The general hard problem is that modern solutions need to be backwards compatible and not everyone immediately updates their tools, for whatever reason. So you can choose to use the new hotness, at the expense of raising the minimum required tool version, or you can opt for wider support, depending on your particular needs. (In my opinion, the #ifdef scenario is very little effort for the benefit of basically universal compiler support.)

      • Zukhramm says:

        As far as i know #pragma once is non-standard. Also, while it might be easier to write, it’s harder to read, to not much benefit (two lines per header, that’s negligible). With #ifdef the words tell you exactly what the directive is doing, #pragma once is just this mysterious spell you have to learn.

        • Blake says:

          On the other hand I’ve seen places where someone has written
          #ifndef FILE_NAME_H
          #define SOME_OTHER_FILE_NAME_H

          And had that sitting around in the code for a long time.
          I think it was the Clang compiler we use on one platform that come out and gave us a warning about it.

    • karln says:

      I guess non-headery languages tend to deal with that by providing a docstring format or similar, whereby comments with certain formatting or positioning can be extracted by a tool into a set of documents (HTML or whatever), which gives you the comments and prototypes in a browsable format.

      Adds an extra step to your workflow, but it can be automated, and you don’t have to manually update the prototypes in two places when you change them.

      • Hankelhankel says:

        Yeah, .NET for example uses “smart” XML comments. If you have the right tags in a comment block that immediately precedes a method (Visual Studio will auto-create those tags for you when you type //// or ””), that information about the method, its arguments and its returns will appear in the IntelliSense tooltip whenever you call that method.

        Probably the best thing about it is that it makes it REALLY hard for your coworkers to miss your critical information on how to use that method you just wrote.

    • The Snide Sniper says:

      I agree. A header file serves two practical purposes:
      – A standard place for short, well-documented references to black-box code. (Severely reducing clutter)
      – A reference to pre-compiled code, or to a block of code that can be compiled separately. (the .o files your compiler generates)

      If it weren’t for those (heck, if it weren’t for the first one), I’d agree with Blow. If you’re not using header files for documentation and black-box abstraction, you’re just wasting time writing code twice.

      Unfortunately, as you start using more of C++ (templates in particular), you end up needing to put more code into header files, which starts defeating purpose #1. There’s a workaround for this (which is creating a separate header file and #include-ing that in your documentation header), but it’s not… elegant.

      • CJ Kerr says:

        A standard place for short, well-documented references to black-box code. (Severely reducing clutter)

        This is good practice in C, but that doesn’t mean it’s the best way.

        Much nicer is to use a tool like Doxygen to extract comments into nicely formatted documentation automatically (and put it in your build process!). Now the comments can be written once where they belong (with the code) but read by other people who are using your library like a black box.

        Doxygen itself is kind of clunky, because it supports multiple languages with different syntaxes and doesn’t want to lock itself into a single idiomatic style. For a better example of doing this in a modern language, look at Godoc: http://blog.golang.org/godoc-documenting-go-code

  4. Henson says:

    But what happens if your ex-gf marnie becomes a marine in maine with her sister marie? And what if she’s armenian? You may as well just give up programming, then.

  5. DrMcCoy says:

    In C++, you actually want to use new[] instead of malloc(), especially if you’re allocating class instances: new[] calls the constructors, malloc() doesn’t.

    And yes, of course the STL is part of the language. The C++ standard, even since the first C++98, covers what the STL contains, its features and contraints. You don’t have to use it (and on certain (embedded) systems it’s better not to, for various reasons), but it is a part of the language all the same.

    I’ve mostly refrained from commenting on these posts of yours, because I’m one of those people that like C++. I like manual memory management, I like headers, I like exceptions, I dislike garbage collection. Sure, there’s a few things I like to change (even in C++11), but for the most part I’m pretty happy with C++. I basically disagree with everything Blow says.

    But I also quite like the GNU autotools, so I’m kind of an outlier.

    • Kian says:

      I’m a fan of C++ as well. I’ve actually enjoyed this series because it lets me write about C++ in the comments :P However, I’m not a big fan of the syntax.

      I guess I’d like to see “C++ without C”, all the goodness of C++ without the burden of having to support C’s syntax. C is a good language, but the way it does things runs counter to C++’s paradigms. Such as Shamus’ use of malloc() instead of new().

      • Bloodsquirrel says:

        I love C++ too, but I agree almost entirely with Blow. There are definitely opportunities to clean things up, and his ideas add ways to manage memory more conveniently without preventing you from doing things the C++ way when you need to.

        I mean, I get why header files are handy, but couldn’t we just generate those automatically from the main class file instead of requiring me to manually edit two class signatures every time I change or add a function? Forcing somebody else’s documentation scheme on everyone who uses a language is a failure of design.

      • Chris says:

        Agreed. I suspect that a substantial portion of complaints about C++ are really complaints about C that C++ has already solved.

    • Blake says:

      Are you a games programmer or other applications?
      Every games programmer I’ve talked to about the talks (either at work or elsewhere) tend to agree with basically everything Blow is saying, and most of the people I’ve seen disagree aren’t games programmers by trade.

      In my case I like being able to read header files to quickly see the interface to a class, but actually writing the extra files, adding them to the solution, updating the files, and updating the files again at a later date does tend to add a lot of busywork for no real reason.

      The main thing I like with header files (getting a class description) could be easily generated by an IDE or language feature, and if you want comments on the functions I’m sure you could easily add them too.

      Basically I don’t think it’s the job of the language to force that extra work on people when they don’t want it.

    • Wide And Nerdy says:

      All I know is, I thought arrays were enough of a chore in javascript. But then I’m still at the point where I have to remind myself that arrays are zero index so

      myArray[myArray.length] //will always be out of scope. If you want the last you either need to do.

      myArray[myArray.length – 1] //do you need parentheses? I can’t remember OR

      var last = myArray.pop() //if you want to access the last value and remove it from the array.

      And the “last in first out” approach implied by pop() is also not intuitive to me yet which I’m sure betrays my immaturity as a front end web design/developer. But then, the most performance intensive thing I’ve ever tried to pull off in a browser is smooth parallax scrolling (and after seeing how easy it is to get that wrong, I have a lot more respect for what game developers do, very educational).

      Point is, if I’d started out my life as a developer working in a language where you also have to manually manage memory allocation for arrays, I’m not sure I would have ever stuck with it long enough to learn anything useful. And I know plenty of people on Stack Exchange who would say thats a good thing and I should quit.

      Hard to stay motivated when your entire field feels like its designed to make you feel stupid.

      • Kian says:

        I don’t know about Javascript, but in C++ you have different structures, and the name of the structure tends to help understand it’s behavior. Vector in particular is terrible in that it’s name is more of a historical artifact than a good representation. A more strict name might be “variable length array”, but then that is too wordy.

        Arrays in C++ don’t need to have their memory managed because they are of fixed length, so no reallocation required, thus no memory management.

        However, what you talk about “last in first out” is what an algorithms and data structures course would have taught you to call a stack (as in, stack of cards). In C++, and most other languages I believe, you would have a Stack class, built using an array, that gives you that behavior if you want it.

        Adding that method to arrays is a defect of the language, and “not getting it” is not a failure on your part as a programmer. You could have a similarly named pop() method in a queue, for example, which is a “first in, first out” container, and you’d get the exact opposite behavior. What makes it make sense is what kind of container it is. It makes sense in a stack to pop the thing at the top, and in makes sense in a queue to pop the thing that’s been waiting the longest.

        Arrays don’t imply either of those behaviors, you can in fact construct either a queue or a stack on top of an array.

        • Wide And Nerdy says:

          True. I haven’t had much use for removing things from arrays after creating them (I tend to pack up an array and send it to an API). But you’re reviving dim memories of strongly typed languages from when I tried to learn C# (well, I did learn C# but I didn’t understand what I needed at the time to go from that to a web page).

          Strangely, I think I agree with you that strongly typed languages end up being easier to understand than weakly typed ones like javascript. Sure its a little more work up front but its also more immediately obvious what I did wrong when I fail.

          I declare everything as a var in javascript (ints, strings, arrays, functions, objects) and can never remember if I stored something as a string or int (and since the symbols for addition and concatenation are the same, thats . . . fun)

        • WJS says:

          Javascript arrays have methods to add or remove things from either end, so yes, you can do both with them. Why is being versatile like that a problem?

  6. Kian says:

    The bit about headers touches on something that really annoys me about compiling C++: There is no standard build mechanism.

    For those that don’t know, the way you compile a C++ program is like this:

    You tell the compiler to compile each of your code (.cpp) files, with the flags and options required. So if you have a hundred files in your project, you have to tell the compiler a hundred times “compile aaa.cpp in debug mode”, “compile aab.cpp in debug mode”, “compile aac.cpp in debug mode”, etc.

    This builds your object files, which are basically the machine translation of your code, with placeholders for where you told the compiler “I’ll tell you what SpaceMarineExplode() means later, it’s in another file”.

    You then tell the linker to grab those one hundred object files, plus any external libraries you want to include in your project, and fit it all together. It’s a bit like assembling a jigsaw puzzle, with the placeholders representing where the different files slot together.

    If you did everything fine, your program compiles and the linker spits out an executable.

    Obviously, no one does all this by hand. Otherwise it would take forever to get anything done. No, instead, we have a hundred different incompatible systems to define how a particular project should be compiled. If you use an IDE, like Visual Studio, you have the IDE’s project files describe what all the files needed are, where they are, options, etc. So you can’t easily switch IDEs, you have to copy the files and then rebuild the project and all the options in the new IDE. You might also have makefiles, which are used by the “make” utility. Only there are different make utilities too, with incompatible makefile formats.

    The problem is so annoying, someone came up with the idea of CMake. CMake is not a build system. Instead, it’s a generator for different build systems. So it takes a list of all the files you want to compile, and it produces Visual Studio solutions, or makefiles, or any one of a bunch of project files for different IDEs.

    I’d love it if the specification would define a way to describe all this in a standard way that all the IDEs would then support and save us the hassle.

    • DrMcCoy says:

      You might also have makefiles, which are used by the “make” utility. Only there are different make utilities too, with incompatible makefile formats

      Not quite. There is a POSIX standard for Make, the core subset of basically all Make implementations (the big ones being GNU Make and BSD Make).

      Unfortunately, Make itself gives you no way to search for different locations where libraries might be, check for compiler features and the like. So people commonly preface a configure stage, which is a shell script that does these checks and modifies the Makefile (*). Alternatively, CMake (or a dozen of other solutions) acts as this configure step.

      (*) To make sure the configure script stays portable, doesn’t do weird stuff and to cut down on the boilerplate required, there’s again several ways to generate the configure script, GNU autoconf one of those. So a build system using the full GNU autotools range would have a step to parse a Makefile pre-pre-stage into a Makefile pre-stage, parse a configure pre-stage to create configure script, run that configure step and combine the output with the Makefile pre-stage into Makefiles. Optionally, you can have a config.h pre-stage to generate a config.h where configure places stuff its found, and call libtool to handle compatibility things for building libraries. This all runs and is configurable using the m4 macro language that’s never used anywhere else.

      Yes, this all sounds horribly complex and insane, but it creates the most portable system out there that also works relatively comfortable if you want to do weird things like cross-compile Windows binaries on your GNU/Linux box. However, to use this on Windows, you need to bend over backwards and install MinGW or a full-fledged Cygwin environment, since this doesn’t mesh with the usual Visual Studio ways.

      • Bryan says:

        autotools >>> cmake

        So, so, so much better. If you pull down any random package using cmake and try to build it with a slightly different system from the one the person who built the package was using, you’re probably screwed. (How do I tell it where to install? Not everything should be in /usr/local, or /usr. How do I tell it what compiler to use? Not everything uses gcc, or cc; sometimes I need the much newer compiler in /opt somewhere. How do I tell it what compiler flags to use? Half the time I need -m32 and half the time I need -m64. What if I need one different flag when building C++ than when building C? *None* of this is standardized.)

        Autoconf and automake (mostly the latter) provides trivial support for –prefix, ${prefix}, –libdir / ${libdir}, etc., as well as CC, CXX, CFLAGS, CXXFLAGS, etc. *Every* package using autotools supports these.

  7. Bloodsquirrel says:

    The syntax for std::vector isn’t a product of vector being bolted on at the last minute. It’s the standard syntax for templates in C++; vector is just a regular class that could be written by anyone using C++.

    You can actually get rid of the std:: part by adding “using namespace std;” at the top of your file. This is often not done because it might cause namespace conflicts (The compiler doesn’t know what class you’re referring to because something in the std namespace has the same name as another namespace you’re using). This is something that could probably be fixed by organizing the namespace and cleaning up how #includes work.

    The rest of it is standard templates syntax. Java’s template syntax is actually identical (although the implementation is very different, and kind of a mess). Templates exist so that you can write a function or class which handles any type of parameter generically without knowing what it is until the program is complied (this is usually used for lists and other data structures where you don’t need to know how that type behaves, you just need to store it). So with templates you can write a generic list class that you can then import and use with int, char, String, or any other type/class. They’re an important feature for a language to have. Maybe some other language has come up with better syntax; I don’t know.

    Blow isn’t actually recommending implementing a language-level construct that acts like std::vector (vector does a lot of stuff behind the scenes in order to be able to grow; you don’t want that baked into the language); what he wants is just something that doesn’t require storing the size of the array as a separate variable.

    Also, C++ recommends against using malloc. You’d actually use:
    int sizeOfArray = 5;
    int * foo = new int [sizeOfArray];
    //shit happens
    delete foo[];

    • DrMcCoy says:

      If you want to get rid of the std:: in front of vector, instead of pulling in everything from std with “using namespace std;”, you can just pull in vector with “using std::vector;”.

      EDIT: And yes, you need to do this for everything you’re using from std manually. This way, you know what you’re pulling in. The thing is, a conflict does not necessarily make the code not compile. It can, in certain circumstances, still compile and just call the wrong function, leaving you with a head-scratching bugs.

      • mhoff12358 says:

        I think the real issue is the way that templated classes very easily turn into gobbledygook. It’s rather easy to have an unordered_map consisting of a string mapped to a pair of values. And then you need to take its iterator. Writing out these type names is a pain, but more importantly the compiler’s output any time something goes wrong is absolutely useless. Errors cascade through layers of the templating filling the screen with junk over a one line error.

        • DrMcCoy says:

          Yes, I give you that

          for (std::unordered_map<std::string, Something>::const_iterator i = something.begin(); i != something.end(); ++i)

          is long and unwieldy. typedefs help to cut it down, as does the auto keywords in C++11, but not completey.

          And yes, the error messages are quite often near useless for templates, apart from pointing you at the line of the issue. I am not sure, however, how that could be fixed.

          EDIT: Narf, had to edit and use the HTML entities for the angle brackets.

          • Phil says:

            I used to use, I think it was STLFilter in Visual Studio 6, which involved renaming cl.exe, putting in a new cl.exe that just called the real cl.exe and then cleaned up the output. Definitely not a standard part of the language, but was a nice workaround.

    • HeroOfHyla says:

      [pedantic]you’d be better off doing std::size_t sizeOfArray rather than int. Size_t is guaranteed to be large enough to hold any array index, and it’s unsigned automatically.[/pedantic]

      Sorry, I’m taking a data structures class right now and it’s rubbing off on me.

    • CJ Kerr says:

      It's the standard syntax for templates in C++

      I’m pretty sure this (rather than the STL) is actually what Shamus is complaining about – C++ has a weird problem with syntax, because it has to be backwards compatible with C. Every time they add a new feature, they have to find a new syntax to express it which NEVER appears in valid C (well, in valid C89. I don’t think modern C++ promises to be compatible with K&R C).

      This means that big chunks of the syntax aren’t intuitively related to each other (particularly to people migrating from C, and yes, we still exist), and that new bits of the language (C++03 and onwards) don’t necessarily look much like the old bits, to the point where you might not recognise it as the same language at all if you’re looking at brand new C++11 code with auto type deduction and lambda expressions.

      Which is a long winded way of saying this:

      std::vector things;

      is already kind of ugly, and it’s a VERY simple example of a C++ template. More complex examples are a lot uglier.

      • Bloodsquirrel says:

        Like I said, though, that’s the same syntax that Java uses for it’s degenerate version of templates. I haven’t seen better syntax for templates in any typed language.

        • Zukhramm says:

          The style ML or Haskell goes for is certainly lighter, just the words next to each other, so either “int list” or “List Int”, depending on language.

          And before anyone comments: yes, there are special literals for lists, I’m just using lists to show the general case.

      • kdansky says:

        Luckily, there is “auto” which removes a ton of syntactic garbage. Unluckily, auto is too high concept for the inflexible crowd that are the majority of C++ programmers.

        • Shamus says:

          So I tried using “auto” for the first time last week. (Just recently upgraded to c11.)

          The most common place I want to use it is iterating through vectors. I always do:

          for (int i=0; i < foovector.size (); i++) {} And then the compiler warns me because I'm comparing an int to an unsigned, because size() is unsigned. And then I have to go back and fix it. So after making this mistake a million times I thought, "Ah! good place to use auto. It'll save me five keystrokes and a compiler warning." So: for (auto i=0; i < foovector.size (); i++) {} But here auto chose to be an int. And then the compiler bitched at me again. You had ONE job, auto!

          • Csirke says:

            Well, this is the C++11 way to go through that vector:

            for(const auto& item : foovector) {}

            Which I think is pretty nice.

            (That is, if you don’t want to modify the items, if you do want to modify individual items, you can ditch the “const”. If you want to modify the vector itself, you can’t do this, you have to use indices. Or even better, often one of the algorithms in #include with a nice lambda will also work :) )

            The automatic type deduction only uses the value you first put in the variable to deduce the type, and “0” is an int by default. When using indices like that I usually go with

            for (size_t i=0; i < foovector.size (); i++) {}

            because "size_t" is guaranteed to work, and is shorter than "unsigned".

          • Kian says:

            What Csirke mentions about the new range based for is true, although range based for can sometimes be annoying in that it doesn’t let you iterate over a portion of the container (except for early exit), it’s all or nothing.

            What you can do, however, is use for with iterators instead of indexes. There, auto shines:

            for (auto it = foovector.begin(); it != foovector.end(); ++it)

            As for why it chose int in your example, it’s because auto infers the type of the thing you initialize the variable with. In your case, the literal ‘0’, which is of type ‘int’. Which is why it’s nice to use it with iterators. instead of

            std::vector::iterator it = foovector.begin()

            you can use

            auto it = foovector.begin()

          • Phil says:

            That’s even more fun when you’re also working with Windows code. Now you get to try to resolve signed/unsigned differences in for loops, and calls to the API/MFC (especially if you’re using the same variable for both).

          • Carlos Castillo says:

            That isn’t auto’s job.

            Auto’s purpose is to take an expression like:

            int foo = value;

            And realize that since value’s type is known to the compiler, there’s no need for the user to specify the type again.

            In your for loop example, your essentially giving it conflicting information, and thus a warning/error is appropriate. First you try to assign 0 to the variable, then you try to compare it to the result of the method (an unsigned value). I don’t know the spec for auto in C++, but I’m guessing that a raw integer constant is treated as signed by auto, since signed integers are usually the preferable choice (eg: int is signed).

          • Neil Roy says:

            The best way to do this with the standard for() loop rather than an iterator, I found was…

            for (size_t i=0; i < foovector.size (); i++) {}

            you'll get rid of those warnings (which bug me) with that. And I believe .size() returns size_t so it only makes sense.

      • fscan says:

        How is std:vector<T> uglier then for example java.util.ArrayList<T> ?

        Btw, from a performance standpoint vector is miles ahead of a java or c# dynamic array (well, c# is good with basic types) as the data is stored sequential in memory. This is the number one reason why it’s implemented as a template.

        • guy says:

          I am given to understand that Java arrays are usually implemented sequentially in memory. I mean, Java is all about not having to care and I’m not sure if the JVM spec technically requires it, but it’s generally done that way. ArrayList is actually backed by an array, so it’s got the same performance for everything except some insertions; Java arrays are fixed size and if ArrayList is out of space in the old array it needs to make a bigger one and copy over the contents. But it will get extra space so that doesn’t have to happen every time. Vector must work similarly, or expanding one would overwrite subsequent data.

          • fscan says:

            Java does not have any value semantics for classes. What you get is a sequential array of *pointers* to the values.
            Very bad for modern processors to work with.

            • guy says:

              Um, if the C++ vector doesn’t use pointers, you can only use it for fixed-size objects. Granted, C++ does have many more of those than Java.

              Anyways, it’s only a significant issue for modern processors if it causes a cache miss. Modern processors are pretty good at not sitting idle while waiting for something from the cache. Using the pointers only adds one memory access, which would be a single instruction. That’s likely to be lost in the noise of all the other memory accesses you do after getting the object, because registers are tiny.

              • fscan says:

                If you are transforming a large array in a loop the processor has not much room to reorder instructions. It just has to load the stuff before using it. They are very good at predicting what to load next, but this only works if stuff is sequential (prefetcher!!). If you are chasing pointers all over the memory cache misses are guaranteed.
                And having an array of fixed size memory junks (coordinates, matrices, etc) are pretty much the common case. If it’s not you are probably overusing subtyping. Especially in performance critical paths.
                Also, the overhead of having an additional pointer to every array entry is summing up pretty fast (eg almost doubling the size of an integer array list).
                Good performance on modern processors mostly comes down to optimized memory access. This is why java/c#/… are slower, not because they are jit’ed.

              • Bloodsquirrel says:

                It’s not that C++ has more fixed sized objects- just like Java, you can subclass any object (or struct), at which point it will no longer be the same size if you add any properties.

                What C++ does is give you more control, such that you can actually say “No, allocate the actual memory for this object as part of this array/class here instead of using a pointer”. This requires making the decision that you’re using MyObject, and not any subclasses of MyObject, which requires that the syntax differentiates between declaring and accessing an object as a pointer and doing so as a direct value. It’s one of those tricky things to learn when dealing with C++, but it adds more flexibility to the language.

                Java also can’t do what vector does because it doesn’t have proper templates. When you use C++ templates, you determine what type you’re actually using at compile time. The actual compiled code acts like your vector was written for MyObject all along. Java uses type erasure, which basically means that it has no idea what type it’s dealing with, even at runtime, meaning that it doesn’t even know the size of the object it’s putzing around with.

              • Kian says:

                “Um, if the C++ vector doesn't use pointers, you can only use it for fixed-size objects.”

                Vector can be used for any movable or copyable type, and it guarantees that all the objects will be stored in contiguous memory. Which is the source of the issue with headers: in order to know how large a class is, it needs to know the layout, so it needs to see the declaration of the class while compiling the code, and any classes included in it.

                However, if you have a pointer to a class as a member of one of your classes, you just need to declare the name of the class (called forward declaring), not include the header. Since the size won’t matter, you don’t need to include the header.

  8. droid says:

    // … A much longer time later we finally close the paren.)

  9. Cybron says:

    I haven’t actually been too big on most of the changes suggested so far (though I won’t pretend there’s not a good amount of sunk cost involved in that), but no header files would be a godsend. Those were one of the things I had the most trouble with when I was learning.

    • Richard says:

      Frankly, I find the idea of not having header files disturbing.

      The header file defines your “public interface”*, which is the description of “How to use this lump of code”.

      This gives you several huge advantages:

      1) When you start trying to use some new code, you need to learn what it does – not how.
      So you read the header file and don’t need to look at the actual code.

      2) If you are writing a ‘portable’ program, you can have the same set of headers for every platform.
      You then put the “Linux-specific” stuff in one set of code files, the Windows in another, the BSD in a third etc.

      Having done this, when you’re writing the ‘general’ code (which is usually most of it), you don’t need to think “Windows does it this way, while Linux does it another”, you just tick a box that says “I’m making this for Linux” and your IDE/toolchain can pick out the Linux-specific implementations.

      When a colleague needs to port the application to VXWorks, they just have to create a new set of code-implementation files. They don’t have to go through the entire program and add the VXWorks-specific stuff.

      3) External Libraries.
      You can compile a massive block of code into a single library**, then hand other people (or You From The Future) this precompiled block, and the header describing how to use it.

      For example, when you write Windows software, you’re using large parts of Windows itself. That’s millions of lines of implementation code, including all kinds of workarounds and special handling for a huge number of different bits of hardware.

      Yet all you need is the DLLs and the header files describing what’s in them.

      * And (at best), it doesn’t say anything whatsoever about how it’s actually implemented.

      ** Static or dynamic, doesn’t matter.

      • Xeorm says:

        The problem I always have with header files is that their benefits are almost always side effects from requiring their use that can be done better with the use of other programs.

        The effects you note are all variations on building good documentation for others to read so that they can use the code. That’s very nice to have, but it’s also not a function that can only be done by using header files. What’s better is to have the documentation written either by the programmer or automatically generated. No reason at all to require that the header files themselves be a part of the code.

        Even the faster compilation is something I would expect could be done much quicker by using a good programming environment. Rather than writing out the header file and all that it involves, it’s the sort of task that should be done by the computer, either as I write or when I build.

        Switching between two files for the same object just results in more locations for errors and friction in developing.

        • Kingmob says:

          “The problem I always have with header files is that their benefits are almost always side effects from requiring their use that can be done better with the use of other programs.”

          Well said. From these comments you can clearly see that a lot of users embrace faulty features because of this. A good benefit does not equal a good system. Especially since this specific case can actually be made by the compiler if it was done properly.

          The header file does not actually add any info (the proof is in basically all modern languages that work without them), that right there is a big warning sign I would think to any programmer.

    • fscan says:

      FWIW, the standards committee is working on an alternative to header files:

      Modules Proposal

  10. I’m curious about how this proposed new programming language would handle templates/generic coding w/o header files.

    • CJ Kerr says:

      Instead of writing a header file, allow the compiler to parse the actual code of the module to find out what it needs to know.

      The only reason C++ templates are implemented in the header is because of the old-fashioned way that C compilation works – one file at a time. The only thing the compiler can see is a single file after it’s been through the preprocessor, which tacks the included headers on the front. Because templates actually generate new classes at compile time, the compiler needs to see the implementation and thus the implementation must go in a header file.

      A modern language designed with the assumption that the compiler has access to lots of RAM doesn’t need to work this way – if it needs to know how something is implemented in another module, it can just go and read it.

      None of this is hypothetical – languages like D work this way already.

      • Richard says:

        I strongly disagree.

        In C++ (and many other languages), you don’t have to use header files at all if you don’t want to.

        You can even write the entire program in a single file if you like.

        However, the code will be almost impossible to understand and will take much longer to compile, because it can’t be multi-threaded as efficiently.

        Compare this:

        DoStuff();
        Flobulate();
        Wibble();

        With this:

        DoStuff()
        {
        // 20 lines of code
        // sit in here
        }

        Flobulate()
        {
        // there is another 50 lines of code
        // in here, so you can’t even see
        }

        Wibble()
        {
        // Containing yet more code
        }

        Fundamentally, code is much easier to write than to read and understand.

        Secondly, being able to build each ‘compilation unit’ without direct reference to any other means that it can be aggressively multithreaded, even spread across hundreds of cores if necessary.

        Even on a normal PC, this can easy reduce total compilation time of a mid-sized program from minutes to seconds – even before considering the fact that it doesn’t have to do anything with code files that haven’t been changed since the last build.

        • Retsam says:

          If the question is “is it better use code or header files as documentation?”, then the correct answer is neither. There are much better ways to deal with code documentation than either. Python, C#, and Java (and a host of other languages) have built-in capabilities for generating documentation files that are far better than reading header files, and neither method requires you to create a separate file for the purpose.

          And compilation speed is such a minor point in the grand scheme of things, if indeed, header files even do increase it. Of which I’m highly skeptical. This StackOverflow question on C++ compilation speed lists header files as the top reason why C++ compilation is so SLOW.

          Those advantages of headers you list? I’m pretty sure every one is already how non-header file using languages already works. Compiling code units without direct reference to other ones? Yeah, I’m pretty sure C#/Java calls those “classes”. Ability to not recompile code that hasn’t changed? Also a feature of C#/Java.

          Besides, even supposing you’re exactly right and headers significantly improve compilation speed… compilation speed is really only important for saving developers’ time. And you know what else could save developers’ time? Not having to write dang header files!

  11. MadTinkerer says:

    “ALL THOSE TEXT FILES THERE MUST BE NEARLY 100 KILOBYTES OF THEM ARE YOU MAD”

    Actual quote from the Computer Intro manual for the Magnavox Odyssey 2:

    “Ten years from this point, there will be chips capable of remembering a million bits of information.”

    Imagine that. Individual computer chips storing up to an entire megabyte of data at once. Maybe even reading the data from the same sort of magnetic tape cassette you currently use to listen to music! Truly, the future of computing will be amazing.

  12. Tim Keating says:

    The D Programming Language (http://dlang.org/) was actually created to address a lot of these issues with C++ — it has garbage collection, but you can disable that and do manual memory management. It has bounds-checked dynamic arrays (and slicing!). And it doesn’t have a preprocessor (replacing that instead with some cool features like mixins).

    It’s interesting, and worth a look at a bare minimum. You will never look at an angle- bracketed template afterward without throwing up a little.

    • Volfram says:

      I actually came here to write pretty much exactly this. D answers both of the suggestions in this particular annotation. Arrays are a basic type, and it uses the more modern “import” syntax over C/++’s “include” system. No function prototyping, and unions with no name.

      I honestly never figured out how you’re “supposed” to use header files. I was never taught, and the whole system never made any sense to me.

      I know I press on this a lot, but Shamus, after everything you’ve said about programming, you owe it to yourself to check out D. It has access to all your favorite libraries, and even has its own high-performance network library. It answers almost every complaint you’ve voiced about C or C++.

      Make it a blog post series. “Learning D.”

      • Volfram says:

        “No function prototyping, and unions with no name.”

        This sentence(particularly the fragment after the comma) is a vestige from an earlier verison of the post in which I commented that D allows what are called “anonymous inner structures.”

  13. 4th Dimension says:

    While it’s boring in C/C++ to manually write down every procedure and where it is, compilers shouldn’t read ALL other files when compiling one file. You might want or simply have inherited procedures with same names in two different unrelated files, so you need to have a way to tell the compiler which one you want.

    C# solution is namespaces. First there are no free floating functions, and every function is a method. Secondly every class, struct etc must belong to a namespace so the definition for your SpaceMarine class would be
    namespace MyAAGame.GameObject {
    class SpaceMarine{
    public void Explode(){

    }

    }
    }

    When you want to work with SpaceMarine classes in the beginign of the file you type
    using MyAAGame.GameObject
    and call SpaceMarine class dirrectly by it’s name. OR you don’t even have to use using, but than you need to refer to the class by it’s full path and that is annoying:
    MyAAGame.GameObject.SpaceMarine marine = new MyAAGame.GameObject.SpaceMarine()
    marine.Explode();

    And C# compilers are smart enough to search for things in the namespaces you specify as using with no need to create header files.

    • Richard says:

      Namespaces came from C++
      In the C++ example, the “std::” bit of “std::vector” is the Standard Library namespace.

      PS: In C#, “using” does not do what you think it does.
      “using” is a way of scoping the lifetime of the object to ensure it gets disposed of when it should.
      It’s how you do RAII in C#.

      • 4th Dimension says:

        Actually it’s both:
        https://msdn.microsoft.com/en-us/library/vstudio/zhdeatwt(v=vs.100).aspx
        and using is not simply used to insure disposing of any object, but objects that use unmanaged resources like files and such. So they implement IDisposable interface and thus have a Method Dispose that is in charge of disposing such unamanged assets. So what using as a statement does, is to insure Dispose is called.

        I sort of remember C++ namespaces, but I don’t remember them being a requirment since C++ is a lot freeer about things. In C# you have to use namespaces and even tools will automatically make namespaces for you.

  14. tmtvl says:

    Isn’t the best way of using arrays simply “my @things”?

    Oh gods, now I wonder if there’s any game development tools in CPAN…

    EDIT: SDL_Perl. it exists. Heavens me.

  15. wumpus says:

    #import “foo.h”

    static NSArray *myArray;

    What? No other iOS devs here? (Those are both age-old Objective C solutions.)

  16. I take a perverse joy in ignoring the standard library of C/C++ and code in “Pure” C/C++ (mostly just C though as me and C++ really don’t jive).

    It’s always fun to see a compiler panic.

    “You want me to do what? But how am I supposed to…”

    And then I have to yell and scream and slap it around and say:

    “Listen you little **** your predecessors managed to compile without a standard library and so can you, now start compiling you little sniveling binary.”

    C and C++ have not changed that much over the years, the standard libraries on the other hand have changed a lot.
    One issue is they keep adding stuff and not deprecating enough, say what you will about PHP but at least they are pretty good at deprecating stuff smoothly IMO.

    I’m in the process of making my own micro standard library that will be as minimalistic as possible.
    This means no malloc() or new or similar, instead I’ll be using the native Win32 HeapAlloc() which malloc() actually uses itself.

    The finished code ends up very tiny as a result.
    And if you are clever in the way you make your include files and such then porting is not a major issue either, you just need to use something else than HeapAlloc() on other OS platforms for example.

    Why do it this way? Because I learn a whole lot more than if I just throw malloc() and free() around.

  17. kdansky says:

    > No Header Files

    I think it’s a given that no modern language should use a technique that was designed to keep RAM consumption low during compiling, because computers literally did not have enough RAM to do more than sequentially read through files when C was designed. That’s why there’s a preprocessor: So you can do a preliminary pass over a file, write it back to disk, and compile from there. Today, this is a disaster.

    > Arrays

    Again, D does the right thing (as I keep saying: All his ideas are more than solved by D). If you declare int[5] x; then you get a static array with length 5, which is really nice if you want hard metal performance at all costs. If you declare int[] x, then you get the equivalent of a C++ std::vector, with dynamic growth and all the fancy goodness, but of course with all the fancy modern stuff that Blow hasn’t even heard about, like slices.

    After spending half a decade with C++, and then having a go at D, I was completely blown away.

  18. Csirke says:

    To “no header files”

    I wouldn’t be so quick to dismiss the memory and time requirements of compiling C++. Even with a codebase under a hundred thousand lines of code (I’ve worked in open-source projects that were close), compilation time can take minutes if you recompile everything (which is of course sometimes necessary).

    There are problems which make C++ very hard to parse and compile, for example its grammar is not context-free, there are examples where the interpretation of the code very much depends on the state of the compiler at that point. Headers are a way to make it easier on the compiler while manageable by the programmer.

    The solution wouldn’t be to just use the implementation files themselves, but maybe to provide some automatic, general way to generate the headers, or at least analogous information, a summary of each file. That’s what other programming languages use, I think. But since those languages were developed by companies, they just use their internal (probably binary) format for that.

    And if you use a good IDE, you don’t need to manually maintain it really. For me (I use Qt Creator, and I only need to add the function declaration in the header, then press a button, and it created the definiton in the implementation file and jumps the cursor there. If I change the parameter list, or anything, it pops up a tip saying “do you want me to change this in the other place too?” and it does so with a click. I’m sure Visual Studio has similar features.

    Anyway, I don’t think header files are fundamentally bad, but some of the shenanigans (like having to define static class variables separately in the implementation file) are problematic.

  19. Ingvar M says:

    Grump. He should probably give Go a try. But I guess that’s garbage-collected as well. On the other hand, if I could write a performant 3D game in Common Lisp, using vanilla X11 for the display, 10 years ago, I guess modern machines should be even better. On the flip side, that game never had more than a couple of hundred polygons on screen at any time (and if had been in the 1000s, it would probably have slowed down, as I had to write the 3D rendering engine as well).

  20. Daemian Lucifer says:

    What I want to know is,why arent there H,I,N,U,V*,W and Y programming languages?Someone needs to make them.

    *There appears to be a vvvv however.Which I approve of.

  21. Marnie says:

    Shamus, it’s been almost thirty years. You need to let go.

  22. Atarlost says:

    Exceptions at least are kind of important.

    Let’s say your game is an online capable game and your network code is a library.

    You can’t muck about with the library code because that makes maintenance harder, but if it’s well written it will throw exceptions when bad things happen.

    You really don’t want to crash. That’s just not polite. If you do crash you’d really like to at least close up your log files so all the queued writes hit the disk including one last one about why you’re crashing.

    You might be able to get the other computer to resend and get the game working again. That would be great, but you can’t do it without being able to have the library throw exceptions for you to catch. The library would have to either crash or do ugly stuff with global variables that you’d have to check every time you called anything from the library.

    If you can’t recover you might be able to save the game state. RTSs sometimes handle dropped connections that way so you can call your friends and arrange to restart the game from the save and hope that someone’s emergency quicksave works.

    At the very least you can almost always avoid crashing and boot the player to the main menu.

    Reading user editable files is another case where maybe you want to be able to catch errors. You probably can’t make correct data structures from malformed XML, but you can tell the user where the problem is and maybe even give them the option of just not loading that mod. If you don’t want to write your own XML parser finding one that throws exceptions you can catch instead of crashing or failing silently sounds like a pretty good idea.

    If you don’t use libraries for things where outside forces can cause errors or don’t care about crashing impolitely whenever anything goes wrong you don’t need exceptions, but lacking them is not by any means a positive feature for a language.

  23. Neil Roy says:

    Now show everyone how to make a 2D vector… you know, something akin to myarray[10][10], LOL… man, does that get UGLY then! I think it’s something like std::vector<std::vector>? I forget, I hate it anyhow… makes me barf just looking at it.

    I love how you used a pure C example, then showed the “best” way which was in C++. I wrote my own vector/stack functions for C so it does it for me. I don’t “forget” to free the memory just like you shouldn’t forget to use angle brackets for vectors. If you forget to use them there, you have problems. If you forget to free the memory… you may have problems (but probably not as many if the program is exiting anyhow).

    When I allocate memory in C, I immediately write code to free it. Simple, it’s a habit with me now. So “if you forget to…” ummm… learn to not forget? I’m not about to switch to a convoluted language because I’m to lazy to develop good programming habits.

    I think C++ uglifies code with such nonsense. Vectors are also slower (as is C++ in general).

Leave a Reply

Comments are moderated and may not be posted immediately. Required fields are marked *

*
*

Thanks for joining the discussion. Be nice, don't post angry, and enjoy yourself. This is supposed to be fun.

You can enclose spoilers in <strike> tags like so:
<strike>Darth Vader is Luke's father!</strike>

You can make things italics like this:
Can you imagine having Darth Vader as your <i>father</i>?

You can make things bold like this:
I'm <b>very</b> glad Darth Vader isn't my father.

You can make links like this:
I'm reading about <a href="http://en.wikipedia.org/wiki/Darth_Vader">Darth Vader</a> on Wikipedia!

You can quote someone like this:
Darth Vader said <blockquote>Luke, I am your father.</blockquote>