So they’re going to rewrite gcc in C++, eh?
Came across this: Converting gcc to C++ and by a little googling, came across Ian Lance Taylor’s blog. Ian Lance Taylor is a pretty big name in the open source world. I think I may have exchanged a few emails with him 7 or 8 years ago related to some patches to CVS that I was working on — not that that’s relevant to anything. I also know I’ve seen his name on many a man page, listed as author.
Given my own antipathy towards C++, you might think I would be worried about a move from C to C++ for gcc development, but, eh, not so much. Here’s why. C++ seems to make life difficult for anyone who is not a total badass expert in C++. The guys working on the gcc codebase, which includes such things as a C++ compiler, linkers, debuggers, cross compilers, etc., are most assuredly the most hard core computer language geeks you’ll ever come across, and the sorts of things which bother me (and loads of other people) about C++ are exactly the sorts of things for which these guys have built a compiler. So, I’m not the least bit worried that these guys will build a buggy, slow gcc by using C++. In fact, if they were to rewrite it in C, they would probably manage to improve it as well.
There are some interesting issues brought up in the comments of the first link though. Many of them are C++ fans expressing approval, some are C++ detractors. One of the more interesting comments is about the “trusting trust” issue, which is referring to Brian Kernighan’s famous C compiler exploit demonstration. That’s worth talking about (in case you aren’t already familiar with it) because it’s just a really interesting hack, and old as the hills, but just really neat.
It goes like this:
The C compiler is written in C. The first executable was “hand compiled” to assembler, and used to compile the 2nd executable. Thereafter, the “hand compiled” compiler was dropped.
Now, lets say you want to provide a back door into a computer system. Let’s say you want to subvert the “login” program to accept your magic password as a skeleton key, or some such.
So, you could try to introduce rogue code into the login program’s source, but, anyone looking at this source would likely notice it. You could try to subvert the compiler’s source, so that if it noticed it was compiling login, it would insert rogue code, but anybody looking at the compiler’s source would see this too. The step of genius is as follows:
Modify the compiler source so that a) if it notices it’s compiling login, it inserts the rogue code, and b) if it notices it’s compiling the C compiler, it inserts rogue code into the compiler to c) compile login with the exploit, and d) compile the C compiler with the exploit.
Now, you distribute binaries for this rogue C compiler, and the sources for the unmodified C compiler. If anyone examines the C compiler source they will find nothing wrong. If they compile the unmodified C compiler source with the distributed rogue C compiler binary, the rogue C compiler will insert the exploit code into the produced C compiler. Likewise, if they compile login with this compiled-from-non-rogue-source C compiler, the exploit will still be inserted into login. To detect the exploit, you’d have to debug the machine code.
How does this relate back to converting gcc to C++?
As brought up by one of the commenters, there are many independently produced C compilers. And gcc has historically been able to bootstrap from even old, pre-ANSI, pre C99 C compilers. There are much fewer C++ compilers. And a mini C compiler that can bootstrap gcc as it is today is not impossible to write. If you want to ensure you can get around the “trusting trust” bug — you can conceivably audit gcc, write your own mini C compiler, and bootstrap gcc with the C compiler you wrote yourself (and thus know is free of exploits). Now you have a C compiler you can trust. Writing a C++ compiler is much more difficult than writing a C compiler, so this avenue is essentially cut off.
Well, that is in some ways a theoretical objection. How many people a) audit gcc, and b) write a mini C compiler and bootstrap gcc with it? Probably close to zero. (There are probably a few computer science guys into languages who’ve written a mini C compiler and bootstrapped gcc and taken a hard look at the gcc code, and I could see the NSA doing this sort of thing (and/or the opposite of this sort of thing for instance — I could easily imagine the NSA distributing compromised binaries of gcc), so I wouldn’t say it’s zero. But it’s probably damned close to zero.) And there’s nothing stopping people from writing their own C++ compiler, apart from the inherent difficulty of the task, compared to writing a mini C compiler.
From what I’ve read the gcc source is not exactly easy reading. I read somewhere that the back end code was deliberately tightly integrated with the front end code by RMS to discourage vendors from producing proprietary optimizing backend plugins for gcc, which sounds about right, though I don’t know for sure if it’s true, but if it is, the intentions were surely noble.
Well, it’ll be interesting to see how gcc development proceeds.