C++ - Lessons of Failure

February 15, 2010

Trust But Verify, A Consulting Love Story

About 12 years ago, I was a member of the Professional Services Group for a C++ tools company. They created a great framework for C++ classes, particularly dates and strings, that really didn’t exist in a standard format at the time. They dominated the market because their tools were second-to-none.

One of my many gigs at the company took me to Dallas, Texas, home of great bar-b-que and technology companies. My observation upon arrival is that the density of the two is approximately 1:1, or at least it was at the time. Coincidentally, another colleague of mine from the same company was also on assignment during the same two-week period at a company just down the road, a major telecommunications company (let’s call them “Lint Communications”). Not coincidentally, I think lint was all that existed between some of the managers ears at that company.

My friend was tasked with porting the C++ toolkit over to OS/390 for Lint Communications. This sort of work was typical for our group–anyone who wanted support for our libraries that was outside the supported platform list usually hired Professional Services to come on-site and create a custom build for them. After you did a few of those, it was mind-numbing work usually consisting of chasing down obscure compiler parameters and libraries in paper manuals at a time when Google and the Internet were not really up to the task of storing that information.

Oh, did I mention that our company charged about $500,000 for a single OS/390 license? Yeah, so there was serious cash on the line. That might be important later.

This gig started out innocently enough. We had dinner about midway through the week, during which we traded war stories:

Me: How did it go today?

Friend: Uh, it was kind of weird. They are telling me to finish the port, but they are talking to the sales guy like they’re not all that interested in it. Something about it not working right.

Me: Did you run the standard test suite? Did it pass?

Friend: Yeah, flying colors. No problems.

Me: Did you offer training or help?

Friend: Yeah, they tell me they don’t have time right now. One other thing though…

Me: What?

Friend: Every hour, on the hour, the file system slows to a crawl. Kind of seems like they’re taking backups of it or something.

I should tell you that my friend was a junior engineer working in our group and this was one of his first gigs. He seemed to think this felt wrong, but he just wasn’t sure. At this point, my alarm bells were ringing. Here we had a customer that was paying good money to have an engineer on site, but telling the sales rep that the product wasn’t working and they didn’t want help to fix it. And taking snapshots of his work.

Our conversation continued:

Friend: What should I do?

Me: Finish the port like they asked. However, just in case, put in a time-bomb. Something not easy to find on quick inspection inside of our headers that will kill the library after a certain date.

He completed his work and put in the Trojan horse in the string library, something like:

#if !defined(SOME_OBSCURE_COMPILER_OPTION_THEY_WOULDNT_GUESS) {
      if (today > some_magic_date) { exit(DONT_STEAL_ME_BRO); }
}

After we returned from our jobs, we learned from the sales rep that they decided not to purchase the software license and they deleted the software from their mainframe. We considered the matter dropped and went about our other projects.

About 45 days later, our customer support department got a call from Lint Communications. They complained that our toolkit was crashing their application every time they launched it, based on stack dumps. A quick search of their customer information confirmed they had no purchased licenses.

Sure enough, they had continued to use the OS/390 port without paying for it.

Our sales group negotiated a nice settlement with them to the tune of almost $3 million for a license deal after agreeing not to sue them for piracy. After that, my friend turned over the compiler option to Lint Communications that was required to shut it off.

I learned a critically important lesson in consulting that day: Take the customer’s word, but make sure they are telling the truth.

Trust, but verify.

December 21, 2009August 21, 2012

Google Go: Good For What?

My posts on Google’s Go (Part 1 and Part 2) definitely touched a nerve with a few folks. And I appreciate good dialog on ideas like this…

One pervasive question that I keep hearing is “Who is Go good for?” And I’m having a hard time finding a good answer. Even Go’s own FAQ page is uncharacteristically vague about it.

I’d say there are plenty of non-starters to keep Go out of the application programming space. After my arguments pointing out that it won’t replace Java anytime soon, folks are telling me that I wasn’t looking at the right demographic. These people suggest that Go is really for systems programmers. Systems programming has typically been the bastion of C and (more recently) C++ programmers for the past 2 decades. If you’re doing serious systems programming, you’re in one of those two camps, generally speaking. Maybe with a touch of assembly here and there.

OK, I’m game for looking at that. First off, what makes a good systems programming language? Here are few things we might want:

can operate in resource-constrained environments
is very efficient and has little runtime overhead
has a small runtime library, or none at all
allows for direct and “raw” control over memory access and control flow
lets the programmer write parts of the program directly in assembly language

Does Go really fit into that box?

Go’s performance numbers are rough 6x worse than C++, on average. The best performing Go test was comparable to the worst C test. While I gave Go some leniency with Java on performance in an application environment (there are plenty of other non-memory, non-CPU bottlenecks to worry about there), the systems world is far stricter about raw, unabashed execution time and resource consumption. (+10/20 pts)
Go’s memory and execution footprint are higher than C and C++, according to these stats. Not exactly an ideal candidate for replacing either of these languages currently entrenched in this space. An interesting experiment: Compile Hello World in Go and C++. Go’s compiled & linked output: 38K, C++ clocks in at 6K, about 84% smaller. (+10/20 pts)
If you include the garbage collector, the Go runtime footprint is certainly larger than C/C++. But it’s safer than either C/C++ for the same reason. And to top it off: Go’s garbage collector isn’t parallel safe right now. (To be fair, that’s the #1 thing on the TODO list right now for the Go team) (+15/20 pts)
Raw and direct control is possible, so Go checks in fine here. You can use this to investigate runtime structures if you like. (+20/20 pts)
This is similar to Java’s native interface (JNI), but statically linked. So yes, it’s possible. (+20/20 pts)

At 20 pts per question, let’s be kind and give Go a 75/100 possible score there (A solid “C” on the American grading scale, yuck yuck…). If you’re a C/C++ programmer where you’re already at 100/100 on the above chart, where is your motive to switch here? Couple that with the fact that systems programmers are not exactly known for adopting bleeding edge technology at a rapid pace. It was years before C++ ever made substantial inroads with the embedded C crowd. Considering the degree of reliability required to do high quality, bare-metal systems programming, I’d be skeptical of anything new in this space too.

Finally, let’s hit up the syntax argument one more time, because I think this is the crux of the entire problem. Before I do, let me just say I don’t personally have any problems with Go’s syntax one way or the other. I’ve learned a plethora of languages in my tenure as a software nerd and adding one more would not be a big deal if I felt the payoff was big enough. But I think syntax familiarity is a barrier for a lot of people, based on my experience as a language instructor and Parkinson’s Law of Triviality.

Briefly stated, Parkinson’s Law says we unfortunately spend disproportionate amounts of time and energy arguing about things that are more trivial (and we understand) than we do about those that are more substantial (and fail to grasp). This is particularly true with programming languages and syntax. I saw that resistance teaching Java to C++ folks back in the mid-90s. And that wasn’t exactly a big leap. Changing from C++ to Go is likely to be much worse than C++ to Java, and that resistance is critical to adoption rates.

So I’m not feeling the love for Go replacing C/C++ systems programming either. If I was looking for a new tool in my toolbox, I don’t think I’d be buying this one from Google.

All of this leaves me scratching my head and singing:

“Go! Huh! Yeah!

What is it good for?

Absolutely nothing.

Say it again.”

This article is translated to Serbo-Croatian language as well.

December 8, 2009

Google’s Go Isn’t Getting Us Anywhere, Part 2

In Part One of this post, we discussed the Great Concurrency Problem and the promise of Go in taking the throne from Java. Today, I show why Go isn’t going to get us there.

Back in the heady days of C++, if you wanted to add concurrency support to your application, you had to work for it. And I don’t mean just find a few calls and shove them into your application. I mean:

Find a threading library available on your platform (maybe POSIX, maybe something more nightmarish, maybe even a custom thread library that would run you a few hundred bucks per license)
Locate the obscure documentation on threading APIs
Figure out how to create a basic thread
In the process, read the encyclopedia-sized docs about all the real issues you’ll hit when building threads
Decode the myriad of options available to you to synchronize your threaded application via header files
Add the library to your makefile
Code the example and
Make it all work

Contrast that with Java:

Create a Runnable interface
Implement the run() method
Call new Thread(myRunnable).start();
Debug the obscure errors you get after about 6 months of production

Whoa. At least with C++, the Threading Shotgun wasn’t loaded, the safety was on and it was hanging on the wall. You had to do the hard work of loading the gun, removing the safety and pulling the trigger. Java took all that away by handing you the loaded shotgun, safety off. That shotgun is the Great Concurrency Problem.

Java’s great contribution and Achilles Heel, in my opinion, was the choice to make threading so darned easy to do, without making developers innately aware of the implications or difficulties of concurrent programming with the shared memory model. C++ made you wade through all the hard shared-memory stuff just to get to threads, so by the time you wrote one, you at least felt smart enough to give it a go. The concurrency models in Java and C# hide all sorts of ugliness under the covers like shared memory models, caching of values, timing issues, and all the other stuff that the hardware must implement to make these concurrent threads do their jobs. But because we don’t understand those potential pitfalls before we write the software, we blithely assume that the language semantics will keep us safe. And that’s where we fall down.

Write a multi-threaded program in any shared-memory concurrent language and you’ll struggle with subtle synchronization issues and non-deterministic behavior. The timing bugs arising from even moderately concurrent applications will frustrate and annoy the most seasoned of developers. I don’t care if it’s in Java or not–the issues are similar.

My specific beef with Java is the ease with which we can create these constructs without understanding the real problems that plague us down the road. Until we have the right tools to produce concurrent applications in which we can reliably debug and understand their behavior, we can’t possibly benefit from the addition of a new language. In other words, if you want to create a Java killer, you’re going to need to make concurrent programming safer and easier to do. A tall order to say the least.

Enter Google’s Go in November, 2009. The number one feature trumpeted by reviewers is the use of goroutines (the message-based concurrency mechanism for Go) and channels to improve concurrent programming. Initial reviews are mixed at best. But I don’t think we’re anywhere close to killing Java off with this new arrival on the scene for a variety of reasons:

Going nowhere?

Go decided to use a foreign syntax to C++, C and Java programmers. They borrows forward declarations from BASIC (yep, you heard me right…BASIC), creating declarations that are backwards from what we’ve been using for close to 20 years. Incidentally, syntax similarity was one of the main reasons C++ programmers easily migrated to Java during the Language Rush of 1995, so this is disappointing.
Performance benchmarks that put it slower than C++ (and therefore, slower than Java today since Java finally caught up to C++ years ago). OK, I’ll grant you that Java wasn’t fast out of the gate, but Java was also interpreted. Go is statically linked, and not dynamically analyzed at runtime, so it’s not likely to get better immediately.
A partial implementation of Hoare’s CSP model using message-based concurrency. I almost got excited about this once I finally understood that message passing really makes for safer concurrency. But they didn’t get the model quite right. For example, did you know you can take the address of a local variable and pass that via a channel to another goroutine to be modified? Bringing us right back to the same crappy problems we have in Java and C#. Oh yes. Not that you should do that, but even Java was smart enough to drop the address of operator for precisely that reason.
A few low-level libraries bundled with language, but just barely enough to be functional for real world applications. Completely AWOL: Database and GUI. (translation: “I get to rewrite database access. One. More Time.” Neat.) Did I mention Java had those during it’s 1.0 release?
Static linking. OK, I admit I’m an object snob and I like a strongly-typed, dynamically-bound language like Java. I like reflection and dynamic class loading and the fact I can pass strings in at runtime, instantiate objects and execute functions in ways the original code didn’t explicitly define (and yes, I’ve done this in enterprise production systems!). Not with Go, instead we’re back to C++ static linking. What you build is what you get. Dynamic class loading was probably one of the most useful aspects of Java that allowed for novel ways of writing applications previously unseen. Thanks for leaving that one out.
Excepting Exceptions. Go decided to omit exceptions as the error handling mechanism for execution. Instead, you can now use multiple return values from a call. While it’s novel and perhaps useful, it’s probably a non-starter for the Java crowd used to error handling using exceptions.

This feels like some academic research project that will be infinitely pontificated about for years to come, but not a serious language for enterprise development (obligatory XKCD joke). In short, I’m not impressed. And I kind of wanted to be. I mean this is freakin’ Google here. With the horsepower of Robert Griesemer, Rob Pike, Ken Thompson in one building. The #1 search engine in the world. The inventor of Google Wave that created so much buzz, people still don’t have their Wave Invites yet.

Enterprise Languages should be evolutionary steps in a forward direction. But Go doesn’t really get us anywhere new. And it certain isn’t much of a threat to Java. Sorry Google, maybe you need to give it another go?

* Many thanks to my friend Tom Cargill (who you may know from the “Ninety-Nine Rule“) who reviewed early drafts of these 2 posts and corrected my mistaken notions of concurrency, parallelism, Goroutines and Go syntax. He didn’t stop the bad jokes, though. Sorry about that.

December 7, 2009

Google’s Go Isn’t Getting Us Anywhere, Part 1

There’s buzz in the air about Google’s new language Go. Naturally, I was excited hearing about it. After all, Google has produced so many interesting tools and frameworks to date there’s almost automatic interest in any new Google software release. But this wasn’t just a product, this was a Google language release. My programmer brain pricked up immediately.

Language releases always catch my attention. Since 1995, I’ve constantly wondered what is going to be the Great Java Killing Language. Java’s release was the Perfect Storm of Language Timing–the rise of the internet, the frustration with C++, the desire for dynamic web content, a language bundled with a large series of useful libraries (UI, database, remoting, security, threading) previously never seen. Lots of languages have been released since, but none with quite the reception of Java. But with that perfect storm came some serious fallout.

At the same time Java rose to prominence as the defacto web and enterprise language of choice, Moore’s Law was hard at work and hardware companies were creating new kinds of processors–not just faster ones, but also motherboards that supported multiple processors. And then multiple cores on those processors. Concurrency became the new belle of the ball, with every language making sure they added support for it. Which gave rise to the widespread use of concurrency features in languages. In essence, Java brought attention to the Great Concurrency Problem that has haunted us almost two decades now.

Before I address the Great Concurrency Problem, we have to agree that most people confuse Concurrency with Parallelism. Let’s start with the definitions from Sun’s Multithreaded Programming Guide:

Parallelism: A condition that arises when at least two threads are executing simultaneously.
Concurrency: A condition that exists when at least two threads are making progress. A more generalized form of parallelism that can include time-slicing as a form of virtual parallelism.

Parallelism has only come about with multi-processor/multi-core machines in the last decade or so. Previously, we used Concurrency to simulate Parallelism. We program our applications to run as concurrent threads. And we’ve been doing that for years now on multithreaded processors. But the Great Concurrency Problem is really a problem about the differences between Human Thinking and actual Machine Processing. We tend to think about things linearly, going from Breakfast to Lunch to Dinner in a logical fashion. In the background of our mind, we know things are going on. You might even be semi-aware of those yourself. And occasionally, we get those “Aha!” moments from that background processing of previous subjects. We use this mental model and attempt create a similar configuration in our software. But the shared-memory concurrency model used by Java and other languages creates implicit problems that our brains don’t really have. Shared memory is a tricky beast. You have objects and data inside Java that multiple threads can access in ways that aren’t intuitive or easily understood, especially when the objects you share get more and more complex.

There are really two main models for concurrent programming: shared memory and message-passing communication. Both have their ups and downs.

Shared memory communication is the most common of the two and is present in most mainstream languages we use today. Java, C#, C++ and C all used shared memory communication in their thread programming models. Shared memory communication depends on the use of memory locations that two or more threads can access simultaneously. The main danger of shared memory is that we share complex data–whole objects on the heap for example. Each thread can operate on that data independently, and without regard to how other threads need to access it. Access control is granted through monitors, mutexes and semaphores. Making sure you have the right control is the tough part. Too little and you corrupt your data. Too much and you create deadlocks.

Let me give a concrete example to show just how nasty this can get for shared memory communication: Let’s say you’re handling image processing via threads in a shared-memory model–like Photoshop does for image resizing. And let’s say you’re trying to parallelize this processing such that more than one thread handles a given image. (Yes, I understand we don’t do that today and there’s a good reason for that. This is an analogy, just keep your shirt on a sec.) An image is an incredibly complex object: RGB values, size, scale, alpha, layers if you’re in Photoshop, color tables and/or color spaces depending on the format, compressed data, etc. So what happens when Thread A is analyzing the pixel data for transformation and Thread B is trying to display that information on the screen? If Thread A modifies something that Thread B was expecting to be invariant, interesting things happen*. Thread A may accidentally corrupt the state of the image if Thread B doesn’t lock the entire object during read operations. That’s because Threads A and B are sharing the entire object. Oh sure, we can break the image down into smaller, simpler data abstractions but you’re doing that because of the shared memory problem. Fundamentally, Java objects can be shared between threads. That’s just a fact.

Keep in mind this is just a TWO thread example. When you write concurrent systems, two threads is like a warm up before the Big Game–we’re barely getting started. Real systems use dozens, if not hundreds of threads. So if we’re already having trouble keeping things straight with two threads, what happens when we get to 20? 200? The problem is that modeling any system using concurrent programming tools yields a subtle mess of timing bugs and problems that rarely appear until you have mountains of production data or traffic hammering your system. Precisely when it’s too late to do anything about it.

Even Java’s own documentation from ages ago cautions just how hard this problem really is:

‘‘It is our basic belief that extreme caution is warranted when designing and building multi-threaded applications … use of threads can be very deceptive … in almost all cases they make debugging, testing, and maintenance vastly more difficult and sometimes impossible. Neither the training, experience, or actual practices of most programmers, nor the tools we have to help us, are designed to cope with the non-determinism … this is particularly true in Java … we urge you to think twice about using threads in cases where they are not absolutely necessary …’’

Hey, what's behind that Runnable there? Uh oh...

Harsh words (at the bottom) from a language that really opened Pandora’s Box in terms of giving us the tools to make concurrency an everyday part of our applications.

Message-passing communication is perhaps the safer of the two models. Originally derived from Hoare’s Communicating Sequential Processes (CSP), message-passing communication is used in languages like Erlang, Limbo and now, Go. In message-passing communication, threads exchange messages with discreet amounts of local data via channels. I like to think of message-passing communication to be kind of algorithmic atomicity–you are performing some action, say transforming an image and at a certain step, you need the data from the image’s color table. So you wait to get a message from another thread when that data is available. And then continue processing locally in your own algorithm.

Because threads are restricted in what they can share, the risk of corrupt data and deadlocks drops considerably. But this comes with a higher processing cost than shared memory communication. With shared memory, there was no re-writing of the data before thread access. Just the opposite is true for message-passing. Until recently, message-passing communication was considered far to expensive to use for real-time systems. But our multi-core, multi-processor world of the 21st century has finally broken down that barrier.

The question is, does Go really solve that problem in a way that overthrows Java as King of the Enterprise? Tune in tomorrow for Part Two, where we look at Go’s features, whether Go really addresses any of these problems, and if Java is doomed.

* “Interesting” is the default programmer adjective we tend to apply when what we really mean is “incredibly BAD”.