Concatenating Strings – You’re still doing it wrong.

A common anti-pattern I see from novice programmers is the tendency to read a coding tip somewhere, assume it to be a universal truth, and immediately start applying it everywhere in their code without fully understanding it. Usually the coding tip relates to optimization and is interpreted by the coder as “X is faster than Y, so always do Y instead of X.” This fallacy is particularly rampant with respect to the different approaches for concatenating strings.

Car with Wing

But it works on race cars....

I’m not writing this article to chastise the fledgling programmer who has fallen into this trap, nor is this intended as a how-to article on optimizing your code. Heaven knows there the internet is lousy with articles about the most efficient way to cram strings together. I will address the problems associated with some of the mythology about string concatenation, but my primary goal will be to encourage critical thinking and healthy skepticism for silver bullet programming techniques.

The Symptom

Though I possess no paranormal super-powers, I do believe I can read the mind of another human when I see code that looks like this:

    StringBuilder WelcomeMessage = new StringBuilder();
    WelcomeMessage.Append("Hello ");
    WelcomeMessage.Append(" ");

My spirit guide informs me that that programmer responsible for this code remembers reading somewhere that <insert programming language> is really inefficient at concatenating strings, but you can overcome that limitation by using the StringBuilder class. Based on this information he/she replaces every string concatenation operator with this clever technique leaving a scent trail through the code that experienced programmers can smell from miles away.

Problem 1: Premature Optimization is Procrastination

Sure, you want ALL of your code to perform well, but experienced programmers understand that their time is valuable and best spent on activities that deliver actual business value. Premature optimization and its ugly cousin Micro Optimization are almost always a waste of time. I understand how tempting it is to justify in your head that you could squeeze a little bit more performance out of your app by re-factoring the whole thing you learned in the blog post you read today, especially since it can be like a mini-vacation for your mind from the really complex issues you really should be working on, but be strong and resist!

As a rule of thumb: If it isn’t worth creating a jig to profile the performance gains you expect to get from optimization re-factoring, then it isn’t worth the time to do the re-factoring in the first place, plus it is risky because you won’t notice that your supposed “optimization” actually hurt performance.

More on that later.

Problem 2: Cookie Cutter Optimizations Assume the Compiler is Stupid

If you could universally make string concatenation faster by applying a simple formula then the compiler would probably already be applying the transformation anyway. Granted, I think this point is lost on some novice programmers who only have experience in higher level languages.

For them, I’ll clarify with this point.

Your program is running the compiler’s interpretation of your code. Not your actual code.

With that in mind, to think that the StringBuilder approach always runs faster would require you to believe that the people who wrote the programming language were smart enough to make string concatenation fast when they created the StringBuilder class, but forgot how to do it when they built the concatenation operator.

Pop Quiz: Does this code give you any heartburn? Why?

    string querySQL = "SELECT * " +
                      "FROM myTable " +
                      "WHERE (ID=5)";

If you said yes because it isn’t worth incurring the cost of concatenation for code readability then you aren’t giving the compiler enough credit.

Here is the MSIL output for the above statement:

ldstr      "SELECT * FROM myTable WHERE (ID=5)"

Amazing, huh?

Compilers are written to do the complex task of reading your code and interpreting what it means. Figuring out that a series of constant strings can be combined is child’s play.

Problem 3: If you don’t understand it, you’ll do it wrong

Cargo Cult programming is a derisive term for doing things in your program because you think you need to, but don’t understand (or have a vague notion of)  the underlying reason. It is really bad practice to adopt a technique without asking enough “why” questions to grasp the reason using it is desirable.

As an example, let’s dissect the premise that string concatenation using operators is slow and should be replaced by StringBuilders.

Q: Why do some claim that string concatenation with operators is slow?
In many garbage collected languages (Java/.NET) string objects are  immutable, meaning you can’t change them. So when you append more content into an existing string the program must internally create a new string and copy the old and new contents into it. The extra effort to create, destroy and garbage collect the extra string objects has the potential to create more work for your program and can degrade performance if done excessively.

Q: How does the StringBuilder help?
The StringBuilder class is implemented as a mutable memory buffer that typically has extra unused space allocated so that concatenations can be made in place without the need to create extra objects to juggle the data.

Q: How much extra space does it reserve? What happens if I append more content than will fit in the unused space?
By default (in .NET) 16 characters, unless you specify differently in the constructor. If you append more data than there is space, the StringBuilder will behave much like a String object creating a new StringBuilder object with double the existing capacity then copy over the data.

You: Wait, what?

You mean that you have been using StringBuilder with the default constructor and then appending more than 16 characters to it?

Yeah, well if you are lucky you’ll be no worse off than if you just used the “evil” string concatenation operators. However, due to that neat capacity doubling side-effect, your program might actually be locking up unnecessarily large chunks of memory on top of the additional work required to wrangle all the intermediary objects. Perhaps it is worth investigating and setting the initial capacity of that StringBuilder to avoid such nastiness.

Bonus: Now that you understand the potential performance benefit is based (at least partly) on mutability, you will see that other string optimization opportunities may exist whenever existing strings need to be modified, not just appended to.

Final Thoughts

Again, the point of all this isn’t about strings, or optimization, or any of that. It is about taking the time to understand what you are doing to avoid falling prey to the potentially harmful myths that are enthusiastically passed around by programmers (see also “The database sorts by the clustered index if you don’t specify an Ordering“).

In any event, I’m curious as to how many of my readers actually have at one time subscribed to the cargo cult programing meme of “StringBuilder is always better”. Please let me know in the comments.

Problem Solving 101: First, Understand the Problem

The Null Solution

In my previous article, Collaborative Problem Solving, I talked about the importance of questioning assumptions and reconfirming your understanding of the problem when you hit a brick wall during design or troubleshooting of a system. In fact, one of my favorite strategies for tackling a tough design problem is take the  nihilist approach. In the words of the Zen Master…

Remember that there is no code faster than no code.

Taligent’s Guide to Designing Programs

My corrolary to this is that there is also no code simpler than no code. So when confronted with a complex design challenge or optimization problem, my first question is always “How can we get away without implementing this at all?” Granted, this only works about 20% of the time, but when we can step back and realize that a particularly pernicious problem has just vaporized it is a wonderful thing.

I’m sure that some may be  incredulous about the idea that this ever works at all. I mean, why would anyone have even started working on a problem that didn’t need to be solved? The answer is that only in rare cases does this approach reveal that the entire project was unnecessary, although I have that happen. More often it just reveals that the developer has ventured too far down a dead-end path and is working exclusively on meta-problems that can be eliminated by just backing up a few steps.

For example, this Stack Overflow Question, is a good example of a meta-problem that can be eliminated by taking a step back and reconsidering how to avoid throwing errors instead of how to ignore them as explained in this answer.

Re-evaluating the Problem

What got me started on this two day rant about problem solving approaches was this really insightful idea for solving some of the core issues with implementing an voting system. It was a really great example of how re-examining the core problem led to a solution to a seemingly intractable problem.  Here’s a quick breakdown on the dilemma.

Goals: An voting system must satisfy these criteria (among others)

  1. Voters must be able to anonymously cast their votes.
  2. Voters can cast zero or one ballot, but never more than one.
  3. Voters must not be able to prove who they voted for (to preclude vote buying).
  4. Voters should be able to confirm their vote was properly recorded for each candidate they picked.

Problem: If a voter can confirm who they voted for then they have the means to prove (and receive remuneration) for voting for a particular candidate.

The problem that has stumped many, including myself in analyzing this problem is that requirements (3) and (4) are apparently incompatible.  How can you give the person a way to confirm their vote without being able to prove their vote to another person?

The Solution: The solution that David Bismark came up with was fairly straightforward once he realized that we were looking at requirement (4) incorrectly. Consider this clarification of that requirement.

4. Voters should be able to confirm their vote was properly recorded for each candidate they picked the way they marked it.

Under this revised understanding of the problem, the solution became more straightforward. Essentially his solution involves randomizing the order of the names on the ballot, after voting the ballot is torn in half and the names part is shredded. The bubble-candidate correlation is encrypted on the ballot to prevent tampering, but allow the vote to be tabulated.

The voter then can check (perhaps online) that the bubbles they marked were counted, but no longer see which candidate corresponds with each bubble. Assuming they remember the order of the candidates, they can confirm their voted counted but not prove who they voted for. Voila! Brilliant!

Problem Solving 101: Collaborative Problem Solving

As a software development manager, I am frequently visited by developers who are spinning their wheels on a design problem or running out of ideas while troubleshooting an application.  I’ve developed a reputation as a problem solver because most of these visits end with a eureka moment that clears the logjam for the developer. Although I do like to think that I’m a reasonably good troubleshooter/architect,  I have accrued quite a bit of unearned reputation from these impromptu brainstorming sessions as a problem solver. A more fair assessment of my contribution might be that I am just a really good sounding board.

I’m sure everyone reading this can remember an instance where you went to ask someone a question and simply the act of asking it aloud made you realize the answer. Consider for a moment why this technique works, and why it often works better when the other person has little advance knowledge of  the issue. To bring that other person into the discussion you are forced to rewind your brain back to when you started working on the problem eschewing all of the clutter and complexity that you have added since you’ve been working on it. Then you must revisit each of the steps you have taken justifying each decision and restating each assumption.

So why should revisiting the history of the problem be effective? After all, having the same person answer the same intermediate questions ought to bring that person to the same conclusion, right? If this were true, then the sounding board approach would be a waste of time, which we know is not the case from anecdotal evidence. The reason you arrive at a different answer is that unlike the first time through, you now have the benefit of foresight and are far less likely to take  a path that you know will lead you through the brambles.

The longer you walk down those brambly paths the easier it is to mentally commit to pressing forward. On an intellectual level we understand the sunk cost fallacy, but are still susceptible to it and can form mental blocks that make it discomforting to backtrack for more than one or two forks in the road. This is why we need the guide to force us to rethink the whole journey and each decision along the way.

Tips for Being a Good Sounding Board

I'm Listening.

Face it, there are times when a poster of Justin Bieber would be exactly as productive as a sounding board as you or anyone else on the team. The core problem is sometimes just so obvious that saying it out loud is enough to expose the flaws in the current approach. I’m going to save the advice on how to be more like Justin for a future article and focus now on how you can be a better guide for helping out on the thornier problems.

  1. Establish yourself as a devil’s advocate – The rest of these tips taken in the wrong context can easily put the person defensive. Establish that you are going to ask a bunch of questions, some naive, some pointed, but that your goal is to help them re-verify their assumptions, not to promote an alternate idea or criticize their approach.
  2. Be Kind, Rewind – Even if you know the back-story, ask lots of “why” questions and continue pushing back towards the root problem. Help them fight the tendency to revisit only the last few decisions. “You are de-normalizing the table for performance? Why did you need to do that?”
  3. Don’t let them linger on meta-problems – When a problem seems too big to hold in your brain, it’s tempting to seek respite by focusing on manageable meta-problems. While those problems may need solving, they are distractions from the real issue that is blocking progress. When the person lingers on details or raises meta-problems, keep rewinding.
  4. Ask Probing Questions – Make them talk through each decision and ask stupid questions. Apply extra scrutiny when the developer clearly thinks a decision was easy or obvious. When we think things are obvious, we take mental shortcuts by avoiding thinking things through as thoroughly. Challenge assumptions, this is where problems hide.
  5. Use Reflective Listening – This is a communication technique where you repeat back  a summary of what the other person just said to you to confirm understanding. Another benefit in this situation is that having the person hear their own ideas in another person’s voice/words may make it easier for them to be objective.
  6. Avoid injecting your own ideas – At some point you may have a great idea for a better approach. Keep it to yourself. It will just make them defensive and will undermine their sense of ownership of the problem, and inhibit their ability to understand the solution.
  7. Lead them to the answer – If they simply aren’t making progress and you know a good answer consider  leading them to the answer with a line of questioning that directs them instead of just hand feeding the information to them. They are more likely to take ownership if they feel they reached the conclusion, and people generally retain information more readily if they arrived at it by logic on their own. If you must resort to this, tread lightly with your tone and take extra care not to come off as pedantic.

It may seem strange that my approach advocates being somewhat evasive with information or potential solutions. However, it is given in the context of a manager/mentor acting as a sounding board. In this role, your primary focus should be to create leverage by making things happen through your team. Any veteran manager will tell you that handing out answers routinely to technical problems is only going to make the line to your office longer and make you the bottleneck.  I strongly favor approaches that encourage critical thinking among developers and gives the glory to the developer instead of the esteemed leader. Trust me, it pays off in the long run.

Is Agile for Amateurs?

I found myself discussing development practices with a manager of a local startup software company and was a little taken aback when he unequivocally proclaimed that they didn’t need agile because he had a team that consisted solely of extremely talented veteran programmers, with an average tenure of more than a decade.

He claimed they knew what needed to be done and did it without any official methodology. In these parts we might call that “Cowboy Coding.” His premise that following a standard methodology was meant for novice teams or those with immature programmers who hadn’t gotten “into the groove” of their careers.


Before you trash the guy,  consider the following:

  • The guy and his team really were hardcore. It was chock full of developers that I’d love to be able to lure over to my shop.
  • This guy had bootstrapped several start-up organizations and is extremely technically competent for a suit.
  • He based this statement on demonstrable results, working software that apparently was popular with his customers.

How can you argue with results? Of course, I’m not absolutely sure about the veracity of his claims of success. In the context of our conversation it was conceivable he might have been exaggerating given that I currently work for a company that is a potential customer for his company’s product.

Personally, and despite these points I still tend to disagree. Even though he may be seeing success despite an unstructured development environment. I’d argue that the team may still be performing below their full potential and would still benefit from some flavor of an Agile approach.

It seems to me that Agile is better suited to teams with a lot of talent and experience, and if anything would be more problematic with a bunch of rookies. A core Agile concept is to let the development team self-organize, which implies a great deal of trust.  Agile is not about babysitting, and certainly not about command-and-control management. It is about communication, coordination, and frequent re-alignment with an emerging picture of the customer’s needs.

I’m curious what you guys think?

Grokking Distributed Version Control Systems

As a longtime fan of  FogBugz, I’ve been dying to start moving some of my personal projects over to Fog Creek’s Kiln. Although I won’t get much use from Kiln’s code review functionality as the lone developer on these projects, I like the idea of storing my code in the same tool that I use to manage the workflow and other meta-information about those applications.

I’d like to say this preference is based on the potential value-add of the integration features between Kiln and FogBugz, but frankly it is more about my core belief in keeping things as simple as possible.  Also, I make a habit of not storing anything that I can’t afford to lose on the hard drive of my development computer where backups are a tad sporadic. So using a hosted source code repository is attractive in that respect also.

I was sold. All I needed to do to kick off “Project Simplify” was to import my source code into Kiln from my Vault repository and … hmmm. It seems that Kiln is based on an SCM tool called Mercurial that I was unfamiliar with. No worries, I’d adapted to new source control tools before, it would just be a matter of translating the jargon from Vaulteese to Mercurian, right?



As it happens, distributed version control systems (DVCS) such as Mercurial represent a considerable paradigm shift from their traditional centralized cousins. At least this is what its evangelists are saying. Despite the claim on the Mercurial Home page that it is “Easy to Learn”, all the anecdotal evidence I’ve seen seems to indicate a significant learning curve.  However, it also appears that once DVCS finally clicks, it is extremely popular with developers. That’s enough for me, I’m still on board, but cautiously optimistic.

So I downloaded and installed the “Kiln Client”, a bundled copy of Mercurial, TortiseHg, and a few Kiln specific extensions, and started working my way through Joel’s Mercurial Primer. Unfortunately, it still wasn’t clicking for me. I got the mechanics down pretty quickly, but I still couldn’t GROK what all the hoopla was about or how I should adapt my current workflow. Things got hectic at work, so I set it aside until the Fog Creek 2010 FogBugz and Kiln World Tour came to town promising to drop some knowledge on us poor schmucks who hadn’t come to Jesus yet.

It is no secret why Joel is hawking Mercurial so fiercely given that to have any use for his new Kiln software you pretty much have to first migrate to Mercurial. Although I appreciate his zeal, I am admittedly skeptical of his decision to tie the fate of Kiln to a technology that (1) is still in the early adopter phase; (2)  involves a substantial barrier for his customers to buy his solution; and (3) requires a religion change for many users of competing SCM tools.

Still, I’m not going to pass up on free training, even if there might be a self-serving agenda behind it. So I went and I think I am better off for it, but I’m still don’t feel confident that I fully understand the use case. Maybe it is because DVCS is always pitched as the solution to the nightmare that is branching/merging in Subversion. I haven’t used Subversion for years, sparingly use branching in Vault  and work on a relatively small development team. Perhaps if I was working with hundreds of developers on big OSS projects, that have seen high adoption of DVCS tools,  I’d see the appeal more easily, but for now it is still evasive.

One aspect that I totally get is the idea of a two phase commit. I may not have been involved in a monster merge, but I have on numerous occasions had developer’s machines go down taking several weeks worth of uncommitted code changes with them. The idea of being able to check-in code privately resonates very much with me. However,  this feature (“Shelving”) already exists on existing centralized version control systems, e.g. Vault. Further, most DVCS workflow examples I’ve seen seem to encourage keeping the working version of the repository on the developer’s workstation, which doesn’t seem to provide much protection from data loss for uncommitted changes.

I also have a very positive opinion of the notion of revision control on entire change-sets as opposed to individual files, but again, while this is a common feature of DVCS systems it need not necessarily be unique to those systems and could be added to more traditional centralized SCM tools.

I Want to Believe

Having said all of that, I’m not giving up yet, but I’m still waiting for that eureka moment where the necessity of DVCS becomes so obvious that I can’t imagine ever going back. For now, I’ll use it for my projects that I want to store in Kiln and continue to use vault for my other projects. If you have a really good use case for DVCS or a really good way of explaining the benefits that will help me along, I’d definitely appreciate you leaving a comment on this post. I want to believe!

FogBugz News Network (FNN) Plugin Version 1.2.1 Released

This update has been in the works for a lot longer than I would have liked, buried under a bunch of higher priority projects. However, I finally found enough spare time to clear the backlog of feature requests and publish a really significant update to my popular FNN plug-in for FogBugz, a project management tool for software development. Thanks to everyone who submitted bug or feature reports.

Where can I get it?

If you are hosting your own install of FogBugz, you can download the new version immediately from the FNN page on the FogBugz PlugIn Gallery.

For those of you using FogBugz On Demand, it will be a bit longer before you can get your hands on it. It has to go through an extended Fog Creek plug-in approval process and be scheduled for inclusion the next available maintenance cycle. If all goes well, it should make it up to the FBOD site in the next month or so.  I’ll update the status in this blog post as things progress.

What’s New?

Despite the minor tick on the version number (1.1.6 -> 1.2.1), there is really quite a bit in the new version, so I encourage you to check it out if you are a FogBugz user.

New Feature: Activity Report

One of the most requested features for the next update of FNN was to enable the specification of a custom date range on the Recent Events Report in addition to the current functionality of showing the last X events that met the other filter critera. I considered adding a toggle, but it just looked too cluttered in all my prototypes, so I decided to split this off as a separate report, the Activity Report, which is now also available from the Extras menu in FogBugz.

FNN Activity Report

New Feature:  Subscriptions Report (and Management Tool)

This screen is also new to FNN, and is used to get an overview of who is subscribing to various items (Cases, Wiki Articles, Discussion Topics). Like the other reporting screens it can also be filtered by person and/or project. You can also remove a subscription from this screen for items where you are the subscriber or you are a site admin.

New Feature:  Auto-Updating Recent Events Report

The popular Recent Events Report has a new feature that was requested by someone on the Fog Creek team at the StackOverflow Dev Days event in Austin last year. By checking the new “Auto-Update” checkbox on the report, it will give you a live feed of the what is going on in your FogBugz database. New updates will be inserted at the top of the report roughly every 5 seconds so you are always looking at the most recent X items, without having to continually refresh. Perfect for the micro-manager in all of us, right?

FNN Recent Events Report

I also made some other minor tweaks to this screen, including:

Feature-ette: You can now display up to the last 1,000 most recent items instead of the paltry 100 allowed by the old version.
Tweak: The description column was empty more often than I liked, so I made it fall back on another location to fill this column if the primary source was blank.
Fix: There were a few scenarios when cases would not be displayed on this screen, when indeed they should have.

What does it cost?

Nada. It’s still freeware. I haven’t attached any licensing terms to it yet, but am okay with you doing whatever you want with it so long as you don’t steal my work and claim it as your own. I developed this and the other plugins mentioned on this site principally for my own use as a FogBugz user, but didn’t want to Bogart these useful utilities all to myself.

Also, since some have asked. It’s not open source at this time. I’m considering it, but want to sheppard it along a little longer on my own before I let other cooks in the kitchen.

Feedback is Appreciated

If you have any complaints, comments, ideas, etc., please either drop me a line using the e-mail link next to the plug-in on the FogBugz PlugIns Admin Page, or just throw a comment up on this post. I also follow the FogBugz StackExchange site semi-regularly, so you can also post ideas/comments there if you want to give other users a chance to vote on them or add to them. Now that the backlog is almost empty on this project, I should be able to get to new requests pretty quickly.

Mad Computer Scientists at VMWare Introduce Zombie IE6

I’ll admit to being somewhat of a Microsoft apologist, but as a web developer, I couldn’t help being a little giddy about the idea that the venerable Internet Explorer 6 finally seemed to be on the ropes thanks to the one-two punch of Google officially dropping support for the 10+ year old browser and Microsoft’s finally releasing an upgrade to Windows that was conceivably worth the effort to ditch XP. Unlike previous quixotic attempts to kill it, I think this time it might take.

Maybe it is a little premature to give the eulogy considering that by some reports IE6 still is still the second most used browser as recently as last month, but considering its accelerating decline in market share over the last two years, one can only hope.

In fact, one of the biggest IE6 holdouts on my company’s client list finally took the plunge and upgraded…to IE7. Hey, it’s at least it is progress, right?

Then I saw this little news nugget on the VMWare ThinApp blog and it gave me pause.

For those that didn’t follow that link, or didn’t make it past the odd “Web Apps are the new DLL Hell” prologue,  here’s the terrifying part:

I’m happy to report we have now have IE6 fully virtualized and working perfectly on Windows 7 32bit and 64bit.

Although getting IE6 to work “perfectly” on any version of Windows is a noteworthy accomplishment, my initial reaction to this information was not unlike what I’d expect to feel if I heard that someone cloning technology had been developed and Hitler was picked as the prototype.

"Sometimes Dead is Bettah"

Don’t get me wrong. I’m a big fan of application virtualization, and VMWare ThinApp has been instrumental in solving deployment problems for some of my applications in environments with rigorous security controls on employee workstations.

Despite how difficult it is to get someone on the phone at VMWare that has heard of ThinApp even though it has been almost two years since they acquired it, it really is a solid and easy to use virtualization platform.

You want technical support for thin what?
Are you sure you dialed the right number?

I suppose I can’t blame them for facilitating such an abomination. Keeping legacy apps alive for those who can’t bear to give up on them is, after all, a key use-case for an application virtualization platform. Like poor old Dr. Frankenstein, you have to give  mad-props for creating such an elegant workaround for  mortality, even if that workaround did maul a few villagers or their web-pages.

In fact, the ability to associate web-pages with different virtualized browser versions seems like a really cool trick… A really cool massively kludgey trick.

I just wish they didn’t sound quite so gleeful in their announcement. Can’t a fellow enjoy his schadenfreude in peace?

At least they could have made it difficult to accomplish. Even Doc Frank didn’t create an “Easy Button” for re-animating corpses, and he was allegedly almost as insane as James Carville.

Currently the process for creating a ThinApp IE6 package is a little complex…We know this is popular use case, so we’ve turned the process of capturing IE6 into a few clicks.


On the other hand, maybe I need a more optimistic perspective. I suppose an argument could be made that the availability of this technology could promote the use of modern browsers at curmudgeonly organizations.

For example, you could argue that it is safe to upgrade everyone’s browser because of application virtualization for those ancient internal apps that no one has budget to update.

Selling it to management shouldn’t be too difficult…

“Application Virtualization?” I’ve never used that buzzword before, and now that my grandmother has heard of “Cloud Computing” I need something snappier to put on my PowerPoint slides. That should just fit if I take out the cloud-with-a-dollar-sign-on-it clip-art.  Let’s do it!