Quick Tip: Comparing a .NET string to multiple values

The Problem

Do you ever wish you could use the SQL IN operator in your C# code to make your conditional blocks more concise and your code easier to read?

Perhaps it’s just my persnickety nature, but I believe that line-wrapped conditional expressions like this are a code smell.

if (animal.Equals("Cow") ||
   animal.Equals("Horse") ||
   animal.Equals("Hen"))
{
   Console.WriteLine("We must be on the farm.");
}

This would be so much cleaner…

if (animal.CompareMultiple("Cow","Horse","Hen")
{
   Console.WriteLine("We must be on the farm.");
}

The Code

With a simple extension class you can upgrade your string classes to do this very thing.

Step 1: Create an extension class as demonstrated here.

C#

namespace extenders.strings
{
  public static class StringExtender {

    public static bool CompareMultiple(this string data, StringComparison compareType, params string[] compareValues) {
      foreach (string s in compareValues) {
        if (data.Equals(s, compareType)) {
          return true;
        }
      }
      return false;
   }
  }
}

VB.NET

Imports System.Runtime.CompilerServices

Namespace Extenders.Strings

  Public Module StringExtender

    <Extension()> _
    Public Function CompareMultiple(ByVal this As String, compareType As StringComparison, ParamArray compareValues As String()) As Boolean

       Dim s As String

       For Each s In compareValues
         If (this.Equals(s, compareType)) Then
            Return True
         End If
       Next s

       Return False
    End Function

  End Module

End Namespace

Step 2: Add a reference to the extension namespace and use it.

C#

using extenders.strings;

namespace MyProgram
{
  static class program {
    static void Main() {

      string foodItem = "Bacon";

      if (foodItem.CompareMultiple(StringComparison.CurrentCultureIgnoreCase, "bacon", "eggs", "biscuit")) {
         System.Console.WriteLine("Breakfast!");
      }
      else {
         System.Console.Write("Dinner");
      }

    }
  }
}

VB.NET

Imports StringExtenderExampleVB.Extenders.Strings

Module Program

  Sub Main()
    Dim foodItem As String = "Bacon"

    If (foodItem.CompareMultiple(System.StringComparison.CurrentCultureIgnoreCase, "bacon", "eggs", "biscuit")) Then
      System.Console.WriteLine("Breakfast!")
    Else
      System.Console.Write("Dinner")
    End If

    System.Console.ReadLine()
  End Sub

End Module

Concatenating Strings – You’re still doing it wrong.

A common anti-pattern I see from novice programmers is the tendency to read a coding tip somewhere, assume it to be a universal truth, and immediately start applying it everywhere in their code without fully understanding it. Usually the coding tip relates to optimization and is interpreted by the coder as “X is faster than Y, so always do Y instead of X.” This fallacy is particularly rampant with respect to the different approaches for concatenating strings.

Car with Wing

But it works on race cars....

I’m not writing this article to chastise the fledgling programmer who has fallen into this trap, nor is this intended as a how-to article on optimizing your code. Heaven knows there the internet is lousy with articles about the most efficient way to cram strings together. I will address the problems associated with some of the mythology about string concatenation, but my primary goal will be to encourage critical thinking and healthy skepticism for silver bullet programming techniques.

The Symptom

Though I possess no paranormal super-powers, I do believe I can read the mind of another human when I see code that looks like this:

    StringBuilder WelcomeMessage = new StringBuilder();
    WelcomeMessage.Append("Hello ");
    WelcomeMessage.Append(firstName);
    WelcomeMessage.Append(" ");
    WelcomeMessage.Append(LastName);
    WelcomeMessage.Append("!\n")

My spirit guide informs me that that programmer responsible for this code remembers reading somewhere that <insert programming language> is really inefficient at concatenating strings, but you can overcome that limitation by using the StringBuilder class. Based on this information he/she replaces every string concatenation operator with this clever technique leaving a scent trail through the code that experienced programmers can smell from miles away.

Problem 1: Premature Optimization is Procrastination

Sure, you want ALL of your code to perform well, but experienced programmers understand that their time is valuable and best spent on activities that deliver actual business value. Premature optimization and its ugly cousin Micro Optimization are almost always a waste of time. I understand how tempting it is to justify in your head that you could squeeze a little bit more performance out of your app by re-factoring the whole thing you learned in the blog post you read today, especially since it can be like a mini-vacation for your mind from the really complex issues you really should be working on, but be strong and resist!

As a rule of thumb: If it isn’t worth creating a jig to profile the performance gains you expect to get from optimization re-factoring, then it isn’t worth the time to do the re-factoring in the first place, plus it is risky because you won’t notice that your supposed “optimization” actually hurt performance.

More on that later.

Problem 2: Cookie Cutter Optimizations Assume the Compiler is Stupid

If you could universally make string concatenation faster by applying a simple formula then the compiler would probably already be applying the transformation anyway. Granted, I think this point is lost on some novice programmers who only have experience in higher level languages.

For them, I’ll clarify with this point.

Your program is running the compiler’s interpretation of your code. Not your actual code.

With that in mind, to think that the StringBuilder approach always runs faster would require you to believe that the people who wrote the programming language were smart enough to make string concatenation fast when they created the StringBuilder class, but forgot how to do it when they built the concatenation operator.

Pop Quiz: Does this code give you any heartburn? Why?

    string querySQL = "SELECT * " +
                      "FROM myTable " +
                      "WHERE (ID=5)";

If you said yes because it isn’t worth incurring the cost of concatenation for code readability then you aren’t giving the compiler enough credit.

Here is the MSIL output for the above statement:

ldstr      "SELECT * FROM myTable WHERE (ID=5)"

Amazing, huh?

Compilers are written to do the complex task of reading your code and interpreting what it means. Figuring out that a series of constant strings can be combined is child’s play.

Problem 3: If you don’t understand it, you’ll do it wrong

Cargo Cult programming is a derisive term for doing things in your program because you think you need to, but don’t understand (or have a vague notion of)  the underlying reason. It is really bad practice to adopt a technique without asking enough “why” questions to grasp the reason using it is desirable.

As an example, let’s dissect the premise that string concatenation using operators is slow and should be replaced by StringBuilders.

Q: Why do some claim that string concatenation with operators is slow?
In many garbage collected languages (Java/.NET) string objects are  immutable, meaning you can’t change them. So when you append more content into an existing string the program must internally create a new string and copy the old and new contents into it. The extra effort to create, destroy and garbage collect the extra string objects has the potential to create more work for your program and can degrade performance if done excessively.

Q: How does the StringBuilder help?
The StringBuilder class is implemented as a mutable memory buffer that typically has extra unused space allocated so that concatenations can be made in place without the need to create extra objects to juggle the data.

Q: How much extra space does it reserve? What happens if I append more content than will fit in the unused space?
By default (in .NET) 16 characters, unless you specify differently in the constructor. If you append more data than there is space, the StringBuilder will behave much like a String object creating a new StringBuilder object with double the existing capacity then copy over the data.

You: Wait, what?

You mean that you have been using StringBuilder with the default constructor and then appending more than 16 characters to it?

Yeah, well if you are lucky you’ll be no worse off than if you just used the “evil” string concatenation operators. However, due to that neat capacity doubling side-effect, your program might actually be locking up unnecessarily large chunks of memory on top of the additional work required to wrangle all the intermediary objects. Perhaps it is worth investigating and setting the initial capacity of that StringBuilder to avoid such nastiness.

Bonus: Now that you understand the potential performance benefit is based (at least partly) on mutability, you will see that other string optimization opportunities may exist whenever existing strings need to be modified, not just appended to.

Final Thoughts

Again, the point of all this isn’t about strings, or optimization, or any of that. It is about taking the time to understand what you are doing to avoid falling prey to the potentially harmful myths that are enthusiastically passed around by programmers (see also “The database sorts by the clustered index if you don’t specify an Ordering“).

In any event, I’m curious as to how many of my readers actually have at one time subscribed to the cargo cult programing meme of “StringBuilder is always better”. Please let me know in the comments.

Quickly Link SQL Server Tables in Access

Long gone are the days when I did much application development in MS Access.  However, I still find it a darn good tool for querying, reporting and doing quick edits to data stored in other platforms.

I wrote this code ages ago to allow me to quickly create linked tables in MS Access for SQL Server tables/views. It is much faster than the manual UI approach and doesn’t require setting up a DSN on the local system to make the connection

Usage

Use the immediate window (ctrl-G) to enter these commands:

Link every table in a SQL database(“MYDB”) from SQL Server Instance (“MYSQLSERVER”)
? LinkTable(MYSQLSERVER,MYDB,True)

Link a single table(“MySQLTable”) in a SQL database (“MYSQLDB”)  from SQL Server Instance named “MYSQLSERVER”
? LinkTable(MyTableAliasInAccess, MYSQLSERVER,MYDB,MYSQLTABLE,True)

The Code

Just paste this code into a new module in the Access database and you are ready to go.

Sub LinkAllTables(Server As String, database As String, OverwriteIfExists As Boolean) 'Usage Example (link all tables in database "SQLDB" on SQL Server Instance SQO01, overwriting any existing linked tables. 'linkalltables "SQL01","SQLDB", true 'This will also update the link if the underlying table definition has been modified. Dim rsTableList As New ADODB.Recordset Dim sqlTableList As String sqlTableList = "SELECT [name] as tablename FROM sysObjects WHERE (type = 'U')" rsTableList.Open sqlTableList, BuildSQLConnectionString(Server, database) While Not rsTableList.EOF If LinkTable(rsTableList("tableName"), Server, database, rsTableList("tableName"), OverwriteIfExists) Then Debug.Print "Linked: " & rsTableList("tableName") End If rsTableList.MoveNext Wend rsTableList.Close Debug.Print "Done." End Sub Function LinkTable(LinkedTableAlias As String, Server As String, database As String, SourceTableName As String, OverwriteIfExists As Boolean) 'This method will also update the link if the underlying table definition has been modified. 'The overwrite parameter will cause it to re-map/refresh the link for LinktedTable Alias, but only if it was already a linked table. ' it will not overwrite an existing query or local table with the name specified in LinkedTableAlias. 'Links to a SQL Server table without the need to set up a DSN in the ODBC Console. Dim dbsCurrent As database Dim tdfLinked As TableDef ' Open a database to which a linked table can be appended. Set dbsCurrent = CurrentDb() 'Check for and deal with the scenario ofthe table alias already existing If TableNameInUse(LinkedTableAlias) Then If (Not OverwriteIfExists) Then Debug.Print "Can't use name '" + LinkedTableAlias + "' because it would overwrite existing table." Exit Function End If 'delete existing table, but only if it is a linked table If IsLinkedTable(LinkedTableAlias) Then dbsCurrent.TableDefs.Delete LinkedTableAlias dbsCurrent.TableDefs.Refresh Else Debug.Print "Can't use name '" + LinkedTableAlias + "' because it would overwrite an existing query or local table." Exit Function End If End If 'Create a linked table Set tdfLinked = dbsCurrent.CreateTableDef(LinkedTableAlias) tdfLinked.SourceTableName = SourceTableName tdfLinked.Connect = "ODBC;DRIVER={SQL Server};SERVER=" & Server & ";DATABASE=" & database & ";TRUSTED_CONNECTION=yes;" On Error Resume Next dbsCurrent.TableDefs.Append tdfLinked If (Err.Number = 3626) Then 'too many indexes on source table for Access Err.Clear On Error GoTo 0 If LinkTable(LinkedTableAlias, Server, database, "vw" & SourceTableName, OverwriteIfExists) Then Debug.Print "Can't link directly to table '" + SourceTableName + "' because it contains too many indexes for Access to handle. Linked to view '" & "vw" & SourceTableName & "' instead." LinkTable = True Else Debug.Print "Can't link table '" + SourceTableName + "' because it contains too many indexes for Access to handle. Create a view named '" & "vw" & SourceTableName & "' that selects all rows/columns from '" & SourceTableName & "' and try again to circumvent this." LinkTable = False End If Exit Function End If On Error GoTo 0 tdfLinked.RefreshLink LinkTable = True End Function Function BuildSQLConnectionString(Server As String, DBName As String) As String BuildSQLConnectionString = "Driver={SQL Server};Server=" & Server & ";Database=" & DBName & ";TRUSTED_CONNECTION=yes;" End Function Function TableNameInUse(TableName As String) As Boolean 'check for local tables, linked tables and queries (they all share the same namespace) TableNameInUse = DCount("*", "MSYSObjects", "(Type = 4 or type=1 or type=5) AND [Name]='" & TableName & "'") > 0 End Function Function IsLinkedTable(TableName As String) As Boolean IsLinkedTable = DCount("*", "MSYSObjects", "(Type = 4) AND [Name]='" & TableName & "'") > 0 End Function