How to diff two folders from a Windows command prompt

In most cases where I need to compare two folders recursively on a Windows system I use my go-to tool Beyond Compare. It is an excellent utility, and one that I think should be among the first utilities any developer should install on a new machine.

However, today I was doing a reconciliation as part of a very large file migration project that required comparing two folders that each contained hundreds of millions of files spread across thousands of sub-folders. BC was having a lot of trouble and choked on many of my comparisons. It just wasn’t the tool for today’s job. I needed another solution.

Necessity mothered some invention and I found an inventive way to use a combination of command switches on  RoboCopy to perform the comparison. If you are not familiar with RoboCopy, and you do a lot of mass copying of files, you need to stop what you are doing and learn about it pronto. It is a supercharged version of XCopy that has been included with Windows since Vista. It has a ton of great features such as multi-threaded file copying, selectively copying changed files, and resumable copies that make it a must especially for big file copy jobs over flaky network connections.

Diff Command Using RoboCopy

So here’s the command to perform a basic comparison of two folders and write a log file listing the differences.

ROBOCOPY “\\FileShare\SourceFolder” “\\FileShare\ComparisonFolder” /e /l /ns /njs /njh /ndl /fp /log:reconcile.txt

Explanation of the command switches used above:

/e  Recurse through sub-directories (including empty ones)
/l  Don’t modify or copy files, log differences only
/fp  Include the full path of files in log (only necessary if you omit /ndl)
/ns  Don’t include file sizes in log
/ndl  Don’t include folders in log
/njs   Don’t include Job Summary
/njh   Don’t include Job Header
/log:reconcile.txt   Write log to reconcile.txt (Recreate if exists)
/log+: reconcile.txt   (Optional variant) Write log to reconcile.txt (Append if exists)

Usage Notes and Warnings Regarding the /NDL Option

The /NDL option is a handy way to suppress the inclusion of every folder checked (regardless of whether it contains differences) in the log, but there because of the way it works it is not a good idea in all circumstances. Consider the following before you use /NDL.

  • Folders that exist only on source or destination are not logged unless at least one mismatched file is present or a source file is missing on destination.
  • Folders that exist only on the destination are not logged at all regardless of contents.

If you omit the /NDL option, it is necessary to include the /FP option if you want full paths listed for each file.

Example Output

(with /NDL option)

*EXTRA File         c:\dest\log.txt
New File              c:\source\newfolder\Blah.txt
Newer                 c:\source\Files\CONCORD.DAT
New File              c:\source\Files\COWCO.DAT

(without NDL Option)

 c:\work\test\source\    (extraneous folder listing)
*EXTRA Dir      c:\dest\newfolderdest\
*EXTRA Dir      c:\dest\newfolderrestempty\
*EXTRA File     c:\dest\log.txt
New Dir           c:\source\newfolder\
New File          c:\source\newfolder\Blah.txt
New Dir           c:\source\newfolderempty\
c:\source\Files\   (extraneous folder listing)
Newer             c:\source\Files\CONCORD.DAT
New File          c:\source\Files\COWCO.DAT
c:\test\source\FilesSame\   (Included despite no diffs)

5 Best Practices for Commenting Your Code

One of the first things you learn to do incorrectly as a programmer is commenting your code. My experience with student and recently graduated programmers tells me that college is a really good place to learn really bad code commenting techniques. This is just one of those areas where in-theory and in-practice don’t align well.

There are two factors working against you learning good commenting technique in college.

  1.  Unlike the real world, you do a lot of small one-off projects as a solo developer.  There’s no one out there fantasizing about dropping a boulder on you for making them decipher your coding atrocity.
  2.  That commenting style you are emulating from your textbook is only a good practice when the comments are intended for a student learning to program. It is downright annoying to professional programmers.

These tips are primarily intended for upstart programmers who are transitioning into the real world of programming, and hopefully will prevent a few from looking quite so n00bish during their first code review. Code Review? Oh yeah, that’s something else they didn’t teach you in school, but that’s a whole other article, I’ll defer to Jason Cohen on that one.

So let’s get started…

(1) Comments are not subtitles

It’s easy to project your own worldview that code is a foreign language understood only by computers, and that you are doing the reader a service by explaining what each line does in some form of human language. Or perhaps you are doing it for the benefit of that non-programmer manager who will certainly want to read your code (Spoiler: He won’t).

Look, in the not too distant future, you will be able to read code almost as easily as your native language, and everyone else who will even glance at it almost certainly already can. By then you will realize how silly it is to write comments like these:

// Loop through all bananas in the bunch
foreach(banana b in bunch) {;  //make the monkey eat one banana

You may have been taught to program by first writing  pseudo-code comments then writing the real code into that wire-frame. This is a perfectly reasonable approach for a novice programmer. Just be sure to replace the comments with the code, and don’t leave them in there.

Computer: Enemy is matching velocity.
Gwen DeMarco: The enemy is matching velocity!
Sir Alexander Dane: We heard it the first time!
Gwen DeMarco: Gosh, I’m doing it. I’m repeating the darn computer!

-Galaxy Quest


  • Code examples used to teach a concept or new programming language.
  • Programming languages that aren’t remotely human readable (Assembly, Perl)

(2) Comments are not an art project

This is a bad habit propagated by code samples in programing books and open source copyright notices that are desperate to make you pay attention to them.

   _     _      _     _      _     _      _     _      _     _      _     _
  (c).-.(c)    (c).-.(c)    (c).-.(c)    (c).-.(c)    (c).-.(c)    (c).-.(c)
   / ._. \      / ._. \      / ._. \      / ._. \      / ._. \      / ._. \
 __\( Y )/__  __\( Y )/__  __\( Y )/__  __\( Y )/__  __\( Y )/__  __\( Y )/__
   || M ||      || O ||      || N ||      || K ||      || E ||      || Y ||
 _.' `-' '._  _.' `-' '._  _.' `-' '._  _.' `-' '._  _.' `-' '._  _.' `-' '._
 `-'     `-'  `-'     `-'  `-'     `-'  `-'     `-'  `-'     `-'  `-'     `-'

                 -It's Monkey Business Time! (Version 1.5)

Why, that’s silly. You’d never do something so silly in your comments.

ORLY? Does this look familiar?

   | Module Name: classMonkey                                   |
   | Module Purpose: emulate a monkey                           |
   | Inputs: Bananas                                              |
   | Outputs: Grunts                                            |
   | Throws: Poop                                               |

Programmers love to go “touch up” their code to make it look good when their brain hurts and they want something easy to do for a while. It may be a waste of time, but at least they are wasting it during periods where they wouldn’t be productive anyway.

The trouble is that it creates a time-wasting maintenance tax imposed on anyone working with the code in the future just to keep the pretty little box intact when the text ruins the symmetry of it. Even programmers who hate these header blocks tend to take the time to maintain them because they like consistency and every other method in the project has one.

How much is it bugging you that the right border on that block is misaligned? Yeah. That’s the point.

(3) Header Blocks: Nuisance or Menace?

This one is going to be controversial, but I’m holding my ground. I don’t like blocks of header comments at the top of every file, method or class.

Not in a boat, not with a goat.
Why? Well let me tell you, George McFly…

They are enablers for badly named objects/methods – Of course, header blocks aren’t the cause for badly named identifiers, but they are an easy excuse to not  put in the work to come up with meaningful names, an often deceptively difficult task. It provides too much slack to just assume the consumer can just read the “inline documentation” to solve the mystery of what the DoTheMonkeyThing method is all about.

JohnFx’s Commandment:
The consumer of thy code should never have to see its source code to use it, not even the comments.

They never get updated: We all know that methods are supposed to remain short and sweet, but real life gets in the way and before you know it you have a 4K line class and the header block is scrolled off of the screen in the IDE 83% of the time. Out of sight, out of mind, never updated.

The bad news is that they are usually out of date. The good news is that people rarely read them so the opportunity for confusion is mitigated somewhat. Either way, why waste your time on something that is more likely to hurt than help?

JohnFx’s Maxim of Plagiarized Ideas :
Bad Documentation is worse than no documentation.

Exception: Some languages (Java/C#) have tools that can digest specially formatted header block comments into documentation or Intellisense/Autocomplete hints. Still, remember rule (2) and stick to the minimum required by the tool and draw the line at creating any form of ASCII art.

(4) Comments are not source control

This issue is so common that I have to assume that programmers (a) don’t know how to use source control; or  (b) don’t trust it.

Archetype 1: “The Historian”

     // method name: pityTheFoo (included for the sake of redundancy)
     // created: Feb 18, 2009 11:33PM
     // Author: Bob
     // Revisions: Sue (2/19/2009) - Lengthened monkey's arms
     //            Bob (2/20/2009) - Solved drooling issue

     void pityTheFoo() {

The programmers involved in the evolution of this method probably checked this code into a source control system designed to track the change history of every file, but decided to clutter up the code anyway. These same programmers more than likely always leave the Check-In Comments box empty on their commits.

I think I hate this type of comment worst of all, because it imposes a duty on other programmers to keep up the tradition of duplicating effort and wasting time maintaining this chaff. I almost always delete this mess from any code I touch without an ounce of guilt. I encourage you to do the same.

Archetype 2: “The Code Hoarder”

     void monkeyShines() {
          if (monkeysOnBed(Jumping).count > max) {

             // code removed, smoothie shop closed.
             // leaving it in case a new one opens.
             // monkeysOnBed.Drink(BananaSmoothie);

Another feature of any tool that has any right to call itself a SCM is the ability to recover old versions of code, including the parts you removed. If you want to be triple super extra sure, create a branch to help you with your trust issues.

(5) Comments are a code smell

Comments are little signposts in your code explaining it to future archaeologists that desperately need to understand how 21st century man sorted lists of purchase orders.

Unfortunately, as Donald Norman explained so brilliantly in The Design of Everyday Things, things generally need signs because their affordances have failed. In plain English, when you add a comment you are admitting that you have written code that doesn’t communicate its purpose well.

Sign:”This is a mop sink.” Why would that be necces… oh.

Despite what your prof told you in college, a high comment to code ratio is not a good thing.  I’m not saying to avoid them completely, but if you have a 1-1 or even a 5-1 ratio of LOC to comments, you are probably overdoing it. The need for excessive comments is a good indicator that your code needs refactoring.

Whenever you think, “This code needs a comment” follow that thought with, “How could I modify the code so its purpose is obvious?”
Talk with your code, not your comments.

Technique 1: Use meaningful identifiers and constants (even if they are single use)

     // Before
     // Calculate monkey's arm length
     // using its height and the magic monkey arm ratio
     double length = h * 1.845; //magic numbers are EVIL!

    // After - No comment required
    double armLength = height * MONKEY_ARM_HEIGHT_RATIO;

Technique 2: Use strongly typed input and output parameters

      // Before
      // input parameter monkeysToFeed:
      // DataSet with one table containing two columns
      //     MonkeyID (int) the monkey to feed
      //     MonkeyDiet (string) that monkey's diet
      void feedMonkeys(DataSet monkeysToFeed) {

     //  After: No comment required
     void feedMonkeys(Monkeys monkeysToFeed) {

Technique 3: Extract commented blocks into another method

      // Before

      // Begin: flip the "hairy" bit on monkeys
      foreach(monkey m in theseMonkeys) {
          // 5-6 steps to flip bit.
      // End: flip the "hairy" bit on monkeys

     // After No comment required

As an added bonus, technique 3 will tend to reduce the size of your methods and minimizing the nesting depth (see also “Flattening Arrow Code”) all of which contribute to eliminating the need for commenting the closing tags of blocks like this:

            } // ... if see evil
         } // ... while monkey do.
      } // ... if monkey see.
    } // ... class monkey
  } // ... namespace primate


Several of the ideas presented here, and a good deal of the fundamental things I know about programming as part of a team, I learned from the book Code Complete by Steve McConnell. If you are a working programmer and have not read this book yet, stop what you are doing and read it before you write another line of code.

Replacing multiple spaces in a string with a single space in SQL

After reading Jeff Moden’s article “Replace Multiple Spaces with One” describing an approach to collapse white space in a string, I was troubled by the need for a temporary placeholder. I’m generally skeptical of any solution that requires picking a character that doesn’t naturally occur in the data. It just feels like you are building a time-bomb into the app even if you are very careful to pick something so zany it has a very little chance of showing up and causing problems. Also, the character you pick depends on the data you are running this against, so it might not make for a great generic solution.

So today’s project was to find another way to skin that cat without  inserting bogus characters into the data.

Statement of The Problem

For those who haven’t read Jeff’s article, here is a basic statement of the task:

Replace any continuous series of repeating spaces of in a database column with a single space.


  1. Although a CLR UDF using .NET’s regular expression library is the most straightforward way to do this, the original article went for a pure SQL approach so I did the same.
  2. My solution is based language features available in MS SQL Server 2005 and later.

Test Data

I used a permanent table to test this instead of a temporary table, but used the same sample data set from Jeff’s article.  Also, I inserted these same rows 100K times so I had a large enough data set to be useful for performance testing.

SomeText varchar(max) NULL,

DECLARE @loopCounter int
set @test=0

WHILE (@loopCounter<100000)
  INSERT INTO spacestest (sometext)
    SELECT '  This      has multiple   unknown                 spaces in        it.   '
    UNION ALL SELECT 'So                     does                      this!'
    UNION ALL SELECT 'As                                does                        this'
    UNION ALL SELECT 'This, that, and the other  thing.'
    UNION ALL SELECT 'This needs no repair.'


If we could assume that no series of spaces in the string consisted of more than two spaces, this challenge could be met with a simple application of the REPLACE statement to replace all double spaces with single spaces. However, given series of arbitrary length you need would need to repeat this operation until you had collapsed the longer series down to single spaces.

You could do it iteratively like this:

WHILE EXISTS(SELECT * FROM SpacesTest WHERE (SomeText like '%  %'))
 UPDATE SpacesTest
 SET SomeText=REPLACE(SomeText,'  ',' ')
 WHERE (SomeText like '%  %')

While the iterative approach works just fine, it does do of repetitive updates to each row and requires you to write the intermediate output to a temporary table if you just want to query the data rather than updating the source table.  I came up with the following recursive solution that doesn’t require the intermediate updates and is more of  “set based” approach.

Recursive Solution:

WITH spacesCollapsed (ID,SomeText) AS
   SELECT ID, cast(LTRIM(RTRIM(SomeText)) as varchar(max)) FROM SpacesTest
   SELECT ID, cast(REPLACE(SomeText,'  ',' ') as varchar(max)) as SomeText FROM spacesCollapsed WHERE (SomeText like '%  %')
   SELECT ID,SomeText,  ROW_NUMBER() OVER(PARTITION BY ID ORDER BY min(len(sometext))) as GroupRowNum
   FROM spacesCollapsed
   GROUP BY ID,SomeText
 ) as reductions
WHERE reductions.GroupRowNum=1;


My new recursive approach did generate valid output and does solve the dilemma of choosing a character that won’t occur in the data. The bad news is that it didn’t perform all that well. In fact, it was over 26x slower than Jeff’s approach. The iterative approach performed a lot better, but was still almost 6x slower than the benchmark. I suspect the performance problems with both of my alternatives come from the need to make multiple passes at the source data using a non-SARGable expression in the where clause.

Query Stats for Jeff's Approach

Jeff's Approach (for comparison)

Query Stats for Recursive Approach

Recursive Approach (Ugh!)

So, although it was a fun exercise, I have to concede this one. If performance is a major consideration (and when isn’t it?) then Jeff actually has a superior technique. Here is an example query using that technique, just be careful to understand the potential side-effects if you use it.

 SELECT REPLACE(REPLACE(REPLACE(LTRIM(RTRIM(sometext)),'  ',' '+ CHAR(7)) , CHAR(7)+' ',''), CHAR(7),'') AS CleanString
 FROM SpacesTest
 WHERE CHARINDEX('  ',sometext) > 0


Given how dramatic the difference is for a similar operation, I have to think that maybe I am missing an optimization on my recursive solution that could reign in the performance. If you see anything I’m missing that might help my solution in terms of performance please let me know in the comments.

How to join on memo fields in Microsoft Access

Rambling Intro, Nostalgia, and Crankiness

This week I got a request troubleshoot a legacy Microsoft Access application that has been floating around our company for ages, but still gets used daily because dang it, it does the job and always has. Seems like most companies that are standardized on MS Office have a few of these lurking out on the network.

Earlier in my career I did a ton of work in MS Access and have garnered a reputation within my company for being an expert in this oft maligned platform so I got the call to look into the problem. It had been quite a while since I’d done any real work on MS Access and I’d forgotten about how quirky it could be. Also, I am more than a little disappointed at how Microsoft has mangled the UI of my old friend Access in the 2007 version. It is almost painful to work with it as a power-user in the current incarnation.

The Problem

So anyway, the issue turned out to be that someone increased the length of a field in the underlying SQL Server table linked into the Access application. They increased it past the magical border (255 characters) between what Access considers a text and a memo field, which imposed new limits on how it could be used. In particular, Access doesn’t allow either end of a join in a query to be a memo field.

Can't join on memo fields

This won't fly, McFly

The Solution

The solution is painfully simple. So much so that I have to wonder why Access doesn’t just do it behind the scenes. Perhaps it is just trying to discourage you from building databases that link on big text fields for your own good (see “The Caveat” below).

The trick is to move the join into the WHERE clause of the query  like so:

SELECT Table1.*, Table2.* FROM Table1, Table2 WHERE (Table1.MemoField=table2.MemoField);

Here is the same query in the query builder for those of you who prefer it to the SQL view:

Graphical display of query

Remove the join between the tables and add a criterion

Access will raise nary a complaint if you run this query which is logically equivalent to the one it abhorred. That’s all there is to it.

The Caveat

A final note. It is a definite database smell for an application to be joining tables on long text fields and will likely be the source of some performance issues in a database of non-trivial size. However, as was the case for the application I was tweaking, joining on long text fields is sometimes necessary in queries used for data clean-up, validation, or replication.  Still, use this type of join with caution avoiding it whenever possible.

Quickly Link SQL Server Tables in Access

Long gone are the days when I did much application development in MS Access.  However, I still find it a darn good tool for querying, reporting and doing quick edits to data stored in other platforms.

I wrote this code ages ago to allow me to quickly create linked tables in MS Access for SQL Server tables/views. It is much faster than the manual UI approach and doesn’t require setting up a DSN on the local system to make the connection


Use the immediate window (ctrl-G) to enter these commands:

Link every table in a SQL database(“MYDB”) from SQL Server Instance (“MYSQLSERVER”)

Link a single table(“MySQLTable”) in a SQL database (“MYSQLDB”)  from SQL Server Instance named “MYSQLSERVER”
? LinkTable(MyTableAliasInAccess, MYSQLSERVER,MYDB,MYSQLTABLE,True)

The Code

Just paste this code into a new module in the Access database and you are ready to go.

Sub LinkAllTables(Server As String, database As String, OverwriteIfExists As Boolean) 'Usage Example (link all tables in database "SQLDB" on SQL Server Instance SQO01, overwriting any existing linked tables. 'linkalltables "SQL01","SQLDB", true 'This will also update the link if the underlying table definition has been modified. Dim rsTableList As New ADODB.Recordset Dim sqlTableList As String sqlTableList = "SELECT [name] as tablename FROM sysObjects WHERE (type = 'U')" rsTableList.Open sqlTableList, BuildSQLConnectionString(Server, database) While Not rsTableList.EOF If LinkTable(rsTableList("tableName"), Server, database, rsTableList("tableName"), OverwriteIfExists) Then Debug.Print "Linked: " & rsTableList("tableName") End If rsTableList.MoveNext Wend rsTableList.Close Debug.Print "Done." End Sub Function LinkTable(LinkedTableAlias As String, Server As String, database As String, SourceTableName As String, OverwriteIfExists As Boolean) 'This method will also update the link if the underlying table definition has been modified. 'The overwrite parameter will cause it to re-map/refresh the link for LinktedTable Alias, but only if it was already a linked table. ' it will not overwrite an existing query or local table with the name specified in LinkedTableAlias. 'Links to a SQL Server table without the need to set up a DSN in the ODBC Console. Dim dbsCurrent As database Dim tdfLinked As TableDef ' Open a database to which a linked table can be appended. Set dbsCurrent = CurrentDb() 'Check for and deal with the scenario ofthe table alias already existing If TableNameInUse(LinkedTableAlias) Then If (Not OverwriteIfExists) Then Debug.Print "Can't use name '" + LinkedTableAlias + "' because it would overwrite existing table." Exit Function End If 'delete existing table, but only if it is a linked table If IsLinkedTable(LinkedTableAlias) Then dbsCurrent.TableDefs.Delete LinkedTableAlias dbsCurrent.TableDefs.Refresh Else Debug.Print "Can't use name '" + LinkedTableAlias + "' because it would overwrite an existing query or local table." Exit Function End If End If 'Create a linked table Set tdfLinked = dbsCurrent.CreateTableDef(LinkedTableAlias) tdfLinked.SourceTableName = SourceTableName tdfLinked.Connect = "ODBC;DRIVER={SQL Server};SERVER=" & Server & ";DATABASE=" & database & ";TRUSTED_CONNECTION=yes;" On Error Resume Next dbsCurrent.TableDefs.Append tdfLinked If (Err.Number = 3626) Then 'too many indexes on source table for Access Err.Clear On Error GoTo 0 If LinkTable(LinkedTableAlias, Server, database, "vw" & SourceTableName, OverwriteIfExists) Then Debug.Print "Can't link directly to table '" + SourceTableName + "' because it contains too many indexes for Access to handle. Linked to view '" & "vw" & SourceTableName & "' instead." LinkTable = True Else Debug.Print "Can't link table '" + SourceTableName + "' because it contains too many indexes for Access to handle. Create a view named '" & "vw" & SourceTableName & "' that selects all rows/columns from '" & SourceTableName & "' and try again to circumvent this." LinkTable = False End If Exit Function End If On Error GoTo 0 tdfLinked.RefreshLink LinkTable = True End Function Function BuildSQLConnectionString(Server As String, DBName As String) As String BuildSQLConnectionString = "Driver={SQL Server};Server=" & Server & ";Database=" & DBName & ";TRUSTED_CONNECTION=yes;" End Function Function TableNameInUse(TableName As String) As Boolean 'check for local tables, linked tables and queries (they all share the same namespace) TableNameInUse = DCount("*", "MSYSObjects", "(Type = 4 or type=1 or type=5) AND [Name]='" & TableName & "'") > 0 End Function Function IsLinkedTable(TableName As String) As Boolean IsLinkedTable = DCount("*", "MSYSObjects", "(Type = 4) AND [Name]='" & TableName & "'") > 0 End Function

Quandry: How to get a screen-shot of the Windows login prompt.

The Dilemma

I was working on some documentation recently and needed to grab a screen-shot of the Windows login prompt for an exhibit in a manual. I quickly realized that my normal techniques for capturing screen-shots were impractical.

Method 1: Windows Built in Screen Capture (Failed)

Unless you have been living in a cave, or are new to Windows, you are probably familiar with this technique. Press the PrintScrn key to copy a bitmap of the desktop to the clipboard, or use AltPrintScrn to capture just the active window. It’s a simple and free solution for screen-shots. Here is a good reference, if you would like more information on this approach.

The problem is that it doesn’t work from the login prompt, probably as part of the security mechanism of Windows that correctly doesn’t allow unauthenticated users to monkey with anything including the clipboard.

Method 2: Screen Capture Software (Failed)

The problem with using any screen capture software is that it has to be running and accepting commands (usually via hot keys) to take a picture of what you see on the screen. Because you can’t run an interactive user program before you log in to Windows, this method is dead-on-arrival. I am unaware of any screen capture software that offers a workaround for grabbing the login prompt.


 The Solution

After experimenting with a few ideas, I arrived at a solution that works pretty well. As with most things, the answer was quite simple, it just required some sideways thinking about the problem.

Step 1:Using terminal software, such as Windows Remote Desktop, open a connection to another machine, but don’t go into fullscreen mode as shown here.  And yes, I appreciate the irony of using a screen-shot to demonstrate taking screen-shots.

Using Remote Desktop to get a screen-shot of the login prompt.

Using Remote Desktop to get a screen-shot of the login prompt.

Step 2 (option A): Grab the screen-shot of using the PrintScrn approach. The AltPrintScrn technique, unfortunately, will grab the whole remote desktop window and not just the login dialog, so you will need to do some cropping in a graphics package to get down to just the dialog.

Step 2 (Option B): Take the screen-shot with your screen-capture software, which should allow you to only capture the part of the screen with that contains the dialog.

That’s it! The consumer of your screen-shot  will be none the wiser that they are looking at a capture from  a nested window unless you do a really shoddy job on the cropping.