How to diff two folders from a Windows command prompt

In most cases where I need to compare two folders recursively on a Windows system I use my go-to tool Beyond Compare. It is an excellent utility, and one that I think should be among the first utilities any developer should install on a new machine.

However, today I was doing a reconciliation as part of a very large file migration project that required comparing two folders that each contained hundreds of millions of files spread across thousands of sub-folders. BC was having a lot of trouble and choked on many of my comparisons. It just wasn’t the tool for today’s job. I needed another solution.

Necessity mothered some invention and I found an inventive way to use a combination of command switches on  RoboCopy to perform the comparison. If you are not familiar with RoboCopy, and you do a lot of mass copying of files, you need to stop what you are doing and learn about it pronto. It is a supercharged version of XCopy that has been included with Windows since Vista. It has a ton of great features such as multi-threaded file copying, selectively copying changed files, and resumable copies that make it a must especially for big file copy jobs over flaky network connections.

Diff Command Using RoboCopy

So here’s the command to perform a basic comparison of two folders and write a log file listing the differences.

ROBOCOPY “\\FileShare\SourceFolder” “\\FileShare\ComparisonFolder” /e /l /ns /njs /njh /ndl /fp /log:reconcile.txt

Explanation of the command switches used above:

/e  Recurse through sub-directories (including empty ones)
/l  Don’t modify or copy files, log differences only
/fp  Include the full path of files in log (only necessary if you omit /ndl)
/ns  Don’t include file sizes in log
/ndl  Don’t include folders in log
/njs   Don’t include Job Summary
/njh   Don’t include Job Header
/log:reconcile.txt   Write log to reconcile.txt (Recreate if exists)
/log+: reconcile.txt   (Optional variant) Write log to reconcile.txt (Append if exists)

Usage Notes and Warnings Regarding the /NDL Option

The /NDL option is a handy way to suppress the inclusion of every folder checked (regardless of whether it contains differences) in the log, but there because of the way it works it is not a good idea in all circumstances. Consider the following before you use /NDL.

  • Folders that exist only on source or destination are not logged unless at least one mismatched file is present or a source file is missing on destination.
  • Folders that exist only on the destination are not logged at all regardless of contents.

If you omit the /NDL option, it is necessary to include the /FP option if you want full paths listed for each file.

Example Output

(with /NDL option)

*EXTRA File         c:\dest\log.txt
New File              c:\source\newfolder\Blah.txt
Newer                 c:\source\Files\CONCORD.DAT
New File              c:\source\Files\COWCO.DAT

(without NDL Option)

 c:\work\test\source\    (extraneous folder listing)
*EXTRA Dir      c:\dest\newfolderdest\
*EXTRA Dir      c:\dest\newfolderrestempty\
*EXTRA File     c:\dest\log.txt
New Dir           c:\source\newfolder\
New File          c:\source\newfolder\Blah.txt
New Dir           c:\source\newfolderempty\
c:\source\Files\   (extraneous folder listing)
Newer             c:\source\Files\CONCORD.DAT
New File          c:\source\Files\COWCO.DAT
c:\test\source\FilesSame\   (Included despite no diffs)

5 Best Practices for Commenting Your Code

One of the first things you learn to do incorrectly as a programmer is commenting your code. My experience with student and recently graduated programmers tells me that college is a really good place to learn really bad code commenting techniques. This is just one of those areas where in-theory and in-practice don’t align well.

There are two factors working against you learning good commenting technique in college.

  1.  Unlike the real world, you do a lot of small one-off projects as a solo developer.  There’s no one out there fantasizing about dropping a boulder on you for making them decipher your coding atrocity.
  2.  That commenting style you are emulating from your textbook is only a good practice when the comments are intended for a student learning to program. It is downright annoying to professional programmers.

These tips are primarily intended for upstart programmers who are transitioning into the real world of programming, and hopefully will prevent a few from looking quite so n00bish during their first code review. Code Review? Oh yeah, that’s something else they didn’t teach you in school, but that’s a whole other article, I’ll defer to Jason Cohen on that one.

So let’s get started…

(1) Comments are not subtitles

It’s easy to project your own worldview that code is a foreign language understood only by computers, and that you are doing the reader a service by explaining what each line does in some form of human language. Or perhaps you are doing it for the benefit of that non-programmer manager who will certainly want to read your code (Spoiler: He won’t).

Look, in the not too distant future, you will be able to read code almost as easily as your native language, and everyone else who will even glance at it almost certainly already can. By then you will realize how silly it is to write comments like these:

// Loop through all bananas in the bunch
foreach(banana b in bunch) {
    monkey.eat(b);  //make the monkey eat one banana
}

You may have been taught to program by first writing  pseudo-code comments then writing the real code into that wire-frame. This is a perfectly reasonable approach for a novice programmer. Just be sure to replace the comments with the code, and don’t leave them in there.

Computer: Enemy is matching velocity.
Gwen DeMarco: The enemy is matching velocity!
Sir Alexander Dane: We heard it the first time!
Gwen DeMarco: Gosh, I’m doing it. I’m repeating the darn computer!

-Galaxy Quest

Exceptions:

  • Code examples used to teach a concept or new programming language.
  • Programming languages that aren’t remotely human readable (Assembly, Perl)

(2) Comments are not an art project

This is a bad habit propagated by code samples in programing books and open source copyright notices that are desperate to make you pay attention to them.

/*
   _     _      _     _      _     _      _     _      _     _      _     _
  (c).-.(c)    (c).-.(c)    (c).-.(c)    (c).-.(c)    (c).-.(c)    (c).-.(c)
   / ._. \      / ._. \      / ._. \      / ._. \      / ._. \      / ._. \
 __\( Y )/__  __\( Y )/__  __\( Y )/__  __\( Y )/__  __\( Y )/__  __\( Y )/__
(_.-/'-'\-._)(_.-/'-'\-._)(_.-/'-'\-._)(_.-/'-'\-._)(_.-/'-'\-._)(_.-/'-'\-._)
   || M ||      || O ||      || N ||      || K ||      || E ||      || Y ||
 _.' `-' '._  _.' `-' '._  _.' `-' '._  _.' `-' '._  _.' `-' '._  _.' `-' '._
(.-./`-'\.-.)(.-./`-'\.-.)(.-./`-'\.-.)(.-./`-'\.-.)(.-./`-'\.-.)(.-./`-'\.-.)
 `-'     `-'  `-'     `-'  `-'     `-'  `-'     `-'  `-'     `-'  `-'     `-'

                 -It's Monkey Business Time! (Version 1.5)
*/

Why, that’s silly. You’d never do something so silly in your comments.

ORLY? Does this look familiar?

   +------------------------------------------------------------+
   | Module Name: classMonkey                                   |
   | Module Purpose: emulate a monkey                           |
   | Inputs: Bananas                                              |
   | Outputs: Grunts                                            |
   | Throws: Poop                                               |
   +------------------------------------------------------------+

Programmers love to go “touch up” their code to make it look good when their brain hurts and they want something easy to do for a while. It may be a waste of time, but at least they are wasting it during periods where they wouldn’t be productive anyway.

The trouble is that it creates a time-wasting maintenance tax imposed on anyone working with the code in the future just to keep the pretty little box intact when the text ruins the symmetry of it. Even programmers who hate these header blocks tend to take the time to maintain them because they like consistency and every other method in the project has one.

How much is it bugging you that the right border on that block is misaligned? Yeah. That’s the point.

(3) Header Blocks: Nuisance or Menace?

This one is going to be controversial, but I’m holding my ground. I don’t like blocks of header comments at the top of every file, method or class.

Not in a boat, not with a goat.
Why? Well let me tell you, George McFly…

They are enablers for badly named objects/methods – Of course, header blocks aren’t the cause for badly named identifiers, but they are an easy excuse to not  put in the work to come up with meaningful names, an often deceptively difficult task. It provides too much slack to just assume the consumer can just read the “inline documentation” to solve the mystery of what the DoTheMonkeyThing method is all about.

JohnFx’s Commandment:
The consumer of thy code should never have to see its source code to use it, not even the comments.

They never get updated: We all know that methods are supposed to remain short and sweet, but real life gets in the way and before you know it you have a 4K line class and the header block is scrolled off of the screen in the IDE 83% of the time. Out of sight, out of mind, never updated.

The bad news is that they are usually out of date. The good news is that people rarely read them so the opportunity for confusion is mitigated somewhat. Either way, why waste your time on something that is more likely to hurt than help?

JohnFx’s Maxim of Plagiarized Ideas :
Bad Documentation is worse than no documentation.

Exception: Some languages (Java/C#) have tools that can digest specially formatted header block comments into documentation or Intellisense/Autocomplete hints. Still, remember rule (2) and stick to the minimum required by the tool and draw the line at creating any form of ASCII art.

(4) Comments are not source control

This issue is so common that I have to assume that programmers (a) don’t know how to use source control; or  (b) don’t trust it.

Archetype 1: “The Historian”

     // method name: pityTheFoo (included for the sake of redundancy)
     // created: Feb 18, 2009 11:33PM
     // Author: Bob
     // Revisions: Sue (2/19/2009) - Lengthened monkey's arms
     //            Bob (2/20/2009) - Solved drooling issue

     void pityTheFoo() {
          ...
     }

The programmers involved in the evolution of this method probably checked this code into a source control system designed to track the change history of every file, but decided to clutter up the code anyway. These same programmers more than likely always leave the Check-In Comments box empty on their commits.

I think I hate this type of comment worst of all, because it imposes a duty on other programmers to keep up the tradition of duplicating effort and wasting time maintaining this chaff. I almost always delete this mess from any code I touch without an ounce of guilt. I encourage you to do the same.

Archetype 2: “The Code Hoarder”


     void monkeyShines() {
          if (monkeysOnBed(Jumping).count > max) {
             monkeysOnBed.breakHead();

             // code removed, smoothie shop closed.
             // leaving it in case a new one opens.
             // monkeysOnBed.Drink(BananaSmoothie);
          }
     }

Another feature of any tool that has any right to call itself a SCM is the ability to recover old versions of code, including the parts you removed. If you want to be triple super extra sure, create a branch to help you with your trust issues.

(5) Comments are a code smell

Comments are little signposts in your code explaining it to future archaeologists that desperately need to understand how 21st century man sorted lists of purchase orders.

Unfortunately, as Donald Norman explained so brilliantly in The Design of Everyday Things, things generally need signs because their affordances have failed. In plain English, when you add a comment you are admitting that you have written code that doesn’t communicate its purpose well.

Sign:"This is a mop sink." Why would that be necces... oh.

Despite what your prof told you in college, a high comment to code ratio is not a good thing.  I’m not saying to avoid them completely, but if you have a 1-1 or even a 5-1 ratio of LOC to comments, you are probably overdoing it. The need for excessive comments is a good indicator that your code needs refactoring.

Whenever you think, “This code needs a comment” follow that thought with, “How could I modify the code so its purpose is obvious?”
Talk with your code, not your comments.

Technique 1: Use meaningful identifiers and constants (even if they are single use)

     // Before
     // Calculate monkey's arm length
     // using it's height and the magic monkey arm ratio
     double length = h * 1.845; //magic numbers are EVIL!

    // After - No comment required
    double armLength = height * MONKEY_ARM_HEIGHT_RATIO;

Technique 2: Use strongly typed input and output parameters

      // Before
      // input parameter monkeysToFeed:
      // DataSet with one table containing two columns
      //     MonkeyID (int) the monkey to feed
      //     MonkeyDiet (string) that monkey's diet
      void feedMonkeys(DataSet monkeysToFeed) {
      }

     //  After: No comment required
     void feedMonkeys(Monkeys monkeysToFeed) {
     }

Technique 3: Extract commented blocks into another method

      // Before

      // Begin: flip the "hairy" bit on monkeys
      foreach(monkey m in theseMonkeys) {
          // 5-6 steps to flip bit.
      }
      // End: flip the "hairy" bit on monkeys

     // After No comment required
     flipHairyBit(theseMonkeys);

As an added bonus, technique 3 will tend to reduce the size of your methods and minimizing the nesting depth (see also “Flattening Arrow Code”) all of which contribute to eliminating the need for commenting the closing tags of blocks like this:

            } // ... if see evil
         } // ... while monkey do.
      } // ... if monkey see.
    } // ... class monkey
  } // ... namespace primate

Acknowledgments

Several of the ideas presented here, and a good deal of the fundamental things I know about programming as part of a team, I learned from the book Code Complete by Steve McConnell. If you are a working programmer and have not read this book yet, stop what you are doing and read it before you write another line of code.

Quick Tip: Comparing a .NET string to multiple values

The Problem

Do you ever wish you could use the SQL IN operator in your C# code to make your conditional blocks more concise and your code easier to read?

Perhaps it’s just my persnickety nature, but I believe that line-wrapped conditional expressions like this are a code smell.

if (animal.Equals("Cow") ||
   animal.Equals("Horse") ||
   animal.Equals("Hen"))
{
   Console.WriteLine("We must be on the farm.");
}

This would be so much cleaner…

if (animal.CompareMultiple("Cow","Horse","Hen")
{
   Console.WriteLine("We must be on the farm.");
}

The Code

With a simple extension class you can upgrade your string classes to do this very thing.

Step 1: Create an extension class as demonstrated here.

C#

namespace extenders.strings
{
  public static class StringExtender {

    public static bool CompareMultiple(this string data, StringComparison compareType, params string[] compareValues) {
      foreach (string s in compareValues) {
        if (data.Equals(s, compareType)) {
          return true;
        }
      }
      return false;
   }
  }
}

VB.NET

Imports System.Runtime.CompilerServices

Namespace Extenders.Strings

  Public Module StringExtender

    <Extension()> _
    Public Function CompareMultiple(ByVal this As String, compareType As StringComparison, ParamArray compareValues As String()) As Boolean

       Dim s As String

       For Each s In compareValues
         If (this.Equals(s, compareType)) Then
            Return True
         End If
       Next s

       Return False
    End Function

  End Module

End Namespace

Step 2: Add a reference to the extension namespace and use it.

C#

using extenders.strings;

namespace MyProgram
{
  static class program {
    static void Main() {

      string foodItem = "Bacon";

      if (foodItem.CompareMultiple(StringComparison.CurrentCultureIgnoreCase, "bacon", "eggs", "biscuit")) {
         System.Console.WriteLine("Breakfast!");
      }
      else {
         System.Console.Write("Dinner");
      }

    }
  }
}

VB.NET

Imports StringExtenderExampleVB.Extenders.Strings

Module Program

  Sub Main()
    Dim foodItem As String = "Bacon"

    If (foodItem.CompareMultiple(System.StringComparison.CurrentCultureIgnoreCase, "bacon", "eggs", "biscuit")) Then
      System.Console.WriteLine("Breakfast!")
    Else
      System.Console.Write("Dinner")
    End If

    System.Console.ReadLine()
  End Sub

End Module

Adding combo-box style dropdowns to your web page

Announcing the Improved Dropdown jQuery plugin!

I’m happy to announce that I’ve released a new jQuery plugin today that will enable web page authors to create a richer user experience than is offered by the standard implementation of the SELECT element in poplar browser platforms. The plugin will supercharge the dropdown controls into combo-box style controls that allow filtering the selection list based on keyboard input.

The following two screen shots tell the story of this plugin better than a 100 words ever could.

An Improved Dropdown demonstrating groups and disabling items

Headings and disabled items are fully supported.

Filtering down to the potential matches based on keyboard input.

This list is filtered to only matching items as the user types.

Another auto-complete plugin? Why?

Okay. I am fully aware that there exist a number of freely available jQuery plugins with similar functionality. However, none of them seemed to fit exactly what I needed on one of my development projects so I built ImprovedDropDown originally for my own use, and later decided it was worth sharing with the developer community.

Key differences between ImprovedDropdown and the typical auto-complete plugin:

  1. It allows free-text entry to filter the list, but forces the user to pick an item that is in the list.
  2. When filtering/suggesting items it uses open-ended contains logic, not begins-with logic.
  3. It pulls the suggestion list from plain-old client side HTML rather than requiring an AJAX call for suggestions.
  4. It supports non-selectable group headers in the list similar to an OPTGROUP tag in a SELECT control.

Item #3 at first glance may look like a step back since AJAX is a cool way to serve up suggestions for a long list of choices. However, I was looking for a super simple way to add functionality to a dropdown without adding the additional complexity of  extra server-side code.

For small to medium sized lists it seemed overkill to implement AJAX handlers on the server all the time. As a bonus, this control will work just fine for static web-sites where the hosting provider doesn’t make available a server-side scripting capabilities.

Do I have to pay for it?

Nope. I’m releasing this project as open-source freeware as a thank you to all the other developers that have contributed greatly to many of my development projects by  contributing their own code to the community. I’ve licensed it under the Apache 2.0 License, which is fairly liberal in terms of what you can do with the code either in commercial or non-commercial software.

Although it isn’t mandated by the license, the one thing I ask in return for using the software is to let me know that you are using it. A shout-out on twitter (@JohnFx), an e-mail, blog post, or comment somewhere on this blog will suffice.  You don’t have to tell me any specifics of your project, unless you just want to brag on it. Primarily, I am just using this request as a way to gauge how much use it is actually getting in the wild.

Is it complicated to use?

If you are new to jQuery, there is a small learning curve there, but even if you don’t know jQuery you can pretty much just copy the included examples to get up and running.

Step 1: Copy the files to your site (jQuery + the 3 ImprovedDropdown files)

Step 2: Add a few references to the HEAD section of the page where you want the control.

Step 3: In the $(document).ready event, call the improveDropdown method to upgrade one or more controls on that page.

Example code (steps 2-3) to upgrade all SELECT elements on the  page to Improved Dropdowns:


<head>
<script type="text/javascript" src="js/jquery.js"></script>
<script type="text/javascript" src="js/improvedDropDown.js"></script>
<link rel="StyleSheet" href="css/improvedDropdown.css" type="text/css" />

    <script type="text/javascript">
       $(document).ready(function () {
            $("select").improveDropDown();
       });
    </script>

</head>

Sounds Great. Where can I get it?

You can download all the files you need to get started from the Improved Dropdown project page at BitBucket (Go Mercurial!) in case you’d like to contribute, follow the project, or just laugh at all the bug-fix changesets that it took to get to the 1.x version.

What do you think?

I put considerable thought into making  the integration of this functionality into existing web pages super quick and easy. I hope that you will find it easy to work with and your users will like the functionality it offers. If you have any comments, complaints, or suggestions for improvement I would love to hear them. Just post a comment somewhere on the blog or drop me an e-mail (see “About” tab for my address).

Replacing multiple spaces in a string with a single space in SQL

After reading Jeff Moden’s article “Replace Multiple Spaces with One” describing an approach to collapse white space in a string, I was troubled by the need for a temporary placeholder. I’m generally skeptical of any solution that requires picking a character that doesn’t naturally occur in the data. It just feels like you are building a time-bomb into the app even if you are very careful to pick something so zany it has a very little chance of showing up and causing problems. Also, the character you pick depends on the data you are running this against, so it might not make for a great generic solution.

So today’s project was to find another way to skin that cat without  inserting bogus characters into the data.

Statement of The Problem

For those who haven’t read Jeff’s article, here is a basic statement of the task:

Replace any continuous series of repeating spaces of in a database column with a single space.

Constraints:

  1. Although a CLR UDF using .NET’s regular expression library is the most straightforward way to do this, the original article went for a pure SQL approach so I did the same.
  2. My solution is based language features available in MS SQL Server 2005 and later.

Test Data

I used a permanent table to test this instead of a temporary table, but used the same sample data set from Jeff’s article.  Also, I inserted these same rows 100K times so I had a large enough data set to be useful for performance testing.


CREATE TABLE SpacesTest(
ID int IDENTITY(1,1) NOT NULL,
SomeText varchar(max) NULL,
)

go
DECLARE @loopCounter int
set @test=0

WHILE (@loopCounter<100000)
BEGIN
  INSERT INTO spacestest (sometext)
    SELECT '  This      has multiple   unknown                 spaces in        it.   '
    UNION ALL SELECT 'So                     does                      this!'
    UNION ALL SELECT 'As                                does                        this'
    UNION ALL SELECT 'This, that, and the other  thing.'
    UNION ALL SELECT 'This needs no repair.'
END

Solutions

If we could assume that no series of spaces in the string consisted of more than two spaces, this challenge could be met with a simple application of the REPLACE statement to replace all double spaces with single spaces. However, given series of arbitrary length you need would need to repeat this operation until you had collapsed the longer series down to single spaces.

You could do it iteratively like this:

WHILE EXISTS(SELECT * FROM SpacesTest WHERE (SomeText like '%  %'))
BEGIN
 UPDATE SpacesTest
 SET SomeText=REPLACE(SomeText,'  ',' ')
 WHERE (SomeText like '%  %')
END

While the iterative approach works just fine, it does do of repetitive updates to each row and requires you to write the intermediate output to a temporary table if you just want to query the data rather than updating the source table.  I came up with the following recursive solution that doesn’t require the intermediate updates and is more of  “set based” approach.

Recursive Solution:

WITH spacesCollapsed (ID,SomeText) AS
(
   SELECT ID, cast(LTRIM(RTRIM(SomeText)) as varchar(max)) FROM SpacesTest
   UNION ALL
   SELECT ID, cast(REPLACE(SomeText,'  ',' ') as varchar(max)) as SomeText FROM spacesCollapsed WHERE (SomeText like '%  %')
)
SELECT ID, SomeText
FROM (
   SELECT ID,SomeText,  ROW_NUMBER() OVER(PARTITION BY ID ORDER BY min(len(sometext))) as GroupRowNum
   FROM spacesCollapsed
   GROUP BY ID,SomeText
 ) as reductions
WHERE reductions.GroupRowNum=1;

Conclusions

My new recursive approach did generate valid output and does solve the dilemma of choosing a character that won’t occur in the data. The bad news is that it didn’t perform all that well. In fact, it was over 26x slower than Jeff’s approach. The iterative approach performed a lot better, but was still almost 6x slower than the benchmark. I suspect the performance problems with both of my alternatives come from the need to make multiple passes at the source data using a non-SARGable expression in the where clause.

Query Stats for Jeff's Approach

Jeff's Approach (for comparison)

Query Stats for Recursive Approach

Recursive Approach (Ugh!)

So, although it was a fun exercise, I have to concede this one. If performance is a major consideration (and when isn’t it?) then Jeff actually has a superior technique. Here is an example query using that technique, just be careful to understand the potential side-effects if you use it.

 SELECT REPLACE(REPLACE(REPLACE(LTRIM(RTRIM(sometext)),'  ',' '+ CHAR(7)) , CHAR(7)+' ',''), CHAR(7),'') AS CleanString
 FROM SpacesTest
 WHERE CHARINDEX('  ',sometext) > 0

Feedback

Given how dramatic the difference is for a similar operation, I have to think that maybe I am missing an optimization on my recursive solution that could reign in the performance. If you see anything I’m missing that might help my solution in terms of performance please let me know in the comments.

How to join on memo fields in Microsoft Access

Rambling Intro, Nostalgia, and Crankiness

This week I got a request troubleshoot a legacy Microsoft Access application that has been floating around our company for ages, but still gets used daily because dang it, it does the job and always has. Seems like most companies that are standardized on MS Office have a few of these lurking out on the network.

Earlier in my career I did a ton of work in MS Access and have garnered a reputation within my company for being an expert in this oft maligned platform so I got the call to look into the problem. It had been quite a while since I’d done any real work on MS Access and I’d forgotten about how quirky it could be. Also, I am more than a little disappointed at how Microsoft has mangled the UI of my old friend Access in the 2007 version. It is almost painful to work with it as a power-user in the current incarnation.

The Problem

So anyway, the issue turned out to be that someone increased the length of a field in the underlying SQL Server table linked into the Access application. They increased it past the magical border (255 characters) between what Access considers a text and a memo field, which imposed new limits on how it could be used. In particular, Access doesn’t allow either end of a join in a query to be a memo field.

Can't join on memo fields

This won't fly, McFly

The Solution

The solution is painfully simple. So much so that I have to wonder why Access doesn’t just do it behind the scenes. Perhaps it is just trying to discourage you from building databases that link on big text fields for your own good (see “The Caveat” below).

The trick is to move the join into the WHERE clause of the query  like so:

SELECT Table1.*, Table2.* FROM Table1, Table2 WHERE (Table1.MemoField=table2.MemoField);

Here is the same query in the query builder for those of you who prefer it to the SQL view:

Graphical display of query

Remove the join between the tables and add a criterion

Access will raise nary a complaint if you run this query which is logically equivalent to the one it abhorred. That’s all there is to it.

The Caveat

A final note. It is a definite database smell for an application to be joining tables on long text fields and will likely be the source of some performance issues in a database of non-trivial size. However, as was the case for the application I was tweaking, joining on long text fields is sometimes necessary in queries used for data clean-up, validation, or replication.  Still, use this type of join with caution avoiding it whenever possible.

Usability Hall of Shame – AutoCorrect

Cupertino, we have a problem.

Q: How do you know when your software has a usability problem?

A: When someone creates an entire humor site dedicated to mocking it.

Okay, you might have said something involving reports from the usability testing, or maybe even some axiomatic list of UX smells. However, if you ever find that a piece of software you wrote has inspired an entire new category of parodic humor, there just might be a problem.

How has it come to this?

The emergence of this UX gaffe leaves me incredulous for a number of reasons. First, one of the biggest offending implementations is baked into iOS, a product of a company that gets a lot of respect for designing great interfaces. Secondly, and more importantly, most of the Mobile OS platforms with this issue have gone through several major upgrades and not fixed the problem despite all the bad press about it.

Fail x 439,000

What’s the Problem?

So to reel this conversation back in, and make it instructional and not just another ranting voice on the Internet that the developers of this software will ignore, let’s talk about what it is about the auto-correct feature on your cell phone that makes it such a usability disaster.

The core problem is that it violates the usability principle that the user should always feel like they are in control of the software.

The users of your software should be the drivers, not the passengers. Ask Toyota how their ‘users’ felt about their products deciding independently that more speed might be nice .

If you are going to make your spelling checker jump in and autonomously decide it knows what the typist meant to say, it had better be right 100% of the time.  Is that bar too high? Well then stop making your app take over the controls. A well behaved app is heard and not stomping all over my carefully written prose and/or review of the wing place where I am currently receiving inferior service.

Not only does this implementation of auto-correct make the user feel impotent, it also violates the usability commandment that the user’s input is sacred. Violations of this principle contribute mightily to the computer-phobic person’s notion that the machine is going all SkyNet on them.

Extra Credit: Why do you think applications like MS Office that move your menu items around by usage frequency so infuriating? How about browser pop-ups?

Reinventing the Wheel – Let’s try square this time!

This would all be more excusable if a perfectly user-friendly auto-correct interface didn’t already exist. What’s more aggravating is that the existing workable system is so close to the new broken version that it is inconceivable that the designers of the new version weren’t familiar with the existing one.

Consider for a moment how auto-correct is implemented in MS Office, Blackberry OS, and the Firefox spell-checker. In almost identical fashion it gives you a list of suggested alternatives for the suspected misspelled word.  The key difference is that it SUGGESTS, but doesn’t CORRECT the user’s text unless the user takes affirmative action. This puts the user fully in control of the system, which makes them happy. Sure some typos may slip by the user, but if you are giving good enough visual indicators that should be minimized.

If only auto-correct had an anti-troll feature...

Is Anyone Listening?

Perhaps I’ve completely missed the point here, and there is a good reason why something that was routine on my old Blackberry is nigh impossible on the other smart phone platforms. If you have any inside scoop, or just have a convincing theory to share I’d love to hear about it in the comments. If you just want to complain about it, that’s fine too, maybe we can make this square wheel squeaky enough to get some satisfaction.

Follow

Get every new post delivered to your Inbox.

Join 128 other followers