Interviews Should Include Code Discussions

I’m amazed by the number of interviews I’ve been through, or heard about,  where real code wasn’t discussed, or code examples were not written.  The primary responsibility of a software developer is to write code. It’s not the only responsibility, but it’s really what you’re hiring a software developer to do on a daily basis.  As a manager, you have a very short period of time to select a candidate that you will work with for the next few years, if not more.  You’re taking a huge risk if you don’t ask the candidate about specific code examples, or have them write some code that can be discussed during an interview.

It’s possible that many hiring managers don’t use real code because they have no idea what to look for or what questions to ask. If this is the case, you need to have a guru on your team that can help you with this.  There are a lot of bad programmers out there. Many people can talk all day about technology, but write code like a 3 year old writes their name…it’s a cute effort, but it’s ugly.

An example that I’ve used in the past is not a question that I made up, but it’s a common question that can tell you a lot about a candidate. I’ll typically have the candidate turn in the code example before an in-person interview.  The question is as follows:

Write a function that takes an array of strings and returns an array of strings with no duplicates. Consider an algorithm that is most efficient from a “Big O” perspective. Treat the function as if you were writing it for production code.

A solution for this question can be written in a short amount of time (which is important for interview purposes), but it also tells a lot about how a candidate thinks and how they write code.

Below is an example solution I have seen from a candidate:

 public string[] RemoveDuplicates(string[] localArray) {
            //Internal arraylists used for processing the inputs
            ArrayList original = new ArrayList();
            ArrayList final = new ArrayList();
            int j = 0;

            try {
                //load the input array into an arraylist
                for (int i = 0; i <= localArray.GetUpperBound(0); i++) {
                    original.Add(localArray[i]);
                }

                //sort the array (Arraylist.Sort implements Quicksort)
                original.Sort();

                //Loop through the sorted arraylist and removes any duplicate strings, the
                //duplicate check is NOT case-sensitive
                while (j < original.Count) {
                    if (j == 0) {
                        //add the first element
                        final.Add(original[j]);
                    } else {
                        //check for duplicate and if not found add the element
                        if (String.Compare(original[j].ToString(), original[j - 1].ToString(), true) != 0) {
                            final.Add(original[j]);
                        }
                    }
                    j++;
                }
            } catch (Exception e) {
                System.Diagnostics.Debug.WriteLine(e.Message);
            }

            //return the final cleaned array
            return (string[])final.ToArray(typeof(string));

        }

OK…let’s talk about the positives. The solution will return the correct answer, but the correct answer is not what I’m looking for. If the function doesn’t return the correct result, the interview will be over pretty quick.  I’m looking for an answer that is not only correct, but is also well written.  Here are some things that stand out about this example:

  • The input array is immediately enumerated and placed into an ArrayList. When I asked this particular candidate why he did this, his answer was “I prefer ArrayLists over the standard Array.” OK…wrong answer.
  • The input parameter is not validated. Why don’t developers validate their input anymore?
  • He used the standard ArrayList object (this example was written in C# 2.0). This forced him into calling ToString on line 24. This tells me he probably doesn’t know a thing about generics (something most C# developers should know about by now).
  • A for loop would probably be tighter that using while.
  • What’s with the try/catch? Nothing of value is being done with the exception, let it run up the call stack.
  • The code just seems longer than it should. Let’s see another, more concise example..
  public string[] RemoveDuplicates(string[] input) {
            if (input == null) {
                throw new ArgumentNullException("input");
            }
            List<string> returnList = new List<string>();
            Array.Sort<string>(input);
            for (int index = 0; index < input.Length; index++) {
                if (index == 0) {
                    // Add the first element.
                    returnList.Add(input[index]);
                } else {
                    if (String.Compare(input[index], input[index - 1], true) != 0) {
                        returnList.Add(input[index]);
                    }
                }
            }
            return returnList.ToArray();
        }

This solution is much cleaner and tighter than the previous example. Parameters are checked, generics are used, the code is clean and tight.  Now, I would not eliminate the candidate who wrote the first example, but I would ask a bunch of questions about the code during the interview. The candidate may be able to explain their solution and may also be able to discuss alternatives…this would be a big plus for the candidate.

The second example leaves a much better first impression. It shows that they can write clear and concise code, but what about the “Consider an algorithm that is most efficient from a “Big O” perspective.” part of the question. With this solution the array must be sorted and enumerated.  Array.Sort uses the Quick Sort algorithm, which is O(n log n) on average. This is great for sorting, but you really don’t need to sort the array to begin with. Why not use a hash table?

        public string[] RemoveDuplicates(string[] input) {
            if (input == null) {
                throw new ArgumentNullException("input");
            }
            List<string> returnList = new List<string>();
            Dictionary<string, string> lookup = new Dictionary<string, string>();
            foreach (string stringItem in input) {
                string lowercaseStringItem = stringItem.ToLower();
                if (!lookup.ContainsKey(lowercaseStringItem)) {
                    returnList.Add(lowercaseStringItem);
                    lookup.Add(lowercaseStringItem, lowercaseStringItem);
                }
            }
            return returnList.ToArray();
        }

When a candidate provides this solution, it immediately puts a smile on my face. It’s actually a big jump for a candidate to go from the sort solution to the hash table solution. This solution is clean and has a Big O of about O(n)…much better than the example that sorts the array.  Of course, I would still like the candidate to know that the array solution is possible and that the hash table example potentially uses more memory because a hash table is created, etc.  That’s the beauty of having a candidate provide a code example…it generates a bunch of follow-up questions that will tell you a lot about the habits of the candidate, how they think, etc.

During a recent interview, one of my co-workers mentioned that .NET 3.5 actually has a new HashSet class.  I had forgotten about this, but it was a good point.  A candidate would really make me happy if they provided the following example:

         public string[] RemoveDuplicates(string[] input) {
            if (input == null) {
                throw new ArgumentNullException("input");
            }
            HashSet<string> stringSet = new HashSet<string>(input, StringComparer.OrdinalIgnoreCase);
            string[] returnList = new string[stringSet.Count];
            stringSet.CopyTo(returnList);
            return returnList;
        }

Wow. This code is extremely clean and simple…4 lines without the parameter validation. The original example had around 20 lines or so. It also shows that the candidate keeps up with new tools.

Now…some may think this is unnecessary and that most developers don’t need to analyze their algorithms so closely. I disagree, multiply the first example by the thousands of other functions that the candidate will write if they are hired. The best developers pay attention to this kind of detail and are always trying to write clear and concise code.  One function is not that bad, but thousands written without this attention to detail leads to applications that are difficult to support.

Something else to keep in mind is that I don’t dictate to the candidate what language to use when they write the code example. We currently use C#, but our interviews are not necessarily technology specific. Good programmers can master new languages quite easily.

Interviews should include discussions about real code that the candidate has written. It’s not the only thing you will use to evaluate a candidate, but it is one for your interviewing toolbox.

2 Responses to “Interviews Should Include Code Discussions”

  1. Pallavi Says:

    Nice entry.

  2. Ahmmed Rahman Says:

    Using Linq, we can weed out duplicates from a string array writing only one line of code. Here is the code snippet to remove duplicates:

    using System.Linq;

    public string[] RemoveDuplicates(string[] input)
    {
    var s = input.Select(item => item.ToString()).Distinct().ToArray();
    return s;
    }

    Another old-school “quick and dirty” technique can be used to remove dublicates from a string array by using new-school’s (.Net 3.5) HashSet class. A set is a collection of distinct objects – so HashSet will not allow to add same value more than once. So the following function will remove dublicates from string array without explicitely writing code for string comparision.

    public string[] RemoveDuplicates2(string[] input)
    {
    HashSet hs = new HashSet();
    foreach (string s in input)
    {
    try
    {
    hs.Add(s);
    }
    catch
    {
    continue;
    }
    }

    return hs.ToArray();
    }

    HashSet provides some robust set operations like IsSubsetOf and IsSupersetOf but it has no Distinct() operator like Linq does. The main difference between LINQ set operations and HashSet operations is that the LINQ set operations always return a new IEnumerable collection while the HashSet equivalent methods modify the current collection.

Leave a Reply