C# Linq: Find Missing Values in a Sequence of Numbers and Other Sequence Related lists as IEnumerable Extensions

Water bubble and wavesI recently had a need to determine if a sequence of integers was solid, not broken, and whether it contained a gap of any missing numbers to report to the end user. I researched the issue and was dismayed that the examples found on the internet seemed to have unnecessary overheads of copied arrays or other artifacts and were too cumbersome for my needs. From that I wrote these C# extensions which work for any .Net above 3.5. Below I document the extension methods to get the missing numbers of a sequence, quickly determine if a sequence is broken and finally report where the existing numbers before the break of the sequence. See the end for the whole extension class for easier copying.

Problem Definition

If one has a set of numbers say

{ 2, 4, 7, 9 }

it has a broken sequence because there are missing numbers from the set 2-9. Those missing numbers are:

{ 3, 5, 6, 8 }

Find Missing Numbers In a Sequence

This method is the one I created first. It uses the Linq Aggregate extension to enumerate over the numbers in the set. If there is a gap which is greater than 1 between the numbers (the difference below is > 0 but same concept) then it reports the numbers missing between the two.

public static IEnumerable<int> SequenceFindMissings(this IList<int> sequence)
{

    var missing = new List<int>();

    if ((sequence != null) && (sequence.Any()))
    {
        sequence.Aggregate((seed, aggr) =>
                            {
                                var diff = (aggr - seed) - 1;

                                if (diff > 0)
                                    missing.AddRange(Enumerable.Range((aggr - diff), diff));

                                return aggr;
                            });
    }

    return missing;
}

Quickly Determine Broken Sequence

Is the sequence broken from the first number to the last in the set?

public static bool IsSequenceBroken(this IEnumerable<int> sequence)
{
    bool broken = false;

    if (sequence != null) 
    {
        var sequenceAsList = sequence.ToList();

        if (sequenceAsList.Any())
        {
            int lastValue = sequence.First();

            broken = sequence.Any(value =>
                                    {
                                        if ((value - lastValue) > 1)
                                            return true;

                                        lastValue = value;

                                        return false;
                                    }); 
        }
    }

    return broken;
}

Report Last Valid Number Before The Break

This is useful in situations where one needs to report where the break happens, say the user is editing in a grid and one highlights the existing number which precedes the missing number(s).

Example here returns a 2 and 5 which are the numbers which precede the break.

   (new List() { 1, 2, 4, 5, 7, 8}).SequenceReportMissingsBreakStarts()

Here is the method:

public static IEnumerable<int> SequenceReportMissingsBreakStarts(this IList<int> sequence)
{

    var breaks = new List<int>();

    if ((sequence != null) && (sequence.Any()))
    {

        sequence.Aggregate((seed, aggr) =>
                            {
                                var diff = (aggr - seed) - 1;

                                if (diff > 0)
                                    breaks.Add(seed);
                                return aggr;
                            });
    }

    return breaks;
}

Full Extension Source With Comments

Here is the code for an easier copy

public static class SequenceExtensions
{
    /// <summary>
    /// Take a sequence of numbers and if there are any gaps greater than 1 between the numbers,
    /// report true.
    /// </summary>
    /// <param name="sequence">A set of numbers to check.</param>
    /// <returns>True if the there is a break in the sequence of numbers.</returns>
    public static bool IsSequenceBroken(this IEnumerable<int> sequence)
    {
        bool broken = false;

        if (sequence != null)
        {
            var sequenceAsList = sequence.ToList();

            if (sequenceAsList.Any())
            {
                int lastValue = sequence.First();

                broken = sequence.Any(value =>
                                        {
                                            if ((value - lastValue) > 1)
                                                return true;

                                            lastValue = value;

                                            return false;
                                        });
            }
        }

        return broken;
    }

    /// <summary>
    /// Take a sequence of numbers and report the missing numbers. Stop at first break found.
    /// </summary>
    /// <param name="sequence">Set of Numbers</param>
    /// <returns>True of sequence has missing numbers</returns>
    public static IEnumerable<int> SequenceFindMissings(this IList<int> sequence)
    {

        var missing = new List<int>();

        if ((sequence != null) && (sequence.Any()))
        {
            sequence.Aggregate((seed, aggr) =>
                                {
                                    var diff = (aggr - seed) - 1;

                                    if (diff > 0)
                                        missing.AddRange(Enumerable.Range((aggr - diff), diff));

                                    return aggr;
                                });
        }

        return missing;

    }

    /// <summary>
    /// A missing break start in a sequence is where the drop off occurs in the sequence.
    /// For example 3, 5, has a missing break start of the #3 for #4 is the missing.
    /// </summary>
    /// <param name="sequence">Set of Numbers</param>
    /// <returns>The list of break numbers which exist before the missing numbers.</returns>
    public static IEnumerable<int> SequenceReportMissingsBreakStarts(this IList<int> sequence)
    {

        var breaks = new List<int>();

        if ((sequence != null) && (sequence.Any()))
        {

            sequence.Aggregate((seed, aggr) =>
                                {
                                    var diff = (aggr - seed) - 1;

                                    if (diff > 0)
                                        breaks.Add(seed);
                                    return aggr;
                                });
        }

        return breaks;

    }
}

Hope this helps!

Share

2 Comments

  1. While these options work – they all are creating “queries” with significant side effects. All of the same operations can be done without side effects via Skip(), ie:

    public static IEnumerable SequenceFindMissings(this IEnumerable sequence)
    {
       return sequence.Zip(sequence.Skip(1), (a,b) => Enumerable.Range(a+1, (b-a)-1))
                      .SelectMany(s => s);
    }
    
    public static bool IsSequenceBroken(this IEnumerable sequence)
    {
       return sequence.Zip(sequence.Skip(1), (a,b) => b-a)
                      .Any(v => v != 1);
    }
    
    public static IEnumerable SequenceReportMissingsBreakStarts(this IList sequence)
    {
       return sequence.Zip(sequence.Skip(1), (a,b) => new{b,a})
                      .Where(v => v.b - v.a > 1)
                      .Select(v => v.a);
    }
    
  2. Bart says:

    Try this:

    Enumerable.Range(seq.Min(), seq.Max() - seq.Min()).Except(seq)
    

Leave a Reply