Archive for the ‘Linq To Object’ Category.

C#: Finding List Duplicates Using the GroupBy Linq Extension and Other GroupBy Operations and Usage Explained

Woman against mirror showing her reflection as she seductively looks outward from picture. Her reflection is like a duplicate found in a list.
When one is dealing with lists such as strings there can be a situation where duplicates can be encountered and one such way of finding identical strings is to use the Linq extension GroupBy.  This article also provides an in depth explanation of that least used and somewhat misunderstood extension. Examples are give with related and non related keys to help one understand the flexibility of the extension.
Note that this code can be used in any .Net version from 3.5 and greater.

Finding Identical Strings using GroupBy

Searching for duplicates in lists can be done in different ways but with the introduction of GroupBy extension in Linq to Object queries one has a powerful tool to find those duplicates. GroupBy-s premise is to group items by a key and since keys in dictionaries are required to be unique, using this method to find duplicate items makes sense.
So let us define our problem in terms of a list of strings and somewhere within that list are duplicates which we want reported. For the sake of simplicity I won’t deal with case sensitivities to keep the example tight. The solution is as below with a line by line explanation of what is going on.
List<string> theList = new List<string>() { "Alpha", "Alpha", "Beta", "Gamma", "Delta" };

theList.GroupBy(txt => txt)
        .Where(grouping => grouping.Count() > 1)
        .ToList()
        .ForEach(groupItem => Console.WriteLine("{0} duplicated {1} times with these values {2}",
                                                 groupItem.Key, 
                                                 groupItem.Count(),
                                                 string.Join(" ", groupItem.ToArray())));
// Writes this out to the console:
//
// Alpha duplicated 2 times with these values Alpha Alpha

Line By Line Explanation

Line 1: Our generic list is defined with duplicate strings “Alpha” while Beta Gamma and Delta are only found once.
Line 3:
Using the extension method GroupBy. This extension is based off of an enumerable item (IEnumerable<TSource>) which is of course our list. The primary argument is and a Lambda function (Func<TSource, TKey>) where we will simply define our TSource as the input (our string of the list) and its lambda operation as the key for our grouping.
The key in our case for this scenario is our string which we want to find the duplicate in the list. If we were dealing with a complex object then the key might be a property or field off of the object to use as the key in other scenarios but it is not. So our key will be the actual item found within our list. Since GroupBy’s result behaves like a dictionary, each key must be unique and that is the crux of how we can use GroupBy to divine all identical strings.
Line 3: Before moving on we must understand what GroupBy will return. By definition it returns IEnumerable<IGrouping<TKey, TSource>>. This can be broken down as such:

  • IEnumerable simply means that it will return a list or multiple of items which will be of IGrouping<> type.
  • IGrouping is a tuple type object where it contains the key of the grouped item and its corresponding value.  The nuance of this item is that when it is accessed directly it simply returns the TSource item (the non key part, just its value).
If one is familiar with the Dictionary class then one has worked with the KeyValuePair and this is the same except for the direct access of the value as mentioned above is not found in KeyValuePair.
Line 4: With GroupBy returning individual lists of key value pairs of IGrouping objects we need to weed out the single item keys and return only ones of two of more and the Where does this job for us. By specifying a lambda to weed out the lists which only have 1 found key, that gives us the duplicates sought.
Line 5:
Change from IEnumerable returned from Where extension and get an actual List object. This is done for two reasons.
The first is that the ForEach is an extension method specific to a list and not a raw IEnumerable.
Secondly and possibly more importantly, the GroupBy extension is a differed execution operation meaning that the data is not generated until it is actually needed. By going to ToList we are executing the operation and getting the data back immediately. This can come into play if the original data may change. If the data can change its best to get the data upfront from such differed execution methods.
Line 6: The goal is to display all the duplicates found and demonstrate how an IGrouping object is used. In our example we only have one IGrouping result list but if the data were changed we could have multiple. The following will display the information about our current IGrouping list returned.
Line 7: By accessing the named Key property of a IGrouping object we get an individual unique key result which defines the list of grouping data found. Note just because we have grouped our data by a key which is the same as the data, doesn’t mean in another use of Groupby that the data will be the same. In our example the key is “Alpha” which we send to {0} of the string.Format.
Line 8: The count property shows us how many values are found in the grouping. Our example returns two.
Line 9: We will enumerate the values of this current grouping and list out the data values. In this case there are two values both being “Alpha”.

GroupBy Usage with Only Two Defined Keys Regex Example

Now that one understands the GroupBy, one must not think that multiple unique keys are the be all end all to its usage. Sometimes we may want group things into found and not found groupings. The following example takes our greek letter list above and finds all the  words ending in “ta”.
Here is how it is done:
List<string> theList = new List<string>() { "Alpha", "Alpha", "Beta", "Gamma", "Delta" };

theList.GroupBy( txt => Regex.IsMatch( txt, "ta$" ))
       .ToList()
       .ForEach(groupItem => Console.WriteLine("{0} found {1} times with these values: {2}",
                                                 groupItem.Key,
                                                 groupItem.Count(),
                                                 string.Join(" ", groupItem.ToArray())));
// Output
// False found 3 times with these values: Alpha Alpha Gamma
// True found 2 times with these values: Beta Delta
Using our old friend Regex we are going to check to see if the current string ends in ta. If it does it will be in the key grouping of True and if not it will be found in the False grouping by the result of IsMatch. The result shows how we have manipulated the groupings to divine that Beta and Delta are the only two in our list which match the criteria. Hence demonstrating how we can further use the GroupBy method.

GroupBy Usage with one Key or a Non Related Key

I have actually had a need to where I grouped all items in to one key and performed an aggregate method on the result. The tip here is to show that one doesn’t have to group items by related keys. In the following example we through everything into group 1. We could have called the group anything frankly and sometimes it is needed.
This final example shows how the GroupBy can be flexible.
List<string> theList = new List<string>() { "Alpha", "Alpha", "Beta", "Gamma", "Delta" };

theList.GroupBy(txt => 1 )
        .ToList()
        .ForEach(groupItem => Console.WriteLine("The Key ({0}) found a total of {1} times with a total letter count of {2} characters",
                                                 groupItem.Key,
                                                 groupItem.Count(),
                                                 groupItem.Sum(it => it.Count())));

// Output:
// The Key (1) found a total of 5 times with a total letter count of 24 characters
Share

C#: WPF and Silverlight DataGrid Linq Binding Example Using Predefined Column Header Names in Xaml

iStock_000015057287XSmallThis snippet is from my archives and reminds me how to setup column names for a WPF/Silverlight datagrid and bind them to the data by not using AutoGenerateColumn feature. (See below for the visual end result.)

Ω Check out the related post Xaml: Adding Visibility Behaviors Using Blend to A DataGrid for WPF or Silverlight

DataGrid Column Names Setup in Xaml

In this example we will have two columns where the data will be filename strings. We will specify the columns in the Xaml to use the DataGridTextColumn and not to dynamically generate columns on our Datagrid:

<DataGrid x:Name="dgOperation" AutoGenerateColumns="False">
    <DataGrid.Columns>
        <DataGridTextColumn Header="File Name Before" Binding="{Binding Path=Original}"/>
        <DataGridTextColumn Header="File Name After"  Binding="{Binding Path=New}"/>
    </DataGrid.Columns>
</DataGrid>

The result is that the header row will have two columns with the name “File Name Before” and “File Name After” will subsequently bind them to a data object with the property names of “Original” and “New”.

Actual Binding During the Loading of the Grid

We will read in a directory for the WPF example from the hard drive and fill the original file names to the first column and a generated name for the second column. Since we have specified in the Xaml that we are binding to “Original” and “New” properties the dynamic linq object created will have those properties.

dgOperation.ItemsSource = Directory.GetFiles( @"C:\" )
                                   .Select( ( nm, index ) => new
                                       {
                                          Original = System.IO.Path.GetFileName( nm ),
                                          New = string.Format( "{0}_{1}{2}", System.IO.Path.GetFileNameWithoutExtension( nm ),
                                                                             index,
                                                                             System.IO.Path.GetExtension( nm ) )
                                        } );

The result is as below:

DataGridBind

Note: Out of the box editing the rows will throw an exception. To fix that make each of the rows read only.

Share

C#: Access a Resource .resx File and a Corresponding Enum To Create a Dictionary

iStock_000013436514XSmall(Update 5/16/2011: Fixed mispelling)
I ran across a situation where the code I was working on had an enum with values and I needed to display a user friendly text which was related to the enum but not the enum’s actual text value.

The standard way of mapping values across cultures is to create a localized resource file(s) (.resx) and to put in string key and a string value, where the value will be shown. The code can then do an if check to map between the two; but that gets laborious real quick. It would be better to have a dictionary where the enum is the key and the value is the value of the resources file.

The following snippet of code takes in an enum and its corresponding resource resx data and creates that dictionary. See the section Steps to Test for an example of its usage.

/// <summary>
/// Take an enum and a created corresponding resx resource file and join the two together 
/// into a dictionary. The dictionaries key is the actual enum and the value is the user readable text
/// found in the value of the resource file.
/// </summary>
/// <typeparam name="T">The Enum type which contains the target enums</typeparam>
/// <param name="rm">The .resx resource's manager which contains the mapped text</param>
/// <returns>A dictionary whose key is the enum and the result is the mapped text in the resource file.</returns>
public static Dictionary<T, string> LoadResourceEnumMappings<T>( ResourceManager rm )  
{
    T[] eValues = (T[])System.Enum.GetValues( typeof( T ) ); // Gets the actual values of the enum (T) type

    // Puts those into a local dictionary/hash for use in later.
    var dictEnum = eValues.ToDictionary( it => it.ToString(), it => it ); 

    // Work through the key value pairs (KVP) of the resource set and marry them
    // to a new dictionary where the key is the enum and the value output is the user string
    // as found in the resource's value of its kvp.
    return rm.GetResourceSet( Thread.CurrentThread.CurrentCulture, true, false )
             .OfType<DictionaryEntry>()
             .ToDictionary( kvp => dictEnum[kvp.Key.ToString()], kvp => kvp.Value.ToString() );

}
NOTES
  • One can have more resource keys than enums and the above will work. But if there are more enums than resource keys, the above code will throw a KeyNotFound exception.
Enumerate the Embedded Resource

The above code uses the ability to enumerate or iterate the resource file by calling GetResourceSet. That calls returns a Dictionary entry which has the key value pair of the resource file and could be used with a foreach.

Steps To Test
  1. Create console application
  2. Create enum named MappedValues with these enums : Alpha, Beta, Gamma.
    public enum MappedValues
    {
        Alpha,
        Beta,
        Gamma
    }
  3. Create Resource File named UserText.Resx with these values:Resx 

Test as such with this code by calling EnumMapper.Usage():

public static class EnumMapper
{
    /// <summary>
    /// This is just for show and not meant for production.
    /// </summary>
    public static void Usage()
    {
        try
        {
            Console.WriteLine( "Load resource and show: " + UserText.Alpha );

            Dictionary<MappedValues, string> mapped = LoadResourceEnumMappings<MappedValues>( UserText.ResourceManager );

            Console.WriteLine( mapped[MappedValues.Alpha] );

            Console.WriteLine( mapped[MappedValues.Beta] );

            Console.WriteLine( mapped[MappedValues.Gamma] );
        }
        catch ( KeyNotFoundException )
        {
            Console.WriteLine("The Resource File has a key which is not found in the enum.");
        }

/* outputs (Note "first value" was shown to initialize the ResourceManager otherwise it would be null from GetResourceSet)
Load resource and show: First Value
First Value
Second Item
Third Wave
*/
    }

    /// <summary>
    /// Take an enum and a created corresponding resx resource file and join the two together 
    /// into a dictionary. The dictionaries key is the actual enum and the value is the user readable text
    /// found in the value of the resource file.
    /// </summary>
    /// <typeparam name="T">The Enum type which contains the target enums</typeparam>
    /// <param name="rm">The .resx resource's manager which contains the mapped text</param>
    /// <returns>A dictionary whose key is the enum and the result is the mapped text in the resource file.</returns>
    public static Dictionary<T, string> LoadResourceEnumMappings<T>( ResourceManager rm )  
    {
        T[] eValues = (T[])System.Enum.GetValues( typeof( T ) ); // Gets the actual values of the enum (T) type

        // Puts those into a local dictionary/hash for use in later.
        var dictEnum = eValues.ToDictionary( it => it.ToString(), it => it ); 

        // Work through the key value pairs (KVP) of the resource set and marry them
        // to a new dictionary where the key is the enum and the value output is the user string
        // as found in the resource's value of its kvp.
        return rm.GetResourceSet( Thread.CurrentThread.CurrentCulture, true, false )
                    .OfType<DictionaryEntry>()
                    .ToDictionary( kvp => dictEnum[kvp.Key.ToString()], kvp => kvp.Value.ToString() );

    }

}
Share

C#: Handling Title Case in Strings with Articles and Prepositions

iStock_000002240961XSmall

This was an issue I answered in the forums which a user presented and felt that the response was intricate enough to share with the world as a whole. The user wanted to have a string converted to title case but also wanted to have the first letter of any article or preposition to not be be upper case along with the rest of the sentence. This article discusses how to do that in C#.

For example the user was interested in changing

“ALL QUIET ON THE WESTERN FRONT”

   to

“All Quiet on the Western Front.”

.Net Framework Almost Does It

Thanks to the TextInfo class and a helping hint from a current CultureInfo object we can use the method ToTitleCase to work with our current language. The problem is that when ToTitleCase is called with the original sentence we get this:

“All Quiet On The Western Front”

Give it some Help

The .Net code is not robust enough to ignore the articles and prepositions so we will augment it. The following code using Linq-to-Object and Regex and processes majority of the target articles and prepositions . I have placed it into an extension method below:

/*
using System.Globalization;
using System.Threading;
using System.Text.RegularExpressions;
*/

/// <summary>
/// An Extension Method to allow us t odo "The Title Of It".asTitleCase()
/// which would return a TitleCased string.
/// </summary>
/// <param name="title">Title to work with.</param>
/// <returns>Output title as TitleCase</returns>
public static string asTitleCase ( this string title)
{
    string WorkingTitle = title;

    if ( string.IsNullOrEmpty( WorkingTitle ) == false )
    {
        char[] space = new char[] { ' ' };

        List<string> artsAndPreps = new List<string>()
            { "a", "an", "and", "any", "at", "from", "into", "of", "on", "or", "some", "the", "to", };

        //Get the culture property of the thread.
        CultureInfo cultureInfo = Thread.CurrentThread.CurrentCulture;
        //Create TextInfo object.
        TextInfo textInfo = cultureInfo.TextInfo;

        //Convert to title case.
        WorkingTitle = textInfo.ToTitleCase( title.ToLower() );

        List<string> tokens = WorkingTitle.Split( space, StringSplitOptions.RemoveEmptyEntries ).ToList();

        WorkingTitle = tokens[0];

        tokens.RemoveAt(0);

        WorkingTitle += tokens.Aggregate<String, String>( String.Empty, ( String prev, String input )
                                => prev +
                                    ( artsAndPreps.Contains( input.ToLower() ) // If True
                                        ? " " + input.ToLower()              // Return the prep/art lowercase
                                        : " " + input ) );                   // Otherwise return the valid word.

        // Handle an "Out Of" but not in the start of the sentance
        WorkingTitle = Regex.Replace( WorkingTitle, @"(?!^Out)(Out\s+Of)", "out of" );
    }

    return WorkingTitle;

}
Explanation
  • Line 21: Here is our English list of words not to capitalize. We would have to change this for other languages.
  • Line 25: We get the current culture from the running thread so that ToTitleCase can do its job.
  • Line 30: ToTitleCase does the first run and upper cases all the first letters and drops any following upper case letters if they exist.
  • Line 32: We split the line on space between the words into word tokens and put them in a list.
  • Line 34: We save off the first word because regardless of what it is, it is correct.
  • Line 36: We remove the first word so not to process it.
  • Line 40: Using the Aggregate extension to accumulate each word token we will add a space. We are using the aggregate method in-lieu of string.Join to add spaces to our words (the accumulation), but also to check each word as it goes by which string.Join can’t help us with.
  • Line 42: As the tokens (words) are handed to us, check to see if they are in the list we setup in line 21. If it exists, add a space in front and make the whole word lower case (Line 43) other wise ad a space and just return the word.
  • Line 46: Handle any two word Out Of issues, but ignore if it is the first word as found in “Out of Africa”.
Tests and Results

 

Console.WriteLine( "ALL QUIET ON THE WESTERN FRONT".asTitleCase() );
Console.WriteLine( "Bonfire OF THE Vanities".asTitleCase() );
Console.WriteLine( "The Out-of-Sync Child: Recognizing and Coping with Sensory Processing Disorder".asTitleCase() );
Console.WriteLine( "Out OF AFRICA".asTitleCase() );

/* Results
All Quiet on the Western Front
Bonfire of the Vanities
The Out-Of-Sync Child: Recognizing and Coping With Sensory Processing Disorder
Out of Africa
*/
Share