C#: Handling Title Case in Strings with Articles and Prepositions
This was an issue I answered in the forums which a user presented and felt that the response was intricate enough to share with the world as a whole. The user wanted to have a string converted to title case but also wanted to have the first letter of any article or preposition to not be be upper case along with the rest of the sentence. This article discusses how to do that in C#.
For example the user was interested in changing
“ALL QUIET ON THE WESTERN FRONT”
to
“All Quiet on the Western Front.”
.Net Framework Almost Does It
Thanks to the TextInfo class and a helping hint from a current CultureInfo object we can use the method ToTitleCase to work with our current language. The problem is that when ToTitleCase is called with the original sentence we get this:
“All Quiet On The Western Front”
Give it some Help
The .Net code is not robust enough to ignore the articles and prepositions so we will augment it. The following code using Linq-to-Object and Regex and processes majority of the target articles and prepositions . I have placed it into an extension method below:
/* using System.Globalization; using System.Threading; using System.Text.RegularExpressions; */ /// <summary> /// An Extension Method to allow us t odo "The Title Of It".asTitleCase() /// which would return a TitleCased string. /// </summary> /// <param name="title">Title to work with.</param> /// <returns>Output title as TitleCase</returns> public static string asTitleCase ( this string title) { string WorkingTitle = title; if ( string.IsNullOrEmpty( WorkingTitle ) == false ) { char[] space = new char[] { ' ' }; List<string> artsAndPreps = new List<string>() { "a", "an", "and", "any", "at", "from", "into", "of", "on", "or", "some", "the", "to", }; //Get the culture property of the thread. CultureInfo cultureInfo = Thread.CurrentThread.CurrentCulture; //Create TextInfo object. TextInfo textInfo = cultureInfo.TextInfo; //Convert to title case. WorkingTitle = textInfo.ToTitleCase( title.ToLower() ); List<string> tokens = WorkingTitle.Split( space, StringSplitOptions.RemoveEmptyEntries ).ToList(); WorkingTitle = tokens[0]; tokens.RemoveAt(0); WorkingTitle += tokens.Aggregate<String, String>( String.Empty, ( String prev, String input ) => prev + ( artsAndPreps.Contains( input.ToLower() ) // If True ? " " + input.ToLower() // Return the prep/art lowercase : " " + input ) ); // Otherwise return the valid word. // Handle an "Out Of" but not in the start of the sentance WorkingTitle = Regex.Replace( WorkingTitle, @"(?!^Out)(Out\s+Of)", "out of" ); } return WorkingTitle; }
Explanation
- Line 21: Here is our English list of words not to capitalize. We would have to change this for other languages.
- Line 25: We get the current culture from the running thread so that ToTitleCase can do its job.
- Line 30: ToTitleCase does the first run and upper cases all the first letters and drops any following upper case letters if they exist.
- Line 32: We split the line on space between the words into word tokens and put them in a list.
- Line 34: We save off the first word because regardless of what it is, it is correct.
- Line 36: We remove the first word so not to process it.
- Line 40: Using the Aggregate extension to accumulate each word token we will add a space. We are using the aggregate method in-lieu of string.Join to add spaces to our words (the accumulation), but also to check each word as it goes by which string.Join can’t help us with.
- Line 42: As the tokens (words) are handed to us, check to see if they are in the list we setup in line 21. If it exists, add a space in front and make the whole word lower case (Line 43) other wise ad a space and just return the word.
- Line 46: Handle any two word Out Of issues, but ignore if it is the first word as found in “Out of Africa”.
Tests and Results
Console.WriteLine( "ALL QUIET ON THE WESTERN FRONT".asTitleCase() ); Console.WriteLine( "Bonfire OF THE Vanities".asTitleCase() ); Console.WriteLine( "The Out-of-Sync Child: Recognizing and Coping with Sensory Processing Disorder".asTitleCase() ); Console.WriteLine( "Out OF AFRICA".asTitleCase() ); /* Results All Quiet on the Western Front Bonfire of the Vanities The Out-Of-Sync Child: Recognizing and Coping With Sensory Processing Disorder Out of Africa */