Archive for the ‘.Net’ Category.

Entity Framework Stored Procedure Instructions

This is a how-to on getting Entity Framework (EF) version 5 and 6 to include stored procs and how to consume the resulting entities in code.

  1. In the EF designer choose `Update Model From Database`.
  2. When the page to `Choose Your Database Objects and Settings` comes up which allows one to add new tables/views/stored procs, select the stored proc of interest. Remember the name for the resulting data mapping entity will be the name with the extension `_Result`.2015-06-12_19-10-39
  3. Once the wizard is finished EF will contain the stored proc in the `Model Browser`. The model browser can be displayed by right clicking the EF design surface and selecting `Model Browser`.2015-06-12_19-51-20
  4. Here is an explanation of what has happened.
    (1) You have added the stored proc into the `Stored Procedures __abENT__#8260; Functions` as an item of interest.
    (2) EF has created a function import of the stored proc and placed it into `Function Imports`.
    (3) If EF was able to determine the *result set entity* it will most likely be in the `Complex Types` folder.
  5. If the mapping has gone right you should be able to call the stored proc off of the EF context in code and it will return a list of the complex type `xxx_Result`. If it works you will know it, but there could be problems with the mapping.

Mapping Problems and How to Resolve

  • One can delete at anytime the any object in the folders of 1/2 or 3 shown above and regenerate or create a custom mapping. Don’t be afraid to delete.
  • Sometimes very complex stored procs will not divulge the right mapping of the entity in the result set and the resulting complex type will cause failures. One way around that is to create a faux data return in the stored proc which leaves no ambiguity for Ef.
          1. In the database change the stored proc as follows.  Comment out the meat of the result select and replace it with a one-to-one column stub faux select such as this example: “SELECT 1 AS ResultId, ‘Power1’ AS GroupName, ‘Test’ AS Description”. Note to be clear you will need to match every column and name.
          2. In EF’s Model Browser delete all things associated with the stored proc in folders 1, 2 and 3 above.
          3. Regenerate all by using`Update Model From Database`.
          4. Check the results.
  • If the above steps fail one can always create a function mapping by hand. Be careful not to create duplicates, if so delete all and start over.
        • Open up and find the stored proc you inserted into folder #3 above. Right click and select`Add Function Import…`2015-06-12_20-09-09
        • One can get the column information, change items on the above screen; it is trial and error.
        • You will need to play around with this until you have the right columns for your function import. Be wary of numbered copiesof the complex types which may be created from the mapping.

Remember to reset the stored proc back to its original state instead of the faux stub mentioned.

Share

Entity Framework Cascading Deletes; Set it from the database.

To achieve cascading deletes, one must specify the cascading deletes on the FK relationships from the top level table in the database. The default is not to cascade.

Here is the visual Process in SQL Server Management Studio.

  1. Select the top level table which will handle the delete and right click.
  2. Select design mode.
  3. Right click any row in the design mode.
  4. Select Relationships.
  5. Find all the FK relationships and set them to cascade.

Then in Entity Framework update the edmx file after these changes are made so entity framework knows about the cascading constraint.  Once all this is done a cascaded delete is possible using Entity Framework.

 

omegacoder_dot_com_EF

Share

WCF: Creating Custom Headers, How To Add and Consume Those Headers

When creating a C# WCF service (version .Net 3.0 and above) there may be a value in identifying the clients (consumers) which a web service is providing operational support to. This article demonstrates in C# and config Xml how to have clients identify themselves and pass pertinent information within the soap message’s header. That information in turn will be processed by the Web Service accordingly.

Client Identifies Itself

The goal here is to have the client provide some sort of information which the server can use to determine who is sending the message. The following C# code will add a header named `ClientId`:

var cl = new ActiveDirectoryClient();

var eab = new EndpointAddressBuilder(cl.Endpoint.Address);

eab.Headers.Add( AddressHeader.CreateAddressHeader("ClientId",       // Header Name
                                                   string.Empty,     // Namespace
                                                    "OmegaClient")); // Header Value
cl.Endpoint.Address = eab.ToEndpointAddress();

// Now do an operation provided by the service.
cl.ProcessInfo("ABC");

What that code is doing is adding an endpoint header named `ClientId` with a value of `OmegaClient` to be inserted into the soap header without a namespace.

Custom Header in Client’s Config File

There is an alternate way of doing a custom header. That can be achieved in the Xml config file of the client where all messages sent by specifying the custom header as part of the endpoint as so:

<configuration>
    <startup> 
        <supportedRuntime version="v4.0" sku=".NETFramework,Version=v4.5" />
    </startup>
    <system.serviceModel>
        <bindings>
            <basicHttpBinding>
                <binding name="BasicHttpBinding_IActiveDirectory" />
            </basicHttpBinding>
        </bindings>
        <client>
          <endpoint address="http://localhost:41863/ActiveDirectoryService.svc"
              binding="basicHttpBinding" bindingConfiguration="BasicHttpBinding_IActiveDirectory"
              contract="ADService.IActiveDirectory" name="BasicHttpBinding_IActiveDirectory">
            <headers>
              <ClientId>Console_Client</ClientId>
            </headers>
          </endpoint>
        </client>
    </system.serviceModel>
</configuration>

The above config file is from a .Net 4.5 client.

Server Identifies Client Request

Finally the web service will read the custom header and distinquish between any WCF client and process it accordingly.

var opContext = OperationContext.Current; // If this is null, is this code in an async block? If so, extract it before the async call.

var rq = opContext.RequestContext; 

var headers = rq.RequestMessage.Headers;

int headerIndex = headers.FindHeader("ClientId", string.Empty);

var clientString = (headerIndex < 0) ? "UNKNOWN" : headers.GetHeader<string>(headerIndex);
Share

Xaml: ViewModel Main Page Instantiation and Loading Strategy for Easier Binding.

BasicBindingUpdate 11.07.2013 : Added ICommanding example.
Update 10.25.2013 : Added how to use the CallerMemberName attribute with INotifyPropertyChanged in .Net 4.

While answering a question on StackOverflow I ran across a situation where the user was having troubles due to binding to a view model which was statically initiated in Xaml. While there is nothing wrong with creating the VM in Xaml, it can miss some benefits for most control and page situations where the all controls want to access one source of data. This article describes and provides example code to creating the view model in the code behind which I tend to prefer over the others methods.

Create a .Net 4 and Above VM

The view model adheres to the standard where it implements INotifyPropertyChange. The following ViewModel implements it using the .Net 4.5 way with the CallerMemberName attribute which saves us from having to explicitly define it on every member call. (If you are using .Net 4, add the Nuget Package: Microsoft BCL into you project to use CallerMemberName attribute).

Note for reference there is a situation below where we actually do specify the property name but it is to create in a change notification on another property.

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Linq;
using System.Runtime.CompilerServices;
using System.Text;
using System.Threading.Tasks;

namespace SO_WPF.ViewModels {

public class MainVM : INotifyPropertyChanged 
{

    private List<string> _Members;
    private int _MemberCount;
    private bool _IsMembershipAtMax;
    public bool IsMembershipAtMax 
    {
        get { return MemberCount > 3; }
    }
    public int MemberCount 
    { 
        get { return _MemberCount; }
        set
        {
            _MemberCount = value; 
            OnPropertyChanged();
            OnPropertyChanged("IsMembershipAtMax");
        } 
    }

    public List<string> Members 
    { 
        get { return _Members; }
        set { _Members = value; OnPropertyChanged(); } 
    }
    public MainVM()
    {
        // Simulate Asychronous access, such as to a db.

        Task.Run(() =>
                    {
                        Members = new List<string>() {"Alpha", "Beta", "Gamma", "Omega"};
                        MemberCount = Members.Count;
                    });
    }
    /// <summary>Event raised when a property changes.</summary>
    public event PropertyChangedEventHandler PropertyChanged;

    /// <summary>Raises the PropertyChanged event.</summary>
    /// <param name="propertyName">The name of the property that has changed.</param>
    protected virtual void OnPropertyChanged([CallerMemberName] string propertyName = null)
    {
        PropertyChangedEventHandler handler = PropertyChanged;
        if (handler != null)
        {
            handler(this, new PropertyChangedEventArgs(propertyName));
        }
    }

}
}

Main Page Code Behind

We now need to hook up the main page view model to the main page. To do that we will do two things, one, make a non INotifyPropertyChanged property on the class which will hold our view model. Two we will instantiate it, hook it up to the page’s data along with our property. Doing that ensures that everything on our page will get access to our view model, thanks to the data context and how controls inherit their parents data context.

using SO_WPF.ViewModels;

namespace SO_WPF
{
    public partial class MainWindow : Window
    {

        public MainVM ViewModel { get; set; }

        public MainWindow()
        {
            InitializeComponent();

            // Set the windows data context so all controls can have it.
            DataContext = ViewModel = new MainVM();

        }

    }
}

Xaml

Now in the xaml we can simply bind to the View Models properties directly without any fuss.

<Window x:Class="SO_WPF.MainWindow"
        xmlns:viewModels="clr-namespace:SO_WPF.ViewModels"
        xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
        xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
        xmlns:i="http://schemas.microsoft.com/expression/2010/interactivity"
        xmlns:ei="http://schemas.microsoft.com/expression/2010/interactions"
        xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
        mc:Ignorable="d"
        xmlns:d="http://schemas.microsoft.com/expression/blend/2008"
        d:DataContext="{d:DesignInstance {x:Type viewModels:MainVM}}"
        Title="MainWindow"
        Height="300"
        Width="400">

<StackPanel Orientation="Vertical">
    <ListBox Name="lbData"
                ItemsSource="{Binding Members}"
                SelectionMode="Multiple"
                Margin="10" />

    <Button Height="30"
            Width="80"
            Margin="10"
            Content="Click Me" />
</StackPanel>

</Window>

In the above code we bind the Members to the listbox just by specifying the Members property name. As to the highlighted lines, I have added them as a debug design option. That lets Visual Studio and Blend know that our data context is our MainVM. That allows for the editor to present us with the options of data binding to the appropriate items, and not having it blank.

This has been a simple example, but a powerful one which can be used as a binding strategy for any WPF, Silverlight or Windows Phone Xaml based applications in C#.

Note though it is not shown, sometimes in styles one needs the element name to bind to, where the data context will fail due to the nature of the style binding, the above page we would bind to the page name as provided (“MainWindow”) and then the property name as usual!

Extra Credit ICommanding

I won’t go into much detail about commanding, but the gist is that the ViewModel is not directly responsible for actions which can happen due to the ICommanding process, but allow for binding operations to occur against those actions and those actions are performed elsewhere usually on a view.

Below is our view model with the commanding public variables which can be consumed by controls (or other classes which have access to the VM).

public class MainVM : INotifyPropertyChanged
{
    #region Variables
       #region Commanding Operations

    public ICommand ToggleEditing { get; set; }
    public ICommand ReportError   { get; set; }
    public ICommand CheckSequence { get; set; }

       #endregion
       #region Properties
    public string ErrorMessage { // the usual INotifyProperty as shown before }
       #endregion
    #endregion
}

Then on our main page which consumes the view model we then process those requests by creating methods to fulfill those operations. Note that the example below references properties on the VM which were not shown in the example,
but that is not important in this article. But one variable is shown and that is the error message. This allow anyone to push an error using the commanding to the viewmodel which is subsequently shown. That is a peek at the power of commanding right there to allow a dependency injection of setting an error variable to be done outside the VM but used by those which consume the VM!

public partial class MainWindow : Window
{

    public MainWindow()
    {
        InitializeComponent();

        // Set the windows data context so all controls can have it.
        DataContext = ViewModel = new MainVM();

        SetupCommanding();
    }

    private void SetupCommanding()
    {
        // Commanding using OperationCommand class
        ViewModel.ToggleEditing         = new OperationCommand((o) => ViewModel.IsEditing = !ViewModel.IsEditing);
        ViewModel.ReportError           = new OperationCommand((o) => ViewModel.ErrorMessage = (string)o);
        ViewModel.CheckSequence         = new OperationCommand(CheckSequences, CanExecute);

    }

     private void CheckSequences(object obj)
     {
        ...
     }

    private bool CanExecute(object obj)
    {
        return !ViewModel.UnsavedsExist;
    }
}

Finally the ICommanding class used.

public class OperationCommand : ICommand
{

    #region Variables

    Func<object, bool> canExecute;
    Action<object> executeAction;

    public event EventHandler CanExecuteChanged;

    #endregion

    #region Properties

    #endregion

    #region Construction/Initialization

    public OperationCommand(Action<object> executeAction)
        : this(executeAction, null)
    {
    }

    public OperationCommand(Action<object> executeAction, Func<object, bool> canExecute)
    {
        if (executeAction == null)
        {
            throw new ArgumentNullException("Execute Action was null for ICommanding Operation.");
        }
        this.executeAction = executeAction;
        this.canExecute = canExecute;
    }

    #endregion

    #region Methods

    public bool CanExecute(object parameter)
    {
        bool result = true;
        Func<object, bool> canExecuteHandler = this.canExecute;
        if (canExecuteHandler != null)
        {
            result = canExecuteHandler(parameter);
        }

        return result;
    }

    public void RaiseCanExecuteChanged()
    {
        EventHandler handler = this.CanExecuteChanged;
        if (handler != null)
        {
            handler(this, new EventArgs());
        }
    }

    public void Execute(object parameter)
    {
        this.executeAction(parameter);
    }

    #endregion
Share

Xaml: Call Binding Converter Without Defining StaticResource in Xaml Thanks to Markup Derived Base Class in C#

When developing Xaml everyone has to create a converter in code at some point for binding data conversion. Of course to expose that converter to the Xaml bindings one has to specify it in as a static instantiated resource to be available to the binding call(s). In this post I demonstrate how to remove the middle man of that Xaml static instantiation to reside in a common base class which can be derived by the existing converters with minimal change. Once that is in place the converter will also be a MarkupExtension which can be called directly within the { } brackets such as shown on the highlighted line:

<Window xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
        xmlns:converters="clr-namespace:Omega.Operation.Converters"
        ...
        >

<DataGrid Grid.Row="1"
          Visibility="{Binding IsEditing, 
                       Converter={ converters:BooleanToVisibilityReverseConverter } 
                      }">
Convert the Converter

The change to any converter is quite minimal and once the base class (shown later) is in place it is simply a one line change. Here is the code for the converter used above.

The highlighted line shows the change;  simply adding the base class to its definition with a generic template of itself:

namespace Omega.Operation.Converters
{
/// <summary>Does the reverse where if a value is true the control is collapsed and if false the control is visibile</summary>
public class BooleanToVisibilityReverseConverter : CoverterBase<BooleanToVisibilityReverseConverter>, 
                                                   System.Windows.Data.IValueConverter
{
    public object Convert(object value, Type targetType, object parameter, System.Globalization.CultureInfo culture)
    {
        return (value is bool && (bool)value) ? Visibility.Collapsed : Visibility.Visible;
    }

    public object ConvertBack(object value, Type targetType, object parameter, System.Globalization.CultureInfo culture)
    {
        return value is Visibility && (Visibility)value == Visibility.Collapsed;
    }
}
}
Base Class Magic

The following generic base class will instantiate the derived class and return a single static instance of the derived class for usage in Xaml:

/// <summary>
/// This creates a Xaml markup which can allow converters (which inheirit form this class) to be called directly
/// without specify a static resource in the xaml markup.
/// </summary>
public  class CoverterBase<T> : MarkupExtension where T : class, new()
 {
    private static T _converter = null;

    public CoverterBase() { }

    /// <summary>Create and return the static implementation of the derived converter for usage in Xaml.</summary>
    /// <returns>The static derived converter</returns>
    public override object ProvideValue(IServiceProvider serviceProvider)
    {
        return _converter ?? (_converter = (T) Activator.CreateInstance(typeof (T), null));
    }
}

With the base class providing the required implementation of for the MarkupExtension, the derived class can be simply called in the Xaml and avoiding having to use the static resource implementation.

Things to Consider
  • Works in WPF, Silverlight, Windows Phone 8 (WP8), and Windows 8 Store apps.
  • Visual Studio’s Xaml designer may give blue squiggly warning “No constructor type for ‘xxx’ has 0 parameters.”. Ignore that warning…for there is a default constructor which doesn’t have to be defined in C#.
Share

C# Regex for Parsing Known texts

A user wanted to parse basic text. Here is a regex pattern which  breaks out the user text.

 

string @pattern = @"
(?:OS\s)                     # Match but don't capture (MDC) OS, used an an anchor
(?<Version>\d\.\d+)          # Version of OS
(?:;)                        # MDC ;
(?<Phone>[^;]+)              # Get phone name up to ;
(?:;)                        # MDC ;
(?<Type>[^;]+)               # Get phone type up to ;
(?:;)                        # MDC ;
(?<Major>\d\.\d+)            # Major version
(?:;)
(?<Minor>\d+)                # Minor Version
";

string data = 
@"Windows Phone Search (Windows Phone OS 7.10;Acer;Allegro;7.10;8860)
Windows Phone Search (Windows Phone OS 7.10;HTC;7 Mozart T8698;7.10;7713)
Windows Phone Search (Windows Phone OS 7.10;HTC;Radar C110e;7.10;7720)";

 // Ignore pattern white space allows us to comment the pattern, it is not a regex processing command
var phones = Regex.Matches(data, pattern, RegexOptions.IgnorePatternWhitespace)
                  .OfType<Match>()
                  .Select (mt => new 
                  {
                    Name = mt.Groups["Phone"].Value.ToString(),
                    Type = mt.Groups["Type"].Value.ToString(),
                    Version = string.Format( "{0}.{1}", mt.Groups["Major"].Value.ToString(),
                                                        mt.Groups["Minor"].Value.ToString())
                  }
                  );
                  
Console.WriteLine ("Phones Supported are:");            

phones.Select(ph => string.Format("{0} of type {1} version ({2})", ph.Name, ph.Type, ph.Version))
      .ToList()
      .ForEach(Console.WriteLine);
      
/* Output
Phones Supported are:
Acer of type Allegro version (7.10.8860)
HTC of type 7 Mozart T8698 version (7.10.7713)
HTC of type Radar C110e version (7.10.7720)
*/
Share

Xaml: Adding Visibility Behaviors Using Blend to A DataGrid for WPF or Silverlight

iStock_000015143879XSmallIn Xaml the determining when to trigger the visibility, or the hiding of  controls or their functionality is a key concept of doing either WPF or Silverlight programming. This article builds upon my article C#: WPF and Silverlight DataGrid Linq Binding Example Using Predefined Column Header Names in Xaml where we are going to add behaviors to the datagrid shown.  (Don’t worry about reading the article, for I will get you up to speed with the code snippets in this article.) We will use Microsoft’s Expression Blend product to do the dirty work of xaml modification to our DataGrid and it will be shown in a step by step process.

Concept

In the previous article the idea was to load our datagrid with two columns of data. The first column showed us a filename and the second column displayed a modified filename with a count in it. We will take that one step further and have a description show up with the file size. Here is the resulting look:

Inital with Description

The user gets the description when the row is clicked.

But what if we wanted to disable that functionality and automatically show the user all the items when the mouse hovers over the datagrid such as

Result

Setup

First thing we need to do is setup our datagrid. Below is the xaml and the code behind to load our datagrid. Note the datagrid has the RowDetailsVisibiltyMode set to collapsed. That means that when a user clicks on the row, it will only select it and not open up our description. The loading of the ItemsSource happens during the construction and after the initial initialization and is shown in C# in the second pane.

<DataGrid x:Name="dgPrimary"
            RowDetailsVisibilityMode="Collapsed">
    <DataGrid.RowDetailsTemplate>
        <DataTemplate>

            <TextBlock FontWeight="Bold"
                        Text="{Binding Size, StringFormat=Size \{0\} (bytes)}" />

        </DataTemplate>
    </DataGrid.RowDetailsTemplate>
    <DataGrid.Columns>
        <DataGridTextColumn Binding="{Binding Original}"
                            Header="File Name Before"
                            IsReadOnly="True" />
        <DataGridTextColumn Binding="{Binding New}"
                            Header="File Name After"
                            IsReadOnly="True" />
    </DataGrid.Columns>
</DataGrid>
dgPrimary.ItemsSource = 
    new DirectoryInfo( "c:\\" ).GetFiles()
                               .Select( ( fInfo, index ) => new
                    {
                        Original = fInfo.Name,
                        New = string.Format( "{0}_{1}{2}", 
                                    System.IO.Path.GetFileNameWithoutExtension( fInfo.Name ), 
                                    index, 
                                    System.IO.Path.GetExtension( fInfo.Name ) ),
                        Size = fInfo.Length
                    } );

Behaviors and Blend

One of the easiest ways to add a behavior [of the action] to a control is to use Blend to add an interaction behavior.  In our case we want a mouse over action to open up all of the Row Details and a secondary action to close them when the mouse leaves. The behavior we need to search for in blend is the ChangePropertyAction. Below we drag (or add) two behaviors to the datagrid and change the RowDetailsVisibilityMode to visible on mouse enter and to collapsed on mouse leave.

cpaEnter cpaLeave

Then when we build and run the app, the mouse hover over makes the descriptions visible and collapsed depending on the mouse. Here is the final xaml:

<DataGrid x:Name="dgPrimary"
            RowDetailsVisibilityMode="Collapsed">
    <DataGrid.RowDetailsTemplate>
        <DataTemplate>

            <TextBlock FontWeight="Bold"
                        Text="{Binding Size, StringFormat=Size \{0\} (bytes)}" />

        </DataTemplate>
    </DataGrid.RowDetailsTemplate>
    <DataGrid.Columns>
        <DataGridTextColumn Binding="{Binding Original}"
                            Header="File Name Before"
                            IsReadOnly="True" />
        <DataGridTextColumn Binding="{Binding New}"
                            Header="File Name After"
                            IsReadOnly="True" />
    </DataGrid.Columns>
    <i:Interaction.Triggers>
        <i:EventTrigger EventName="MouseEnter" SourceObject="{Binding ElementName=dgPrimary}">
            <ei:ChangePropertyAction x:Name="cpaEnter" PropertyName="RowDetailsVisibilityMode">
                <ei:ChangePropertyAction.Value>
                    <DataGridRowDetailsVisibilityMode>Visible</DataGridRowDetailsVisibilityMode>
                </ei:ChangePropertyAction.Value>
            </ei:ChangePropertyAction>
        </i:EventTrigger>
        <i:EventTrigger EventName="MouseLeave">
            <ei:ChangePropertyAction x:Name="cpaLeave"
                PropertyName="RowDetailsVisibilityMode" />
        </i:EventTrigger>
    </i:Interaction.Triggers>
Share

.Net Regex: Can Regular Expression Parsing be Faster than XmlDocument or Linq to Xml?

iStock_000017256683XSmallMost of the time one needs the power of the xml parser whether it is the XmlDocument or Linq to Xml to manipulate and extract data. But what if I told you that in some circumstances regular expressions might be faster?

Most conventional development thinking has branded regex processing as slow and the thought of using regex on xml might seem counter intuitive. In a continuation of articles I again want to dispel those thoughts and provide a real world example where Regular Expression parsing is not only on par with other tools in the .Net world but sometimes faster. The results of my speed test may surprise you;  and hopefully show that regular expressions are not as slow as believed, if not faster!

See: Are C# .Net Regular Expressions Fast Enough for You?

Real World Scenario

There was a developer on the MSDN forums who needed the ability to count URLs in multiple xml files. (See the actual post count the urls in xml file on Msdn) The poster received three distinct replies, one to use XMLDocument, another provided a Linq to XML solution and I chimed in with the regular expression method. The poster took the XMLDocument method and marked as the answer, but could he have done better?

I thought so…

So I took the three replies and distilled them down into their core processing and wrapped them in a similar IO extraction layer and proceeded to time them. I created 48 xml files with over one hundred thousand urls to find for a total of 13 meg on disk. I then proceeded to run the test all in release mode to get the results.  (See below section Setup to get a gist repository of the code).

Real World Result

Five tests, each test name is the technology and the user as found on the original msdn post. In red is the slowest and fastest time. Remember XmlDoc is the one the user choose as the answer.

Test 1
Regex           found 116736 urls in 00:00:00.1843576
XmlLinq_Link_FR found 116736 urls in 00:00:00.2662190
XmlDoc_Hasim()  found 116736 urls in 00:00:00.3534628

Test 2
Regex           found 116736 urls in 00:00:00.2317883
XmlLinq_Link_FR found 116736 urls in 00:00:00.2792730
XmlDoc_Hasim()  found 116736 urls in 00:00:00.2694969

Test 3
Regex           found 116736 urls in 00:00:00.1646719
XmlLinq_Link_FR found 116736 urls in 00:00:00.2333891
XmlDoc_Hasim()  found 116736 urls in 00:00:00.2625176

Test 4
Regex           found 116736 urls in 00:00:00.1677931
XmlLinq_Link_FR found 116736 urls in 00:00:00.2258825
XmlDoc_Hasim()  found 116736 urls in 00:00:00.2590841

Test 5
Regex           found 116736 urls in 00:00:00.1668231
XmlLinq_Link_FR found 116736 urls in 00:00:00.2278445
XmlDoc_Hasim()  found 116736 urls in 00:00:00.2649262

 

Wow! Regex consistently performed better, even when there was no caching of the files as found for the first run! Note that the time is Hours : Minutes : Seconds and regex’s is the fastest at 164 millseconds to parse 48 files! Regex worst time of 184 milleseconds is still better than the other two’s best times.

How was this all done? Let me show you.

Setup

Ok what magic or trickery have I played? All tests are run in a C# .Net 4 Console application in release mode. I have created a public Gist (Regex vs Xml) repository of the code and data which is actually valid Git repository for anyone how may want to add their tests, but let me detail what I did here on the blog as well.

The top level operation found in the Main looks like this where I run the tests 5 times

Enumerable.Range( 1, 5 )
            .ToList()
            .ForEach( tstNumber =>
            {
                Console.WriteLine( "Test " + tstNumber );
                Time( "Regex", RegexFindXml );
                Time( "XmlLinq_Link_FR", XmlLinq_Link_FR );
                Time( "XmlDoc_Hasim()", XmlDoc_Hasim );
                Console.WriteLine( Environment.NewLine );
            }

while the Time generic method looks like this and dutifully runs the target work and reports the results in “Test X found Y Urls in X [time]”:

public static void Time<T>( string what, Func<T> work )
{
    var sw = Stopwatch.StartNew();
    var result = work();
    sw.Stop();
    Console.WriteLine( "\t{0,-15} found {1} urls in {2}", what, result, sw.Elapsed );
}

Now in the msdn post the different methods had differing ways of finding each xml file and opening it, I made them all adhere to the way I open and sum the ULR counts. Here is its snippet:

return Directory.EnumerateFiles( @"D:\temp", "*.xml" )
            .ToList()
            .Sum( fl =>
            {

            } );

Contender  –  XML Document

This is one which the poster marked as the chosen one he used and I dutifully copied it to the best of my ability.

public static int XmlDoc_Hasim()
{
    return Directory.EnumerateFiles( @"D:\temp", "*.xml" )
                .ToList()
                .Sum( fl =>
                {
                    XmlDocument doc = new XmlDocument();
                    doc.LoadXml( System.IO.File.ReadAllText( fl ) );

                    if (doc.ChildNodes.Count > 0)
                        if (doc.ChildNodes[1].HasChildNodes)
                            return doc.ChildNodes[1].ChildNodes.Count;

                    return 0;

                } );

}

I used the sum extension method which is a little different from the original sum operation used, but it brings the tests closer in line by using the Extension.

Contender – Linq to Xml

Of the other two attempts, this one I felt was the more robust of the two, because it actually handled the xml namespace. Sadly it appeared to be ignored by the original poster. Here is his code

public static int XmlLinq_Link_FR()
{
    XNamespace xn = "http://www.sitemaps.org/schemas/sitemap/0.9";

    return Directory.EnumerateFiles( @"D:\temp", "*.xml" )
                    .Sum( fl => XElement.Load( fl ).Descendants( xn + "loc" ).Count() );

}

Contender – Regular Expression

Finally here is the speed test winner. I came up with the pattern design Upon by looking at the xml and it appeared one didn’t need to match the actual url, but just the two preceding  tags and any possible space between. That is the key to regex, using good patterns can achieve fast results.

public static int RegexFindXml()
{
    string pattern = @"(<url>\s*<loc>)";

    return Directory.EnumerateFiles( @"D:\temp", "*.xml" )
                    .Sum( fl => Regex.Matches( File.ReadAllText( fl ), pattern ).OfType<Match>().Count() );

}

XML1 (Shortened)

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url><loc>http://www.linkedin.com/directory/companies/internet-web2.0-startups-social-networking/barcelona.html</loc><changefreq>weekly</changefreq></url>
<url><loc>http://www.linkedin.com/directory/companies/internet-web2.0-startups-social-networking/basel.html</loc><changefreq>weekly</changefreq></url>
<url><loc>http://www.linkedin.com/directory/companies/internet-web2.0-startups-social-networking/bath.html</loc><changefreq>weekly</changefreq></url>
<url><loc>http://www.linkedin.com/directory/companies/computer-networking/sheffield.html</loc><changefreq>weekly</changefreq></url>
<url><loc>http://www.linkedin.com/directory/companies/computer-networking/singapore.html</loc><changefreq>weekly</changefreq></url>
<url><loc>http://www.linkedin.com/directory/companies/computer-networking/slough.html</loc><changefreq>weekly</changefreq></url>
<url><loc>http://www.linkedin.com/directory/companies/computer-networking/slovak-republic.html</loc><changefreq>weekly</changefreq></url>
</urlset>

Xml2 Shortened

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url><loc>http://www.linkedin.com/groups/gid-2431604</loc><changefreq>monthly</changefreq></url>
<url><loc>http://www.linkedin.com/groups/gid-2430868</loc><changefreq>monthly</changefreq></url>
<url><loc>http://www.linkedin.com/groups/Wireless-Carrier-Reps-Past-Present-2430807</loc><changefreq>monthly</changefreq></url>
<url><loc>http://www.linkedin.com/groups/gid-2430694</loc><changefreq>monthly</changefreq></url>
<url><loc>http://www.linkedin.com/groups/gid-2430575</loc><changefreq>monthly</changefreq></url>
<url><loc>http://www.linkedin.com/groups/gid-2431452</loc><changefreq>monthly</changefreq></url>
<url><loc>http://www.linkedin.com/groups/gid-2432377</loc><changefreq>monthly</changefreq></url>
<url><loc>http://www.linkedin.com/groups/gid-2428508</loc><changefreq>monthly</changefreq></url>
<url><loc>http://www.linkedin.com/groups/gid-2432379</loc><changefreq>monthly</changefreq></url>
<url><loc>http://www.linkedin.com/groups/gid-2432380</loc><changefreq>monthly</changefreq></url>
<url><loc>http://www.linkedin.com/groups/gid-2432381</loc><changefreq>monthly</changefreq></url>
<url><loc>http://www.linkedin.com/groups/gid-2432383</loc><changefreq>monthly</changefreq></url>
<url><loc>http://www.linkedin.com/groups/gid-2432384</loc><changefreq>monthly</changefreq></url>
</urlset>

Summary

It really comes down to the right tool for the right situation and this one regex really did well. But Regex is not good at most xml parsing needs, but for certain scenarios it really shines. If the xml has malformed or the namespace was wrong, then the parser has its own unique problems which would lead to a bad count. All the technologies had to do some upfront loading and that is key to how they performed. Regex is optimized to handle large data efficiently and as long as the pattern is spot on, it can really be quick.

My thought is don’t dismiss regular expression parsing out of hand, while the learning of it can pay off in some unique text parsing situations.

Share