Pranay Rana: C# State machine - Yield

Sunday, June 17, 2012

C# State machine - Yield

Yield keyword introduced in C#2.0. Yield keyword allow to create state machine and allow to iterate through the collection of objects one by one.

yield is a contextual keyword used in iterator methods in C#. yield use like following in iterator block
public IEnumerable methodname(params)
{
      foreach(type element in listofElement)
      {
         ...code for processing 
         yield return result;
      }
}
Note : here IEnumerable can be replace by IEnumerable<T>.
What yield keyword does ? - "When you process the collection by this keyword in iterator block. It pause the execution return proceeded  element or the current element of the collection. And when you call it again it start execution with the next element which in turn become current element for that call. This thing get continue till it reach the last element of collection."

Now I am going to show how you can gain some performance when make use of yield keyword.
In this example I am checking each datarow of the datatable weather it is empty or not.

Code With Yield Keyword
static void Main(string[] args)
{
     int[] arr = new int[] { 1, 2, 3 };

     DataTable table = new DataTable();
     table.Columns.Add("ItemName", typeof(string));
     table.Columns.Add("Quantity", typeof(int));
     table.Columns.Add("Price", typeof(float));
     table.Columns.Add("Process", typeof(string));
     //
     // Here we add five DataRows.
     //
     table.Rows.Add("Indocin", 2, 23);
     table.Rows.Add("Enebrel", 1, 10);
     table.Rows.Add(null, null, null);
     table.Rows.Add("Hydralazine", 1, null);
     table.Rows.Add("Combivent", 3, 5);
     table.Rows.Add("Dilantin", 1, 6);

     foreach (DataRow dr in GetRowToProcess(table.Rows))
     {
         if (dr != null)
         {                    
            dr["Process"] = "Processed";
            Console.WriteLine(dr["ItemName"].ToString() 
+ dr["Quantity"].ToString() + " : " + dr["Process"].ToString());
            //bool test = dr.ItemArray.Any(c => c == DBNull.Value);
         }
      }
      Console.ReadLine();
}
private static IEnumerable<datarow>GetRowToProcess(DataRowCollection                                                         dataRowCollection)
{
     foreach (DataRow dr in dataRowCollection)
     {
          bool isempty = dr.ItemArray.All(x => x == null || 
(x!= null && string.IsNullOrWhiteSpace(x.ToString())));

          if (!isempty)
          {
             yield return dr;
             //dr["Process"] = "Processed";
          }
          else
          {
             yield return null;
             //dr["Process"] = " Not having data ";
          }
          //yield return dr;
     }
}
Code Without Yield Keyword
private static IList<datarow> GetRowToProcess(DataRowCollection dataRowCollection)
{
    List<datarow> procedeedRows = new List<datarow>();
    foreach (DataRow dr in dataRowCollection)
    {
        bool isempty = dr.ItemArray.All(x => x == null || 
                           (x!= null && string.IsNullOrWhiteSpace(x.ToString())));

        if (!isempty)
        {
          procedeedRows.Add(dr);
        }
     }
     return procedeedRows;
 }

static void Main(string[] args)
{
   //code as above function to create datatable 
   List<datarow> drs= GetRowToProcess(table.Rows);
   foreach (DataRow dr in drs)
   {
     //code to process the rows 
   } 
}

Now Difference between two code
In code (Code without yield keyword)
in this code there is extra list is get created which point to the rows which is matching the condition and than there is loop for processing each row.
Disadvantage with this code is extra list is get created which occupies the extra space i.e memory as well as slow down the code.
In code (Code with yield keyword)
in this there no extra list is get created , with help yield one row at a time which is matching condition is get processed.
Advantage of the code is there is no extra list is get created and also it doesn't cause any performance problem.

Following example of linq with the yield keyword
void Main()
{
   // This uses a custom 'Pair' extension method, defined below.
   List<string> list1 = new List<string>()
 {
     "Pranay",
     "Rana",
     "Hemang",
     "Vyas"
 };
   IEnumerable<string>  query = list1.Select (c => c.ToUpper())
  .Pair()         // Local from this point on.
  .OrderBy (n => n.length);
}

public static class MyExtensions
{
 public static IEnumerable<string> Pair (this IEnumerable<string> source)
 {
  string firstHalf = null;
  foreach (string element in source)
  if (firstHalf == null)
   firstHalf = element;
  else
  {
   yield return firstHalf + ", " + element;
   firstHalf = null;
  }
 }
}
There is other statement besides yeild return
yield break
stops returning sequence elements (this happens automatically if control reaches the end of the iterator method body).
The iterator code uses the yield return statement to return each element in turn. yield break ends the iteration.

Constraint
The yield statement can only appear inside an iterator block, which might be used as a body of a method, operator, or accessor. The body of such methods, operators, or accessors is controlled by the following restrictions:
  • Unsafe blocks are not allowed.
  • Parameters to the method, operator, or accessor cannot be ref or out.
  • A yield statement cannot appear in an anonymous method.
  • When used with expression, a yield return statement cannot appear in a catch block or in a try block that has one or more catch clauses.

2 comments:

  1. I have read this post and if I could I desire to suggest you some interesting things or tips.

    ReplyDelete

  2. You have brought up a very excellent points, thankyou for the post.

    ReplyDelete