Home c# How list.Where works

How list.Where works

Author

Date

Category

I wanted to know how many iterations Where will make.

var list = new List & lt; byte & gt; (new byte [] {0, 0, 1, 0, 1} );
int LinqCnt = 0;
var tmp = list.Where (x = & gt;
{
 LinqCnt ++;
 Console.WriteLine (LinqCnt + "");
 return x & gt; 0;
});
Console.WriteLine (LinqCnt + "");

The result was predictable:

0

Obviously, the “optimizer” worked and the Where block just didn’t work. Okay, let’s add the following unreachable code:

if (tmp.First () == null)
{
  var i = tmp.Count ();
  Console.Write ("(i =" + i + ")");
}

We get:

0 1 2 3

Strange, i.e. the search was not performed, but due to the appeal to First, something got there. Let’s continue. Let’s add another extra. a check that will work for sure:

if (tmp.First () == tmp.Last ()) Console.WriteLine ("LinqCnt +" ") ;

Result:

0 1 2 3 4 5 6 7 8 9 10 11 11

In total, we went through all 5 elements, and First and Last also led to an increment.

Dear experts, tell me what is happening?


Answer 1, authority 100%

You don’t quite understand how LINQ works.

This is because LINQ implements lazy evaluation . The actual run of Where occurs not when you define res , but when you enumerate.

The sequence that “comes” in Where may be too large, or taken from a file, or generated automatically, and thus be infinite.

For example, code like this:

IEnumerable & lt; double & gt; RandomSequence ()
{
  var r = new Random ();
  while (true)
    yield return r.NextDouble ();
}

produces an infinite sequence of real numbers.

Therefore, all LINQ operations only remember how the computation should be done, but do not perform it until it is actually needed.


In your first case, the tmp contains only a “recipe” for how to get the sequence. numbers greater than 0. If you want to actually apply this recipe and get a List & lt; int & gt; (this is called “materializing a sequence”), you can use .ToList ( ) , or go through the loop foreach :

var list = new byte [] {0, 0, 1, 0, 1};
int LinqCnt = 0;
var tmp = list.Where (x = & gt;
{
  LinqCnt ++;
  Console.WriteLine ("In where:" + LinqCnt);
  return x & gt; 0;
});
Console.WriteLine ("After where:" + LinqCnt);
foreach (var n in tmp)
  Console.WriteLine ("Enumerating:" + n);
Console.WriteLine ("After enumeration:" + LinqCnt);

You will see that the calculation of the next member of the sequence occurs in parallel with the output of the results: the numbers displayed in the Where -clause, are” shuffled “in the text displayed in the loop :

After where: 0
In where: 1
In where: 2
In where: 3
Enumerating: 1
In where: 4
In where: 5
Enumerating: 1
After enumeration: 5


What happens in the second case? Again, in TMP , only the “recipe” is recorded, how to get one at one member of the sequence. As you use First , the whole sequence is not needed: only the first term will be calculated. For this, the initial sequence will be wetted until Where does not give the first member of the result, and on this processing will end. Since inside if ‘and we do not fall, nothing happens more.


And finally, the third case. You are called .first and .last , for the same “recipe” TMP . This means that the recipe applies two times: once the sequence is flushed to get the first element (before the moment when this very first element is issued), and then – to get the last element. Let’s a little modify the code :

var list = new byte [] {0, 0, 2, 0, 1};
INT LINQCNT = 0;
var tmp = list.where (x = & gt;
{
  LinQCnt ++;
  Console.WriteLine ("In Where:" + LinqCnt + ", x =" + x);
  Return X & GT; 0;
});
Console.WriteLine ("After Where:" + LinQCNT);
if (tmp.first () == TMP.Last ())
  Console.WriteLine (LinqCnt + ");
Console.WriteLine ("After Enumeration:" + LinQCNT);

We get the result:

after where: 0
In where: 1, x = 0
In where: 2, x = 0
In Where: 3, x = 2
In where: 4, x = 0
In Where: 5, x = 0
In where: 6, x = 2
In Where: 7, x = 0
In Where: 8, x = 1
After Enumeration: 8

To calculate the first it took three member of the original sequence: 0, 0 and 2. To get Last – all five.


By the way, the lack of “unnecessary” calculations is not the feature of the optimizer. This documented behavior of lazy LINQ sequences, which does not depend on whether you have optimization mode in the compiler or not.


Why do results with Linq look so strange? The fact is that LinQ came to us from functional programming. Pure functions are preferred there, that is, the functions without side effects. For such functions, anyway, how many times and when to call them – the result will be the same. Therefore, with “clean” functions, lazy calculations do not lead to surprises, since anyway, whether calculations are performed (for example, filtering in where ) “lazy” or “energetic” manner.

Your filtering function, however, contains side effects: it changes the external variable and displays text on the console. Therefore, the behavior of Where and it seems to you unexpected.

It is recommended to use clean functions in LINQ expressions, then you will not have to follow the state, and keep in your head, which of LINQ functions are lazy, and which are not. If you use the features with side effects, you will have to remember that the functions that return the sequence (for example, where , select , Distinct ) usually Lazy, but the functions returning a specific element (first , sum , max ) – energetic.

Programmers, Start Your Engines!

Why spend time searching for the correct question and then entering your answer when you can find it in a second? That's what CompuTicket is all about! Here you'll find thousands of questions and answers from hundreds of computer languages.

Recent questions