4 December 2007 in .Net, Code, Tools | Comments enabled

After yesterdays post I thought I should write up a basic sample to test the effectiveness of the Parallel Extensions. Admittedly it is a contrived example and you are unlikely to see this sort of performance increase in a real world since your applications are unlikely to be this primitive.

My sample iterates through a number sequence from 0 upwards and works out if the value is a prime number. There are two implementations, one using a standard loop and the other using a Parallel.For(). Of course, to try and ride out any spikes I iterate the tests 25 times and then average the outcome. This test is of course not run in a clean environment but does give a roughly indicative result of using the Parallel Extensions.

Using a dual core system, checking the numbers up to 100, 000 and running 25 iterations of each run I had the following outcome:

Using a normal for() loop: 3104 milliseconds average per run
Using a Parallel.For() loop: 1607 milliseconds average per run

This speed up is acceptable and, as you can imagine, these sorts of results are only going to become more impressive as we move to 8, 16, 32 core systems.

A few things to consider in a real world application (consider this my “don’t blame me if you think this will solve all your problems” line! :) :

  • Often slowness is caused by some slow resource – a web connection, a database call etc. The parallel extensions library will default to spinning up as many threads as there are cores and therefore if you have a slow dependent resource you may wish to investigate bumping up the thread count or writing your own threading code.
  • The architecture of a solution is more likely to impact the overall performance of the application. Improving the speed of a few loops and LINQ queries will not improve performance by any order of magnitude.
  • Amdahl’s Law applies – effectively this law states that the maximum parallel improvement that is possible for an application is limited by the amount of sequential code remaining. For example, if I can only make 10% of the code run in parallel then even with infinitive parallel processes running I’m still running slow sequential code 80% of the time – this feeds back to the previous point.

Download my sample application here (with source)

Note: You will need .Net 3.5 framework installed + the Parallel Extensions Library installed

– JD

kick it on DotNetKicks.com

Average Rating: 4.6 out of 5 based on 289 user reviews.


5 comments. Add your own comment.

Bart Czernicki says 4 December 2007 @ 18:21

I think u need to do some investigation, before writing an article on the topic. First…there is a degree of parallelism option which determins how many cores you want to use. A DOP setting of 1…u guessed it uses one core on a proc.

I agree with u on the code percentage and running items in parallel has minimal effect if you are running a query that takes 10% of the resources to execute. This is why making it configurable would be a good idea along with the DOP option.

IF you are working with large amounts of data (BI) and combine this with hash indexes and parallel…u are talking about some SICK performance gains. The numbers are factors of 100s!

traskjd says 4 December 2007 @ 18:49

Hi Bart,

Thanks for your comment. Perhaps I have missed something but I did show that I’m not dealing with the PLINQ which is the only method that takes a DOP.

Further to this, the global variable, PLINQ_DOP, can be configured however:

1. I haven’t tested to see if this affects task based work since all the documentation suggests (indeed, the name itself) it only affects PLINQ.

2. Clearly you shouldn’t be altering the PLINQ_DOP variable even if it does affect the tasks because you don’t want to be altering a global variable as tracking it and its affects to all parallel work would be messy. Not to mention that altering single global variable constantly (as you may wish to alter it depending on the load attributed to a thread) is bad design – especially when you could be altering it from parallel threads, making it a nightmare to debug :)

I would love to see that future drops of the Parallel Extensions Library include the ability to set the DOP value on normal tasks methods (like For(), ForEach() etc).

BI work would of course be an excellent candidate however my interest lies in problems that are not inherently parallel. We’re all geeks and I like a challenge :-)

Perhaps before suggesting I do investigation we agree that we’re all still exploring this framework and perhaps we should both do more investigation? ;-) It did force me to double check that DOP couldn’t be set on task work so your comment was valuable in helping me learn more. If you still believe I’ve missed something then please show me where I can set the DOP for tasks (and ideally not globally! :-)

Cheers and thanks again for your comment.

traskjd says 4 December 2007 @ 19:09

Hi again Bart,

Just having more of an explore – you can configure the “IdealThreads” on the ThreadManagerPolicy in order to adjust the thread count which does default to the processor count by default. So you could alter it with that.

My sample however is ideally suited to defaulting to the processor count as each thread will use 100% proc. I might put together another sample that will only spike to 50% or something in order to play more with the ThreadManagerPolicy.

Thanks for nudging me into exploring the framework more.

– JD

Bart Czernicki says 5 December 2007 @ 04:15

Discussing semantics is not what I do. I read this line:

“The parallel extensions library will default to spinning up as many threads as there are cores and therefore if you have a slow dependent resource you may wish to investigate bumping up the thread count or writing your own threading code”

and its simply not true. This is what made me respond..not anything else really.

1. Furthermore, your comment that its ONLY for PLINQ is NOT true again. AsParallel is an Extension method that converts IEnumerable to a IParallelEnumerable (since it implements all of the IEnumerable members). So we have that down. We can now write code like this (I made u 3 examples…note all three have nothing to do with LINQ):

class Program
{
static void Main(string[] args)
{
List numbers = new List { 1, 2, 3, 4, 5, 6, 8, 9, 10 };

// simple test
double sum = numbers.AsParallel(1).Sum();

// sequential: notice one proc and all numbers come back in sequence
numbers.AsParallel(1).ForAll(i => writeLine(i));
// parallel: notice numbers don’t come back in sequence (2 procs used)
// could use Preserve order if you want
Console.WriteLine(“Parallel”);
numbers.AsParallel(2).ForAll(i => writeLine(i));
Console.ReadLine();
}

private static void writeLine(int value)
{
if (value % 2 == 0)
{
Console.WriteLine(“even number =” + value.ToString());
}
}
}

2. DOP is a parameter inside the method. (see example above). You can call it with 2 or 1 to get one core doing all the work. I am not sure where you get the “global” option part. I am not sure what that method does and if it does set something weird, I would assume that there would be some kind of Strategy Design Pattern in there and if 1 do this and if 4 spin up 4 threads etc. It wouldn’t make sense to make it a global variable…but I’ve seen stranger things.

Ur not the only one I troll(ed). I always check blog sites for incorrect information…Rick Strahl (writes for Code Magazine and owns is own shop) had an article entitled something nightmares with LINQ and he wrote a whole paragraph on how LINQ was not dynamic and is static. All wrong information on TOP of that..he also entitled it so pessimistically it puts bad vibes/info out there for a new technology.

Why is this a big deal to me? Well “most” developers are lazy..so they will go on google.com and do a search and find the first thing that appears and whether it be a magazine article/blog/sample project whatever that “first hit” is usually turns itself into gold and thats the way it is.

traskjd says 5 December 2007 @ 08:23

Hi Bart,

First off, I don’t consider your posts trolling – it’s always good to get some comments if you’re heading off the beaten track. I have fundamental experience in concurrency however not a lot with this particular framework (which is why my posts usually have some comment about just playing with it). Comments that help me to learn AND are beneficial to those reading the post are always more than welcome.

Regarding .AsParallel(), if you checked out my sample you would see I don’t have a list, I just run a loop as the test. I could have pre-populated a list of the numbers I was going to calculate however that seemed somewhat redundant when I could just use a for() loop to compute the values. I could have done many things in a more complicated manor but I was simply trying to highlight the benefits of using this library and, like you, draw developers attention to the multi-core issue and the impact that it will have on them.

The .AsParallel() method is still considered to be part of PLINQ, not the task based work as it lives inside the System.Linq.ParallelQuery class however I’m happy that, given you require .Net 3.5, there is no real point in arguing that being able to set DOP is specific to “LINQ”. You could always find a way of calling off to .AsParallel() but in some situations this would require that you’re changing your code to suit the framework which is not generally a direction I like to see things going. While there is also the ThreadManagerPolicy I’m still holding out for an additional overload to For(), ForEach(), Do() etc that takes a DOP value :)

I’m new with this framework and only playing with it in my own time – any feedback is appreciated Bart.

Thanks,

– JD

Leave a Comment

Name (required)

E-mail (required - not published)

Website

Your comment: