Concurrent Programming in. NET Core

Time:2019-7-19

Concurrent programming – asynchronous vs. multithreaded code

Parallel programming is a broad term, and we should explore it by observing the differences between asynchronous methods and actual multithreading. Although. NET Core uses tasks to express the same concepts, a key difference is the difference in internal processing. When the calling thread does something else, the asynchronous method runs in the background. This means that these methods are I/O intensive, that is, they spend most of their time on input and output operations, such as file or network access. Whenever possible, it is meaningful to use asynchronous I/O method instead of synchronous operation. At the same time, the calling thread can process other requests while dealing with user interaction in desktop applications or server applications, rather than just waiting for the operation to complete.

Computing-intensive methods require CPU cycles to work and only run in their dedicated background threads. The number of cores in the CPU limits the number of threads available for parallel runtime. The operating system is responsible for switching between the remaining threads, giving them the opportunity to execute code. These methods are still executed concurrently, but not in parallel. Although this means that methods are not executed simultaneously, they can be executed when other methods are suspended.

Parallel vs concurrency

This article will focus on multithreaded concurrent programming in. NET Core in the last paragraph.

Task Parallel Library

NET Framework 4 introduces Task Parallel Library (TPL) as the preferred API for writing concurrent code. NET Core uses the same programming mode. To run a piece of code in the background, you need to wrap it up as a task:


var backgroundTask = Task.Run(() => DoComplexCalculation(42));
// do other work
var result = backgroundTask.Result;

The Task. Run method receives a function (Func) when it needs to return a result; the Task. Run method receives an action when it does not need to return a result. Of course, lambda expressions can be used in all cases, just like the long-time method with one parameter called in my example above. A thread in the thread pool will handle tasks. NET Core runs with a default scheduler that uses thread pools to process queues and perform tasks. You can implement your own scheduling algorithm by deriving the TaskScheduler class instead of the default, but this is beyond the scope of this article. As we have seen before, I use the Result property to merge the invoked background threads. For threads that do not need to return results, I can call Wait () instead. Both approaches will be blocked to background tasks. To avoid blocking call threads, such as in ASP.NET Core applications, you can use the await keyword:


var backgroundTask = Task.Run(() => DoComplexCalculation(42));
// do other work
var result = await backgroundTask;

This releases the called thread to handle other incoming requests. Once the task is completed, an available worker thread will continue to process the request. Of course, the controller action method must be asynchronous:


public async Task<iactionresult> Index() { // method body }

Handling exceptions

When two threads are merged, any exception thrown by the task is passed to the calling thread:

If you use Result or Wait (), they will be packaged into Aggregate Exception. The actual exception will be thrown and stored in its InnerException property.

If you use await, the original exception will not be packaged.

In both cases, the call stack information remains the same.

Cancellation of tasks

Since tasks can run for a long time, you may want to have an option to cancel tasks ahead of time. To implement this option, you need to pass in a cancelled token when the task is created, and then use the token to trigger the cancelled task:


var tokenSource = new CancellationTokenSource();
var cancellableTask = Task.Run(() =>
{
for (int i = 0; i < 100; i++)
{
if (tokenSource.Token.IsCancellationRequested)
{
// clean up before exiting
tokenSource.Token.ThrowIfCancellationRequested();
}
// do long-running processing
}
return 42;
}, tokenSource.Token);
// cancel the task
tokenSource.Cancel();
try
{
await cancellableTask;
}
catch (OperationCanceledException e)
{
// handle the exception
} 

In fact, in order to cancel the task ahead of time, you need to check the cancellation token in the task and react when it needs to be cancelled: after performing the necessary cleanup operation, callThrowIfCancellationRequested() Exit the task. This method will throwOperationCanceledExceptionTo execute the corresponding processing in the calling thread.

Coordination of multitasking

If you need to run multiple background tasks, here are some ways to help you. To run multiple tasks simultaneously, you just need to start them continuously and collect their references, such as in an array:


var backgroundTasks = new []
{
Task.Run(() => DoComplexCalculation(1)),
Task.Run(() => DoComplexCalculation(2)),
Task.Run(() => DoComplexCalculation(3))
};

Now you can use the static methods of the Task class to wait for them to be executed asynchronously or synchronously.


// wait synchronously
Task.WaitAny(backgroundTasks);
Task.WaitAll(backgroundTasks);
// wait asynchronously
await Task.WhenAny(backgroundTasks);
await Task.WhenAll(backgroundTasks);

In fact, both methods eventually return to all tasks themselves and can be operated on again like any other task. To get the results of the corresponding task, you can check the Result attribute of the task. Handling multitasking exceptions is a bit tricky. Method WaitAll and WhenAll throw exceptions whenever any task is collected. However, for WaitAll, all exceptions will be collected to the corresponding InnerExceptions property; for WhenAll, only the first exception will be thrown. To determine which task throws which exception, you need to check the Status and Exception properties of each task separately. Be careful when using WaitAny and WhenAny. They wait until the first task is completed (successful or unsuccessful), and they don’t throw any exceptions even if a task has an exception. They will only return the index of completed tasks or the completed tasks separately. You have to wait until the task is completed or access its result attribute to catch exceptions, such as:


var completedTask = await Task.WhenAny(backgroundTasks);
try
{
var result = await completedTask;
}
catch (Exception e)
{
// handle exception
}

If you want to run multiple tasks in succession instead of concurrent tasks, you can use continuations:


var compositeTask = Task.Run(() => DoComplexCalculation(42))
.ContinueWith(previous => DoAnotherComplexCalculation(previous.Result),
TaskContinuationOptions.OnlyOnRanToCompletion)

ContinueWith() Method allows you to perform multiple tasks one by one. This extended task will get a reference to the result or state of the previous task. You can still add conditions to determine whether to perform a continuation task, such as only if the previous task succeeds or throws an exception. Compared with waiting for multiple tasks continuously, it improves flexibility. Of course, you can combine continuation tasks with all the functions discussed earlier: exception handling, cancellation, and parallel running tasks. There is a lot of room for performance, which can be combined in different ways.


var multipleTasks = new[]
{
Task.Run(() => DoComplexCalculation(1)),
Task.Run(() => DoComplexCalculation(2)),
Task.Run(() => DoComplexCalculation(3))
};
var combinedTask = Task.WhenAll(multipleTasks);
var successfulContinuation = combinedTask.ContinueWith(task =>
CombineResults(task.Result), TaskContinuationOptions.OnlyOnRanToCompletion);
var failedContinuation = combinedTask.ContinueWith(task =>
HandleError(task.Exception), TaskContinuationOptions.NotOnRanToCompletion);
await Task.WhenAny(successfulContinuation, failedContinuation);

Task synchronization

If the task is completely independent, then the coordination approach we have just seen is sufficient. However, once data needs to be shared at the same time, additional synchronization is necessary to prevent data corruption. When two or more threads update a data structure at the same time, the data quickly becomes inconsistent. Just like the following sample code:


var counters = new Dictionary< int, int >();
if (counters.ContainsKey(key))
{
counters[key] ++;
}
else
{
counters[key] = 1;
}

When multiple threads execute the above code at the same time, executing instructions in specific order in different threads may result in incorrect data, for example:

  • All threads will check whether the same key exists in the collection
  • As a result, they all enter the else branch and set the value of this key to 1.
  • The final result will be 1, not 2. If the code is executed successively, it will be the expected result.

In the above code, critical sections allow only one thread to enter at a time. In C #, lock statements can be used to implement:


var counters = new Dictionary< int, int >();
lock (syncObject)
{
if (counters.ContainsKey(key))
{
counters[key]++;
}
else
{
counters[key] = 1;
}
}

In this method, all threads must share the same syncObject. As a best practice, syncObject should be a dedicated instance of Object designed to protect access to an independent critical area from external access. In the lock statement, only one thread is allowed to access the code block inside. It will block the next thread that tries to access it until the previous thread exits. This ensures that the thread executes the critical code in its entirety without being interrupted by another thread. Of course, this will reduce parallelism and slow down the overall execution of the code, so you’d better minimize the number of critical sections and keep them as short as possible.

Use Monitor classes to simplify lock declarations:


var lockWasTaken = false;
var temp = syncObject;
try
{
Monitor.Enter(temp, ref lockWasTaken);
// lock statement body
}
finally
{
if (lockWasTaken)
{
Monitor.Exit(temp);
}
}

Although most of the time you want to use lock statements, the Monitor class can give additional control when needed. For example, you can use TryEnter () instead of Enter () and specify a limited time to avoid endless waiting for lock release.

Other synchronization primitives

Monitor is just one of many synchronization primitives in. NET Core. According to the actual situation, other primitives may be more suitable.

Mutex is a heavier version of Monitor, which relies on the underlying operating system and provides synchronous access to resources across multiple processes. It is a recommended alternative for Mutex synchronization.

Semaphore Slim and Semphore can limit the maximum number of threads accessing resources at the same time, instead of restricting only one thread like Monitor. Semaphore Slim is lighter than Semaphore, but limited to a single process. If possible, you’d better use Semaphore Slim instead of Semaphore.

Reader Writer LockSlim can distinguish between two ways of accessing resources. It allows an unlimited number of readers to access resources at the same time, and limits access to locked resources to only one writer at the same time. Thread is safe when reading, but it needs exclusive resources when modifying data, which protects resources well.

AutoResetEvent, Manual ResetEvent, and Manual ResetEvent Slim block incoming threads until they receive a signal (that is, call Set ()). Then the waiting thread will continue to execute. AutoResetEvent will block until the next call to Set () and allow only one thread to continue executing. Manual ResetEvent and Manual ResetEventSlim do not block threads unless Reset () is called. Manual Research Event Slim is lighter and more recommendable than the first two.

Interlocked provides an alternative, atomic operation, which is a better alternative (if applicable) to locking and other synchronization primitives:


// non-atomic operation with a lock
lock (syncObject)
{
counter++;
}
// equivalent atomic operation that doesn't require a lock
Interlocked.Increment(ref counter);

Concurrent collection

When a critical region needs to ensure atomic access to data structures, a dedicated data structure for concurrent access may be a better and more effective alternative. For example, using Concurrent Dictionary instead of Dictionary can simplify the lock statement example:


var counters = new ConcurrentDictionary< int, int >();
counters.TryAdd(key, 0);
lock (syncObject)
{
counters[key]++;
}

Naturally, it may be as follows:


counters.AddOrUpdate(key, 1, (oldKey, oldValue) => oldValue + 1);

Because the delegation of update is a method outside the critical area, the second thread may read the same old value before the first thread updates the value, and effectively override the update value of the first thread with its own value, thus losing an increment. Misuse of concurrent collections is also an unavoidable problem caused by multithreading. Another alternative to concurrent collections is immutable collections. Similar to concurrent collections, it is also thread-safe, but the underlying implementation is different. Any operation that changes the data structure will not change the original instance. Instead, they return a modified copy and leave the original instance unchanged:


var original = new Dictionary< int, int >().ToImmutableDictionary();
var modified = original.Add(key, value);

Therefore, any change to the collection in one thread is invisible to other threads. Because they still refer to the original unmodified collection, this is why the invariant collection is essentially thread-safe. Of course, this makes them very effective for solving different sets of problems. The best scenario is that multiple threads independently modify the data in the same input set, and at the last step may merge the changes for all threads. With regular collections, you need to create a copy of the collection for each thread in advance.

Parallel LINQ (PLINQ)

Parallel LINQ (PLINQ) is an alternative to Task Parallel Library. As the name implies, it relies heavily on LINQ (Language Integrated Query) functionality. It is useful for scenarios where the same expensive operations are performed in large collections. Unlike ordinary LINQ to Objects, where all operations are performed sequentially, PLINQ can perform these operations in parallel with multiple CPUs. The code changes needed to take advantage are minimal:


// sequential execution
var sequential = Enumerable.Range(0, 40)
.Select(n => ExpensiveOperation(n))
.ToArray();
// parallel execution
var parallel = Enumerable.Range(0, 40)
.AsParallel()
.Select(n => ExpensiveOperation(n))
.ToArray();

As you can see, the difference between the two code snippets is simply the callAsParallel()。This converts IEnumerable to Parallel Query, resulting in partial parallel operation of the query. To switch back to sequential execution, you can callAsSequential(),It will return an IEnumerable again. By default, PLINQ does not retain the order in the collection to make the process more efficient. But when the order is important, you can call AsOrdered ():


var parallel = Enumerable.Range(0, 40)
.AsParallel()
.AsOrdered()
.Select(n => ExpensiveOperation(n))
.ToArray();

Similarly, you can callAsUnordered() Switch back.

Concurrent programming in a complete. NET Framework

Because. NET Core is a complete simplified implementation of. NET Framework, all parallel programming methods in. NET Framework can also be used in. NET Core. The only exception is invariant collections, which are not part of the complete. NET Framework. They are distributed as separate NuGet packages (System. Collections. Immutable), which you need to install and use in your project.

Conclusion:

Whenever an application contains CPU-intensive code that can run in parallel, it is meaningful to use concurrent programming to improve performance and hardware utilization. The API in. NET Core abstracts many details, making it easier to write concurrent code. However, some potential problems need to be noted, most of which involve accessing shared data from multiple threads. If you can, you should avoid it altogether. If not, make sure you choose the most appropriate synchronization method or data structure.

The above is the whole content of this article. I hope it will be helpful to everyone’s study, and I hope you will support developpaer more.

Recommended Today

Hadoop MapReduce Spark Configuration Item

Scope of application The configuration items covered in this article are mainly for Hadoop 2.x and Spark 2.x. MapReduce Official documents https://hadoop.apache.org/doc…Lower left corner: mapred-default.xml Examples of configuration items name value description mapreduce.job.reduce.slowstart.completedmaps 0.05 Resource requests for Reduce Task will not be made until the percentage of Map Task completed reaches that value. mapreduce.output.fileoutputformat.compress false […]