An excerpt from Grokking Concurrency by Kirill Bobrov
This article talks about the differences between the concepts of concurrency and parallelism and why it’s important to know what each is.
The conversational meanings of the words “parallel” and “concurrent” are mostly synonymous, which is a source of significant confusion that extends even to the computer science literature. Distinguishing between parallel and concurrent programming is very important because both pursue different goals at different conceptual levels.
Concurrency is about multiple tasks which start, run, and complete in overlapping time periods, in no specific order. Parallelism is about multiple tasks or subtasks of the same task that literally run at the same time on a hardware with multiple computing resources like multi-core processor. As you can see, concurrency and parallelism are similar but not identical.
Concurrency is a semantic property of a program or system. Concurrency is when multiple tasks are in progress for overlapping periods of time. Note, here we are not talking about the actual execution of the tasks, but the design of the system – that the tasks are order-independent. So, concurrency is a conceptual property of a program or a system, it’s more about how the program or system has been designed.
Imagine that one cook is chopping salad while occasionally stirring the soup on the stove. He has to stop chopping, check the stove top, and then start chopping again, and repeat this process until everything is done.
As you can see, we only have one processing resource here, the chef, and his concurrency is mostly related to logistics; without concurrency, the chef has to wait until the soup on the stove is ready to chop the salad.
Parallelism is an implementation property. Parallelism is literally the simultaneous physical execution of tasks at runtime, and it requires hardware with multiple computing resources. It resides on the hardware layer.
Back in the kitchen, now we have two chefs, one who can do stirring and one who can chop the salad. We’ve divided the work by having another processing resource, another chef.
Parallelism is a subclass of concurrency: before you can do several tasks at once, you have to manage several tasks first.
The essence of the relationship between concurrency and parallelism is that concurrent computations can be parallelized without changing the correctness of the result, but concurrency itself does not imply parallelism. Furthermore, parallelism does not imply concurrency; it is often possible for an optimizer to take programs with no semantic concurrency and break them down into parallel components via such techniques as pipeline processing, wide vector SIMD operations, or divide and conquer.
As Rob Pike pointed out “Concurrency is about dealing with lots of things at once. Parallelism is about doing lots of things at once.” In a single-core CPU, you can have concurrency but not parallelism. But both go beyond the traditional sequential model in which things happen one at a time.
To get more idea about the distinction between concurrency and parallelism, consider the following points:
- An application can be concurrent but not parallel, which means that it processes more than one task at the same time, but no two tasks are executing at the same time instant.
- An application can be parallel but not concurrent, which means that it processes multiple sub-tasks of a single task at the same time.
- An application can be neither parallel nor concurrent, which means that it processes one task at a time, sequentially, and the task is never broken into subtasks.
- An application can be both parallel and concurrent, which means that it processes multiple tasks or subtasks of a single task concurrently at the same time (executing them in parallel)
I don’t want to be that guy, but terminology is important. Too often it happens that the conversation about the problem gets confusing because one person thinks of concurrency and the other thinks of parallelism. In practice, the distinction between concurrency and parallelism is not absolute. Many programs have aspects of each.
Imagine you have a program that inserts values into a hash table. If you spread the insert operation between multiple cores, that’s parallelism. But coordinating access to the hash table is concurrency.
That’s all for now. Thanks for reading. Check out Grokking Concurrency for more!