|From Concurrency in .NET by Riccardo Terrell
In this article, we discuss the need for concurrency, common issues specific to developing concurrent applications in either imperative or object-oriented programming (OOP) and functional programming, and why the functional programming paradigm is ideal for solving common concurrency issues.
“The trouble is that essentially all the interesting applications of concurrency involve the deliberate and controlled mutation of shared state, such as screen real estate, the file system, or the internal data structures of the program. The right solution, therefore, is to provide mechanisms which allow the safe mutation of shared state section.”
—Peyton Jones, Andrew Gordon, and Sigbjorn Finne in Concurrent Haskell
Why the need for concurrency?
Concurrency is a natural part of life—we humans are accustomed to multitasking. We can read an email while drinking a cup of coffee, or type on the keyboard while listening to our favorite song. The main reason to use concurrency in an application is to increase performance and responsiveness, and to achieve low latency. It is common sense that if one person does two tasks, one after the other, it’d take longer than if two people did those same two tasks simultaneously; it is the same with applications. The problem is that most applications aren’t written to evenly split the tasks required among the CPUs available.
Computers are used in many different fields, such as analytics, finance, science, and health care. The amount of data analyzed is increasing year by year. In 2012, Google received more than 2 million search queries per minute; in 2014, that number had more than doubled. I found this Pixar story interesting and inspiring. In 1995, Pixar produced the first completely computer-generated movie, Toy Story. In computer animation, a multitude of details and information must be rendered for each image, such as shading and lighting. This information changes at the rate of 24 frames per second. In a 3D movie, an exponential increase in changing information is required.
The creators of Toy Story used 100 connected dual-processor machines to create their movie, and the utilization of parallel computation was indispensable. A few years later, Pixar evolved for Toy Story 2, using an incredible 1,400 computer processors for digital movie editing, thereby vastly improving digital quality and editing time. At the beginning of 2000, this incredible computer power increased even more, to 3,500 processors. Sixteen years later, the computer power used to process a fully animated movie reached an absurd 24,000 cores. The need for parallel computing is increasing exponentially.
Let’s consider a processor with N (as any number) running cores. In a single-threaded application, there’ll be only one core running. The same application executing multiple threads will be faster, and as the demand for performance grows, the demand for N will also grow, making parallel programs the standard programming model choice for the future.
If you run an application in a multicore machine that wasn’t designed with concurrency in mind, then you’re wasting computer productivity because the application, as it sequences through the processes, only uses a portion of the available computer power. In this case, if you open Task Manager, or any CPU performance counter, you’ll notice only one core running high, possibly 100%, but all the other cores are under-utilized or idle. In a machine with eight cores, running non-concurrent programs means the overall utilization of the resources could be as low as 15% (figure 1).
Figure 1. Windows Task Manager shows a program poorly utilizing the CPU resources.
Such waste of computing power unequivocally illustrates that sequential code isn’t the correct programming model for multicore processers. To maximize the utilization of the computational resources available, the Microsoft .NET platform provides parallel execution of code through multithreading. By leveraging parallelism, a program can take full advantage of the resources available, as illustrated by the CPU performance counter in figure 2, where you’ll notice that all the processor cores are running high, possibly at 100%.
Figure 2. A program written with concurrency in mind can maximize CPU resources, possibly up to 100 percent.
Current hardware trends predict more cores instead of faster clock speeds; therefore, developers have no choice but to embrace this evolution and become parallel programmers.
Why Choose Functional Programming for Concurrency
Functional programming (FP) is about minimizing and controlling side effects; this is more commonly called pure functional programming. For example, utilizing the concept of transformation, a function creates a copy of a value x and then modifies the copy, leaving the original value x unchanged and free to be used by other parts of the program. FP encourages considering if mutability and side effects are necessary when designing the program. FP allows mutability and side effects, but in a strategic and explicit manner, isolating this area from the rest of the code by utilizing methods to encapsulate them.
The main reason for adopting functional paradigms is to solve the problems that exist in the multicore era. Highly-concurrent applications, such as web-servers and data-analysis databases, suffer from several architectural issues. These systems must be scalable to respond to many concurrent requests, which leads to design challenges for handling maximum resource contention and high-scheduling frequency. Moreover, race condition and deadlock are common, which makes troubleshooting and debugging code difficult.
In OOP programming paradigms, we’re dealing with objects as a base construct. Conversely, in terms of concurrency, dealing with objects has some caveats to consider when passing from a single-thread program to a massively parallelizing work, which is a challenging and entirely different scenario.
The traditional solution for these problems is to synchronize access to resources, avoiding contention between threads. However, this solution is a double-edged sword, because using primitives for synchronization, such as lock for mutual exclusion, leads to possible deadlock or race conditions. In fact, the state of a variable (as the name variable implies) can mutate. In OOP, a variable usually represents an object, which is liable to change over time. Because of this, you can never rely on its state and, consequentially, you must check its current value to avoid unwanted behaviors (figure 3).
Figure 3. In the functional paradigm, due to immutability as a default construct, concurrent programming guarantees deterministic execution, even in the case of a shared state. On the other hand, imperative and OOP use mutable states, which are hard to manage in a multithreaded environment, leading to non-deterministic programs.
It is important to consider that components of systems that embrace the FP concept can no longer interfere with each other, and they can be used in a multithreaded environment without using any locking strategies.
Development of safe parallel programs using a share of mutable variables and side-effect functions takes substantial effort from the programmer, who must make critical decisions, often leading to synchronization in the form of locking. By removing those fundamental problems through functional programming, we’re able to remove those concurrency-specific issues. This is why FP makes an excellent concurrent programming model. FP is an exceptional fit for concurrent programmers to achieve correct high performance in highly multithreaded environments using simple code. At the heart of FP, neither variables nor state are mutable, they can’t be shared, and functions may not have side effects.
Functional programming is the most practical way to write concurrent programs. Trying to write concurrent programs in imperative languages isn’t only difficult, it leads to bugs that’re difficult to discover, reproduce, and fix.
How are you going to take advantage of every computer core available to you? Embrace the functional paradigm!
Benefits of functional programming
Real advantages to learning functional programming exist, even if you’ve no plans to adopt this style in the immediate future. Still, it’s hard to convince someone to spend their time on something new without showing immediate benefits. The benefits come in the form of idiomatic language features that can seem overwhelming at first. Functional programming is a paradigm that gives you great coding power and has a positive impact on your programs after a short learning curve. Within a few weeks of using functional programming techniques, you’ll improve the readability and correctness of your applications.
The benefits of functional programming (with focus on concurrency) include:
- Immutability: A property that prevents modification of an object state after creation. In FP there’s no concept of variable assignment. Once a value has been associated with an identifier (which replaces the name of variable in FP), it can’t change. Functional code is, by definition, immutable. Immutable objects can be safely transferred between threads, leading to great optimization opportunities. Immutability removes the problems of memory corruption (race condition) and deadlocks because of the absence of mutual exclusion.
- Pure Function: No side effects, which means that functions don’t change any variables or data of any type outside the function. Functions are said to be pure if they’re transparent to the user, and their return value depends only on the input arguments. By passing the same arguments into a pure function, the result won’t change, and each process returns the same value, producing a consistent and expected behavior.
- Referential transparency: The idea of a function whose output depends on and maps only to its input. Each time a function receives the same arguments, the result is the same. This concept is valuable in concurrent programming because the definition of the expression can be replaced with its value and will have the exact same meaning. Referential transparency guarantees that a set of functions can be evaluated in any order and in parallel, without changing the application’s behavior.
- Lazy evaluation: Used in functional programming to retrieve the result of a function on demand or to defer the analysis of a big data stream until needed.
- Composability: Used to compose functions and create higher-level abstraction out of simple functions. Composability is the most powerful tool to defeat complexity, letting you define and build solutions for complex problems.
Learning to program functionally allows you write more modular, expression-oriented, and conceptually simple code. The combinations of these functional programming assets will let you understand what your code is doing, regardless of how many threads the code is executing.
Why use F# and C# for functional concurrent programming
The idea behind all this is to develop and design highly scalable and performant systems, by adopting the functional paradigm to write correct concurrent code. This doesn’t mean you must learn a new language; you can apply the functional paradigm by using tools you’re already familiar with, such as multipurpose languages like C# and F#. Several functional features have been added to the C# and F# languages over the course of the last few years which make it easier for you to shift to incorporating this new paradigm.
Their intrinsically different approaches to solving problems are the reasons these languages have been chosen. Both programming languages can be used to solve the same problem in different ways, which makes a case for choosing the “best tool” for the job. With a well-rounded toolset, you can design a better and easier solution. In fact, as software engineers, we should think of programming languages as tools.
Ideally, a solution should be a combination of C# and F# projects that work cohesively together. Both languages cover a different programming model, but the option to choose which tool to use for the job provides an enormous benefit in terms of productivity and efficiency. Another aspect to selecting these languages is their different concurrent programming model support, which can be mixed. For instance:
- F# offers a simpler model than C# for asynchronous computation, called asynchronous workflows.
- Both C# and F# are strongly typed, multipurpose programming languages with support for multiple paradigms that encompasses functional, imperative, and OOP techniques.
- Both languages are part of the .NET ecosystem and derive a rich set of libraries that can be used equally by both languages.
- F# is a functional-first programming language, which provides an enormous productivity boost. In fact, programs written in F# tend to be more succinct and lead to less code maintenance.
- F# combines the benefits of a functional declarative programming style with support from the imperative object-oriented style. This lets you develop applications using your existing object-oriented and imperative programming skills.
- F# has a set of built-in data structures instead of lock-free code, due to default immutable constructors. An example of this is the discriminated union and the record types. These types have structural equality and don’t allow nulls, which lead to “trusting” the integrity of the data and easier comparisons.
- F#, different from C#, strongly discourages the use of null values, also known as the billion-dollar mistake and, instead, encourages the use of immutable data structures. This lack of null reference helps to minimize the number of bugs in programming.
The Null Reference Origin
The null reference was introduced by Tony Hoare in 1965, while he was designing the ALGOL object-oriented language. Some 44 years later, he apologized for inventing it by calling it the billion-dollar mistake. He also said:
“I couldn’t resist the temptation to put in a null reference, simply because it was so easy to implement. This has led to innumerable errors, vulnerabilities, and system crashes” https://en.wikipedia.org/wiki/Tony_Hoare#Apologies_and_retractions
- F# is naturally parallelizable because it uses immutably as a default type constructor, and because of its .NET foundation, it integrates with the C# language with state-of-the-art capability at the implementation level.
- C# design tends toward an imperative language, first with full support for OOP. (I like to define this as imperative OO.) The functional paradigm, during the past years and since the release of .NET 3.5, has influenced the C# language with the addition of features like lambda expressions and LINQ for list comprehension.
- C# also has great concurrency tools that let you easily write parallel programs and readily solve tough real-world problems. Indeed, exceptional multicore development support within the C# language is versatile, and capable of rapid development and prototyping of highly parallel symmetric multiprocessing (SMP) applications. These programming languages are great tools for writing concurrent software, and the power and options for workable solutions aggregate when used together.
- Furthermore, F# and C# can interoperate. In fact, a F# function can call a method in a C# library, and vice versa.
It is obvious that the industry is looking for a reliable and simple concurrent programming model, shown by the fact that software companies are investing in libraries that take the level of abstraction away from the traditional and complex memory-synchronization models. Examples of these higher-level libraries are Intel’s TBB (Threading Building Blocks) and Microsoft’s TPL (Task Parallel Library), or interesting open source projects, such as OpenMP (which provides pragmas that you can insert into a program to make parts of it parallel) and OpenCL (which is a low-level language to communicate with Graphic Processing Units (GPUs)). GPU programming has a lot of traction and has been sanctioned by Microsoft with C++AMP extensions and Accelerator.NET.
The present and future of concurrent programming
Mastering concurrency to deliver scalable programs has become a required skill. Companies are interested in hiring and investing in engineers who have a deep knowledge of writing concurrent code. In fact, writing correct parallel computation can save time and money. It is cheaper to build scalable programs that use the computational resources available with fewer servers, than buying and adding under-utilized, expensive hardware that won’t reach the same level of performance. In addition, more hardware requires more maintenance and electric power to operate.
This is an exciting time to learn to write multithreaded code, and it’s rewarding to be able to improve the performance of your program with the functional programming approach. It is a bit unnerving to think in a new paradigm, but the initial challenge of learning parallel programming diminishes quickly, and the reward for perseverance is infinite. There’s something magical and spectacular about opening the Windows Task Manager and proudly noticing that the CPU usage spikes to 100% after your code changes. Performance can be measured, which is why you should care about improving it. Once you become familiar and comfortable with writing highly scalable systems using the functional paradigm, it’ll be difficult to go back to the slow style of sequential code.
Concurrency is the next innovation that’ll dominate the computer industry, and it’ll transform how developers write software. The evolution of software requirements in the industry and the demand for high-performance software that delivers great user experience through non-blocking user interfaces will continue to spur the need for concurrency. In lockstep with the direction of hardware, it’s evident that concurrency and parallelism will be the future of programming.
 Symmetric Multiprocessing (SMP) is the processing of programs by multiple processors that share a common operating system and memory.
 Pragmas are compiler-specific definitions that can be used to create new preprocessor functionality or to send implementation-defined information to the compiler.