Prabhu Eshwarla is a senior IT professional with over 26 years in the software engineering and services industry. He currently runs SudhanvaTech, a company focused on developing distributed systems software and cloud-native applications where he has used Rust professionally, and is passionate about helping others learn to use it. His book will be released in the spring of 2021. |
At Manning, we’re always interested in learning more about our authors’ unique perspectives on the technologies that have captured their passions. We spoke with Prabhu Eshwarla, who is writing a new book entitled “Building Network Services with Rust,” and got an education in memory management, garbage collection, and how Rust offers advantages of both high- and low-level programming languages.
Prabhu, thank you for speaking with us. Today, we want to dig into the Rust programming language. It’s increasingly popular, and we have a few books on the topic. What we see in our liveBook discussion forums and in the feedback we collect from peer reviews is that developers really enjoy coding in it. They have also have a tendency to pigeonhole it as a low level language. |
That is so true – I think it happens because of the nature of languages. |
What do you mean? |
Well, programming languages are either low-level, like C or C++ — languages that let you manage the entire memory layer. Or, they’re high-level, which means you get some insight into memory management, but fundamentally, the language manages it for you. The natural conclusion is that if the language is capable of functioning at that fundamental level, it must be meant for that – only that. But the truth is, it doesn’t have to be. Unlike many other languages that are strictly high-level or strictly low-level, Rust gives you all the benefits of both. |
So, Prabhu, I am a publisher, not a developer. To me it’s not obvious why memory management is such a big consideration. Isn’t memory cheap? |
In the old days, scaling went vertical. It was generally about buying bigger and bigger servers. When the load would become large and the application’s response time slows down, you’d call up the IT department and have them increase the memory on the server. |
But now we have the cloud. |
Exactly. Today, the world in the cloud is horizontally scalable. You have these commodity servers, and you can add more of them, or add more cores to each of them, and let each core function as a separate virtual machine, and on each core, you can run programs in parallel – provided the programs are written in the right way. |
What we typically think of as multi-threaded programming. But hasn’t that sort of fallen out of favor? |
It has, because relying on multi-threading for concurrency , even though it can be efficient, introduces a lot of potential for problems. |
Can they be understood simply? |
Yes. Suppose one thread tries accessing data that’s also used by another thread. This causes what’s known as a race condition — where the system is trying to do two tasks at the same time such as reading and writing. But to be done correctly, the tasks actually need to be done in sequence. When this happens, you start seeing some really strange results, and it can be really difficult to figure out what’s going wrong. With a low-level language like C, you have to deliberately ensure that problems like race conditions aren’t happening. But the awesome thing about Rust is that it will prevent you from writing code that might cause race conditions in the first place. |
Prevent it? How does it do that? |
It’s linked to ownership. When one thread takes ownership of a variable, it completely takes ownership. The other threads have no access to it. That way, you can be sure that the same piece of memory is not accessed by multiple threads at the same time, and that data is not being modified in unpredictable ways. Of course, there may be situations where you want many threads to use same data, but Rust provides you ways to safely do that too. There is also another aspect around thread deadlocks that Rust avoids. To modify a piece of data, a thread locks it for its own use and makes the change. In other languages, if you forget to release lock, it creates a deadlock where all other threads are waiting for the lock to release and the system may hang as a result. But in Rust, you don’t have that problem. As soon as the thread lock goes out of scope, it is automatically released for use by other threads. |
So memory safety, data safety, all those kinds of problems are directly eliminated by the compiler? |
Exactly. And all of that really fascinated me when I first began to learn about Rust. Of course, I quickly realized that many of the things that make Rust so powerful also mean that it’s really quite complex. It takes significant effort to master it, even for an experienced person, mostly because you have to adopt a different sort of philosophy, if you will, when you set out to work in Rust. The difficulty isn’t in the syntax. It has to do with the fact that you almost have to unlearn what you have always done in order to apply the constructs that the Rust toolset provides. |
So, is that effort worth it? |
Oh, definitely. Rust gives you conciseness, compactness, low latency, memory safety, and data race safety. All those benefits are typically associated only with low level languages. It also gives you high-level productivity, functional programming constructs and even write high-level frameworks that let you very quickly create things like macros, so you don’t have to write boilerplate code. A lot of code can be generated. |
What do you mean it can be generated? It has tools for code generation? |
Yes! There are two kinds of tools for that. One is called a macro, which lets you write a single line statement and then the compiler will expand that into a whole set of Rust code at compile time and build it into the binary. All the developer has to do is write a one-line macro construct. The second level of code generation is what is known as generics. |
And that means…? |
Well, normally the type of each variable has to be predetermined. You have to specify the type when you declare it. You say it’s an integer or it’s a string or whatever it is. But the concept of a generic solves the problem that sometimes you don’t know beforehand what the type a variable is going to be. And as a result of that, you have to write programs that can accommodate multiple variable types. For example, if I have an ID field, I may not know whether the value will come in as an integer, or as a string with the number one. So that means I have to write code twice to handle each of those two potential types. On the other hand, if the language supports a generic, I can write code that says the value can either be a string or any digit. I write only one piece of code, which means you can re-use your code without any performance hit. Rust generates code that’s as effective as you writing the code yourself for each of the possible different types. Rust supports both compile-time generics (simply called generics) and run-time generics (called trait objects). |
So you get better productivity. |
Correct. You can use these constructs to get things done really quickly. Rust is not the only language which has generics and macro support. The point is, the combination of all these things brings it together. The language designers have put in a lot of effort to balance system programming and developer productivity. I think it’s a very delightful language for the programmer, and at the same time it gives you a lot of control. |
Back to the matter of memory control: Don’t we have servers with plenty of memory? And we can run things in parallel, right, so essentially don’t we have as much memory as we need? Why does the developer still need to control memory manually? |
There are two reasons to control memory manually. The first reason is to prevent memory bloat. Let’s say you allocate memory and then you forget to de-allocate it. If you run the Unix system tools, you might see that the tool is occupying, you know, 20 megabytes of memory. But as the tool keeps running for long periods of time, you’ll start to see that after a few days or weeks, slowly, the memory consumption increases from 20 megabytes to 25 megabyte and then on to 30 megabytes. A memory leak is happening, which slows down the processing of the system. |
But isn’t that what garbage collection handles? |
Yes, but there is a cost for garbage collection: you don’t know at what point in time the garbage will be collected. I mean, what does it really do, this garbage collector? It’s basically a scavenger. At some point in time, it will just say “Hey guys, stop everything.” If there are incoming requests coming to the server, the garbage collector says “Hey, no, you’ll have to stop them while I clean up.” It’s like when your company has a cleaning service and they put up a sign outside the restroom that says Out of service, this restroom is being cleaned. Even if you happen to have an urgent need to go to the restroom, sorry, you’ll have to wait. |
Ha ha. Good example. |
Right? And the best thing is that they don’t tell you when they’re going to clean. It’s not a pre-scheduled maintenance. At any point in time, they might decide to come in and clean, to do the garbage collection, and then you stop. And for those precious milliseconds, you wait — milliseconds that can make the difference between missing a transaction that is coming to you and handling it successfully. |
Okay, so can we go back to how Rust does it, compared to how you do it manually? The use of memory, the amount of data you’re keeping in memory, increases in any process over time. How do you do this cleaning process, or constraining the amount of memory used, when you do it manually? How do you think about that? |
So the way Rust handles this is using something called scope. For every variable in a program, has scope. For example, suppose I have some code within curly braces. I declare a variable within those curly braces. The moment the code execution goes outside the curly braces, the variable goes out of scope. That’s called scoping — most languages work like that. The way Rust does it is whenever a variable value goes out of scope, it automatically runs something called the “destructor” or “Drop” code. Rust lets you customize what happens when a value goes out of scope, where you can say if this type goes out of scope release its resources . So basically it goes and de-allocates the memory. If the memory is allocated on the stack, it will remove it from the stack; if it’s on the heap, it will remove the pointer and remove actual memory that’s stored on the heap. All the clean-up happens automatically, the moment the variable goes out of scope. And all this can be embedded in the binary, so at compile time you tell the compiler, “Generate this code so that you won’t have a runtime problem.” The point is shifting the problem of memory clean-up from runtime to compile time. That’s how Rust solves the problem. The compiler simply uses predefined rules to write the code that handles de-allocation for you on the fly so that you don’t forget to deal with it. It’s as simple as that. |
I don’t know what the stack and the heap are, but we don’t have time for you to teach me. Still, I wonder what the difference is between the way Rust does it and how a garbage collector does it? |
The garbage collector does not look at scope, it just looks at the process at some point in time. For example, there might be lots of objects sitting in memory. The garbage collector basically checks whether there’s any reference to a given object, whether anybody is still using it. If nobody’s using it, then the garbage collection deletes it. |
So this is like garbage collection at a lower level. You’re cleaning up your own trash – instead of leaving it out for the scavenger to get, you just clean it up as soon as you’re done. It’s the difference between how my wife handles the kitchen and how I handle it. She’s a lot faster than me, but I clean up as I go. |
Exactly. That’s an apt analogy. |
Ok, but aren’t there kinds of memory leaks that can potentially spike, so to speak, and create a whole lot of garbage very quickly? Like, you make a mistake and suddenly your kitchen is full of garbage. Does the kind of manual memory management you’re describing essentially avoid those situations altogether, so that you just never have a massive garbage incident? |
The two most popular memory related issues in programming are memory safety and memory leaks. Rust makes it difficult , if not impossible to write code that leaks memory. For example, one could theoretically write some exotic code like cyclical shared references which don’t deallocate memory and compiler cannot detect such a thing, but those are edge cases. When it comes to memory safety where there is no operation allowed on corrupted memory like dangling pointers or double-frees, Rust shines. |
Can you explain? |
It all comes down to the difference between the stack and the heap. |
(Silently: Here we go again.) |
Essentially, there are two types of variables: primitive data types like integers and characters, and reference data types like vectors and hashmaps. Primitive data types are never a problem because they’re always stored on the stack as their size is known to compiler upfront. And the stack is last in, first out. The moment something is done, the stack throws out the last variable and executes the next one. The problem occurs when you allocate memory on the heap. The heap is dynamic memory — it’s allocated at runtime. Let’s say you’re going to host a conference call. If you know beforehand that only four people will join, you could write code to statically allocate space for four people, and then you’d know there’s no memory leak. But what if you don’t know how many people are going to join? Then everything has to happen dynamically — the code has to dynamically allocate more memory whenever new people join the call. That’s where the problem becomes complex. Let’s say one hundred people joined the call, and I allocate memory for each one. What will happen if I forget to de-allocate that memory? Those are the kinds of things Rust makes easy for you. You don’t have to remember to clean up. It does it for you. |
And so how does the automatic de-allocation happen in that example? |
Well, let’s say a hundred people are joining the conference call. Normally, each value has a certain scope associated with it. As soon as the control goes outside that scope, e.g, when the control comes back from the function that was called, Rust adds a piece of code that says, let all the memory associated with it get released. You need additional effort on the part of the programmer to plan these things out, but once you do the planning and tell Rust, this is how it is, Rust will generate additional code that will clean up the system for you. |
So, keeping with your example, that doesn’t mean you can write a program that will accept an infinite number of callers, right? Or does it? |
Yes and no. Normally, you specify an assumed upper limit for things, but there are so many situations in the world. For example, can you guess how many tweets are going to come up in your timeline on a given day? You can’t predict these kinds of things beforehand so programmatically, you look for flexibility. But it’s constrained by the amount of memory that’s available in the system. That’s why people monitor memory use: it’s not enough to just write code. You keep on monitoring your programs and you say, I need to implement load balancing. I’m not going to load one single server with all the requests that come in; instead, I’m going to send some requests to this server and some to another server. |
Now, your book will be about system development with Rust but also about application development, right? |
You know, I really want to highlight the point you made at the beginning of the conversation: it’s very natural for anyone who’s only slightly familiar with Rust to think that because you’re dealing with very low-level memory management—thread management, concurrency, those kinds of things—that it’s more suited for systems programming. And maybe we can actually look at replacing C++ with Rust, in the long run. That seems to be what the industry thinking is, and a lot of work is done with that in mind. But what most people don’t seem to know is that it’s equally effective to use it as a replacement for one of the high-level languages. That even though Rust does not have a garbage collector, thanks to its unique ownership model, it can give you a lot of the high-level features that you have come to expect from languages like Java and Python. I think it’s a well-kept secret that the libraries and frameworks are evolving very rapidly. One or two years ago, I could not have thought about writing a book because Rust was still very, very immature. The whole asynchronous feature, as an inherent part of the language, only came aboard just a few months ago. And with that, things have started to become a lot more interesting. Also, some of the best frameworks associated with it now have matured or undergone multiple iterations. So I think the time is right to show the world what Rust can do. |
Thanks for the conversation, Prabhu. It was really enlightening. And best of luck with your writing – it’s going to be a great book. |