From Machine Learning Bookcamp by Alexey Grigorev

In this series, we cover model deployment: the process of putting models to use. In particular, we’ll see how to package a model inside a web service, allowing other services to use it. We also show how to deploy the web service to a production-ready environment.

Take 40% off Machine Learning Bookcamp by entering fccgrigorev into the discount code box at checkout at

See part 1, part 2, part 3, and part 4 if you missed them.

We don’t run production services on our laptops: we need special servers for that.

In this article, we’ll cover one possible option for that: Amazon Web Services, or AWS. We decided to choose AWS for its popularity and we’re not affiliated with Amazon nor AWS.

Other popular public clouds exist, including Google Cloud, Microsoft Azure, and Digital Ocean. We don’t cover them in this article, but you should be able to find similar instructions online and deploy a model to your favourite cloud provider.

To follow the instructions in this section, you need to have an AWS account.

AWS Elastic Beanstalk

AWS provides a lot of services, and there are many possible ways of deploying a web service there. For example, you can rent an EC2 machine (a server in AWS) and manually set up a service on it, use a “serverless” approach with AWS Lambda, or use a range of other services.

In this article, we’ll use AWS Elastic Beanstalk, which is one of the simplest ways of deploying a model to AWS. Additionally, our service is simple enough, making it possible to stay within the free tier limits. We can use it for free for the first year.

Elastic Beanstalk automatically takes care of many things that we typically need in production, including:

  • Deploying our service to EC2 instances
  • Scaling up: adding more instances to handle the load during peak hours
  • Scaling down: removing these instances when the load goes away
  • Restarting the service if it crashes for any reason
  • Balancing the load between instances

We’ll also need a special utility — Elastic Beanstalk command-line interface (CLI) — to use Elastic Beanstalk. The CLI is written in Python, and we can install it with pip, like any other Python tool.

Because we use Pipenv, we can add it as a development dependency: this way, we’ll install it only for our project and not system-wide.

 pipenv install awsebcli --dev

NOTE:  Development dependencies are the tools and libraries that we use for developing our application. Usually, we need them only locally and don’t need them in the package deployed to production.

After installing it, we can enter the virtual environment of our project:

 pipenv shell

Now the CLI should be available. Let’s check it:

 eb --version

It should print the version:

 EB CLI 3.18.0 (Python 3.7.7)

Next, we run the initialization command:

 eb init -p docker churn-serving

Note that we use “-p docker”: this way, we specify that this is a Docker-based project.

If everything is fine, it creates a couple of files, including config.yml file in .elasticbeanstalk folder.

Now we can test our application locally by using local run command:

 eb local run --port 9696

This should work in the same way as in the previous section with Docker: it’ll first build an image and then run the container.

To test it, we can use the same code as previously and get the same answer:

 {'churn': False, 'churn_probability': 0.061875678218396776}

After verifying that locally it works well, we’re ready to deploy it to AWS. We can do it with one command:

 eb create churn-serving-env

This simple command takes care of setting up everything we need: from the EC2 instances to auto-scaling rules:

 Creating application version archive "app-200418_120347".
 Uploading churn-serving/ to S3. This may take a while.
 Upload Complete.
 Environment details for: churn-serving-env
   Application name: churn-serving
   Region: us-west-2
   Deployed Version: app-200418_120347
   Environment ID: e-3xkqdzdjbq
   Platform: arn:aws:elasticbeanstalk:us-west-2::platform/Docker running on 64bit Amazon Linux 2/3.0.0
   Tier: WebServer-Standard-1.0
   Updated: 2020-04-18 10:03:52.276000+00:00
 Printing Status:
 2020-04-18 10:03:51    INFO    createEnvironment is starting.
  -- Events -- (safe to Ctrl+C)

It’ll take a few minutes to create everything. We can monitor the process and see what it’s doing in the terminal.

When it’s ready, we should see the following information:

 2020-04-18 10:06:53    INFO    Application available at
 2020-04-18 10:06:53    INFO    Successfully launched environment: churn-serving-env   

The URL ( in the logs is important: this is how we reach our application. Now we can use this URL to make predictions (figure 1).

Figure 1. Our service is deployed inside a container on AWS Elastic Beanstalk. To reach it, we use its public URL.

Let’s test it:

 host = ''
 url = 'http://%s/predict' % host
 response =, json=customer)
 result = response.json()

As previously, we should see the same response:

 {'churn': False, 'churn_probability': 0.05960590758316393}

This is all! We have a running service.

WARNING:  This is a toy example, and the service we created is accessible by anyone in the world. If you do it inside an organization, the access should be restricted as much as possible. It’s not difficult to extend this example to be secure, but it’s out of scope for this article. Consult the security department at your company before doing it at work.

We can do everything from the terminal using the CLI, but it’s also possible to manage it from the AWS Console. To do it, we find “Elastic Beanstalk” there and select the environment we created (figure 2).

Figure 2. We can manage the Elastic Beanstalk environment in the AWS Console

To turn it off, choose “terminate deployment” in the “Environment actions” menu using the AWS Console.

WARNING: Even though Elastic Beanstalk is free-tier eligible, we should always be careful and turn it off as soon as we no longer need it. 

Alternatively, we use the CLI to do it:

 eb terminate churn-serving-env

After a few minutes, the deployment is removed from AWS — and the URL is no longer accessible.

AWS Elastic Beanstalk is a great tool to get started with serving machine learning models. You can find more advanced ways of doing it, which involve container orchestration systems like AWS ECS or Kubernetes; or “serverless” with AWS Lambda.

That’s all for this series.

If you want to learn more about the book, check it out on our browser-based liveBook platform here.