From Deep Learning Patterns and Practices by Andrew Ferlitsch
Like the best software engineering, modern deep learning uses a pipeline architecture based on reusable patterns.
You’ve likely seen this before. A successful ML engineer will need to decompose a machine learning solution into the following steps:
- Identify the Type of Model for the Problem
- Design the Model
- Prepare the Data for the Model
- Train the Model
- Deploy the Model
ML engineer(s) organized these steps into a two stage end-to-end (e2e) pipeline. The first e2e pipeline consists of the first three steps, which is depicted in figure 1 below as modeling, data engineering, and training. Once the ML engineer(s) is successful with this stage, it would be coupled with the deployment step to form a second e2e pipeline. Typically, the model was deployed into a container environment and accessed via a REST based or microservice interface.
Fig. 1 2017 prevailing practice for end-2-end machine learning pipeline
That was the prevailing practice in 2017. I refer to it as the discovery phase. What are the parts and how do they fit together?
Machine Learning as a CI/CD production process
In 2018, businesses were formalizing the continuous integration/continuous development (CI/CD) production process, which I refer to as the exploration phase. Figure 2 is a slide I used in a Google presentation to business decision makers in late 2018 which captures where we were then. It wasn’t just a technical process anymore, but included the integration of planning and quality assurance. The data engineering became more defined as extraction, analysis, transformation, management and servicing steps. Model designing and training included feature engineering, and the deployment expanded to include continuous learning.
Fig. 2 By 2018, Google and other large enterprise businesses were formalizing the production process to include the planning and quality assurance stages as well as the technical process.
Model Amalgamation in production
Models today in production don’t have a single output layer. Instead they have multiple output layers, from essential feature extraction (common layers), representational space, latent space (feature vectors, encodings) and probability distribution space (soft and hard labels). The models now are the whole application — there is no backend. They learn the optimal way to interface and data communication. The enterprise ML engineer of 2020 is now guiding the search space within an amalgamation of models. You can see a generalized example of a model amalgamation in figure 3.
Fig. 3 Model Amalgamation – when the models become the entire application!
Let’s break down this generalized example. On the left side is the input to the amalgamation. The input is processed by a common set of convolutional layers into, referred to as the shared model bottom. The output from the shared model bottom in this depiction has four learned output representations: 1) high dimensional latent space, 2) low dimensional latent space, 3) pre-activation conditional probability distribution, and 4) post-activation independent probability distribution. Each of these learned output representations are reused by specialized downstream learned tasks which perform an action (e.g., state transition change or transformation). For each task, represented in the figure as tasks 1, 2, 3 and 4, reuses the output representation which is the most optimal (size, speed, accuracy) for the task’s goal.
These individual tasks may then produce multiple learned output representations or combine learned representations from multiple tasks (dense embeddings) for reuse for further downstream tasks, as you saw in the sports broadcasting example.
Not only do serving pipelines enable these types of solutions, the components within the pipelines can be version controlled and reconfigured. This enables these components to be reusable, which is a fundamental principle in modern software engineering.
That’s all for now. If you want to learn more about the book, check it out on Manning’s liveBook platform here.