Tag

processing

Robust Machine Learning with ML Pipelines

From Data Analysis with Python and PySpark by Jonathan Rioux

This chapter covers using transformer and estimators to prepare data into ML features.

Big Data is Just a Lot of Small Data: using pandas UDF, part 2

From Data Analysis with Python and PySpark by Jonathan Rioux

This article covers

·         Using pandas Series UDF to accelerate column transformation compared to Python UDF.

·         Addressing the cold start of some UDF using Iterator of Series UDF.

Big Data is Just a Lot of Small Data: using pandas UDF

From Data Analysis with Python and PySpark by Jonathan Rioux

This article covers

·   Using pandas Series UDF to accelerate column transformation compared to Python UDF.

·   Addressing the cold start of some UDF using Iterator of Series UDF.

Your Data under a Different Lens: window functions

From Data Analysis with Python and PySpark by Jonathan Rioux

This article covers window functions and the kind of data transformation they enable.

The Layers of a Cloud Data Platform

From Designing Cloud Data Platforms by Danil Zburivsky and Lynda Partner

In this article, we’ll layer some of the critical and more advanced functionality needed for most data platforms today. Without this added layer of sophistication your data platform would work but it wouldn’t scale easily, nor would it meet the growing data velocity challenges. It would also be limited in terms of the types of data consumers (people and systems who consume the data from the platform) it supports, as they’re also growing in both numbers and variety.

Introducing Edge Computing

From Making Sense of Edge Computing by Cody Bumgardner

Conceptually, edge computing is concerned with when it’s best to migrate computational functionally toward source of data and when it is best to move the data itself. This abstract concept of function versus data migration drives not only the fundamental motivations of edge computing, but also the broader field of distributed systems. The act of distributing processes makes even the simplest tasks more complicated.

Getting a Handle on Edge Computing

From Making Sense of Edge Computing by Cody Bumgardner & Caylin Hickey

Moving Your Analytics Data into the Cloud

From Designing Cloud Data Platforms by Danil Zburivsky and Lynda Partner

© 2023 Manning — Design Credits