Tag

simplified

Robust Machine Learning with ML Pipelines

From Data Analysis with Python and PySpark by Jonathan Rioux

This chapter covers using transformer and estimators to prepare data into ML features.

Big Data is Just a Lot of Small Data: using pandas UDF, part 2

From Data Analysis with Python and PySpark by Jonathan Rioux

This article covers

·         Using pandas Series UDF to accelerate column transformation compared to Python UDF.

·         Addressing the cold start of some UDF using Iterator of Series UDF.

Big Data is Just a Lot of Small Data: using pandas UDF

From Data Analysis with Python and PySpark by Jonathan Rioux

This article covers

·   Using pandas Series UDF to accelerate column transformation compared to Python UDF.

·   Addressing the cold start of some UDF using Iterator of Series UDF.

Your Data under a Different Lens: window functions

From Data Analysis with Python and PySpark by Jonathan Rioux

This article covers window functions and the kind of data transformation they enable.

© 2023 Manning — Design Credits