Subject

Articles

Big Data is Just a Lot of Small Data: using pandas UDF, part 2

From Data Analysis with Python and PySpark by Jonathan Rioux

This article covers

·         Using pandas Series UDF to accelerate column transformation compared to Python UDF.

·         Addressing the cold start of some UDF using Iterator of Series UDF.

Big Data is Just a Lot of Small Data: using pandas UDF

From Data Analysis with Python and PySpark by Jonathan Rioux

This article covers

·   Using pandas Series UDF to accelerate column transformation compared to Python UDF.

·   Addressing the cold start of some UDF using Iterator of Series UDF.

Dive into clojure.java.io

From Clojure, The Essential Reference by Renzo Borgatti

clojure.java.io contains a collection of functions to simplify the interaction with the Java Input/Output system (or simply IO). Over the years, Java evolved the original InputStream and OutputStream abstractions into Reader and Writer, eventually adding also asynchronous IO. During this transformation, Java put a lot of effort in maintaining backward compatibility, a principle also shared with Clojure. Unfortunately, there are now coexisting IO APIs that impact negatively on usability, forcing Java developers through bridges and adapters to move between different styles of IO.

Parameter Server Pattern: Tagging Entities in 8 Millions of YouTube Videos

From Distributed Machine Learning Patterns by Yuan Tang

In this article, we introduce the parameter server pattern which comes handy for situations where the model is too large to fit in a single machine such as one we would have to build for tagging entities in the 8 millions of YouTube videos.

Fetching Data from the Database

From Data-Oriented Programming by Yehonathan Sharvit

This article explores how data-oriented programming deals with retrieving data from a database.

Cleaning Data

From Pandas Workout by Reuven Lerner

This article discusses cleaning data to use with Pandas.

HTTP Session Management

From Full Stack Python Security by Dennis Byrne

HTTP sessions are a necessity for all but the most trivial web applications. Web applications use HTTP sessions to isolate the traffic, context, and state of each user. This is the basis for every form of online transaction. If you’re buying something on Amazon, messaging someone on Facebook, or transferring money from your bank, the server must be able to identify you across multiple requests. This illustrates these concepts with Django.

Defining Infrastructure Declaratively with Crossplane

From Continuous Delivery with Kubernetes by Mauricio Salatino

This article discusses using Crossplane to provision real infrastructure in a declarative way.

Your Data under a Different Lens: window functions

From Data Analysis with Python and PySpark by Jonathan Rioux

This article covers window functions and the kind of data transformation they enable.

Writing to SQL Server

From Learn dbatools in a Month of Lunches by Chrissy LeMaire, Rob Sewell, Jess Pomfret, Cláudio Silva

This article focuses on saving data to the place that SQL Server DBAs feel most comfortable keeping data: a table in an SQL Server database!

You’ll learn different ways to write data to an SQL Server table using dbatools.

© 2022 Manning — Design Credits