Subject

Big Data

Aggregating Your Data with Spark

From Spark in Action, Second Edition by Jean-Georges Perrin

This article teaches you how to perform an aggregation using Apache Spark. You first look at the definition of an aggregation. You may already know and use aggregations in your job, and this might be a reminder for you. If this is the case, you can safely skim through it: Apache Spark’s aggregations are standard. The second part of this section shows you how to transform a SQL aggregation statement to Spark.

The Inner Workings of Spark

spark_in_act

From Spark in Action, Second Edition by Jean George Perrin

The Random Cut Forest Algorithm

From Machine Learning for Business by Doug Hudgeon and Richard Nichol

In this article, you’ll see how SageMaker and the Random Cut Forest algorithm can be used to create a model that will highlight the invoice lines that Brett should query with the law firm. The result will be a repeatable process that Brett can apply to every invoice that will keep the lawyers working for his bank on their toes and will save the bank hundreds of thousands of dollars per year. Off we go!

Building Linear Models with Dask ML

From Data Science at Scale with Python and Dask by Jesse C. Daniel

This article delves into building linear models using Dask-ML.

Ingesting Data from Files with Spark, Part 4

From Spark in Action, 2nd Ed. by Jean Georges Perrin

This is the last in a series of 4 articles on the topic of ingesting data from files with Spark. This section deals with ingesting a TXT file.

Finding Valuable Insights in Complex Data

From Graph Databases in Action by Dave Bechberger


slideshare-finding-valuable-insights-in-complex-data

How Does Computer Vision Work?

From Deep Learning for Vision Systems by Mohamed Elgendy

human_and_artificial_sensing
By Mohamed Elgendy

Ingesting Data from Files with Spark, Part 2

By Jean Georges Perrin This is the second in a series of 4 articles on the topic of ingesting data from files with Spark. This section deals with ingesting a JSON file.

Ingesting Data from Files with Spark, Part 1

From Spark in Action, 2nd Ed. by Jean Georges Perrin

This is the first in a series of 4 articles on the topic of ingesting data from files with Spark. This section deals with ingesting data from CSV.

The Majestic Role of the Dataframe in Spark

From Spark with Java by Jean Georges Perrin

In this article, you’ll learn what a dataframe is, how it’s organized, and about immutability.

© 2019 Manning — Design Credits