Tag

spark

Ingesting Data from Files with Spark, Part 2

By Jean Georges Perrin

This is the second in a series of 4 articles on the topic of ingesting data from files with Spark. This section deals with ingesting a JSON file.

Ingesting Data from Files with Spark, Part 1

From Spark in Action, 2nd Ed. by Jean Georges Perrin

This is the first in a series of 4 articles on the topic of ingesting data from files with Spark. This section deals with ingesting data from CSV.

The Majestic Role of the Dataframe in Spark

From Spark with Java by Jean Georges Perrin

In this article, you’ll learn what a dataframe is, how it’s organized, and about immutability.

Build a Full-Featured Data Solution


slideshare-build-a-full-featured-data-solution

From Fusion in Action by Guy Sperry

What Happens behind the Scenes with Spark

From Spark with Java by Jean Georges Perrin

You’ve probably seen a simple use-case where Spark ingests data from a CSV file, then performs a simple operation, and then stores the result in the database. In this article, you’re going to see what happened behind the scenes.

Using Apache Spark with Java


slideshare-using-apache-spark-with-java

From Spark with Java
By Jean Georges Perrin

Getting up and Running with Spark


slideshare-getting-up-and-running-with-spark

From Spark in Motion
By Jason Kolter

Running Spark: an overview of Spark’s runtime architecture

From Spark in Action by Petar Zečević and Marko Bonaći.

When talking about Spark runtime architecture, we can distinguish the specifics of various cluster types from the typical Spark components shared by all. Here we describe typical Spark components that are the same regardless of the runtime mode you choose.

 

Spark in Action: The Notion of Resilient Distributed Dataset (RDD)

By Marko Bonaći and Petar Zečević

In this article, excerpted from Spark in Action, we talk about RDD, the fundamental abstraction in Spark.

How to start developing Spark applications in Eclipse

By Marko Bonaći author of Spark in Action
In this article, you will learn to write Spark applications using Eclipse, the most widely used development environment for JVM-based languages.

How to start developing Spark applications in Eclipse (PDF)

© 2018 Manning — Design Credits