From Spark in Action, Second Edition by Jean-Georges Perrin
This article teaches you how to perform an aggregation using Apache Spark. You first look at the definition of an aggregation. You may already know and use aggregations in your job, and this might be a reminder for you. If this is the case, you can safely skim through it: Apache Spark’s aggregations are standard. The second part of this section shows you how to transform a SQL aggregation statement to Spark.
By Jean Georges Perrin
This is the third in a series of 4 articles on the topic of ingesting data from files with Spark. This section deals with ingesting a XML file.
By Jean Georges Perrin This is the second in a series of 4 articles on the topic of ingesting data from files with Spark. This section deals with ingesting a JSON file.
From Spark in Action by Petar Zečević and Marko Bonaći.
When talking about Spark runtime architecture, we can distinguish the specifics of various cluster types from the typical Spark components shared by all. Here we describe typical Spark components that are the same regardless of the runtime mode you choose.