Tag

big data

What Happens behind the Scenes with Spark

From Spark with Java by Jean Georges Perrin

You’ve probably seen a simple use-case where Spark ingests data from a CSV file, then performs a simple operation, and then stores the result in the database. In this article, you’re going to see what happened behind the scenes.

What do Cooking Pasta and Data Science Have in Common?

From Data Science at Scale with Python and Dask by Jesse C. Daniel

This article discusses Dask, how it compares to Apache Spark, and how to create and understand directed acyclic graphs using the example of the delicious Italian pasta dish bucatini all’Amatriciana.

Streaming Data with KSQL

From Kafka Streams in Action by Bill Bejeck

This article discusses KSQL, a brand-new open source, Apache 2.0 streaming SQL engine that enables stream processing with Kafka. Basically, it makes it easy to read, write, and process streaming data in real-time, at scale, using SQL-like semantics.

Using Apache Spark with Java


slideshare-using-apache-spark-with-java

From Spark with Java
By Jean Georges Perrin

Getting up and Running with Spark


slideshare-getting-up-and-running-with-spark

From Spark in Motion
By Jason Kolter

Say Hello to Kafka


slideshare-say-hello-to-kafka
From Kafka in Action
By Dylan Scott

How Streams Relate to Database Tables in Kafka

From Kafka Streams in Action by Bill Bejeck

In this article, we’re going to look deeper into adding state. Along the way, we’ll get introduced to new abstraction, the Ktable, after which we will move further on to discuss how event streams and database tables relate to one another in ApacheKafka (Kstream and Ktable, respectively).

Constructing a Yelling App with Kafka Streams

From Kafka Streams in Action by Bill Bejeck

This article will quickly get you off the ground and show you how Kafka Streams works. We’re going to make a toy application that takes incoming messages and upper-cases the text of those messages, effectively yelling at anyone who reads the message. This application is called the “Yelling Application”.

How can I Improve Data Flow Downstream?


slideshare-how-can-i-improve-data-flow-downstream
From Kafka Streams in Action
By Bill Bejeck

Running Spark: an overview of Spark’s runtime architecture

From Spark in Action by Petar Zečević and Marko Bonaći.

When talking about Spark runtime architecture, we can distinguish the specifics of various cluster types from the typical Spark components shared by all. Here we describe typical Spark components that are the same regardless of the runtime mode you choose.

 

© 2018 Manning — Design Credits