By Jean Georges Perrin
This is the third in a series of 4 articles on the topic of ingesting data from files with Spark. This section deals with ingesting a XML file.
By Jean Georges Perrin This is the second in a series of 4 articles on the topic of ingesting data from files with Spark. This section deals with ingesting a JSON file.
From Spark with Java by Jean Georges Perrin
You’ve probably seen a simple use-case where Spark ingests data from a CSV file, then performs a simple operation, and then stores the result in the database. In this article, you’re going to see what happened behind the scenes.
From Data Science at Scale with Python and Dask by Jesse C. Daniel
This article discusses Dask, how it compares to Apache Spark, and how to create and understand directed acyclic graphs using the example of the delicious Italian pasta dish bucatini all’Amatriciana.
From Kafka Streams in Action by Bill Bejeck
This article discusses KSQL, a brand-new open source, Apache 2.0 streaming SQL engine that enables stream processing with Kafka. Basically, it makes it easy to read, write, and process streaming data in real-time, at scale, using SQL-like semantics.