Subject

Data Science

Running Spark: an Overview of Spark’s Runtime Architecture

By Petar Zečević and Marko Bonaći.

This article is excerpted from Spark in Action.

Slideshare: Cultivating search engines


slideshare-cultivating-search-engines

Slideshare: How can I use Graphs to better interpet data?


slideshare-spark-graphx-in-action1

Relevant Search: Debugging Query Matching

By Doug Turnbull and John Berryman

Systems monitoring and the unified log

By Alexander Dean

In this article, excerpted from the book Unified Log Processing, we generate a simple stream of events related to systems monitoring.

Calculate the percentage of New Orleans that is wetlands

From Geoprocessing with Python

Calculate the percentage of New Orleans that is wetlands

Real-World Machine Learning: Model Evaluation & Optimization

By Henrik Brink, Joseph W. Richards, and Mark Fetherolf

In this article, excerpted from Real-World Machine Learning, we describe the difficulties that arise when evaluating ML models.

Spark in Action: The Notion of Resilient Distributed Dataset (RDD)

By Marko Bonaći and Petar Zečević

In this article, excerpted from Spark in Action, we talk about RDD, the fundamental abstraction in Spark.

Pre-processing data for modeling

By Henrik Brink, Joseph W. Richards, and Mark Fetherolf

In this article, excerpted from Real-World Machine Learning, we will look at a few of the most common data pre-processing steps needed for real-world machine learning.

What Is a Graph and Why Is It Useful?

By Corey L. Lanum

In this article, excerpted from Visualizing Graph Data, we’ll introduce the concept of a graph and its history and uses.

© 2017 Manning — Design Credits