» » » Pluralsight - Data Science and Hadoop Workflows at Scale With Scalding
uploaded.to



Information of news
5-12-2014, 08:16

Pluralsight - Data Science and Hadoop Workflows at Scale With Scalding

Category: Tutorials / Other


Pluralsight - Data Science and Hadoop Workflows at Scale With Scalding



Pluralsight - Data Science and Hadoop Workflows at Scale With Scalding | 1.16GB

This course teaches you how to use Scalding (a domain specific language) built on Scala and Cascading to build distributed applications on Hadoop. The course also focuses on the data science aspect using Algebird, an abstract algebra library for Scala, to solve real-world sketching/streaming problems on distributed systems. You will learn how to reason about a variety of problems, how to build and test locally, and how to deploy on Hadoop. You will also learn the algorithms used to solve problems at scale where performance, compute and memory resources, and the window of time you have to process streaming data are all challenges you'll have to overcome, and how you can use Scalding and Algebird to solve for these constraints. This course also covers some Scala basics to get you up to speed and looks into how you can monitor, visualize, and troubleshoot your application's workflow and performance problems. Watch this course if you were considering, or already know how to use Pig, Hive, or any other DSL for Hadoop and not only wanted more power over your workflows, but also a DSL that is actively being developed to support up and coming execution frameworks like Apache Tez and Apache Spark with all the flexibility that a full functional programming language like Scala has to offer. If you're serious about learning how to build enterprise-grade applications on Hadoop, data science, and Lambda architectures, then this course is for you.

Introduction to Scalding
37:15
Introduction
0:37
Why Scalding & Course Preview
4:01
Course Outline & Focus
1:25
Technologies Covered
2:27
Cascading
1:17
Scalding vs. Pig and Hive
2:08
Scalding APIs
0:40
Scala (map)
4:21
Scalding (map)
2:07
Distributed Applications
2:10
Simple Pipeline
4:50
Pipe Transformations
4:39
Passing Arguments
0:35
Scalding Sources
1:23
Installing the Scala IDE
1:46
Configuring Eclipse & Maven
1:43
Summary
1:00
Building Applications With Scalding
37:46
Introduction
1:09
Scalding REPL
1:35
Eclipse Scala Worksheet With Scalding
5:13
Word Count
2:15
FlatMap
3:36
Union
2:38
RITA Dataset - Exploring Data
2:02
Group Operations
1:24
Join Operations
2:09
Skew Joins
1:22
Joining Pipes
2:26
Reduction Operations
1:38
Fold and FoldLeft
1:52
FoldLeft in Practice
6:59
Summary
1:22
Scalding on Hadoop
16:43
Introduction
0:29
Running on Hadoop
1:49
Multiple Input Sources
1:24
Sequencing Jobs
1:40
Timed Input Sources
3:29
Visualizing Application Workflow With Dot Files
2:12
Driven - Application Performance Monitoring & Management
4:59
Summary
0:38
Data Science With Scalding
37:10
Introduction and Outline
1:07
Monoids
3:38
Priority Queues
2:48
Bloom Filters
7:38
Bloom Filter Joins
3:21
BF Joins With Scalding
6:27
Partitioning Pipes
2:05
HyperLogLog
5:23
HyperLogLog With Scalding
2:35
References
0:41
Summary



uploaded


Rapidgator.net

Site BBcode/HTML Code:
Dear visitor, you went to the site as unregistered user.
We recommend you Sign up or Login to website under your name.
Information
Would you like to leave your comment? Please Login to your account to leave comments. Don't have an account? You can create a free account now.

Tag Cloud

archive of news

^
 
free html hit counter