Join the Treehouse affiliate program and earn 25% recurring commission!

✨ Earn college credits in Cybersecurity, JS, HTML, CSS and Python

Well done!

You have completed Introduction to Big Data!

Sign up for Treehouse Back to Library

Preview

Sign up for Treehouse Continue

Domain: Infrastructure

2:38 with Craig Dennis and Jared Smith

How do you keep data flowing and scale?

Teacher's Notes
Questions?
Video Transcript
Downloads
Workspaces

Learn More

Related Discussions

Have questions about this video? Start a discussion with the community and Treehouse staff.

Sign up

Related Discussions

Have questions about this video? Start a discussion with the community and Treehouse staff.

Sign up

Infrastructure in the world of big data allows the data to keep flowing. 0:00

And also allows the systems that we have discussed to run at scale and 0:01

on large data sets. 0:04

The fundamental unit in big data infrastructure is 0:06

often a cluster of machines. 0:09

Now typically, this is a group of networked Linux servers. 0:11

Managing clusters of machines is a non-trivial task. 0:15

You can't just write your own homegrown software to manage 0:18

all the servers you have available. 0:21

You need to have the ability to run your processing code across the cluster and 0:23

then gather the results to display them to the client. 0:28

You need to sign up for Treehouse in order to download course files.

Sign up

You need to sign up for Treehouse in order to set up Workspace

Sign up