As you know MapReduce is the heart of Hadoop. If you are interested in Hadoop. You cannot avoid to learn about MapReduce, it's really important.
If you are developer or someone who is interested in design patterns for the MapReduce framework. I mention book from O'reilly - MapReduce Design Patterns Building Effective Algorithms and Analytics for Hadoop and Other Systems By Donald Miner, Adam Shook. All code examples in book are written for Hadoop. You will learn from many examples. This book looks like "cook book" (Each example, you will see question, how to do, idea, example code and comparing with sql & pig), but readers should even know about Hadoop and java programming or be able to read java code, because all example is java code. However, it's a good idea to use this book as reference. Readers can reproduce code in book with their work or real world.
You will see:
- Summarization patterns: get a top-level view by summarizing and grouping data
- Filtering patterns: view data subsets such as records generated from one user
- Data organization patterns: reorganize data to work with other systems, or to make MapReduce analysis easier
- Join patterns: analyze different datasets together to discover interesting relationships
- Metapatterns: piece together several patterns to solve multi-stage problems, or to perform several analytics in the same job
- Input and output patterns: customize the way you use Hadoop to load or store data
No comments:
Post a Comment