NoSQL Logo

Goals


- Integrate Big Data compost to create an appropriate Data Lake

- Select suitable Big Data warehouses to manage multiple data sets

- Process large data sets with Hadoop to facilitate technical and business decision making

- Query data sets voluminous in real time

Program

The four dimensions of Big Data: volume, velocity, variety, veracity Presentation of the MapReduce set, storage and queries

Measure the importance of Big Data within a company
Succeed in extracting useful data
Integrate Big Data with traditional data

Select the data sources to analyze
Remove duplicates
Define the role of NoSQL

Data models: key value, chart, document, column family Hadoop Distributed File System (HDFS)
HBase
Hive
Cassandra
Hypertable
Amazon S3
BigTable
DynamoDB
MongoDB
Redis
Riak
Neo4J

Choose a data warehouse based on the characteristics of your data
Inject code into the data, implement multilingual data storage solutions
Choose a data warehouse capable of aligning with business objectives

Map data with programming framework, connect to data and extract it from storage warehouse, transform data to process
Split data for Hadoop MapReduce

Create Hadoop MapReduce task components
Distribute data processing between multiple server farms, run Hadoop MapReduce tasks
Monitor progress of task flows

Identify Hadoop Daemons
Examine the Hadoop Distributed File System (HDFS)
Choose Execution Mode: Local, Pseudo-distributed, Fully Distributed

Compare real-time processing models
Use Storm to extract live events
Fast processing with Spark and Shark

Communicate with Hadoop in Pig Latin
Execute commands with the Grunt shell
Streamline high-level processing

Duration

3 days

Price

£ 1804

Audience

Anyone wishing to take advantage of the many advantages associated with technologies dedicated to Big Data

Prerequisites

Have working knowledge of the Microsoft Windows platform

Programming concepts are useful without being compulsory

Reference

BAS100301-F

Ensure data persistence in the Hive MegaStore
Launch queries with HiveQL
Examine the format of Hive files

Analyze data with Mahout, use reporting tools to display the result of processing
Query in real time with Impala

Define Big Data needs
Achieve objectives thanks to the relevance of data
Evaluate the various market tools dedicated to Big Data
Meet the expectations of company personnel

Identify the importance of business processes
Identify the problem
Choose the right tools
Obtain exploitable results

Choosing the right providers and hosting options
Finding the right balance between the costs incurred and the value provided to the company
Staying ahead

Sessions

Contact us for more informations about session date