Goals
- Integrate Big Data compost to create an appropriate Data Lake
- Select suitable Big Data warehouses to manage multiple data sets
- Process large data sets with Hadoop to facilitate technical and business decision making
- Query data sets voluminous in real time
Program
The four dimensions of Big Data: volume, velocity, variety, veracity Presentation of the MapReduce set, storage and queries
Measure the importance of Big Data within a company
Succeed in extracting useful data
Integrate Big Data with traditional data
Select the data sources to analyze
Remove duplicates
Define the role of NoSQL
Data models: key value, chart, document, column family Hadoop Distributed File System (HDFS)
HBase
Hive
Cassandra
Hypertable
Amazon S3
BigTable
DynamoDB
MongoDB
Redis
Riak
Neo4J
Choose a data warehouse based on the characteristics of your data
Inject code into the data, implement multilingual data storage solutions
Choose a data warehouse capable of aligning with business objectives
Map data with programming framework, connect to data and extract it from storage warehouse, transform data to process
Split data for Hadoop MapReduce
Create Hadoop MapReduce task components
Distribute data processing between multiple server farms, run Hadoop MapReduce tasks
Monitor progress of task flows
Identify Hadoop Daemons
Examine the Hadoop Distributed File System (HDFS)
Choose Execution Mode: Local, Pseudo-distributed, Fully Distributed
Compare real-time processing models
Use Storm to extract live events
Fast processing with Spark and Shark
Communicate with Hadoop in Pig Latin
Execute commands with the Grunt shell
Streamline high-level processing
Duration
3 days
Price
£ 1804
Audience
Anyone wishing to take advantage of the many advantages associated with technologies dedicated to Big Data
Prerequisites
Have working knowledge of the Microsoft Windows platform
Programming concepts are useful without being compulsory
Reference
BAS100301-F
Ensure data persistence in the Hive MegaStore
Launch queries with HiveQL
Examine the format of Hive files
Analyze data with Mahout, use reporting tools to display the result of processing
Query in real time with Impala
Define Big Data needs
Achieve objectives thanks to the relevance of data
Evaluate the various market tools dedicated to Big Data
Meet the expectations of company personnel
Identify the importance of business processes
Identify the problem
Choose the right tools
Obtain exploitable results
Choosing the right providers and hosting options
Finding the right balance between the costs incurred and the value provided to the company
Staying ahead
Sessions
Contact us for more informations about session date