Goals
- Apply data mining techniques to improve business decision making from internal and external data sources
- Get a head start on your competition with structured and unstructured data analysis
- Predict an outcome by using supervised machine learning techniques
Program
Load, query and manipulate data with R
Clean up raw data before modeling
Reduce dimensions with principal component analysis (PCA)
Develop functionality of R with user-defined packages
Explore the characteristics of a dataset through visualization
Graph the distribution of data with boxplots, histograms and density plots
Identify outliers
Preliminary processing and preparation of unstructured data for further analysis
Describe a set of documents with a term-document matrix
Examine MapReduce and Hadoop architectures
Integrate R and Hadoop with RHadoop
Model the relationship between an output variable and several input variables
Correctly interpret the coefficients of continuous and qualitative data
Process large datasets with RHadoop
Create regression modules for RHadoop
Use decision trees to predict target values
Apply probability rules to predict outcomes with the Naive Bayes model
Combine predictor variables of trees and random forests in RHadoop
Visualize model performance with an ROC curve
Evaluate classification models with confusion matrices
Segment the customer market with the K-Means algorithm
Find similarities with distance measures
Create tree-shaped clusters and hierarchical clusterings
Cluster tweets and text files to better understand them
Identify important connections with social media analytics
Understand the use of social media analytics results for marketing purposes
Identify real customer preferences from a set of transactional data to improve user experience
Calculate support and trust indices and lift to differentiate good rules from bad ones
Duration
5 days
Price
£ 2956
Audience
Database professionals, managers, data analysts, data scientists and project management assistants
Professionals responsible for managing forecasts and trends
Prerequisites
Knowledge of programming and statistics is useful but not compulsory
Reference
BUS100294-F
Sessions
Contact us for more informations about session date