Preface Chapter 1: Practical Machine Learning with R Introduction Downloading and installing R Downloading and installing RStudio Installing and loading packages Reading and writing data Using R to manipulate data Applying basic statistics Visualizing data Getting a dataset for machine learning
Chapter 2: Data Exploration with RMS Titanic Introduction Reading a Titanic dataset from a CSV file Converting types on character variables Detecting missing values Imputing missing values Exploring and visualizing data Predicting passenger survival with a decision tree Validating the power of prediction with a confusion matrix Assessing performance with the ROC curve
Chapter 3: R and Statistics Introduction Understanding data sampling in R Operating a probability distribution in R Working with univariate descriptive statistics in R Performing correlations and multivariate analysis Operating linear regression and multivariate analysis Conducting an exact binomial test Performing student's t-test Performing the Kolmogorov-Smirnov test Understanding the Wilcoxon Rank Sum and Signed Rank test Working with Pearson's Chi-squared test Conducting a one-way ANOVA Performing a two-way ANOVA
Chapter 4: Understanding Regression Analysis Introduction Fitting a linear regression model with Im Summarizing linear model fits Using linear regression to predict unknown values Generating a diagnostic plot of a fitted model Fitting a polynomial regression model with Im Fitting a robust linear regression model with rim Studying a case of linear regression on SLID data Applying the Gaussian model for generalized linear regression Applying the Poisson model for generalized linear regression Applying the Binomial model for generalized linear regression Fitting a generalized additive model to data Visualizing a generalized additive model Diagnosing a generalized additive model
Chapter 5: Classification (I) - Tree, Lazy, and Probabilistic Introduction Preparing the training and testing datasets Building a classification model with recursive partitioning trees Visualizing a recursive partitioning tree Measuring the prediction performance of a recursive partitioning tree Pruning a recursive partitioning tree Building a classification model with a conditional inference tree Visualizing a conditional inference tree Measuring the prediction performance of a conditional inference tree Classifying data with the k-nearest neighbor classifier Classifying data with logistic regression Classifying data with the Naive Bayes classifier
Chapter 6: Classification (II) - Neural Network and SVM Introduction Classifying data with a support vector machine Choosing the cost of a support vector machine Visualizing an SVM fit Predicting labels based on a model trained by a support vector machine Tuning a support vector machine Training a neural network with neuralnet Visualizing a neural network trained by neuralnet Predicting labels based on a model trained by neuralnet Training a neural network with nnet Predicting labels based on a model trained by nnet
Chapter 7: Model Evaluation Introduction Estimating model performance with k-fold cross-validation Performing cross-validation with the e1071 package Performing cross-validation with the caret package Ranking the variable importance with the caret package Ranking the variable importance with the trainer package Finding highly correlated features with the caret package Selecting features using the caret package Measuringthe performance of the regression model Measuring prediction performance with a confusion matrix Measuring prediction performance using ROCR Comparing an ROC curve using the caret package Measuring performance differences between models with the caret package
Chapter 8: Ensemble Learning Introduction Classifying data with the bagging method Performing cross-validation with the bagging method Classifying data with the boosting method Performing cross-validation with the boosting method Classifying data with gradient boosting Calculating the margins of a classifier Calculating the error evolution of the ensemble method Classifying data with random forest Estimating the prediction errors of different classifiers
Chapter 9: Clustering Introduction Clustering data with hierarchical clustering Cutting trees into clusters Clustering data with the k-means method Drawing a bivariate cluster plot Comparing clustering methods Extracting silhouette information from clustering Obtaining the optimum number of clusters for k-means Clustering data with the density-based method Clustering data with the model-based method Visualizing a dissimilarity matrix Validating clusters externally
Chapter 10: Association Analysis and Sequence Mining Introduction Transforming data into transactions Displaying transactions and associations Mining associations with the Apriori rule Pruning redundant rules Visualizing association rules Mining frequent itemsets with Eclat Creating transactions with temporal information Mining frequent sequential patterns with cSPADE
Chapter 11: Dimension Reduction Introduction Performing feature selection with FSelector Performing dimension reduction with PCA Determining the number of principal components using the scree test Determining the number of principal components using the Kaiser method Visualizing multivariate data using biplot Performing dimension reduction with MDS Reducing dimensions with SVD Compressing images with SVD Performing nonlinear dimension reduction with ISOMAP Performing nonlinear dimension reduction with Local Linear Embedding
Chapter 12: Big Data Analysis(R and Hadoop) Introduction Preparing the RHadoop environment Installing rmr2 Installing rhdfs Operating HDFS with rhdfs Implementing a word count problem with RHadoop Comparing the performance between an R MapReduce program and a standard R program Testing and debugging the rmr2 program Installing plyrmr Manipulating data with plyrmr Conducting machine learning with RHadoop Configuring RHadoop clusters on Amazon EMR
Appendix A: Resources for R and Machine Learning Appendix B: Dataset - Survival of Passengers on the Titanic Index
以下为对购买帮助不大的评价