preface chapter 1:the era of big data big data - the monster re-defined big data toolbox - dealing with the giant hadoop - the elephant in the room databases hadoop spark-ed up r- the unsung big data hero summary chapter 2:introduction to r programming language and statistical environment learning r revisiting r basics getting r and rstudio ready setting the urls to r reitories r data structures vectors scalars matrices arrays data frames lists exporting r data objects applied data science with r importing data from different formats exploratory data analysis data aggregations and contingency tables hypothesis testing and statistical inference tests of differences independent t-test example (with power and effect size estimates) anova example tests of relationshi an example of pearsons r correlations multiple regression example data visualization packages summary chapter 3:unleashing the power of r from within traditional limitations of r out-of-memory data processing speed to the memory limits and beyond data transformations and aggregations with the ff and ffbase packages generalized linear models with the ff and ffbase packages logistic regression example with ffbase and biglm expan memory with the bigmemory package parallel r from bigmemory to faster putations an apply() example with the big.matrix object a for() loop example with the ffdf object using apply() and for() loop examples on a data.frame a parallel package example a foreach package example the future of parallel processing in r utilizing graphics processing units with r multi-threa with microsoft r open distribution parallel machine learning with h20 and r boosting r performance with the data.table package and other tools fast data import and manipulation with the data.table package data import with data.table lightning-fast subsets and aggregations on data.table chaining, more plex aggregations, and pivot tables with data.table writing better r code summary chapter 4:hadoop and mapreduce framework for r hadoop architecture hadoop distributed file system mapreduce framework a simple mapreduce word count example other hadoop native tools learning hadoop a single-node hadoop in cloud deploying hortonworks sandbox on azure a word count example in hadoop using java a word count example in hadoop using the r language rstudio server on a linux redhat/centos virtual machine installing and configuring rhadoop packages hdfs management and mapreduce in r - a word count example hdinsight - a multi-node hadoop cluster on azure creating your first hdinsight cluster creating a new resource group deploying a virtual work creating a work security group setting up and configuring an hdinsight cluster starting the cluster and exploring ambari connecting to the hdinsight cluster and installing rstudio server ad a new inbound security rule for port 8787 editing the virtual works public ip address for the head node smart energy meter reas analysis example - using r on hdinsight cluster summary chapter 5:r with relational database management systems (rdbmss) relational database management systems (rdbmss) a short overview of used rdbmss structured query language (sql) sqlite with r preparing and importing data into a local sqlite database connecting to sqlite from rstudio mariadb with r on a ec2 instance preparing the ec2 instance and rstudio server for use preparing mariadb and data for use working with mariadb from rstudio tgresql with r on rds launching an rds database instance preparing and uploa data to rds remotely querying tgresql on rds from rstudio summary chapter 6:r with non-relational (nosql) databases introduction to nosql databases review of lea non-relational databases monb with r introduction to monb monb data models installing monb with r on ec2 processing big data using monb with r importing data into monb and basic monb mands monb with r using the rmonb package monb with r using the rmongo package monb with r using the mongolite package hbase with r azure hdinsight with hbase and rstudio server importing the data to hdfs and hbase rea and querying hbase using the rhbase package summary chapter 7:faster than hadoop - spark with r spark for big data analytics spark with r on a multi-node hdinsight cluster launching hdinsight with spark and r/rstudio rea the data into hdfs and hive getting the data into hdfs importing data from hdfs to hive bay area bike share analysis using sparkr summary chapter 8:machine learning methods for big data in r what is machine learning? supervised and unsupervised machine learning methods classification and clustering algorithms machine learning methods with r big data machine learning tools glm example with spark and r on the hdinsight cluster preparing the spark cluster and rea the data from hdfs logistic regression in spark with r naive bayes with h20 on hadoop with r running an h2o instance on hadoop with r rea and exploring the data in h2o naive bayes on h2o with r neural works with h2o on hadoop with r how do neural works work? running deep learning models on h20 summary chapter 9:the future of r - big, fast, and smart data the current state of big data analytics with r out-of-memory data on a single machine faster data processing with r hadoop with r spark with r r with databases machine learning with r the future of r big data fast data smart data where to go next summary index
以下为对购买帮助不大的评价