消息首页搜索举报

Spark机器学习

全新正版极速发货

33.15 4.9折 68 全新

库存2件

广东广州

认证卖家担保交易快速发货售后保障

作者(英)彭特里思(Nick Pentreath) 著

出版社东南大学出版社

ISBN9787564160913

出版时间2016-01

装帧平装

开本16开

定价68元

货号1201273924

上书时间2024-09-05

大智慧小美丽

已实名已认证进店收藏店铺

在售商品暂无
平均发货时间 17小时
好评率暂无

最新上架

一日看尽长安花 ¥20.90

心态好一切都好如何培养耐心、专注和自律美绘插画版 ¥4.01

永恒的爱经典绘本(10册) ¥6.55

兴化访垛 ¥39.58

语言小天才植物童话 ¥4.67

构建人类命运共同体南南人权发展的新机遇 ¥42.41

新编儿童英语入门(1)(第2版) ¥8.35

弘文焕采欧阳中石先生书法教育思想研究文集 ¥38.25

初中物理丢分题每节一练 8年级上 ¥7.72

商品详情

品相描述：全新

商品描述: 作者简介
彭特里思，如果你是一名Scala、Java或Python开发人员，对机器学习和数据分析饶有兴趣，并热衷于学习如何使用spa rk框架将常见机器学习技术运用干大规模应用，那么这本书就是写给你的。如果对spark有基本的理解自然会有益处，但这并不是必需的。

目录
Preface
Chapter 1： Getting Up and Running with Spark
Installing and setting up Spark locally
Spark clusters
The Spark programming model
Spark Context and Spark Conf
The Spark shell
Resilient Distributed Datasets
Creating RDDs
Spark operations
Caching RDDs
Broadcast variables and accumulators
The first step to a Spark program in Scala
The first step to a Spark program in Java
The first step to a Spark program in Python
Getting Spark running on Amazon EC2
Launching an EC2 Spark cluster
Summary
Chapter 2： Designing a Machine Learning System
Introducing Movie Stream
Business use cases for a machine learning system
Personalization
Targeted marketing and customer segmentation
Predictive modeling and analytics
Types of machine learning models
The components of a data—driven machine learning system
Data ingestion and storage
Data cleansing and transformation
Model training and testing loop
Model deployment and integration
Model monitoring and feedback
Batch versus real time
An architecture for a machine learning system
Practical exercise
Summary
Chapter 3： Obtaining， Processing， and Preparing Data with Spark
Accessing publicly available datasets
The Movie Lens lOOk dataset
Exploring and visualizing your data
Exploring the user dataset
Exploring the movie dataset
Exploring the rating dataset
Processing and transforming your data
Filling in bad or missing data
Extracting useful features from your data
Numerical features
Categorical features
Derived features
Transforming timestamps into categorical features
Text features
Simple text feature extraction
Normalizing features
Using MLlib for feature normalization
Using packages for feature extraction
Summary
Chapter 4： Building a Recommendation Engine with Spark
Types of recommendation models
Content—based filtering
Collaborative filtering
Matrix factorization
Extracting the right features from your data
Extracting features from the MovieLens 100k dataset
Training the recommendation model
Training a model on the MovieLens 100k dataset
Training a model using implicit feedback data
Using the recommendation model
User recommendations
Generating movie recommendations from the MovieLens 100k dataset
Item recommendations
Generating similar movies for the MovieLens 100k dataset
Evaluating the performance of recommendation models
Mean Squared Error
Mean average precision at K
Using MLlibs built—in evaluation functions
RMSE and MSE
MAP
Summary
Chapter 5： Building a Classification Model with Spark
Types of classification models
Linear models
Logistic regression
Linear support vector machines
The nafve Bayes model
Decision trees
Extracting the right features from your data
Extracting features from the Kaggle／StumbleUpon evergreen classification dataset
Training classification models
Training a classification model on the Kaggle／StumbleUpon evergreen classification dataset
Using classification models
Generating predictions for the Kaggle／StumbleUpon
evergreen classification dataset
Evaluating the performance of classification models
Accuracy and prediction error
Precision and recall
ROC curve and AUC
Improving model performance and tuning parameters
Feature standardization
Additional features
Using the correct form of data
Tuning model parameters
Linear models
Decision trees
The naive Bayes model
Cross—validation
Summary
Chapter 6： Buildin a Regression Model with Spark
Types of regression models
Least squares regression
Decision trees for regression
Extracting the right features from your data
Extracting features from the bike sharing dataset
Creating feature vectors for the linear model
Creating feature vectors for the decision tree
Training and using regression models
Training a regression model on the bike sharing dataset
Evaluating the performance of regression models
Mean Squared Error and Root Mean Squared Error
Mean Absolute Error
Root Mean Squared Log Error
The R—squared coefficient
Computing performance metrics on the bike sharing dataset
Linear model
Decision tree
Improving model performance and tuning parameters
Transforming the target variable
Impact of training on log—transformed targets
Tuning model parameters
Creating training and testing sets to evaluate parameters
The impact of parameter settings for linear models
The impact of parameter settings for the decision tree
Summary
Chapter 7： Building a Clustering Model with Spark
Types of clustering models
K—means clustering
Initialization methods
Variants
Mixture models
Hierarchical clustering
Extracting the right features from your data
Extracting features from the MovieLens dataset
Extracting movie genre labels
Training the recommendation model
Normalization
Training a clustering model
Training a clustering model on the MovieLens dataset
Making predictions using a clustering model
Interpreting cluster predictions on the MovieLens dataset
Interpreting the movie clusters
Evaluating the performance of clustering models
Internal evaluation metrics
External evaluation metrics
Computing performance metrics on the MovieLens dataset
Tuning parameters for clustering models
Selecting K through cross—validation
Summary
Chapter 8： Dimensionality Reduction with Spark
Types of dimensionality reduction
Principal Components Analysis
Singular Value Decomposition
Relationship with matrix factorization
Clustering as dimensionality reduction
Extracting the right features from your data
Extracting features from the LFW dataset
Exploring the face data
Visualizing the face data
Extracting fa images as vectors
Normalization
Training a dimensionality reduction model
Running PCA on the LFW dataset
Visualizing the Eigenfaces
Interpreting the Eigenfaces
Using a dimensionality reduction model
Projecting data using PCA on the LFW dataset
The relationship between PCA and SVD
Evaluating dimensionality reduction models
Evaluating k for SVD on the LFW dataset
Summary
Chapter 9： Advanced Text Processing with Spark
Whats so spe about text data？
Extracting the right features from your data
Term

— 没有更多了 —