Artículos relacionados a PySpark SQL Recipes: With HiveQL, Dataframe and Graphframes

PySpark SQL Recipes: With HiveQL, Dataframe and Graphframes - Tapa blanda

 
9781484243367: PySpark SQL Recipes: With HiveQL, Dataframe and Graphframes

Esta edición ISBN ya no está disponible.

Sinopsis

Chapter 1:  Introduction to PySparkSQL

Chapter Goal: Reader will  understand about PySpark, PySparkSQL , Catalyst Optimizer, Project Tungsten and Hive

No of pages                   20-30

Sub -Topics

1.      PySpark

2.      PySparkSQL

3.      Hive

4.      Catalyst

5.      Project Tungsten

 

Chapter 2:  Some time with Installation

Chapter Goal: Learner will understand about installation of Spark, Hive, PostgreSQL, MySQL, MongoDB, Cassandra etc.

No of pages: 30 -40

Sub - Topics                 

1.       Installation Spark

2.      Installation Hive

3.      Installation MySQL

4.      Installation MongoDB

Chapter 3:  IO in PySparkSQL

Chapter Goal: This chapter will provide recipes to the reader, which will  enable them to create PySparkSQL DataFrame from different sources.

No of pages : 40-50

Sub - Topics:                

1.      Creating DataFrame from data.

2.      Reading csv file to create Dataframe

3.  Reading JSON file to create Dataframe.

4.  Saving  DataFrames to different formats.

 

Chapter 4 :  Operations on PySparkSQL DataFrames

Chapter Goal:               Reader will learn about data filtering, data manuipulation, data descriptive analysis , Dealing with missing value etc

No Of Pages ; 40 -50

1.      Data filtering

2.      Data manipulation

3.      Row and column manipulation

 

Chapter 5 :  Data Merging and Data Aggregation using PySparkSQL

Chapter Goal: Reader will learn about data merging and aggregation using PySparkSQL

1.      Data Merging

2.      Data aggregation

 

Chapter 6: SQL, NoSQL and PySparkSQL

Chapter Goal: Reader will learn to run SQL and HiveQL queries on Dataframe

No of pages: 30-40

Sub - Topics:

1. Running SQL on DataFrame

2. Running HiveQL

 

Chapter 7: Structured Streaming

Chapter Goal:               Reader will understand about structured streaming

No of pages : 30-40

1.      Different type of modes.

2.      Data aggregation in structured streaming

3.      Different type of sources

 

 

 

 

Chapter 8 : Optimizing PySparkSQL

Chapter Goal:               Reader will learn about optimizing PySparkSQL

No Of pages  : 20-30

Optimizing PySparkSQL

 

 

 

Chapter 9 : GraphFrames

Chapter Goal:               Reader will understand about graph data analysis with Graphframes. 

No of pages : 30-40

1. GraphFrame Creat

"Sinopsis" puede pertenecer a otra edición de este libro.

(Ningún ejemplar disponible)

Buscar:



Crear una petición

¿No encuentra el libro que está buscando? Seguiremos buscando por usted. Si alguno de nuestros vendedores lo incluye en IberLibro, le avisaremos.

Crear una petición

Otras ediciones populares con el mismo título

9781484243343: PySpark SQL Recipes: With HiveQL, Dataframe and Graphframes

Edición Destacada

ISBN 10:  148424334X ISBN 13:  9781484243343
Editorial: Apress, 2019
Tapa blanda