Hands-On Big Data Analytics with PySpark: Analyze large datasets and discover techniques for testing, immunizing, and parallelizing Spark jobs 9781838644130, 9781786463708, 9781788835367, 183864413X

Use PySpark to easily crush messy data at-scale and discover proven techniques to create testable, immutable, and easily

1,015 167 5MB

English Pages 182 Year 2019

Report DMCA / Copyright

DOWNLOAD FILE

Hands-On Big Data Analytics with PySpark: Analyze large datasets and discover techniques for testing, immunizing, and parallelizing Spark jobs
9781838644130, 9781786463708, 9781788835367, 183864413X

Author / Uploaded
Lai
Rudy;Potaczek
Bartlomiej

Table of contents :
Table of ContentsInstalling Pyspark and Setting up Your Development EnvironmentGetting Your Big Data into the Spark Environment Using RDDsBig Data Cleaning and Wrangling with Spark NotebooksAggregating and Summarizing Data into Useful ReportsPowerful Exploratory Data Analysis with MLlibPutting Structure on Your Big Data with SparkSQLTransformations and ActionsImmutable DesignAvoiding Shuffle and Reducing Operational ExpensesSaving Data in the Correct FormatWorking with the Spark Key/Value APITesting Apache Spark JobsLeveraging the Spark GraphX API

Polecaj historie

PySpark Cookbook: Over 60 Recipes for Implementing Big Data Processing and Analytics Using Apache Spark and Python 1788835360, 9781788835367

Combine the power of Apache Spark and Python to build effective big data applications Key Features Perform effective dat

2,295 475 7MB Read more

Frank Kane's Taming Big Data with Apache Spark and Python

4,413 725 143KB Read more

Data Processing with Optimus: Supercharge big data preparation tasks for analytics and machine learning with Optimus using Dask and PySpark 1801079560, 9781801079563

Written by the core Optimus team, this comprehensive guide will help you to understand how Optimus improves the whole da

1,818 196 3MB Read more

Data Processing with Optimus: Supercharge big data preparation tasks for analytics and machine learning with Optimus using Dask and PySpark 1801079560, 9781801079563

Written by the core Optimus team, this comprehensive guide will help you to understand how Optimus improves the whole da

425 112 4MB Read more

Scala Programming for Big Data Analytics: Get Started With Big Data Analytics Using Apache Spark [1st ed.] 978-1-4842-4809-6;978-1-4842-4810-2

Gain the key language concepts and programming techniques of Scala in the context of big data analytics and Apache Spark

2,330 286 7MB Read more

Data Analysis with Python and PySpark

4,558 1,253 24MB Read more

Machine Learning with Spark and Python: Essential Techniques for Predictive Analytics [2 ed.] 1119561930, 9781119561934

Machine Learning with Spark and Python Essential Techniques for Predictive Analytics, Second Edition simplifies ML for p

1,680 276 8MB Read more

Machine Learning with Spark™ and Python®: Essential Techniques for Predictive Analytics [2nd ed.] 1119561930, 9781119561934

2,548 444 7MB Read more

Data Algorithms with Spark: Recipes and Design Patterns for Scaling Up using PySpark [1 ed.] 1492082384, 9781492082385

Apache Spark's speed, ease of use, sophisticated analytics, and multilanguage support makes practical knowledge of

776 123 10MB Read more

Scalable Data Analytics with Azure Data Explorer Modern Ways to Query, Analyze, and Perform Real-Time Data Analysis on Large Volumes of Data. 9781801078542, 9781801079426, 1801079420

129 13 28MB Read more