Python for Data Science For Dummies [3 ed.] 9781394213092, 9781394213146, 9781394213085

Let Python do the heavy lifting for you as you analyze large datasets Python for Data Science For Dummies lets you get y

243 48 5MB

English Pages 464 Year 2023

Report DMCA / Copyright

DOWNLOAD FILE

Python for Data Science For Dummies [3 ed.]
 9781394213092, 9781394213146, 9781394213085

Table of contents :
Cover
Table of Contents
Title Page
Copyright
Introduction
About This Book
Foolish Assumptions
Icons Used in This Book
Beyond the Book
Where to Go from Here
Part 1: Getting Started with Data Science and Python
Chapter 1: Discovering the Match between Data Science and Python
Understanding Python as a Language
Defining Data Science
Creating the Data Science Pipeline
Understanding Python’s Role in Data Science
Learning to Use Python Fast
Chapter 2: Introducing Python’s Capabilities and Wonders
Working with Python
Performing Rapid Prototyping and Experimentation
Considering Speed of Execution
Visualizing Power
Using the Python Ecosystem for Data Science
Chapter 3: Setting Up Python for Data Science
Working with Anaconda
Installing Anaconda on Windows
Installing Anaconda on Linux
Installing Anaconda on Mac OS X
Downloading the Datasets and Example Code
Chapter 4: Working with Google Colab
Defining Google Colab
Working with Notebooks
Performing Common Tasks
Using Hardware Acceleration
Executing the Code
Viewing Your Notebook
Sharing Your Notebook
Getting Help
Part 2: Getting Your Hands Dirty with Data
Chapter 5: Working with Jupyter Notebook
Using Jupyter Notebook
Performing Multimedia and Graphic Integration
Chapter 6: Working with Real Data
Uploading, Streaming, and Sampling Data
Accessing Data in Structured Flat-File Form
Sending Data in Unstructured File Form
Managing Data from Relational Databases
Interacting with Data from NoSQL Databases
Accessing Data from the Web
Chapter 7: Processing Your Data
Juggling between NumPy and pandas
Validating Your Data
Manipulating Categorical Variables
Dealing with Dates in Your Data
Dealing with Missing Data
Slicing and Dicing: Filtering and Selecting Data
Concatenating and Transforming
Aggregating Data at Any Level
Chapter 8: Reshaping Data
Using the Bag of Words Model to Tokenize Data
Working with Graph Data
Chapter 9: Putting What You Know into Action
Contextualizing Problems and Data
Considering the Art of Feature Creation
Performing Operations on Arrays
Part 3: Visualizing Information
Chapter 10: Getting a Crash Course in Matplotlib
Starting with a Graph
Setting the Axis, Ticks, and Grids
Defining the Line Appearance
Using Labels, Annotations, and Legends
Chapter 11: Visualizing the Data
Choosing the Right Graph
Creating Advanced Scatterplots
Plotting Time Series
Plotting Geographical Data
Visualizing Graphs
Part 4: Wrangling Data
Chapter 12: Stretching Python’s Capabilities
Playing with Scikit-learn
Using Transformative Functions
Considering Timing and Performance
Running in Parallel on Multiple Cores
Chapter 13: Exploring Data Analysis
The EDA Approach
Defining Descriptive Statistics for Numeric Data
Counting for Categorical Data
Creating Applied Visualization for EDA
Understanding Correlation
Working with Cramér's V
Modifying Data Distributions
Chapter 14: Reducing Dimensionality
Understanding SVD
Performing Factor Analysis and PCA
Understanding Some Applications
Chapter 15: Clustering
Clustering with K-means
Performing Hierarchical Clustering
Discovering New Groups with DBScan
Chapter 16: Detecting Outliers in Data
Considering Outlier Detection
Examining a Simple Univariate Method
Developing a Multivariate Approach
Part 5: Learning from Data
Chapter 17: Exploring Four Simple and Effective Algorithms
Guessing the Number: Linear Regression
Moving to Logistic Regression
Making Things as Simple as Naïve Bayes
Learning Lazily with Nearest Neighbors
Chapter 18: Performing Cross-Validation, Selection, and Optimization
Pondering the Problem of Fitting a Model
Cross-Validating
Selecting Variables Like a Pro
Pumping Up Your Hyperparameters
Chapter 19: Increasing Complexity with Linear and Nonlinear Tricks
Using Nonlinear Transformations
Regularizing Linear Models
Fighting with Big Data Chunk by Chunk
Understanding Support Vector Machines
Playing with Neural Networks
Chapter 20: Understanding the Power of the Many
Starting with a Plain Decision Tree
Getting Lost in a Random Forest
Boosting Predictions
Part 6: The Part of Tens
Chapter 21: Ten Essential Data Resources
Discovering the News with Reddit
Getting a Good Start with KDnuggets
Locating Free Learning Resources with Quora
Gaining Insights with Oracle’s AI & Data Science Blog
Accessing the Huge List of Resources on Data Science Central
Discovering New Beginner Data Science Methodologies at Data Science 101
Obtaining the Most Authoritative Sources at Udacity
Receiving Help with Advanced Topics at Conductrics
Obtaining the Facts of Open Source Data Science from Springboard
Zeroing In on Developer Resources with Jonathan Bower
Chapter 22: Ten Data Challenges You Should Take
Removing Personally Identifiable Information
Creating a Secure Data Environment
Working with a Multiple-Data-Source Problem
Honing Your Overfit Strategies
Trudging Through the MovieLens Dataset
Locating the Correct Data Source
Working with Handwritten Information
Working with Pictures
Indentifying Data Lineage
Interacting with a Huge Graph
Index
About the Authors
Connect with Dummies
End User License Agreement

Polecaj historie