Network Science with Python: Explore the networks around us using network science, social network analysis, and machine learning 1801073694, 9781801073691

Discover the use of graph networks to develop a new approach to data science using theoretical and practical methods wit

884 94 4MB

English Pages 414 Year 2023

Report DMCA / Copyright

DOWNLOAD FILE

Network Science with Python: Explore the networks around us using network science, social network analysis, and machine learning
 1801073694, 9781801073691

Table of contents :
Cover
Title Page
Copyright and Credits
Acknowledgements
Contributors
Table of Contents
Preface
Part 1: Getting Started with Natural Language Processing and Networks
Chapter 1: Introducing Natural Language Processing
Technical requirements
What is NLP?
Why NLP in a network analysis book?
A very brief history of NLP
How has NLP helped me?
Simple text analysis
Community sentiment analysis
Answer previously unanswerable questions
Safety and security
Common uses for NLP
True/False – Presence/Absence
Regular expressions (regex)
Word counts
Sentiment analysis
Information extraction
Community detection
Clustering
Advanced uses of NLP
Chatbots and conversational agents
Language modeling
Text summarization
Topic discovery and modeling
Text-to-speech and speech-to-text conversion
MT
Personal assistants
How can a beginner get started with NLP?
Start with a simple idea
Accounts that post most frequently
Accounts mentioned most frequently
Top 10 data science hashtags
Additional questions or action items from simple analysis
Summary
Chapter 2: Network Analysis
The confusion behind networks
What is this network stuff?
Graph theory
Social network analysis
Network science
Resources for learning about network analysis
Notebook interfaces
IDEs
Network datasets
Kaggle datasets
NetworkX and scikit-network graph generators
Creating your own datasets
NetworkX and articles
Common network use cases
Mapping production dataflow
Mapping community interactions
Mapping literary social networks
Mapping historical social networks
Mapping language
Mapping dark networks
Market research
Finding specific content
Creating ML training data
Advanced network use cases
Graph ML
Recommendation systems
Getting started with networks
Example – K-pop implementation
Summary
Further reading
Chapter 3: Useful Python Libraries
Technical requirements
Using notebooks
Data analysis and processing
pandas
NumPy
Data visualization
Matplotlib
Seaborn
Plotly
NLP
Natural Language Toolkit
Setup
Starter functionality
Documentation
spaCy
Network analysis and visualization
NetworkX
scikit-network
ML
scikit-learn
Karate Club
spaCy (revisited)
Summary
Part 2: Graph Construction and Cleanup
Chapter 4: NLP and Network Synergy
Technical requirements
Why are we learning about NLP in a network book?
Asking questions to tell a story
Introducing web scraping
Introducing BeautifulSoup
Loading and scraping data with BeautifulSoup
Choosing between libraries, APIs, and source data
Using NLTK for PoS tagging
Using spaCy for PoS tagging and NER
SpaCy PoS tagging
SpaCy NER
Converting entity lists into network data
Converting network data into networks
Doing a network visualization spot check
Additional NLP and network considerations
Data cleanup
Comparing PoS tagging and NER
Scraping considerations
Summary
Chapter 5: Even Easier Scraping!
Technical requirements
Why cover Requests and BeautifulSoup?
Introducing Newspaper3k
What is Newspaper3k?
What are Newspaper3k’s uses?
Getting started with Newspaper3k
Scraping all news URLs from a website
Scraping a news story from a website
Scraping nicely and blending in
Converting text into network data
End-to-end Network3k scraping and network visualization
Introducing the Twitter Python Library
What is the Twitter Python Library?
What are the Twitter Library’s uses?
What data can be harvested from Twitter?
Getting Twitter API access
Authenticating with Twitter
Scraping user tweets
Scraping user following
Scraping user followers
Scraping using search terms
Converting Twitter tweets into network data
End-to-end Twitter scraping
Summary
Chapter 6: Graph Construction and Cleaning
Technical requirements
Creating a graph from an edge list
Types of graphs
Summarizing graphs
Listing nodes
Removing nodes
Quick visual inspection
Adding nodes
Adding edges
Renaming nodes
Removing edges
Persisting the network
Simulating an attack
Summary
Part 3: Network Science and Social Network Analysis
Chapter 7: Whole Network Analysis
Technical requirements
Creating baseline WNA questions
Revised SNA questions
Social network analysis revisited
WNA in action
Loading data and creating networks
Network size and complexity
Network visualization and thoughts
Important nodes
Degrees
Degree centrality
Betweenness centrality
Closeness centrality
PageRank
Edge centralities
Comparing centralities
Visualizing subgraphs
Investigating islands and continents – connected components
Communities
Bridges
Understanding layers with k_core and k_corona
k_core
k_corona
Challenge yourself!
Summary
Chapter 8: Egocentric Network Analysis
Technical requirements
Egocentric network analysis
Uses for egocentric network analysis
Explaining the analysis methodology
Investigating ego nodes and connections
Ego 1 – Valjean
Ego 2 – Marius
Ego 3 – Gavroche
Ego 4 – Joly
Insights between egos
Identifying other research opportunities
Summary
Chapter 9: Community Detection
Technical requirements
Introducing community detection
Getting started with community detection
Exploring connected components
Using the Louvain method
How does it work?
The Louvain method in action!
Using label propagation
How does it work?
Label propagation in action!
Using the Girvan-Newman algorithm
How does it work?
Girvan-Newman algorithm in action!
Other approaches to community detection
Summary
Chapter 10: Supervised Machine Learning on Network Data
Technical requirements
Introducing ML
Beginning with ML
Data preparation and feature engineering
Degrees
Clustering
Triangles
Betweenness centrality
Closeness centrality
PageRank
Adjacency matrix
Merging DataFrames
Adding labels
Selecting a model
Preparing the data
Training and validating the model
Model insights
Other use cases
Summary
Chapter 11: Unsupervised Machine Learning on Network Data
Technical requirements
What is unsupervised ML?
Introducing Karate Club
Network science options
Uses of unsupervised ML on network data
Community detection
Graph embeddings
Constructing a graph
Community detection in action
SCD
Graph embeddings in action
FEATHER
NodeSketch
RandNE
Other models
Using embeddings in supervised ML
Pros and cons
Loss of explainability and insights
An easier workflow for classification and clustering
Summary
Index
Other Books You May Enjoy

Polecaj historie