Data Visualization with Python and JavaScript: Scrape, Clean, Explore, and Transform Your Data, 2nd Edition
- Length: 566 pages
- Edition: 2
- Language: English
- Publisher: O'Reilly Media
- Publication Date: 2023-01-17
- ISBN-10: 1098111877
- ISBN-13: 9781098111878
- Sales Rank: #789712 (See Top 100 Books)
How do you turn raw, unprocessed, or malformed data into dynamic, interactive web visualizations? In this practical book, author Kyran Dale shows data scientists and analysts–as well as Python and JavaScript developers–how to create the ideal toolchain for the job. By providing engaging examples and stressing hard-earned best practices, this guide teaches you how to leverage the power of best-of-breed Python and JavaScript libraries.
Python provides accessible, powerful, and mature libraries for scraping, cleaning, and processing data. And while JavaScript is the best language when it comes to programming web visualizations, its data processing abilities can’t compare with Python’s. Together, these two languages are a perfect complement for creating a modern web-visualization toolchain. This book gets you started.
You’ll learn how to:
- Obtain data you need programmatically, using scraping tools or web APIs: Requests, Scrapy, Beautiful Soup
- Clean and process data using Python’s heavyweight data processing libraries within the NumPy ecosystem: Jupyter notebooks with pandas+Matplotlib+Seaborn
- Deliver the data to a browser with static files or by using Flask, the lightweight Python server, and a RESTful API
- Pick up enough web development skills (HTML, CSS, JS) to get your visualized data on the web
- Use the data you’ve mined and refined to create web charts and visualizations with Plotly, D3, Leaflet, and other libraries
Preface Part I: Basic Toolkit Part II: Getting Your Data Part III: Cleaning and Exploring Data with pandas Part IV: Delivering the Data Part V: Visualizing Your Data with D3 and Plotly The Second Edition Conventions Used in This Book Using Code Examples O’Reilly Online Learning How to Contact Us Acknowledgments Second Edition Introduction Who This Book Is For Minimal Requirements to Use This Book Why Python and JavaScript? Why Not Python in the Browser? Why Python for Data Processing Java R Others Python’s Getting Better All the Time What You’ll Learn The Choice of Libraries Preliminaries The Dataviz Toolchain 1. Scraping Data with Scrapy 2. Cleaning Data with pandas 3. Exploring Data with pandas and Matplotlib 4. Delivering Your Data with Flask 5. Transforming Data into Interactive Visualizations with Plotly and D3 Smaller Libraries Using the Book A Little Bit of Context Summary Recommended Books I. Basic Toolkit 1. Development Setup The Accompanying Code Python Anaconda Installing Extra Libraries Virtual Environments JavaScript Content Delivery Networks Installing Libraries Locally Databases Getting MongoDB Up and Running Easy MongoDB with Docker Integrated Development Environments Summary 2. A Language-Learning Bridge Between Python and JavaScript Similarities and Differences Interacting with the Code Python JavaScript Basic Bridge Work Style Guidelines, PEP 8, and use strict CamelCase Versus Underscore Importing Modules, Including Scripts JavaScript Modules Keeping Your Namespaces Clean Outputting “Hello World!” Simple Data Processing String Construction Significant Whitespace Versus Curly Brackets Comments and Doc-Strings Declaring Variables Using let or var Strings and Numbers Booleans Data Containers: dicts, objects, lists, Arrays Functions Iterating: for Loops and Functional Alternatives Conditionals: if, else, elif, switch File Input and Output Classes and Prototypes Differences in Practice Method Chaining Enumerating a List Tuple Unpacking Collections Underscore Functional Array Methods and List Comprehensions Map, Reduce, and Filter with Python’s Lambdas JavaScript Closures and the Module Pattern A Cheat Sheet Summary 3. Reading and Writing Data with Python Easy Does It Passing Data Around Working with System Files CSV, TSV, and Row-Column Data Formats JSON Dealing with Dates and Times SQL Creating the Database Engine Defining the Database Tables Adding Instances with a Session Querying the Database Easier SQL with Dataset MongoDB Dealing with Dates, Times, and Complex Data Summary 4. Webdev 101 The Big Picture Single-Page Apps Tooling Up The Myth of IDEs, Frameworks, and Tools A Text-Editing Workhorse Browser with Development Tools Terminal or Command Prompt Building a Web Page Serving Pages with HTTP The DOM The HTML Skeleton Marking Up Content CSS JavaScript Data Chrome DevTools The Elements Tab The Sources Tab Other Tools A Basic Page with Placeholders Positioning and Sizing Containers with Flex Filling the Placeholders with Content Scalable Vector Graphics The <g> Element Circles Applying CSS Styles Lines, Rectangles, and Polygons Text Paths Scaling and Rotating Working with Groups Layering and Transparency JavaScripted SVG Summary II. Getting Your Data 5. Getting Data Off the Web with Python Getting Web Data with the Requests Library Getting Data Files with Requests Using Python to Consume Data from a Web API Consuming a RESTful Web API with Requests Getting Country Data for the Nobel Dataviz Using Libraries to Access Web APIs Using Google Spreadsheets Using the Twitter API with Tweepy Scraping Data Why We Need to Scrape Beautiful Soup and lxml A First Scraping Foray Getting the Soup Selecting Tags Crafting Selection Patterns Caching the Web Pages Scraping the Winners’ Nationalities Summary 6. Heavyweight Scraping with Scrapy Setting Up Scrapy Establishing the Targets Targeting HTML with Xpaths Testing Xpaths with the Scrapy Shell Selecting with Relative Xpaths A First Scrapy Spider Scraping the Individual Biography Pages Chaining Requests and Yielding Data Caching Pages Yielding Requests Scrapy Pipelines Scraping Text and Images with a Pipeline Specifying Pipelines with Multiple Spiders Summary III. Cleaning and Exploring Data with pandas 7. Introduction to NumPy The NumPy Array Creating Arrays Array Indexing and Slicing A Few Basic Operations Creating Array Functions Calculating a Moving Average Summary 8. Introduction to pandas Why pandas Is Tailor-Made for Dataviz Why pandas Was Developed Categorizing Data and Measurements The DataFrame Indices Rows and Columns Selecting Groups Creating and Saving DataFrames JSON CSV Excel Files SQL MongoDB Series into DataFrames Summary 9. Cleaning Data with pandas Coming Clean About Dirty Data Inspecting the Data Indices and pandas Data Selection Selecting Multiple Rows Cleaning the Data Finding Mixed Types Replacing Strings Removing Rows Finding Duplicates Sorting Data Removing Duplicates Dealing with Missing Fields Dealing with Times and Dates The Full clean_data Function Adding the born_in column Merging DataFrames Saving the Cleaned Datasets Summary 10. Visualizing Data with Matplotlib pyplot and Object-Oriented Matplotlib Starting an Interactive Session Interactive Plotting with pyplot’s Global State Configuring Matplotlib Setting the Figure’s Size Points, Not Pixels Labels and Legends Titles and Axes Labels Saving Your Charts Figures and Object-Oriented Matplotlib Axes and Subplots Plot Types Bar Charts Scatter Plots Adding a regression line seaborn FacetGrids PairGrids Summary 11. Exploring Data with pandas Starting to Explore Plotting with pandas Gender Disparities Unstacking Groups Historical Trends National Trends Prize Winners Per Capita Prizes by Category Historical Trends in Prize Distribution Age and Life Expectancy of Winners Age at Time of Award Life Expectancy of Winners Increasing Life Expectancies over Time The Nobel Diaspora Summary IV. Delivering the Data 12. Delivering the Data Serving the Data Organizing Your Flask Files Serving Data with Flask Delivering Data Files Dynamic Data with Flask APIs A Simple Data API with Flask Using Static or Dynamic Delivery Summary 13. RESTful Data with Flask The Tools for a RESTful Job Creating the Database A Flask RESTful Data Server Serializing with marshmallow Adding our RESTful API Routes Posting Data to the API Extending the API with MethodViews Paginating the Data Returns Deploying the API Remotely with Heroku CORS Consuming the API Using JavaScript Summary V. Visualizing Your Data with D3 and Plotly 14. Bringing Your Charts to the Web with Matplotlib and Plotly Static Charts with Matplotlib Adapting to Screen Sizes Using Remote Images or Assets Charting with Plotly Basic Charts Plotly Express Plotly Graph-Objects Mapping with Plotly Adding Custom Controls with Plotly From Notebook to Web with Plotly Native JavaScript Charts with Plotly Fetching JSON Files User-Driven Plotly with JavaScript and HTML Summary 15. Imagining a Nobel Visualization Who Is It For? Choosing Visual Elements Menu Bar Prizes by Year A Map Showing Selected Nobel Countries A Bar Chart Showing Number of Winners by Country A List of the Selected Winners A Mini-Biography Box with Picture The Complete Visualization Summary 16. Building a Visualization Preliminaries Core Components Organizing Your Files Serving the Data The HTML Skeleton CSS Styling The JavaScript Engine Importing the Scripts Modular JS with Imports Basic Data Flow The Core Code Initializing the Nobel Prize Visualization Ready to Go Data-Driven Updates Filtering Data with Crossfilter Creating the filter Running the Nobel Prize Visualization App Summary 17. Introducing D3—The Story of a Bar Chart Framing the Problem Working with Selections Adding DOM Elements Leveraging D3 Measuring Up with D3’s Scales Quantitative Scales Ordinal Scales Unleashing the Power of D3 with Data Binding/Joining Updating the DOM with Data Putting the Bar Chart Together Axes and Labels Transitions Updating the Bar Chart Summary 18. Visualizing Individual Prizes Building the Framework Scales Axes Category Labels Nesting the Data Adding the Winners with a Nested Data-Join A Little Transitional Sparkle Updating the Bar Chart Summary 19. Mapping with D3 Available Maps D3’s Mapping Data Formats GeoJSON TopoJSON Converting Maps to TopoJSON D3 Geo, Projections, and Paths Projections Paths graticules Putting the Elements Together Updating the Map Adding Value Indicators Our Completed Map Building a Simple Tooltip Updating the Map Summary 20. Visualizing Individual Winners Building the List Building the Bio-Box Updating the Winners List Summary 21. The Menu Bar Creating HTML Elements with D3 Building the Menu Bar Building the Category Selector Adding the Gender Selector Adding the Country Selector Wiring Up the Metric Radio Button Summary 22. Conclusion Recap Part I: Basic Toolkit Part II: Getting Your Data Part III: Cleaning and Exploring Data with pandas Part IV: Delivering the Data Part V: Visualizing Your Data with D3 and Plotly Future Progress Visualizing Social Media Networks Machine-Learning Visualizations Final Thoughts A. D3’s enter/exit Pattern The enter Method Accessing the Bound Data Index
Donate to keep this site alive
How to download source code?
1. Go to: https://www.oreilly.com/
2. Search the book title: Data Visualization with Python and JavaScript: Scrape, Clean, Explore, and Transform Your Data, 2nd Edition
, sometime you may not get the results, please search the main title
3. Click the book title in the search results
3. Publisher resources
section, click Download Example Code
.
1. Disable the AdBlock plugin. Otherwise, you may not get any links.
2. Solve the CAPTCHA.
3. Click download link.
4. Lead to download server to download.