Comparing API Access and Web Scraping for Python Data Retrieval

Learn and practice both APIs and web scraping to access structured and unstructured data.

Gain clarity on two essential Python techniques for data retrieval. Understand when to leverage structured APIs and when web scraping becomes necessary.

Key Insights

  • Using an API involves obtaining an API key, understanding request types, and navigating structured data responses, providing a clear, user-friendly way to access data.
  • Web scraping requires analyzing HTML page structures, identifying relevant data across multiple pages, and manually structuring extracted information into usable formats, making it considerably more complex than using a structured API.
  • Mastering both structured API data retrieval and web scraping broadens your Python skill set, allowing access to almost any type of publicly available data.

Note: These materials offer prospective students a preview of how our classes are structured. Students enrolled in this course will receive access to the full set of materials, including video lectures, project-based assignments, and instructor feedback.

Let's compare these two ways we've gone over today for accessing data using Python. First, our API. To use an API, we had to understand the API itself.

We had to get set up with it, we had to get an API key, and we had to understand what kind of request we wanted to make, and then we had to understand the shape of the data we got back so that we could look at specific elements of it, get the time series, get the date, get the closing value. But this came with an amazing advantage. It was structured.

It already had this structure. We had to understand it. We had to understand how to use the API, but it was intended to be used by people.

It was designed carefully, and a lot of resources were put into that. Now, a page like the one we just looked at does not have that work put in. Instead, if we want to do something like what we just did, we needed to examine the page structure itself to see how could we get out of that the data we want.

Instead of the data being in a somewhat challenging dictionary within dictionaries, it was instead just HTML, and we had to provide the structure to make it into a data frame, and that involved quite a lot of work. We had to find even how much data there was, that there was 50 pages of it, and then we had to do a lot of work to scrape and grab very specific parts of the page and then do some work to get it into the right format. So, similar kind of structure overall, but a very different approach.

Data Science Certificate: Live & Hands-on, In NYC or Online, 0% Financing, 1-on-1 Mentoring, Free Retake, Job Prep. Named a Top Bootcamp by Forbes, Fortune, & Time Out. Noble Desktop. Learn More.

When we have to scrape data, it's fun in its way. It's a fun challenge, but it is an extra challenge added on top because nobody has taken the time to put work into making the data look presentable, making the data in the right format, making sure the data is there for everybody to access, as they have with APIs, typically. So, you'd always prefer the API, pretty exclusively, but data scraping is a great tool to have in your toolbox for those times when you have data you want, and it's not publicly available, or it's publicly available, but not in an easy-to-get way.

So, I highly recommend you learn and practice both of these methods. Using these two tools, you can access pretty much any data that's out there. All you need to do is get it.

Colin Jaffe

Colin Jaffe is a programmer, writer, and teacher with a passion for creative code, customizable computing environments, and simple puns. He loves teaching code, from the fundamentals of algorithmic thinking to the business logic and user flow of application building—he particularly enjoys teaching JavaScript, Python, API design, and front-end frameworks.

Colin has taught code to a diverse group of students since learning to code himself, including young men of color at All-Star Code, elementary school kids at The Coding Space, and marginalized groups at Pursuit. He also works as an instructor for Noble Desktop, where he teaches classes in the Full-Stack Web Development Certificate and the Data Science & AI Certificate.

Colin lives in Brooklyn with his wife, two kids, and many intricate board games.

More articles by Colin Jaffe

How to Learn Data Science

Master data science with hands-on training. Data science is a field that focuses on creating and improving tools to clean and analyze large amounts of raw data.

Yelp Facebook LinkedIn YouTube Twitter Instagram