Scraping the Top 5 Tech Company Job Boards

In this project, I wanted to scrape the job search results from Apple, Amazon, Facebook, Google, and Netflix to help expedite my job search. It is a tedious thing to go to each site to get all the jobs results for different cities, so I figured I would automate it. Here is the thing, I have never scraped data off a site before but really wanted to try using the tools available to do it. Now I do realize that these sites probably don’t want you scraping their data, so do this at your own risk. This is the long but fun journey to finally figure out how to do this.

Continue reading “Scraping the Top 5 Tech Company Job Boards”

Analyzing CIA Factbook Data Using SQLite and Python

Introduction

In this project, I’ll be working with data from the CIA World Factbook, a collection of statistics about all of the countries on Earth. The factbook contains demographic information like:

  • population – The population as of 2015.
  • population growth – The annual population growth rate, as a percentage.
  • area – The total land and water area.

I’ll be using pandas to run SQL queries and display the results neatly as a DataFrame object.

Continue reading “Analyzing CIA Factbook Data Using SQLite and Python”

Visualizing The Gender Gap In College Degrees

In this project, I continue exploring the data set used in my post on employment rates of college graduates. Here, I’m looking at how gender gaps in college degrees have changed specifically during the time between 1968 – 2011. At the end of the post are grouped line charts comparing how the gap across all degree categories between genders have changed over time. Continue reading “Visualizing The Gender Gap In College Degrees”

US Births by Days of the Week

This was my first simple project from the Dataquest.io Data Science program and basically just counts all the US Births for each day of the week between 1994 – 2003. It was a quick little project but still interesting to see how the totals were pretty equal between Monday and Friday but decreased quite a bit on the weekends. Investigating why the sudden drop off would be something interesting to look into. The data can be found here.

Visualizing Earnings Based on College Majors

This project is a demonstration on using the plotting tools built into pandas to explore data on job outcomes for recent college graduates. Using visualizations, we can start to explore questions from the data set like:

  • Do students in more popular majors make more money?
  • Using scatter plots
  • How many majors are predominantly male? Predominantly female?
  • Using histograms
  • Which category of majors have the most students?
  • Using bar plots

The data released by American Community Survey and cleaned by FiveThirtyEight can be found here.