Skip to content

huntnguyen/PGA-Tour-Data-Science-Project

 
 

Repository files navigation

This repository contains jupyter notebooks related to multiple data science projects completed using the PGA tour dataset I created. Please read the file descriptions below.

  1. PGAtour.com Web Scraper - Contains code used to scrape the pgatour.com website for PGA tour player statistics from 2007-2017.
  2. pgatour_raw.db - Sqlite database file containing pgatour player data scraped in file 1.
  3. pgatour_raw.csv - CSV file containing raw data from file #2.
  4. pgatour_cleaned.csv - CSV file containing cleaned version of pgatour_raw.csv. The process I used to clean this data can be found in the PGA tour - EDA notebook in this repository.
  5. PGA Tour Machine Learning Project - Classification.ipynb - Contains a machine learning project focused on classifying players as tournament and non-tournament winners.
  6. PGA Tour - EDA - Contains exploratory data analysis for the dataset in the pgatour_raw.db database file. This EDA includes data cleaning and formatting, feature investigation, and a thorough analysis of the PGA tour statistics collected over time.

More files will be added to this repository as I continue to develop more project ideas associated with this dataset.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%