A repository for storing stuff for data science workshop for https://www.datafestak.cz.
The workshop will be held at 08:30-10:00 in room XXX. Please get there on time, we'll not wait up.
- Viktor Sohajek (Mall Group)
- Hynek Walner (Workday)
- Viktor Brada (Workday)
The aim of this workshop is to give you a starter kit for modeling using machine learning techniques. After the end of the workshop, you should be able to:
- Understand the concept of train-test dataset split for predictive modeling.
- Build a model on your training data with scikit learn library [Py].
- Use the model to predict values on new predictor data.
- Build some of following models: Decision Tree, Random Forest, Gradient Boosting, Neural Networks.
- Know where to dig further information about DS.
- Make you instantly DS guru.
- Help you master hyperparameter selection for your kaggle competition.
- Fill your math and stats gaps from college.
- We will not cover clustering methods.
Before you attend the workshop, make sure, you have grip of:
- basic understending of concepts: linear regression, derivatives, extremes of analytical functions
- pandas library [Py]: loading data from csv, selecting/altering columns/rows
- knowledge of how to use jupyter notebooks (setup a new notebook, what is a cell, how do I run/alter it, ...)
Come with a PC and do these steps before the beginning of the workshop:
- have Python3 installed and ready to use (preferably with a fresh virtual environment, using e.g.
virtualenv
). - install python packages from
requirements.txt
file (pip3 install -r requirements.txt
)
If you are not sure, whether the workshop level is optimal for you, just text us on Slack and we will be happy to discuss it with you ;-].