Posts

Showing posts from March, 2021

Bike Sharing Demand Prediction

Image
 Regression to Predict Bike Sharing Demand Introduction In this blogpost we will use hourly data on bike sharing program to predict the subsequent demand. the dataset can be found here on Kaggle.  The data has been collected by a bike share progarm in the city of Washington D.C. The dataset contains training data till the 20th of every month for 2 years (2011-2012). Our task is to predict the usage over the remaining days in the month for every hour. Our analysis can be found here. The dataframe looks as follows There are no null values and no rows need to be removed or values imputed. Feature Engineering and Data visualization The only feature engineering we perform is to transform the the datetime stamp to [time of day (hour), day of week, month, year ]. The transformed dataframe looks as follows Note that in the training data there are three target variables, registered users, casual users and the sum of the two (count) total users. For the final submission to Kaggle we on...

MNIST Digit Recognition

Image
MNIST Digit Recognition without Neural Networks  Introduction The MNIST dataset is a set of 70,000 handwritten digits. Of these 42,000 are used for training while 28,000 are used for testing our trained models. The dataset can be found here on Kaggle. The MNIST is an excellent standard benchmark dataset and can be used an a playground to to test and learn more about standard classification algorithms. Using Neural Networks it's reasonably easy to achieve an accuracy greater than 98% as is evidenced by the submitted notebooks on Kaggle. In this blog we will try to see what's the best accuracy we can get without using deep neural networks. We will use Random Forest, Gradient Boost, SVM, stacking and combinations of the above to perform our analysis. We will try different tricks and feature engineering, not all of which will give us better accuracy compared to out of box classification methods . However since the goal of this blog is to explore and learn we will report  The anal...