Blog

Here you can find our latest posts


Meetup #1: Microsoft Data Science Azure Machine Learning Workshop

Microsoft Data Science Azure Machine Learning Workshop Lab  Setup and Instruction Guide In this first Microsoft Data Science meetup, hosted by Infi and with guest speaker Jeroen ter Heerdt from Microsoft, we also organized a workshop to get the basics of machine learning on the Azure platform. Overview In this […]


Human Resources Analytics – Why employees are leaving

In this blog about human resources analytics, we are building a model to predict whether an employee will leave or not, and we will also try to find out why they leave, according to the data. We will use a simulated dataset from Kaggle, which can be found here: https://www.kaggle.com/ludobenistant/hr-analytics […]

Human Resources Analytics - Why are employees leaving

boxplot

Research Methodology – Descriptive statistics and exploring graphs with SPSS

This blog is part of a lecture about descriptive statistics and exploring graphs with SPSS. Some of the data is of the students themselves, and for other graphs, I used the datasets from Andy Field’s Discovering Statistics Using IBM SPSS Statistics, a book I highly recommend! Part 1: Opening a […]


Optimizing prediction models on Azure

Optimizing prediction models on Azure – pruning the trees 9

This is a simple example about optimizing prediction models on Azure. In this case we will use a Boosted Decision Tree model. We will show you how you can use the Permutation Feature Performance module to prune your trees. We start with the Student Performance Classifier from a previous blog. We […]


predict-heart-disease

Binary classification: heart disease prediction – 7 ideas how to start and improve your model

Binary classification: heart disease prediction – 7 ideas how to start and improve your model This experiment is based on the original Heart Disease Prediction experiment created by Weehyong Tok from Microsoft, which is one of the Healthcare Industry solutions. This experiment uses the data set from the UCI Machine […]