Feature Engineering
Images from Unsplash Disclaimer: This article is my learning note from the courses I took from Kaggle. In this course, we will learn on how to: determine which features are the most important with mutual information invent new features in several real-world problem domains encode high-cardinality categoricals with a target encoding create segmentation features with k-means clustering decompose a dataset’s variation into features with principal component analysis 1. Introduction The reason we perform feature engineering is we want to make our data more suited to the problem at hand. Consider “apparent temperature” measures like the heat index and the wind chill. These quantities attempt to measure the perceived temperature to humans based on air temperature, humidity, and wind speed, things which we can measure directly. You could think of an apparent temperature as the result of a kind of feature engineering, an attempt to make the observed data more relevant to what we actually care about: how it actually feels outside! ...