When performing classification you often want not only to predict the class label, but also obtain a probability of the respective label. This probability gives you some kind of confidence on the prediction. Some models can give you poor estimates of the class probabilities and some even do not support probability prediction. The calibration module allows you to better calibrate the probabilities.
Very complex, but... there is a good explanation about data entropy. Roughly, it measures the loss of information when you encode/compress an information. The main goal is to minimize the information loss when simplifying problems. KL divergence measusures the loss. And... you can use this to evaluate unsupervised learning alghorithms.
Kullback–Leibler divergence is a very useful way to measure the difference between two probability distributions. In this post we'll go over a simple example to help you better grasp this interesting
Feature scaling is a method used to standardize the range of independent variables or features of data.
This paper aims to clarify how and why data are normalized or standardized, these two process are used in the data preprocessing stage in which the data is prepared to be processed later by one of the data mining and machine learning techniques.
A Cambridge University course with lecture notes, providing an Introduction to string theory and conformal field theory.