Dataview:
list from [[]] and !outgoing([[]])Medium Standardization
Medium Normalization
š -> Normalization and Standardization
Transclude of Normalization_vs_Standardization.avif
ā Information
Both are key elements of Data-Processing.
| Normalization | Standardization |
|---|---|
| ThisĀ methodĀ scalesĀ theĀ modelĀ usingĀ minimumĀ andĀ maximumĀ values. | ThisĀ methodĀ scalesĀ theĀ modelĀ usingĀ theĀ meanĀ andĀ standardĀ deviation. |
| WhenĀ featuresĀ areĀ onĀ variousĀ scales,Ā itĀ isĀ functional. | WhenĀ aĀ variableāsĀ meanĀ andĀ standardĀ deviationĀ areĀ bothĀ setĀ toĀ 0,Ā itĀ isĀ beneficial. |
| ValuesĀ onĀ theĀ scaleĀ fallĀ betweenĀ [0,Ā 1]Ā andĀ [-1,Ā 1]. | ValuesĀ onĀ aĀ scaleĀ areĀ notĀ constrainedĀ toĀ aĀ particularĀ range. |
| Additionally known as scaling normalization. | This process is called Z-score normalization. |
| When the feature distribution is unclear, it is helpful. | When the feature distribution is consistent, it is helpful. |
š -> Methodology
Normalization
Data Normalization fits data between 0 and 1, transforming the dataset into a common scale. Its goal is to eliminate the potential biases and distortions caused by different scales of features.
Rescales each feature to a common range, making them directly comparable.
Methods:
- Min-max scaling
- Will scale data proportionally on a 0-1 interval
- Z-score standardization (this is the one Scikit uses)
- Log transformation
- Quantile Normalization
from sklearn.preprocessing import minmax_scale
Standardization
Rescaling data to have zero mean and unit variance (like a z-score)
Two main benefits:
- Mitigates outliers skewing the distribution
- Allowing direct comparison of model coefficients. Creates a standardized scale..
from sklearn.preprocessing import StandardScaler
āļø -> Usage
Its use is critical in the processing of data. This can help a ML model to make better inference, or to make data easier to create inference from (data science).
Standardization Comments:
Standardization is more commonly used for machine learning algorithms like linear regression that assume a normal distribution. If you are using neural networks, you should consider normalization since standardization maintains differences in the original distribution, and normalization distorts them.
Some advanced ML models are become more powerful, and they are getting better at learning directly from raw unmodified data. These models can automatically identify patterns in the raw data distribution, reducing the need for explicit pre-processing steps like standardization.
Normalization Comments:
Normalization also aids convergence during gradient descent.
Normalization helps rescale the gradients to be more uniform, improving convergence.
Consider normalizing only on the training data to avoid test data going out of bounds ā Fitting the min-max bounds only on training data prevents test set outliers from getting clipped during normalization.
Works well for sigmoid activation functions (already expecting 0-1 inputs)
May not help models that are scale-invariant like linear regression ā Linear models donāt require normalization since adding a constant input bias term absorbs any offsets in scale.
Class Connection:
Normalization
Given
Link to original
- If there is a large value, we prefer to normalize the vector
- Different norm methods:
- Min-max normalization:
- Given a vector, find the minimum and maximum values in the vector. Two scalars.
- Here min=0.1, max =100
- Normalizes x:
. - Mean STD Normalization
- Find mean and STD. They are both scalars
- Layer norm does mean/std norm, finding mean and STD across layers
- Layer norm normalizes instances
- Batch norm does mean/std norm, but across columns
- Batch norm normalizes features
š§Ŗ-> Example
- Define examples where it can be used
š -> Related Word
- Link all related words