Dataview:

list from [[]] and !outgoing([[]])

Medium Article

πŸ“— -> One-Hot Encoding

❗ Information

A Data-Processing method.

One-hot encoding is a common preprocessing technique used when working with categorical data in machine learning.

It serves to:

  • Convert categorical values into processable numeric vectors
  • Captures uniqueness of categories

πŸ“„ -> Methodology

When processing the three primary colors:

  • β€œred” becomes [1, 0, 0], β€œgreen” is [0, 1, 0], and β€œblue” is [0, 0, 1].>
    It will create a new column for each distinct category.

βœ’οΈ -> Usage

  • Might not be great if there are a large number of distinct categories.

    • It creates a new column for each distinct one. Number of features can grow quite large.
  • It categorical has a logical ordering (disagree, neutral, agree) Ordinal-Encoding is an alternative that maps to integers rather than binaries.

  • from sklearn.preprocessing import OneHotEncoder

πŸ§ͺ-> Example

  • Define examples where it can be used
  • Link all related words