๐Ÿ“— -> 12/03/24: GAN


Slides

๐ŸŽค Vocab

โ— Unit and Larger Context

Small summary

โœ’๏ธ -> Scratch Notes

Video GAN

In relevant situations, we donโ€™t just want to deepfake random images. We want to condition our model on existing images, to โ€˜extendโ€™ existing pictures/videos.

  • Also a very difficult question. What should the model move? We as people now that the laws of physics apply, and that certain things do and donโ€™t move (trains vs train tracks), but the model does not.

GAN has a latent space:

  • The GAN serves as a transformation from noise to image
  • Interestingly: you can do vector addition of noise inputs to get fun results
  • Manipulating the noise that creates these images, not the images themselves
  • Ties to Explainable-AI, and similar work being done on LLMs.
  • For actual applications, you can imagine wanting to translate an image from daytime to night-time
    • Image to Image Translation

Image-to-Image Translation with Conditional Adversarial Networks

  • Looks intersting (creating a city view from satelite view but I donโ€™t understand)

Other interesting things with GAN

  • Text-to-Image Translation (text2image)
  • Learning What and Where to Draw
  • Semantic-Image-to-Photo Translation
  • Generate New Human Poses
  • Photos to Emojis
  • Photograph Editing (+- bangs, smiling, etc)
  • Image De-raining
  • Photo Inpainting
  • Face Image Inpainting
  • Clothing Translation

Resources

  • Put useful links here

Connections

  • Link all related words