Загрузка видео...

Не удалось загрузить видео

На главную

Functions are vectors! This perspective lets us apply the tools of linear algebra to computational problems from image and geometry processing to machine learning and light transport—and provides a natural explanation for Fourier series. Let's explore:

30,424 просмотров • 2 лет назад •via X (Twitter)

Комментарии: 0

Нет доступных комментариев

Здесь появятся комментарии из оригинального поста

Похожие видео

[CLIP] by Hand ✍️ The CLIP (Contrastive Language–Image Pre-training) model, a groundbreaking work by OpenAI, redefines the intersection of computer vision and natural language processing. It is the basis of all the multi-modal foundation models we see today. How does CLIP work? Goal: 🟨 Learn a shared embedding space for text and image [1] Given ↳ A mini batch of 3 text-image pairs ↳ OpenAI used 400 million text-image pairs to train its original CLIP model. Process 1st pair: "big table" [2] 🟪 Text → 2 Vectors (3D) ↳ Look up word embedding vectors using word2vec. [3] 🟩 Image → 2 Vectors (4D) ↳ Divide the image into two patches. ↳ Flatten each patch [4] Process other pairs ↳ Repeat [2]-[3] [5] 🟪 Text Encoder & 🟩 Image Encoder ↳ Encode input vectors into feature vectors ↳ Here, both encoders are simple one layer perceptron (linear + ReLU) ↳ In practice, the encoders are usually transformer models. [6] 🟪 🟩 Mean Pooling: 2 → 1 vector ↳ Average 2 feature vectors into a single vector by averaging across the columns ↳ The goal is to have one vector to represent each image or text [7] 🟪 🟩 -> 🟨 Projection ↳ Note that the text and image feature vectors from the encoders have different dimensions (3D vs. 4D). ↳ Use a linear layer to project image and text vectors to a 2D shared embedding space. 🏋️ Contrastive Pre-training 🏋️ [8] Prepare for MatMul ↳ Copy text vectors (T1,T2,T3) ↳ Copy the transpose of image vectors (I1,I2,I3) ↳ They are all in the 2D shared embedding space. [9] 🟦 MatMul ↳ Multiply T and I matrices. ↳ This is equivalent to taking dot product between every pair of image and text vectors. ↳ The purpose is to use dot product to estimate the similarity between a pair of image-text. [10] 🟦 Softmax: e^x ↳ Raise e to the power of the number in each cell ↳ To simplify hand calculation, we approximate e^□ with 3^□. [11] 🟦 Softmax: ∑ ↳ Sum each row for 🟩 image→🟪 text ↳ Sum each column for 🟪 text→ 🟩 image [12] 🟦 Softmax: 1 / sum ↳ Divide each element by the column sum to obtain a similarity matrix for 🟪 text→🟩 image ↳ Divide each element by the row sum to obtain a similarity matrix for 🟩 image→🟪 text [13] 🟥 Loss Gradients ↳ The "Targets" for the similarity matrices are Identity Matrices. ↳ Why? If I and T come from the same pair (i=j), we want the highest value, which is 1, and 0 otherwise. ↳ Apply the simple equation of [Similarity - Target] to compute gradients of for both directions. ↳ Why so simple? Because when Softmax and Cross-Entropy Loss are used together, the math magically works out that way. ↳ These gradients kick off the backpropagation process to update weights and biases of the encoders and projection layers (red borders).

Tom Yeh

67,790 просмотров • 2 лет назад

a playlist of 30 youtube videos to learn machine learning fundamentals from scratch if you're struggling on where to start learning ML, this list goes this "Machine Learning: Teach by Doing" is a solid choice to learn both theory and code. (1) Introduction to Machine Learning Teach by Doing: (2) What is Machine Learning? History of Machine Learning: (3) Types of ML Models: (4) 6 steps of any ML project: (5) Install Python and VSCode and run your first code: (6) Linear Classifiers Part 1: (7) Linear Classifiers Part 2: (8) Jupyter Notebook, Numpy and Scikit-Learn: (9) Running the Random Linear Classifier Algorithm in Python: (10) The oldest ML model - Perceptron: (11) Coding the Perceptron: (12) Perceptron Convergence Theorem: (13) Magic of features in Machine Learning: (14) One hot encoding: (15) Logistic Regression Part 1: (16) Cross Entropy Loss: (17) How gradient descent works: (18) Logistic Regression from scratch in Python: (19) Introduction to Regularization: (20) Implementing Regularization in Python: (21) Linear Regression Introduction: (22) Ordinary Least Squares step by step implementation: (23) Ridge regression fundamentals and intuition: (24) Regression recap for interviews: (25) Neural network architecture in 30 minutes: (26) Backpropagation intuition: (27) Neural network activation functions: (28) Momentum in gradient descent: (29) Hands on neural network training in Python: (30) Introduction to Convolutional Neural Networks (CNNs):

ℏεsam

117,570 просмотров • 1 год назад

if you're struggling on where to start learning ML, here’s a playlist of 30 youtube videos to learn machine learning fundamentals from scratch "Machine Learning: Teach by Doing" is a solid choice to learn both theory and code. (1) Introduction to Machine Learning Teach by Doing: (2) What is Machine Learning? History of Machine Learning: (3) Types of ML Models: (4) 6 steps of any ML project: (5) Install Python and VSCode and run your first code: (6) Linear Classifiers Part 1: (7) Linear Classifiers Part 2: (8) Jupyter Notebook, Numpy and Scikit-Learn: (9) Running the Random Linear Classifier Algorithm in Python: (10) The oldest ML model - Perceptron: (11) Coding the Perceptron: (12) Perceptron Convergence Theorem: (13) Magic of features in Machine Learning: (14) One hot encoding: (15) Logistic Regression Part 1: (16) Cross Entropy Loss: (17) How gradient descent works: (18) Logistic Regression from scratch in Python: (19) Introduction to Regularization: (20) Implementing Regularization in Python: (21) Linear Regression Introduction: (22) Ordinary Least Squares step by step implementation: (23) Ridge regression fundamentals and intuition: (24) Regression recap for interviews: (25) Neural network architecture in 30 minutes: (26) Backpropagation intuition: (27) Neural network activation functions: (28) Momentum in gradient descent: (29) Hands on neural network training in Python: (30) Introduction to Convolutional Neural Networks (CNNs):

ℏεsam

108,861 просмотров • 1 год назад