Uploaded: 2024-08-27T12:20:30.000Z
Duration: PT12.276S
Channel: Tom Yeh

[Graph Convolutional Network] by hand ✍️ Graph Convolutional Networks... (GCNs), introduced by Thomas Kipf and Max Welling in 2017, have emerged as a powerful tool in the analysis and interpretation of data structured as graphs. This exercise demonstrates how GCN works in a simple application: binary classification. -- Goal -- Predict if a node in a graph is X. -- Architecture -- 🟪 Graph Convolutional Network (GCN) 1. GCN1(4,3) 2. GCN2(3,3) 🟦 Fully Connected Network (FCN) 1. Linear1(3,5) 2. ReLU 3. Linear2(5,1) 4. Sigmoid Simplications: • Adjacent matrices are not normalized. • ReLU is applied to messages directly. -- Walkthrough -- [1] Given ↳ A graph with five nodes A, B, C, D, E [2] 🟩 Adjacency Matrix: Neighbors ↳ Add 1 for each edge to neighbors ↳ Repeat in both directions (e.g., A->C, C->A) ↳ Repeat for both GCN layers [3] 🟩 Adjacency Matrix: Self ↳ Add 1's for each self loop ↳ Equivalent to adding the identity matrix ↳ Repeat for both GCN layers [4] 🟪 GCN1: Messages ↳ Multiply the node embeddings 🟨 with weights and biases ↳ Apply ReLU (negatives → 0) ↳ The result is one message per node [5] 🟪 GCN1: Pooling ↳ Multiply the messages with the adjacent matrix ↳ The purpose is the pool messages from each node's neighbors as well as from the node itself. ↳ The result is a new feature per node [6] 🟪 GCN1: Visualize ↳ For node 1, visualize how messages are pooled to obtain a new feature for better understanding ↳ [3,0,1] + [1,0,0] = [4,0,1] [7] 🟪 GCN2: Messages ↳ Multiply the node features with weights and biases ↳ Apply ReLU (negatives → 0) ↳ The result is one message per node [8] 🟪 GCN2: Pooling ↳ Multiply the messages with the adjacent matrix ↳ The result is a new feature per node [9] 🟪 GCN2: Visualize ↳ For node 3, visualize how messages are pooled to obtain a new feature for better understanding ↳ [1,2,4] + [1,3,5] + [0,0,1] = [2,5,10] [10] 🟦 FCN: Linear 1 + ReLU ↳ Multiply node features with weights and biases ↳ Apply ReLU (negatives → 0) ↳ The result is a new feature per node ↳ Unlike in GCN layers, no messages from other nodes are included. [11] 🟦 FCN: Linear 2 ↳ Multiply node features with weights and biases [12] 🟦 FCN: Sigmoid ↳ Apply the Sigmoid activation function ↳ The purpose is to obtain a probability value for each node ↳ One way to calculate Sigmoid by hand ✍️ is to use the approximation below: • >= 3 → 1 • 0 → 0.5 • <= -3 → 0 -- Outputs -- A: 0 (Very unlikely) B: 1 (Very likely) C: 1 (Very likely) D: 1 (Very likely) E: 0.5 (Neutral)show more

Tom Yeh

46,499 次观看 • 1 年前

[LSTM] by Hand ✍️ LSTMs have been the most... effective architecture to process long sequences of data, until our world was taken over by the Transformers. LSTMs belong to the broader family of recurrent neural network (RNNs) that process data sequentially in a recurrent manner. Transformers, on the other hand, abandon recurrence and use self-attention instead to process data concurrently in parallel. Recently, there is renewed interest in recurrence as people realized self-attention doesn’t scale to extremely long sequences, like hundreds of thousands of tokens. Mamba is a good example to bring back recurrence. All of a sudden, it is cool to study LSTMs. How do LSTMs work? [1] Given ↳ 🟨 Input sequence X1, X2, X3 (d = 3) ↳ 🟩 Hidden state h (d = 2) ↳ 🟦 Memory C (d = 2) ↳ Weight matrices Wf, Wc, Wi, Wo Process t = 1 [2] Initialize ↳ Randomly set the previous hidden state h0 to [1, 1] and memory cells C0 to [0.3, -0.5] [3] Linear Transform ↳ Multiply the four weight matrices with the concatenation of current input (X1) and the previous hidden state (h0). ↳ The results are feature values, each is a linear combination of the current input and hidden state. [4] Non-linear Transform ↳ Apply sigmoid σ to obtain gate values (between 0 and 1). • Forget gate (f1): [-4, -6] → [0, 0] • Input gate (i1): [6, 4] → [1, 1] • Output gate (o1): [4, -5] → [1, 0] ↳ Apply tanh to obtain candidate memory values (between -1 and 1) • Candidate memory (C’1): [1, -6] → [0.8, -1] [5] Update Memory ↳ Forget (C0 .* f1): Element-wise multiply the current memory with forget gate values. ↳ Input (C’1 .* o1): Element-wise multiply the “candidate” memory with input gate values. ↳ Update the memory to C1 by adding the two terms above: C0 .* f1 + C’1 .* o1 = C1 [6] Candiate Output ↳ Apply tanh to the new memory C1 to obtain candidate output o’1. [0.8, -1] → [0.7, -0.8] [7] Update Hidden State ↳ Output (o’1 .* o1 → h1): Element-wise multiply the candidate output with the output gate. ↳ The result is updated hidden state h1 ↳ Also, it is the first output. Process t = 2 [8] Initialize ↳ Copy previous hidden state h1 and memory C1 [9] Linear Transform ↳ Repeat [3] [10] Update Memory (C2) ↳ Repeat [4] and [5] [11] Update Hidden State (h2) ↳ Repeat [6] and [7] Process t = 3 [12] Initialize ↳ Copy previous hidden state h2 and memory C2 [13] Linear Transform ↳ Repeat [3] [14] Update Memory (C3) ↳ Repeat [4] and [5] [15] Update Hidden State (h3) ↳ Repeat [6] and [7]show more

Tom Yeh

72,891 次观看 • 1 年前

[Backpropagation] by Hand✍️ [1] Forward Pass ↳ Given a... multi layer perceptron (3 levels), an input vector X, predictions Y^{Pred} = [0.5, 0.5, 0], and ground truth label Y^{Target} = [0, 1, 0]. [2] Backpropagation ↳ Insert cells to hold our calculations. [3] Layer 3 - Softmax (blue) ↳ Calculate ∂L / ∂z3 directly using the simple equation: Y^{Pred} - Y^{Target} = [0.5, -0.5, 0]. ↳ This simple equation is the benefit of using Softmax and Cross Entropy Loss together. [4] Layer 3 - Weights (orange) & Biases (black) ↳ Calculate ∂L / ∂W3 and ∂L / ∂b3 by multiplying ∂L / ∂z3 and [ a2 | 1 ]. [5] Layer 2 - Activations (green) ↳ Calculate ∂L / ∂a2 by multiplying ∂L / ∂z3 and W3. [6] Layer 2 - ReLU (blue) ↳ Calculate ∂L / ∂z2 by multiplying ∂L / ∂a2 with 1 for positive values and 0 otherwise. [7] Layer 2 - Weights (orange) & Biases (black) ↳ Calculate ∂L / ∂W2 and ∂L / ∂b2 by multiplying ∂L / ∂z2 and [ a1 | 1 ]. [8] Layer 1 - Activations (green) ↳ Calculate ∂L / ∂a1 by multiplying ∂L / ∂z2 and W2. [9] Layer 1 - ReLU (blue) ↳ Calculate ∂L / ∂z1 by multiplying ∂L / ∂a1 with 1 for positive values and 0 otherwise. [10] Layer 1 - Weights (orange) & Biases (black) ↳ Calculate ∂L / ∂W1 and ∂L / ∂b1 by multiplying ∂L / ∂z1 and [ x | 1 ]. [11] Gradient Descent ↳ Update weights and biases (typically a learning rate is applied here). 💡 Matrix Multiplication is All You Need: Just like in the forward pass, backpropagation is all about matrix multiplications. You can definitely do everything by hand as I demonstrated in this exercise, albeit slow and imperfect. This is why GPU's ability to multiply matrices efficiently plays such an important role in the deep learning evolution. This is why NVIDIA is now close to $1 trillion in valuation. 💡Exploding Gradients: We can already see the gradients are getting larger as we back-propagate up, even in this simple 3-layer network. This motivates using methods like skip connections to handle exploding (or diminishing) gradients as in the ResNet. I did the calculations entirely by hand. Please let me know if you spot any error or have any questions!show more

Tom Yeh

64,645 次观看 • 1 年前

Released: major updates to the network dashboard, including a... show more

Hyperspace

10,836 次观看 • 10 个月前

Tension Map / Dynamic Squish 1. Geo Node to... show more

snek3d

183,159 次观看 • 29 天前

We are excited to unveil the latest version of... the AIOZ Node: The Version 4.0 update! This update includes a new user interface and brings substantial functional improvements, enhancing your overall experience for increased productivity and efficiency. More information below: The standout feature of AIOZ Node v4.0 is the introduction of the Transcoding functionality, which is currently available in beta. This functionality enables your node to participate in video transcoding, which converts video files into different formats for various digital devices and media platforms. By enabling transcoding, your node can contribute more significantly to the AIOZ Network, expanding the network's capabilities and potential $AIOZ token rewards. While the transcoding functionality is currently in beta, the upcoming AIOZ W3Stream integration, a DePIN Video Infrastructure due for release in Q3 2024, will unlock the full potential of your node and enable seamless video transcoding tasks. To get started with AIOZ Node v4.0, you simply need to visit our official website to download the latest version of the AIOZ Node: This download process is very straightforward, and with a one-click installation process, you can set up AIOZ Node v4.0 to start running on your device within a few minutes. If you are already running an AIOZ Node on your device, the version 4.0 update will be applied automatically, ensuring you have the latest features and improvements without hassle! With the Node v4.0 update running on your device, you can proceed to familiarize yourself with the new layout, check out the performance improvements, and start transcoding to see how it enhances your contributions to the network! Learn More: $AIOZshow more

AIOZ Network

20,428 次观看 • 1 年前

The best new feature in Blender 4 is NODE... show more

passivestar

180,326 次观看 • 2 年前

I came up with a very hacky use for... show more

redjam9

13,517 次观看 • 1 年前

The design of Geoff is pure genius. For anyone... show more

jack

21,402 次观看 • 2 个月前

Vector Database by Hand ✍️ Vector databases are revolutionizing... how we search and analyze complex data. They have become the backbone of Retrieval Augmented Generation (#RAG). How do vector databases work? [1] Given ↳ A dataset of three sentences, each has 3 words (or tokens) ↳ In practice, a dataset may contain millions or billions of sentences. The max number of tokens may be tens of thousands (e.g., 32,768 mistral-7b). Process "how are you" [2] 🟨 Word Embeddings ↳ For each word, look up corresponding word embedding vector from a table of 22 vectors, where 22 is the vocabulary size. ↳ In practice, the vocabulary size can be tens of thousands. The word embedding dimensions are in the thousands (e.g., 1024, 4096) [3] 🟩 Encoding ↳ Feed the sequence of word embeddings to an encoder to obtain a sequence of feature vectors, one per word. ↳ Here, the encoder is a simple one layer perceptron (linear layer + ReLU) ↳ In practice, the encoder is a transformer or one of its many variants. [4] 🟩 Mean Pooling ↳ Merge the sequence of feature vectors into a single vector using "mean pooling" which is to average across the columns. ↳ The result is a single vector. We often call it "text embeddings" or "sentence embeddings." ↳ Other pooling techniques are possible, such as CLS. But mean pooling is the most common. [5] 🟦 Indexing ↳ Reduce the dimensions of the text embedding vector by a projection matrix. The reduction rate is 50% (4->2). ↳ In practice, the values in this projection matrix is much more random. ↳ The purpose is similar to that of hashing, which is to obtain a short representation to allow faster comparison and retrieval. ↳ The resulting dimension-reduced index vector is saved in the vector storage. [6] Process "who are you" ↳ Repeat [2]-[5] [7] Process "who am I" ↳ Repeat [2]-[5] Now we have indexed our dataset in the vector database. [8] 🟥 Query: "am I you" ↳ Repeat [2]-[5] ↳ The result is a 2-d query vector. [9] 🟥 Dot Products ↳ Take dot product between the query vector and database vectors. They are all 2-d. ↳ The purpose is to use dot product to estimate similarity. ↳ By transposing the query vector, this step becomes a matrix multiplication. [10] 🟥 Nearest Neighbor ↳ Find the largest dot product by linear scan. ↳ The sentence with the highest dot product is "who am I" ↳ In practice, because scanning billions of vectors is slow, we use an Approximate Nearest Neighbor (ANN) algorithm like the Hierarchical Navigable Small Worlds (HNSW).show more

Tom Yeh

191,919 次观看 • 2 年前

The new Index Switch Node available in #blender 4.1... show more

Isaac

29,559 次观看 • 2 年前

JUST IN: 🇺🇸 US Admiral Paparo says the United... show more

Watcher.Guru

1,667,331 次观看 • 1 个月前

Heres a random release cause Im never gonna use... show more

360FNAF

20,885 次观看 • 9 个月前

I added a node to my custom node that... show more

toyxyz

117,354 次观看 • 1 年前

DecentralizedAI: The AI Layer2 for Google Chrome Use the... show more

DecentralizedAI

293,220 次观看 • 2 年前

Messiah Partners with IoTeX to Power Next-Gen Node Infrastructure... Messiah is proud to team up with IoTeX , a leading force in connecting the real world to Web3. Together, we’re bringing IoTeX’s infrastructure to NodeHub, making it easier than ever for the community to run and earn from nodes. We’re also excited to share that Messiah is now live on DepinScan , further solidifying our position in the decentralized infrastructure space. Explore our listing here: What’s next? IoTeX RPC Nodes on NodeHub: Deploying an IoTeX RPC Node will soon be as simple as a few clicks. Whether you’re bootstrapping a full node or configuring a gateway, NodeHub will handle the complexity so you can focus on building and transacting. Delegate Node Support: For those looking to play a bigger role in IoTeX’s governance and consensus, NodeHub will also support Delegate Node deployment. This partnership isn’t just about infrastructure, it’s about empowering builders, validators, and stakers to contribute to the growth of the IoTeX ecosystem with ease and security. The future of decentralized infrastructure is simple, powerful, and community-driven. With IoTeX on NodeHub, that future is closer than ever.show more

Messiah

34,101 次观看 • 10 个月前

⚡️ Final date for the Node Sale is here!... We're thrilled to announce the mint day, set for 5th April. 🗓️ We realised we are on par with many of the other projects in terms of product & traction but need more time to grow the community. So, we have added a special access for KOLs and carved out the entire Tier 5, 906 nodes, that they can get in presale. Note, our node sale is FCFS it starts at Tier 1 for the community and infact VCs get a Tier 7 allocation in presale. Public sale starts at 8am UTC 🕗show more

GPU.NET

93,896 次观看 • 2 年前

Release 1.23.1 The team continues to optimize the Drosera... show more

Drosera

13,511 次观看 • 8 个月前

Live Cam