Vector Database by Hand ✍️ Vector databases are revolutionizing... how we search and analyze complex data. They have become the backbone of Retrieval Augmented Generation (#RAG). How do vector databases work? [1] Given ↳ A dataset of three sentences, each has 3 words (or tokens) ↳ In practice, a dataset may contain millions or billions of sentences. The max number of tokens may be tens of thousands (e.g., 32,768 mistral-7b). Process "how are you" [2] 🟨 Word Embeddings ↳ For each word, look up corresponding word embedding vector from a table of 22 vectors, where 22 is the vocabulary size. ↳ In practice, the vocabulary size can be tens of thousands. The word embedding dimensions are in the thousands (e.g., 1024, 4096) [3] 🟩 Encoding ↳ Feed the sequence of word embeddings to an encoder to obtain a sequence of feature vectors, one per word. ↳ Here, the encoder is a simple one layer perceptron (linear layer + ReLU) ↳ In practice, the encoder is a transformer or one of its many variants. [4] 🟩 Mean Pooling ↳ Merge the sequence of feature vectors into a single vector using "mean pooling" which is to average across the columns. ↳ The result is a single vector. We often call it "text embeddings" or "sentence embeddings." ↳ Other pooling techniques are possible, such as CLS. But mean pooling is the most common. [5] 🟦 Indexing ↳ Reduce the dimensions of the text embedding vector by a projection matrix. The reduction rate is 50% (4->2). ↳ In practice, the values in this projection matrix is much more random. ↳ The purpose is similar to that of hashing, which is to obtain a short representation to allow faster comparison and retrieval. ↳ The resulting dimension-reduced index vector is saved in the vector storage. [6] Process "who are you" ↳ Repeat [2]-[5] [7] Process "who am I" ↳ Repeat [2]-[5] Now we have indexed our dataset in the vector database. [8] 🟥 Query: "am I you" ↳ Repeat [2]-[5] ↳ The result is a 2-d query vector. [9] 🟥 Dot Products ↳ Take dot product between the query vector and database vectors. They are all 2-d. ↳ The purpose is to use dot product to estimate similarity. ↳ By transposing the query vector, this step becomes a matrix multiplication. [10] 🟥 Nearest Neighbor ↳ Find the largest dot product by linear scan. ↳ The sentence with the highest dot product is "who am I" ↳ In practice, because scanning billions of vectors is slow, we use an Approximate Nearest Neighbor (ANN) algorithm like the Hierarchical Navigable Small Worlds (HNSW).show more

Tom Yeh
191,919 Aufrufe • vor 2 Jahren
Stokes' Theorem is a classic result from vector calculus.... It tells us that the line integral of a vector field over a loop is equal to the surface integral of the curl of the vector field over some enclosed surface.show more

Alec Helbling
354,621 Aufrufe • vor 4 Monaten
[Graph Convolutional Network] by hand ✍️ Graph Convolutional Networks... (GCNs), introduced by Thomas Kipf and Max Welling in 2017, have emerged as a powerful tool in the analysis and interpretation of data structured as graphs. This exercise demonstrates how GCN works in a simple application: binary classification. -- Goal -- Predict if a node in a graph is X. -- Architecture -- 🟪 Graph Convolutional Network (GCN) 1. GCN1(4,3) 2. GCN2(3,3) 🟦 Fully Connected Network (FCN) 1. Linear1(3,5) 2. ReLU 3. Linear2(5,1) 4. Sigmoid Simplications: • Adjacent matrices are not normalized. • ReLU is applied to messages directly. -- Walkthrough -- [1] Given ↳ A graph with five nodes A, B, C, D, E [2] 🟩 Adjacency Matrix: Neighbors ↳ Add 1 for each edge to neighbors ↳ Repeat in both directions (e.g., A->C, C->A) ↳ Repeat for both GCN layers [3] 🟩 Adjacency Matrix: Self ↳ Add 1's for each self loop ↳ Equivalent to adding the identity matrix ↳ Repeat for both GCN layers [4] 🟪 GCN1: Messages ↳ Multiply the node embeddings 🟨 with weights and biases ↳ Apply ReLU (negatives → 0) ↳ The result is one message per node [5] 🟪 GCN1: Pooling ↳ Multiply the messages with the adjacent matrix ↳ The purpose is the pool messages from each node's neighbors as well as from the node itself. ↳ The result is a new feature per node [6] 🟪 GCN1: Visualize ↳ For node 1, visualize how messages are pooled to obtain a new feature for better understanding ↳ [3,0,1] + [1,0,0] = [4,0,1] [7] 🟪 GCN2: Messages ↳ Multiply the node features with weights and biases ↳ Apply ReLU (negatives → 0) ↳ The result is one message per node [8] 🟪 GCN2: Pooling ↳ Multiply the messages with the adjacent matrix ↳ The result is a new feature per node [9] 🟪 GCN2: Visualize ↳ For node 3, visualize how messages are pooled to obtain a new feature for better understanding ↳ [1,2,4] + [1,3,5] + [0,0,1] = [2,5,10] [10] 🟦 FCN: Linear 1 + ReLU ↳ Multiply node features with weights and biases ↳ Apply ReLU (negatives → 0) ↳ The result is a new feature per node ↳ Unlike in GCN layers, no messages from other nodes are included. [11] 🟦 FCN: Linear 2 ↳ Multiply node features with weights and biases [12] 🟦 FCN: Sigmoid ↳ Apply the Sigmoid activation function ↳ The purpose is to obtain a probability value for each node ↳ One way to calculate Sigmoid by hand ✍️ is to use the approximation below: • >= 3 → 1 • 0 → 0.5 • <= -3 → 0 -- Outputs -- A: 0 (Very unlikely) B: 1 (Very likely) C: 1 (Very likely) D: 1 (Very likely) E: 0.5 (Neutral)show more

Tom Yeh
46,499 Aufrufe • vor 1 Jahr
[Discrete Fourier Transform] by Hand ✍️ In signal processing,... the Discrete Fourier Transform (DFT) is no doubt the most important method. But the math involved is extremely complex, literally, involving a summation over a complex number term e^(-iwt). I developed this exercise to demonstrate that underneath such complexity, DFT is just a series of matrix multiplications you can calculate by hand. ✍️ Once you see that, it should not surprise you that a deep neural network, which is also a series of matrix multiplications, with activation functions in-between, can learn to perform DFT to process and analyze signals so effectively. How does DFT work? [1] Given ↳ Signals A, B, and C in the 🟧 frequency domain: ◦ A = cos(w) + 2cos(2w) ◦ B = cos(w) + cos(3w) + cos(4w) ◦ C = -cos(2w) + cos(3w) ◦ Each signal is a weighed sum of four cosine waves at frequencies 1w, 2w, 3w, and 4w. ◦ We will apply Inverse DFT to convert the signals to time domain representations, and then demonstrate DFT can convert back to their original frequency domain representations. ↳ Signal X in the 🟩 time domain. X is sampled at 10 time points 1t, 2t, …, 10t: ◦ X = [-2.5, -1.8, 3, -0.7, -1.0, -0.7, 3, -1.8, -2.5, 5] ◦ Suppose X is also a weighted sum of the same four cosine waves, but we don’t already know their weights. We will apply DFT to discover them. [2] 🟧 Frequency Matrix (F) ↳ Write the coefficients of A, B, C as a matrix F. Each signal is a row. Each frequency is a column. ↳ A → [1, 2, 0, 0] ↳ B → [1, 0, 1, 1] ↳ C → [0, 1-, 1, 0] [3] Cosine → Discrete ↳ Sample from the continuous cosine waves at discrete time points 1t, 2t, 3t, to 10t. [4] Cosine Matrix (W) ↳ Write the samples as a matrix, Each frequency is a row. Each time point is a column. [5] Inverse DFT: 🟧 Frequency → 🟩 Time ↳ Multiply the frequency matrix F and the cosine matrix W. ↳ The meaning of this multiplication is to linearly combine the four cosine waves (rows in W) into time-domain signals (rows in T) using the weights specified in F. ↳ The result is matrix T, which are signals A, B, C converted to the time domain. Each signal is a row. Each time point is a column. [6] Transpose ↳ Transpose T, converting each signal’s time domain representation from a row to a column. [7] DFT: 🟩 Time → 🟧 Frequency ↳ Multiply the cosine matrix W with the transpose of matrix T. ↳ The purpose of this multiplication is to take a dot-product between each time-domain signal (columns in the transpose of T) and each cosine wave (rows in W), which has the effect of projecting the signal onto a cosine wave to determine how much they are correlated. Zero means not correlated at all. ↳ The result is an intermediate version of the “recovered” frequency matrix where each column corresponds to a signal and each row corresponds to a frequency. ↳ Compared to the original frequency matrix F, this intermediate matrix has non-zero weights in the correct places, but scaled up by a factor of 5 (n/2, n=10). For example, signal A, originally [1,2,0,0], is recovered at [5,10,0,0]. [8] Scale ↳ Multiply each value by 2/n = 1/5 to scale down the intermediate matrix to match the magnitude of the original frequency matrix F. [9] Transpose ↳ Transpose the recovered frequency matrix back to the same orientation of the original frequency matrix F. ↳ Like magic 🪄, the result is identical to the original F, which means DFT successfully recovered the frequency components of signals A, B, C. [10] Apply DFT to X: 🟩 Time → 🟧 Frequency ↳ Now that we have some confidence in DFT’s ability to recover frequency components, we apply DFT to X’s time-domain representation by multiplying W with X. ↳ The result is the an intermediate matrix. [11] Scale ↳ Similarly, we scale down by a factor of 5 to obtain the recovered frequency components of X (a column). [12] Transpose ↳ Similarly, we transpose the recovered column to row to match the orientation of the frequency matrix. ↳ Using the coefficients [0,0,3,2], we can write the equation of X as 3cos(3w) + 2cos(4w). Notes: I hope this by hand exercise helps you understand the essence of DFT. But there is more technical details, such as: • Sine: The complete DFT math also includes sine waves that follow a similar calculation process. • Phase: Here, we assume all the cosine waves are aligned at the origin, namely, phase is 0. If a phase p is added, for example, cos(w+p), we will need to calculate the sine component and use their ratio to figure out what p is. • Magnitude: If phase is not zero, the magnitude will need to be calculated by combining both cosine and sine terms.show more

Tom Yeh
116,622 Aufrufe • vor 1 Jahr
The Helmholtz decomposition is one of the fundamental results... of vector calculus. It says any well-behaved vector field can be split into two parts, one capturing sources and sinks through divergence, and one capturing rotation through curl.show more

Alec Helbling
220,625 Aufrufe • vor 22 Tagen
One cool thing about ColBERT-based search compared to the... cosine-based vector retrieval is that you get interpretability for free as a byproduct of the MaxSim computation. It's kind of like the Lucene highlighter, letting you grab the most relevant snippets from a long document to show users where their query matches. With Jina-ColBERT-v1, which supports up to 8K token length, released by us earlier this Feb., the visualization of the late interaction between a query and a document is almost... artistic. The video shows the late interaction between the query "Elephants eat 150 kg of food per day." and the Wikipedia article about "Indian Elephant". Darker colors indicate stronger semantic matches. The darkest area corresponds to "The species is classified as a megaherbivore and consume up to 150 kg (330 lb) of plant matter per day." from the original article.show more

Jina AI
22,254 Aufrufe • vor 1 Jahr
The power of QuiverAI 's newest vector image models... is in a workflow. Connect Arrow 1.1 to your LLMs and image models in FLORA, and it becomes a full creative system. This tutorial covers logo ideation, fashion design, and a lot more. Here are the use cases:show more

FLORA ©
104,344 Aufrufe • vor 1 Monat
more experiments in shapes of text: mapping word embeddings... of a paragraph using different sliding windows (1, 2, 3, 5, 8 and 13 words)to 3d space. unsurprisingly, as the sliding window size gets larger the whole shape gets “smoother”.show more

Kat ⊷ the Poet Engineer
59,024 Aufrufe • vor 1 Jahr
INDIAN DRONE MOTORS !!! Yes you read that right,... Vector Technics from Hyderabad is making Drone motors in India from scratch from steel to the finished product, These motors are also exported Globally, i got to do a rare tour of their factory, video linked in replyshow more

Gareeb Scientist
432,895 Aufrufe • vor 9 Monaten
Kedarnath is a tremendous space. The utterance of the... sound “Shiva” attains a completely new dimension and significance in Kedar. It is a space which has been specially prepared for this particular sound. When we utter the word “Shiva,” it is the freedom of the uncreated, the liberation of one who is not created. It is almost like on this planet, the sound “Shiva” emanates from this place. For thousands of years, people have experienced that space as a reverberation of that sound. This is also a place that has witnessed thousands of Yogis and mystics of every kind. When I say every kind, you cannot imagine those kinds. These are people who made no attempt to teach anything to anyone. Their way of making an offering to the world was by leaving their energies, their path, their work – everything – in a certain way in these spaces.show more

Sadhguru
57,561 Aufrufe • vor 1 Monat
We dropped a new build just in time for... the holidays. One of my favorite additions is symmetry for tube and solid shapes in vector layers. This is great for greebling up sci-fi and hard surface stuff.show more

Joe Wilson
23,388 Aufrufe • vor 1 Jahr
I vibe coded a visual PDF search app with... ColQwen2. This is how it works: - Store PDF files as images in a Weaviate AI Database vector database - Embed images and text with a multimodal late-interaction model (ColQwen2) - Generate token-wise (and summed) similarity maps to highlight image patches with high similarity Now I need to refactor the messy vibe-coded project. In the meantime, you can check out the Notebook this demo is based on to try it out yourself:show more

Leonie
34,474 Aufrufe • vor 9 Monaten
Land doesn't vote, people do - German edition. Each... municipality is represented by a dot, with the size of the dot proportional to the number of voters.show more

Xavi Ruiz
69,370 Aufrufe • vor 2 Jahren
Turso is an incredible technical feat. A Rust rewrite... of sqlite, with an async-first architecture, incoming support for concurrent writes, vector search, and browser / wasm support out of the box. I think this has a very good chance of being a foundational piece of infrastructure of the vibe-coding age. On-demand, sqlite-compatible global databases that can also run in-browser and on-device. The pace at which the project is evolving is most definitely *not normal*. Pekka Enberg and Glauber Costa are built different. Demo:show more

Guillermo Rauch
242,000 Aufrufe • vor 8 Monaten
The UFO/UAP phenomenon is ancient. It has been here... long before us, and it is connected to a grand hierarchy created by God. They are coming from higher dimensions, something that religions would describe as the Tree of Life. The phenomenon is part of us, and our soul is part of this grand system. We are currently in the lowest dimension, chakra, sefirot, or part of the soul, known as Malkhut in Kabbalah, Muladhara in Hinduism, and Khat in ancient Egypt. We are made of matter, but our soul is made of fire, and it is time for an upgrade. The Geophysical Event is connected to this transformation.show more

Open Minded Approach
136,661 Aufrufe • vor 20 Tagen
We ran quickly from the intensity of the bombing... to the stairwell because it is considered safer. We are living in a genocide that has not stopped for two years. The situation is difficult here. I am afraid for my family and myself. If you’re scrolling, PLEASE leave a dot . it's just a dot.show more

Hasan alrabay
19,507 Aufrufe • vor 9 Monaten
As a gynecologist, one of the most frequently asked... questions is about the ''hymen''. The hymen is not a structure located deep inside the vagina. It is a thin, flexible fold of tissue situated very close to the vaginal opening, anatomically at the entrance of the vaginal canal. It is typically found about 1–2 cm inside the vaginal opening, but in many women, it is almost at the surface. Therefore, the common belief that it is located deep inside is incorrect. One of the most important points is this: The hymen is not a closed membrane. It naturally has openings that allow menstrual blood to pass. Additionally, the structure of the hymen varies from person to person. Its thickness, elasticity, and shape are not standard. In some women, it may be very thin and elastic, while in others, it may be minimal or barely noticeable. This is completely a normal anatomical variation. A common misconception is that changes in the hymen occur only due to sexual intercourse. In reality, factors such as certain physical activities or tampon use can also lead to stretching or changes in this tissue. Therefore, the hymen is not a reliable indicator of a person’s sexual history or “virginity.” In summary, the hymen is a small and variable anatomical structure. The meanings attributed to it are largely shaped by sociocultural beliefs rather than medical facts.show more

Op. Dr. Mehmet Bekir Şen
113,262 Aufrufe • vor 1 Monat
[LSTM] by Hand ✍️ LSTMs have been the most... effective architecture to process long sequences of data, until our world was taken over by the Transformers. LSTMs belong to the broader family of recurrent neural network (RNNs) that process data sequentially in a recurrent manner. Transformers, on the other hand, abandon recurrence and use self-attention instead to process data concurrently in parallel. Recently, there is renewed interest in recurrence as people realized self-attention doesn’t scale to extremely long sequences, like hundreds of thousands of tokens. Mamba is a good example to bring back recurrence. All of a sudden, it is cool to study LSTMs. How do LSTMs work? [1] Given ↳ 🟨 Input sequence X1, X2, X3 (d = 3) ↳ 🟩 Hidden state h (d = 2) ↳ 🟦 Memory C (d = 2) ↳ Weight matrices Wf, Wc, Wi, Wo Process t = 1 [2] Initialize ↳ Randomly set the previous hidden state h0 to [1, 1] and memory cells C0 to [0.3, -0.5] [3] Linear Transform ↳ Multiply the four weight matrices with the concatenation of current input (X1) and the previous hidden state (h0). ↳ The results are feature values, each is a linear combination of the current input and hidden state. [4] Non-linear Transform ↳ Apply sigmoid σ to obtain gate values (between 0 and 1). • Forget gate (f1): [-4, -6] → [0, 0] • Input gate (i1): [6, 4] → [1, 1] • Output gate (o1): [4, -5] → [1, 0] ↳ Apply tanh to obtain candidate memory values (between -1 and 1) • Candidate memory (C’1): [1, -6] → [0.8, -1] [5] Update Memory ↳ Forget (C0 .* f1): Element-wise multiply the current memory with forget gate values. ↳ Input (C’1 .* o1): Element-wise multiply the “candidate” memory with input gate values. ↳ Update the memory to C1 by adding the two terms above: C0 .* f1 + C’1 .* o1 = C1 [6] Candiate Output ↳ Apply tanh to the new memory C1 to obtain candidate output o’1. [0.8, -1] → [0.7, -0.8] [7] Update Hidden State ↳ Output (o’1 .* o1 → h1): Element-wise multiply the candidate output with the output gate. ↳ The result is updated hidden state h1 ↳ Also, it is the first output. Process t = 2 [8] Initialize ↳ Copy previous hidden state h1 and memory C1 [9] Linear Transform ↳ Repeat [3] [10] Update Memory (C2) ↳ Repeat [4] and [5] [11] Update Hidden State (h2) ↳ Repeat [6] and [7] Process t = 3 [12] Initialize ↳ Copy previous hidden state h2 and memory C2 [13] Linear Transform ↳ Repeat [3] [14] Update Memory (C3) ↳ Repeat [4] and [5] [15] Update Hidden State (h3) ↳ Repeat [6] and [7]show more

Tom Yeh
72,891 Aufrufe • vor 1 Jahr
Also! You can now also simplify your complex vector... paths and reduce nodes with the newly added simplify vector tool in Figma Draw! I was super stoked to demo this yesterday on the Release Notes live stream!show more

miggi from figgi
53,090 Aufrufe • vor 10 Monaten
Retrieval Weighting RAG by hand ✍️ + Langflow .... I am designing a series of exercises to teach advanced RAG techniques. This is No. 5. Previous exercises are in the comment.show more

Tom Yeh
48,408 Aufrufe • vor 1 Jahr
I built MatmulFlow ( — an interactive tool that... makes matrix multiplication dimensions visual, part of my AI by Hand ✍️ series. Matrix multiplication dimensions are confusing. Which is the inner dimension? Columns of the first or rows of the second? And when you chain five multiplications together, it gets worse. The idea: represent matrices as rectangles. Shift the second matrix up and to the right. The edges that must align become obvious. The result fills in the remaining space. No memorization. You can see it. It extends to chains. Stack vertically for left-multiplication. Stack horizontally for right-multiplication. Resize any matrix and watch the dimensions "flow" through the entire chain. Give it a try!show more

Tom Yeh
25,992 Aufrufe • vor 2 Monaten