Загрузка видео...

Не удалось загрузить видео

На главную

SQL has levels to it: - level 1 SELECT, FROM, WHERE, GROUP BY, HAVING, LIMIT Master these basic keywords and you’ll be well on your way to mastering SQL. - level 2 Mastering JOINs: Most common JOINs: INNER and LEFT Less common JOINs: FULL OUTER Joins you should avoid...

33,136 просмотров • 26 дней назад •via X (Twitter)

Комментарии: 0

Нет доступных комментариев

Здесь появятся комментарии из оригинального поста

Похожие видео

Dear Friend, I wrote this book for you. For the past year, I have labored to create a product that will help you learn and master SQL. I have been there. I have felt the frustration of trying to learn SQL and not knowing where to begin. I have lived through the struggle of setting up a platform to run SQL queries. Most platforms require sign-ups and logins that create a headache for learners. I also know the challenge of finding proper SQL exercises that mirror the real-world experience of a data analyst. Yes, I have been in your shoes. That’s why I created SQL Essentials for Data Analysis: A 50-Day Hands-on Challenge Book (Go From Beginner to Pro). Yes, to give you a clear, practical path from beginner to confident SQL user. ✅Why SQL Still Matters You may be wondering if SQL still matters in 2025. The answer: it has never mattered more. SQL is the lingua franca of data. Data still lives in databases, and the only language it truly understands is SQL. Think about it, even in Python, SQL is there. You’ve probably heard about the powerful pandas library. Guess what? It also has some SQL. And don’t get me started on BigQuery, Tableau, Power BI, and Databricks; the answer is the same: they all rely on SQL. SQL is the big shadow that hovers over everything data. This is why learning SQL is a must for data analysts, engineers, scientists, and anyone working with data. SQL connects everything: exploration, extraction, transformation, modeling, validation, and reporting. ✅Why I Wrote This Book Dear friend, I wanted to create a resource that gives you everything you need to learn SQL for data analysis. Quite often, resources are scattered across different places. You might learn theory in one place, search for datasets in another, and hunt for questions somewhere else. More often than not, the only place you can tackle SQL challenges is online. But online platforms usually focus on syntax and don’t reflect the messiness of real-world data. I wrote this book to give you the best of both worlds: theory and practice. I don’t want you to be worrying about where to find resources. I want you to focus only on learning SQL. If you are new to SQL or need a refresher on the fundamentals, Part 1 of the book has you covered. If you are looking for practice, Part 2 is 49 days of hands-on SQL challenges designed to mirror real-world tasks. Each day in the book is designed to feel like a mini project, rather than isolated exercises. Take Day 15: Standardize Climbers Data, for example: On this day, you’re not just writing a single query; you’re working with a dataset from start to finish. By combining these tasks, you experience a full data preprocessing workflow, just like a real project. You get to practice loading, transforming, cleaning, and validating data, all in one challenge. This approach makes every day a hands-on project, not just an isolated query. You’re learning how SQL is used in real-world scenarios, not just memorizing syntax. By the end of each day, you’ve solved a problem that feels meaningful and practical: yes, something that mirrors data analysts’ and engineers’ work in real life. In this book I use SQLite. I chose SQLite because it’s simple, lightweight, and runs on any system without complicated setups or cloud accounts. You don’t need to worry about complex configurations. SQLite allows you to focus entirely on learning SQL concepts, queries, and logic without distractions. You will just have to import it. I also structured the book for use in Jupyter or Google Colab notebooks. These are playgrounds for data analysts, engineers, and scientists. These environments are interactive and flexible. They let you run queries, visualize results, and experiment in real time. Using notebooks ensures that you can practice SQL while documenting your work and learning at your own pace, all in one place. No need for sign-ups. ✅Why 50 Days? I chose 50 days intentionally. Learning SQL isn’t a sprint; it’s a habit. You can’t truly master a language by cramming a few queries in one sitting. 50 days creates a commitment. You attach yourself to a goal, a tangible outcome. Every day is a small win, a step forward, and by the end of the journey, you’ve transformed your understanding of SQL. By spreading the learning over 50 days, you build momentum, consistency, and confidence. Think of it like training for a marathon. You don’t run 26 miles on the first day. You run a little each day, gradually building strength, endurance, and skill. By the end of the 50 days, you’ll have tackled a wide range of SQL tasks: from simple filtering to window functions, date operations, joins, and performance tuning. You’ll have not just learned SQL but truly internalized it. The goal isn’t to overwhelm you. It’s to give you a structured, achievable path that fits into your daily routine, so learning SQL becomes natural, steady, and rewarding. Even if you don’t finish within 50 days, the 50-day structure gives you a rhythm, a habit, and a sense of accomplishment. The kind of outcome that sticks long after the book is finished. In summary, I wrote the book to address these pain points: 🔶Not knowing where to start: The book gives you a clear roadmap that guides you day by day. 🔶Too much theory, not enough practice: Reading about SQL is not the same as doing SQL. This book includes hands-on challenges that mirror real-world scenarios, so you’re not just memorizing commands; you’re learning to think like a data analyst. 🔶Complex setup: Many learners get stuck setting up databases or configuring environments. You will not worry about complex setups; everything runs in SQLite3 inside Jupyter Notebook, so you start immediately. 🔶Disconnected learning: The challenges mirror real-world analytics problems. Every day here is like a mini project, giving you the experience of exploring, cleaning, transforming, and analyzing data ✅What I ask of You I wrote this book for you because I want you to succeed, but books alone don’t create mastery; your effort does. I have provided the tools. All I ask is that you show up every day. Even if it’s just 20–30 minutes, take the challenge seriously. Tackle the problems, experiment with your queries, make mistakes, and fix them. That’s how real learning happens. I also ask that you trust the process. The book is designed to guide you from beginner to confident SQL user, step by step. Some days will feel "easy" and others "hard." Stay the course, and by the end, you’ll see how all the pieces fit together. Finally, I ask that you bring curiosity and persistence. SQL is a language of logic and structure, but it’s also a language of insight. The more you explore, the more patterns you’ll discover, and the more confident you’ll become in solving real-world problems. Don’t be scared to experiment. If you commit to this, I promise you’ll finish 50 days with more than just knowledge. You’ll have the skills, confidence, and habit of thinking like a data analyst. To make starting even easier, as a subscriber to this newsletter, I’m giving you an exclusive 35% launch discount. You can grab your copy today and start the 50-day journey at a reduced price. Grab SQL Essentials for Data Analysis here: I can’t wait to hear about your progress, the insights you uncover, and the confidence you gain along the way. If you have any questions, feel free to reach out to me or post them in the comments section. Let’s start this journey together: one challenge, one query, one day at a time. Warmly, Benjamin PS. Please repost.

Benjamin Bennett Alexander

15,375 просмотров • 6 месяцев назад

Apache Spark has levels to it: - Level 0 You can run spark-shell or pyspark, it means you can start - Level 1 You understand the Spark execution model: •RDDs vs DataFrames vs Datasets •Transformations (map, filter, groupBy, join) vs Actions (collect, count, show) •Lazy execution & DAG (Directed Acyclic Graph) Master these concepts, and you’ll have a solid foundation - Level 2 Optimizing Spark Queries •Understand Catalyst Optimizer and how it rewrites queries for efficiency. •Master columnar storage and Parquet vs JSON vs CSV. •Use broadcast joins to avoid shuffle nightmares •Shuffle operations are expensive. Reduce them with partitioning and good data modeling •Coalesce vs Repartition—know when to use them. •Avoid UDFs unless absolutely necessary (they bypass Catalyst optimization). Level 3 Tuning for Performance at Scale •Master spark.sql.autoBroadcastJoinThreshold. •Understand how Task Parallelism works and set spark.sql.shuffle.partitions properly. •Skewed Data? Use adaptive execution! •Use EXPLAIN and queryExecution.debug to analyze execution plans. - Level 4 Deep Dive into Cluster Resource Management •Spark on YARN vs Kubernetes vs Standalone—know the tradeoffs. •Understand Executor vs Driver Memory—tune spark.executor.memory and spark.driver.memory. •Dynamic allocation (spark.dynamicAllocation.enabled=true) can save costs. •When to use RDDs over DataFrames (spoiler: almost never). What else did I miss for mastering Spark and distributed compute?

Zach Wilson

36,123 просмотров • 1 год назад

Your agents can't keep up with real-time data. Especially when it's scattered across dozens of sources. Most teams waste weeks building custom connectors for every database, API, and data warehouse. Then they build ETL pipelines to sync everything. By the time your agent retrieves the data, it's already outdated. Picture this: Your Postgres database updated 5 minutes ago. Your MongoDB collection changed 2 minutes ago. Your agent is still pulling from yesterday's snapshot. This is why most production RAG systems fail. There's a better approach: MindsDB is an open-source AI platform with a federated data engine that lets you query multiple data sources in real-time using SQL - without moving any data. Here's what makes it different: ↳ Your data stays in place. No ETL pipelines or data duplication ↳ Query Postgres, MongoDB, REST APIs, and more using consistent SQL ↳ JOIN across different sources in real-time with a unified interface ↳ Works with both structured and un-structured data And here's the best part: You don't even need to write SQL. Just describe what you want in plain English, and MindsDB converts it to SQL automatically. The system does all the heavy lifting. The breakthrough for AI agents is simple: When data updates at the source, your agent gets fresh results immediately. No sync delays. No stale embeddings. No custom code for each integration. You can literally write a SQL query that joins a Postgres table with a MongoDB collection and gets live results. This is what production AI applications need but rarely get. In this video, I give you a complete walkthrough of what we just discussed and how to actually do it. Make sure you watch this till the end. I've shared the link to MindsDB's GitHub repo in the next tweet!

Akshay 🚀

65,672 просмотров • 6 месяцев назад

Google open-sourced MCP Toolbox for Databases. I gave it access to everything else. For context, Google's MCP Toolbox for Databases is an open-source server that lets AI agents securely query structured databases like PostgreSQL and MySQL through the MCP protocol However, most enterprise knowledge doesn't actually live in databases. It's scattered across emails, Slack threads, GitHub repos, Salesforce records, customer reviews, and internal docs. So Agents can't see any of it, which means they're working with a fraction of the context they need. I fixed that using MindsDB. It acts as a universal SQL layer that sits on top of all your data sources: structured, semi-structured, and unstructured. This means you can query Salesforce, Gmail, GitHub, S3 files, Jira, and 200+ more sources using SQL syntax. The clever part is how it connects to the MCP Toolbox. MindsDB exposes everything through MySQL, so from the Agent's perspective, it's just running SQL and getting context back. It doesn't know or care that the data came from five different sources behind the scenes. This setup unlocks some powerful capabilities: → One SQL interface for dozens of enterprise sources → Cross-datasource joins (combine GitHub and CRM data in a single query) → Built-in ML capabilities for working with unstructured data → Simple MCP tools that now have massively expanded reach In the video below, the Agent queries GitHub data and a customer review database in one SQL query. So what used to require ETL pipelines and weeks of engineering effort now happens instantly. At the end of the day, AI agents are only as useful as the data they can access. This gives them a lot more to work with. I have shared the GitHub repo in the replies, where you can find more details about this.

Akshay 🚀

39,331 просмотров • 3 месяцев назад

A lot of you complaining about shooting in 2k25 dont understand how the shooting actually works and are just calling it bad so let me break it down real quick The shooting itself is pretty EASY, if you cant green your first few shots you're just mistiming and need to learn your timing or get better, for example you can green your first 2 jumpshots very easily. NOW LETS GET TO THE PROBLEM. But after say for example you make 2 shots in a row, MOST OF THE TIME 2k will give you NO GREEN WINDOW at all on the next shot, leaving it completely up to RNG (random) whether your next shot will go in (will most likely miss). So essentially 2k is punishing you for greening multiple shots in a row by forcing you to eventually miss your next shot most of the time. In this clip in a ranked 1v1 match, my opponent greens 2 shots in a row then i confidently leave him WIDE OPEN for the 3rd shot knowing that he will 100% MISS because of the algorithm and he does...then i do the exact same thing again after he makes 2 more shots in a row, i leave him WIDE OPEN for his 3rd shot knowing he will miss again and he does indeed miss... You should not be PUNISHED for timing your jumpshot correctly by being given no green window after you make a sequence of shots, THAT IS THE PROBLEM WITH THE SHOOTING...it is not hard to time your shot in general, the issue is after you make a few shots, you're given NO GREEN WINDOW, which is leaving it up to RNG which should not be a thing. And if there's any brainrot kids who can't understand me saying that the shooting is easy but then also saying that the way they made the algorithm to make you miss is bad, i am specifically talking about the times where it gives you no green window after making a few shots, which is not on every single shot... 2k needs to revert this because it should not be impossible to make specific jumpshots, especially after making a few shots you think you would be rewarded and it would be easier to make the next ones correct? Well they made it harder instead.

HankDaTank

538,309 просмотров • 1 год назад

Building Data Pipelines has levels to it: - level 0 Understand the basic flow: Extract → Transform → Load (ETL) or ELT This is the foundation. - Extract: Pull data from sources (APIs, DBs, files) - Transform: Clean, filter, join, or enrich the data - Load: Store into a warehouse or lake for analysis You’re not a data engineer until you’ve scheduled a job to pull CSVs off an SFTP server at 3AM! level 1 Master the tools: - Airflow for orchestration - dbt for transformations - Spark or PySpark for big data - Snowflake, BigQuery, Redshift for warehouses - Kafka or Kinesis for streaming Understand when to batch vs stream. Most companies think they need real-time data. They usually don’t. level 2 Handle complexity with modular design: - DAGs should be atomic, idempotent, and parameterized - Use task dependencies and sensors wisely - Break transformations into layers (staging → clean → marts) - Design for failure recovery. If a step fails, how do you re-run it? From scratch or just that part? Learn how to backfill without breaking the world. level 3 Data quality and observability: - Add tests for nulls, duplicates, and business logic - Use tools like Great Expectations, Monte Carlo, or built-in dbt tests - Track lineage so you know what downstream will break if upstream changes Know the difference between: - a late-arriving dimension - a broken SCD2 - and a pipeline silently dropping rows At this level, you understand that reliability > cleverness. level 4 Build for scale and maintainability: - Version control your pipeline configs - Use feature flags to toggle behavior in prod - Push vs pull architecture - Decouple compute and storage (e.g. Iceberg and Delta Lake) - Data mesh, data contracts, streaming joins, and CDC are words you throw around because you know how and when to use them. What else belongs in the journey to mastering data pipelines?

Zach Wilson

16,688 просмотров • 1 год назад

The subject of 'owning a slave' is dense. It is something we hear a lot when we are in the FemDom Realm. Is it just fantasy? Can it actually be a lifestyle? How do we navigate this type of dynamic? How do we even get to that level of D/s? In this short clip [Exerpt from SLAVE TRAINING Part 2] I want to already bring to your attention one thing that will define if your desire for a slave (or desire as a slave) is touching more on a fantasy or... how can you actually navigate this in a realistic way. No one person 'can do it all' or should be expected to. If you want your slave to be 'the best' , assign them a specific role in which they can excel... and then build upon that. Once they 'master' your housekeeping (which takes quite a bit of real training), they can move to other levels. And an important note I want to leave here... make them EARN access to certain things in your life that sometimes you just want to delegate because you don't want to manage or don't know how to manage. Entrusting them with serious tasks that can affect your life, your business, your reputation, are on top of the ladder. Are they even qualified for the thing you want them to take off your shoulders? Start small and allow them to grow in their submission, to develop their skills and to learn how to best satisfy you without setting them up for failure by expecting too much, too quick. In the end, if you want this to truly work, you have to approach it from a place that transcends the roles. As this is consensual power exchange. And you both want to be fulfilled in that relationship.

Ms. Malissia

11,931 просмотров • 3 месяцев назад

Q: It must be complicated, when I listen to you, to have a private life, somebody to understand your passion and to share this moment. Lewis: "It really is, especially I would say more so today than ever before, which is the way the world is, you know. I look at the other drivers and I wonder how they're doing it. You know, some are having kids and some married, some, you know, most of them girlfriends. I did that when I was in my 20s, but I took a decision to really to maximize my time that I have here because it's not as long as you think and it's limited, you know. And I don't want to look back and be like, ah, if I just gave a little bit more here, I didn't sacrifice my time because I was committed elsewhere." "So I really focused in these last, you know, particularly these last 10 years, like get everything I can out of my performance. Then when I retire, then I can do whatever I want. You know, I can dedicate my time to whatever else it is and not have to worry." "But in this competition time, focus on health, well-being, my mental health, my driving technique, being as good an engineer as I can be, and also being the best teammate that I can potentially be for the guys that I get to work with. That's my sole focus. You know, I want to win." "I've been fortunate enough to win with great teams in the past. Particularly, obviously, with Mercedes and with McLaren, which was incredible. And my dream is to win a championship with Ferrari." "And that's something that hasn't been done for a while. But they have absolutely every ingredient that's needed to win. It's just like getting all the pieces of the puzzle in the right place. And that's what I'm trying to work on in the background with Fred and the whole team." [📹 VIGNERON GAETAN]

sim

86,907 просмотров • 10 месяцев назад

Alex Karp, Palantir: “At a certain level of accomplishment, you’re in an artistic space where it’s very hard to explain why you have your insights.” "There’s one country in the world where you get rewarded for that.. in America, if you deliver, you can be you.” “This is a maximal freedom culture… & that self-expression—because it’s not playbook—creates an environment that is exceedingly hard to compete with & will piss off all the right people.” . . . "I think in the end, to do something important—whether it’s me, or @elianoayounes, or look at all these people here—these are among the best and most talented people in the world. At a certain level of accomplishment, you’re in an artistic space where it’s very hard to explain why you have your insights, and it goes way beyond experiences that have of course also influenced them. But I just have artistic impulses, and they shape my life, and I’ve allowed myself—or I’ve been forced to allow myself—the freedom to live that way. And there’s one country in the world where you get rewarded for that, because in America, if you deliver, you can be you. You’re your own boss, right? You decide who you want to talk to, you decide who you don’t want to talk to. You have ideas of things you’d like to advance on. And I think one of the biggest variables in my life is simply that I live in a culture where if you deliver—in this case economically—and by the way, at 18, for most investors, we were failing for at least 15 years. Many would say 18 years. Honestly, some would say until two years ago. And still, this is a culture where the financials are going to show up. That’s only possible in this culture. I guess maybe because I lived abroad so long, it’s easier for me to accept and rely on that. I think sometimes people who’ve lived here their whole life don’t always exactly understand that this is a maximal freedom culture. It’s the only culture like this in the world, and it allows you to self-express. And if you self-express, that self-expression—because it’s not playbook—creates an environment that is exceedingly hard to compete with and will piss off all the right people."

Molly O’Shea

52,497 просмотров • 4 месяцев назад

RLM is the most import foundation of my Pi Harness (other than Pi of course). It's seeded with late interaction retrieval results (thanks to @lightonai for pylate). The Agent initiates it with query then.. 𝐒𝐞𝐭𝐮𝐩 A python REPL is created and seeded with: 1. Late interaction search to pre-filter. Instead of doing top 3/5/10, it's top hundreds of documents. This is set into a `context` variable. 2. Python functions are loaded in to do more searches if `context` variable isn't enough. And to make llm calls with cheaper models in parallel batches. 𝐈𝐭𝐞𝐫𝐚𝐭𝐢𝐨𝐧 𝐋𝐨𝐨𝐩 From there, an LLM iterates in the REPL based on the query. It's just like exploring in a jupyter notebook. The LLM writes prose (like a markdown cell) and code to be run in the REPL each turn. This allows the LLM to sort, filter, and synthesize information. It can fan out and ask smaller models to summarize, combine, contrast, or do anything else to documents to help it understand the data. After several turns the LLM reponds with the final answer. Either because it found the answer, or hit the budget limit. Context as a Python variable, LLM as the programmer, REPL as the runtime. 𝐖𝐡𝐲 𝐃𝐨𝐞𝐬 𝐓𝐡𝐢𝐬 𝐖𝐨𝐫𝐤 1. Richer Shell. Agents (and subagents) work by intermixing code and prose/thinking. But they use static scripts or bash that run and exit and start over each tool call. That's not ideal for exploration and synthesis of data. For that, state is useful to continue building and exploring the data as you learn more. There's a reason jupyter notebooks have been popular with data scientists. 2. Keeps main agent context clean. The better context you have the better the agent will perform (duh!). This means three thing: better human input, less missing search results, and less incorrect search results. Letting the agent iterate allows it to synthesize just what is needed and nothing else. All bad paths or peeks at something that turns out to be irrelevant stays out of main agent context. 3. Stack the good ideas! People often compare late interaction search vs RLM. Or static vs dynamic languages. Or agentic search vs semantic search. But...You can just use them all together for what they're each good at. Use them all for the area they're really great for. Read the full post which has more detail about how and why.

Isaac Flath

40,212 просмотров • 1 месяц назад