In The Loop Episode 1 | DeepSeek’s AI Breakthrough: Hype or Game-Changer? A No-Nonsense Breakdown

Published by
Published on
Read time
Category
If you’ve been anywhere near the AI world this week, you’ve heard the name DeepSeek. You’ve seen the headlines—some calling it a paradigm shift, others saying it’s mostly smoke and mirrors. Investors are rattled. Silicon Valley is watching. Some are saying this is China’s Sputnik moment for artificial intelligence.
But here’s the thing—every time AI has one of these moments, the headlines run ahead of reality. Some of this is hype. Some of this is real. The challenge is knowing which is which.
In the latest episode of the In The Loop podcast, we break it all down:
- What is DeepSeek AI?
- What has DeepSeek done different
- What’s so great about DeepSeek? The four big improvements explained, simply.
- Controversy around the model training method
- Why is DeepSeek such a big deal?
- Is the market’s reaction justified or just overhyped?
What is China’s DeepSeek Ai?
DeepSeek released an open-source AI model that, on paper, competes with the very best and latest models. But here’s what makes this super interesting: They claim to have trained it for just $5.6 million. That’s not a typo. That’s a fraction of the half a billion dollars U.S. labs spend on comparable models. And the cost to actually use it is even lower. Some developers are making 200,000 API calls for just $50.
If that’s true, this isn’t just a breakthrough—it could be a threat.
A threat to OpenAI, to Google, to the economics of AI itself. Because this isn’t just about a new model. It’s about changing the cost structure of AI from the ground up.
But of course, the simple version of this story is rarely the real one.
There are rumors that DeepSeek had access to restricted NVIDIA hardware, cut corners, were subsidized by the Chinese government, and it isn’t as revolutionary as it looks. And, of course, there’s the geopolitical angle—because this isn’t just about technology.
So first off, what did they actually do and how?
What did DeepSeek do different
How DeepSeek made AI faster, smaller, and cheaper to run?
AI models are incredibly powerful, but they’re also super resource-intensive and expensive to run. Companies like OpenAI, Google, and Meta have built large-scale models capable of generating text, answering questions, and even writing code.
But there’s a cost: these models require massive amounts of computing power and memory, making them difficult to run outside of high-end servers. DeepSeek changed this by redesigning how AI models work, making them smaller, faster, and cheaper while keeping the same level of intelligence.
Before we get into how they did it, let’s first understand how AI models are built and run, so we can see exactly where DeepSeek made improvements.
How AI models are created and ran
Step 1: Training the Model (Building the AI’s Knowledge)
AI models are created by analyzing massive amounts of data—books, websites, and conversations—so they can recognize patterns in language. This process requires huge computing power because the AI must process and store billions of words and phrases.
The training phase happens on server farms—large warehouses filled with high-end computers that work together to handle the massive amount of data. Single computers aren’t powerful enough to handle the workload.
Once trained, the model doesn’t need to learn anymore, it just needs to generate responses based on what it has learned. That’s where the next step comes in.
Step 2: Running the Model (Inference – How AI Generates Text)
Once an AI model is trained, it moves into inference mode.
This is when it responds to users and generates text and is used when an AI model generates responses based on what it’s already learned. This is different from training, where the model is still being built. Inference is the real-time part of AI—it’s what determines how fast, responsive, and expensive AI is to run. Inference happens every time you:
- Ask ChatGPT a question.
- Generate an image using AI
- Use AI to summarize an article
It's this latter part, inference, that DeepSeek completely flipped on its head by re-designing models to make them faster and cheaper.
What’s so great about DeepSeek?
First of all, DeepSeek is open-source so it is accessible to third-parties, making development and future innovation easier. DeepSeek made AI faster, smaller, and cheaper to run. They managed to train the model for just $5.6 million which is a fraction of the half a billion dollars U.S. labs spend on comparable models. All this results in lower end-user costs: some developers are making 200,000 API calls for just $50.
There are several resourceful improvements they made.
Improvement #1: 8-bit floating-point numbers
AI was using numbers that were too big, so they were wasting memory and slowing everything down. Imagine this:
You’re writing a book, but instead of using regular-sized letters, you decide to write and draw everything in 3D block letters. Each word takes up four times more space than necessary, and soon your notebook is full before you’ve even finished a chapter. You’re running out of space, even though you don’t need those oversized letters—you could write smaller and still read it just fine.
AI models had this exact problem.
How AI uses numbers
AI doesn’t “think” in words like humans do. Instead, it converts every word into numbers and runs complex calculations to predict the next word in a sentence. These numbers are called floating-point numbers, which store decimals for precision.
For a long time, AI models have used 32-bit floating-point numbers, which are highly precise but take up a lot of memory.
Why this was a problem
AI needs to process millions of these numbers at once. The bigger the numbers, the more memory it consumes, which means you need high-end hardware with lots of VRAM.
Think of VRAM as the desk space whilst you’re working. The bigger your desk, the more room you have to spread out papers, notebooks, and a laptop (process AI calculations). If your desk is too small, you have to keep stacking things on top of each other, shuffling them around, and wasting time just to find what you need.
AI models need a lot of desk space (VRAM) to work smoothly—if they don’t have enough, they slow down or struggle to function.
GPUs (Graphics Processing Units) are like your hands—they do the work of writing, typing, and organizing everything on the desk. A larger desk (more VRAM) means your hands can work faster without constantly moving things around.
DeepSeek’s Solution: use 8-bit floating-point numbers.
DeepSeek reduced the size of the numbers AI uses (switching from 32-bit to 8-bit), which means:
- AI now needs less VRAM (desk space) to process the same amount of information.
- Because there’s more free space, the AI can work faster without constantly moving things around.
It’s like switching from huge textbooks to smaller notebooks—you can fit more on your desk, find things quicker, and work more efficiently without needing a bigger desk.
This means AI can run smoothly on less powerful (cheaper and more accessible) hardware, instead of requiring expensive high-end GPUs.
This was a huge breakthrough—but even after shrinking the numbers, AI still wasted memory by storing too much information.
Improvement #2: Compress key-value memory by 93%
AI was storing way too much information and so ran out of memory. Imagine this:
You’re taking notes during a meeting, but instead of summarizing key points, you write down every single word. Your notebook fills up quickly, and when you need to find an important point, you struggle to dig through all the useless details.
This is what was happening inside AI models. So when AI generates text, it remembers past words so it can stay consistent. This memory is called key-value storage.
Why this was a problem
AI stored every single detail exactly as it appeared, without compression. Compression is just the process of shrinking the data while keeping the important parts. It’s like zipping a file—it takes up less space but can still be used when needed.
But here’s the problem:
- Storing too much data fills up memory (VRAM) quickly.
- If AI ran out of memory, it had to delete old information or slow down.
This is yet another smart move by DeepSeek, which in retrospect feels obvious: compress key-value memory—by 93%!
DeepSeek developed a very clever compression technique that shrinks memory storage while keeping important details. This means that AI doesn’t need to remember everything, just the key information. So AI can now store much longer conversations without running out of memory. As a result, AI can now handle bigger tasks without needing crazy expensive hardware. Sorry NVIDIA…
However, even with smaller numbers and better memory storage, there was still another issue: AI was too slow.
Improvement #3: Predict multiple words at once
AI was thinking one word at a time, which slowed down response time. Imagine this:
You’re texting a friend, but your phone forces you to type one letter at a time, stopping after each one to think about what comes next. It would take forever to finish a single sentence.
That’s how AI models were generating text. Most AI models predict one token (word or part of a word) at a time. After generating each word, they stop, reprocess everything, and then generate the next word.
Why this was a problem
Well, this step-by-step approach added unnecessary delays, especially for long responses. Even though AI models process information quickly, generating words one at a time still makes responses slow.
DeepSeek’s Solution? Yet another clever decision: predict multiple words at once.
DeepSeek optimized the model to predict multiple words (tokens) in a single step, instead of one at a time. This works because AI already knows the probability of upcoming words, so it doesn’t need to stop after each one.
So, the AI could now generate responses twice as fast without losing accuracy. Meaning, AI now responds instantly, making conversations feel smoother and more natural.
But there was still one last issue: AI models were too big to run on regular GPUs.
Improvement #4: Mixture of Experts (MoE)
Get AI models to run on regular GPUs, making them less expensive and more accessible. Imagine this:
You have a laptop, but to run your favorite app, you’d need an entire server farm—huge racks of computers in a warehouse.
This was the problem with AI models. Most AI models were one massive neural network that handled everything. This meant:
- The entire model had to be loaded into memory at once;
- Only high-end GPUs with massive VRAM could handle them;
- Which meant that running AI was super expensive.
DeepSeek’s Solution? Genius! Mixture of Experts (MoE).
Instead of one gigantic model doing all the work, DeepSeek used a technique called ‘mixture of experts’, which means splitting the model into specialized components that activate only when summoned.
So, if you ask ChatGPT to "Explain gravity," the AI loads the entire model into memory, including parts that cover history, cooking, and sports, even though they aren’t relevant.
This wastes a lot of effort (called computing power) and space (memory) since the AI is using unnecessary resources just to process one response.
With Mixture of Experts (MoE):
- The AI model is divided into specialized "expert" components—some trained for science questions, others for history, others for coding, etc.
- If you ask "Explain gravity," the AI only activates the expert trained on physics and science, while the other experts stay inactive.
- This makes the response faster, more efficient, and less resource-intensive, because only the relevant part of the AI is running.
This was clever because not all tasks need the full AI model—just the right parts of it. So now, AI now runs on consumer-grade GPUs instead of requiring supercomputers, so AI is now faster, cheaper, and more accessible.
Debate: Did DeepSeek use distillation?
There are arguments and debates about whether DeepSeek also used a technique called distillation—which seems to be false.
How distillation works
In distillation, you’d have one teacher (the master painter) and one student (a smaller AI model). The students wouldn’t learn by practicing on their own—instead, they would watch every brushstroke of the teacher and try to mimic it exactly. Over time, the student learns to copy the teacher’s exact style, but with fewer tools and a smaller canvas (i.e. a smaller model).
This is NOT what DeepSeek did.
What DeepSeek actually did…
Instead of training one student to copy the master, DeepSeek hired multiple new painters and gave them the same reference materials that the master painter originally used. Each painter learned from scratch using these materials, but without watching the master at work. The result? These painters aren’t perfect clones of the master, but they still produce high-quality paintings—just using a different learning process.
What this means for AI
Instead of distilling R1 into smaller models, DeepSeek trained smaller models separately using the same dataset that was used to refine R1—but without reinforcement learning (RL).
This process is called Supervised Fine-Tuning (SFT)—where AI models learn by predicting the next word based on real examples, rather than copying another AI’s predictions. That’s why DeepSeek’s smaller models aren’t just mini versions of R1—they were trained differently, making them more efficient without needing reinforcement learning.
Why this was misunderstood
Many people assumed that any efficient smaller model must have been distilled from a bigger one, which led to the widespread but incorrect claim that DeepSeek used distillation.
DeepSeek’s method was simpler but highly effective—they trained smaller models from scratch using the same high-quality data, rather than distilling R1 down into a mini version.
Why DeepSeek is a big deal
As expected, DeepSeek headlines are all over the web and almost everyone has heard about it, whether they use the likes of ChatGPT once per month or five times per day, or have barely used it in their lives.
The short story? DeepSeek managed to create an open-source model that is on par with OpenAI and Google.
It's come right as Trump has taken office for the second time and when geopolitical tensions are high. There are a lot of skeptics, who’re saying that DeepSeek didn’t actually use the cheaper NVIDIA GPUs and are lying, or they are subsidized by the Chinese government. And ofcourse, a lot of people hyping this innovation up as the end of OpenAI.
One thing is for sure: this has been a big slap in the face to the big players. DeepSeek has…
- Increased geo-political tensions;
- Created a new competitive battleground for AI inference;
- Started a big discussion on open-source and commoditized reasoning models that bring down cost;
- And kicked U.S.-based Big Tech companies into hyperdrive.
I'll explain all of this and finally, whether I think the market's reaction is justified.
Reason #1: It's come at a time of big geopolitical tension
DeepSeek’s impact is not limited to technology. It has important geopolitical ramifications. The model originates from China and it challenges the view that Silicon Valley is the sole leader in AI. Some see it as a signal of China’s rise in technological innovation. This shift shakes long-held assumptions about U.S. dominance.
A key concern is that deep innovations like DeepSeek may shift global power balances. The method of cost-efficient training could allow China to export its technology at lower prices. In a competitive market, cost becomes a decisive factor. It could force a pivot in the race for AI supremacy. Many see it as the start of a new chapter in the U.S.-China rivalry.
If you ask DeepSeek what happened at Tienemann Square, it won't answer for example.
Investors and policymakers are taking note too. The shock of DeepSeek’s efficiency has made them rethink investment strategies. Many are arguing that excessive bureaucracy may be holding back progress. In response, calls for a leaner, more agile strategy are growing louder.
Some critics have taken a more conspiratorial view. They suggest that DeepSeek’s low cost might be a strategic ploy to destabilize American AI competitiveness. These critics argue that the low cost is an introductory offer and soon, prices might rise.
However, the technical breakthroughs with inference remain impressive.
Reason #2: A new scaling law - AI Is moving from training to inference
For years, the AI race was about training bigger and better models. All of this was based on Moore's Law- that we will see exponential growth of computer power and thus, AI improvement.
Companies like OpenAI and Google focused on improving how models learn from massive datasets. But now, the real challenge isn’t training—it’s inference (how AI generates answers in real-time).
Inference used to be cheap and simple—it just depended on how many users were making requests. But a new shift has happened because AI is now ‘thinking through’ problems, which comes at a cost.
New AI models don’t just predict answers immediately. They now use Chain-of-Thought (CoT) reasoning, which means instead of jumping straight to an answer, a Chain-of-Thought model breaks the problem into steps and processes it more like a human would:
- Instead of making a quick guess, the AI checks its work multiple times before giving a final answer.
- This makes responses more accurate, but also slower and more expensive to generate.
- The more complex the reasoning, the more computing power and cost is needed.
So suddenly, providing much better, accurate responses, just got a whole lot cheaper for everyone. Before DeepSeek, using the OpenAI-o1 model was limited to a certain number of questions for £20 per month, until you had to upgrade to the £200 license
Companies like ours, Mindset AI, will benefit greatly: we embed AI agents into SaaS platforms so we can provide an end-to-end agent infrastructure. We can now provide those models to customers instantly. The same goes for other big providers like Amazon and Microsoft.
DeepSeek’s API access is priced at around 3% of OpenAI’s rate for GPT-o1. Over the weekend, it's possible to make hundreds of thousands of API calls at a cost of only a few cents per query. This cost reduction is enormous. It shifts the focus from training expenses to inference expenses.
This means further AI model commoditization and more open-source discussions.
Reason 3: Reduced costs through open-source and commoditized models
What's crazy is that DeepSeek is open-source. This is a big part of why it's so impactful. Its design is fully public.
Researchers and developers worldwide can study the method. They can replicate the model and even improve on it. Platforms like Hugging Face are already running tests and getting good results.
Even Sam Altman suggested that OpenAI is on the wrong side of history with this one, but that they would need to go through many steps and open-sourcing their platform is not their top priority right now.
Peter Yang of Roblox noted that users care more about AI applications than the underlying models. This suggests that companies building compelling AI-driven user experiences will ultimately win, rather than those solely focusing on developing base models.
Anthropic’s Dario Amodei echoed this sentiment, explaining that DeepSeek’s cost efficiency is part of an ongoing trend rather than a singular breakthrough. U.S. labs have been reducing training costs by a factor of four annually, and China’s advancements fit within this larger pattern.
Being open source is a gift to the community. Open research allows for faster iteration and innovation, as well as a level playing field. This openness breaks down the traditional barriers seen in the AI industry. I recently discussed this in an article before Deepseek hit the mainstream, that models are losing their moat. It's all about the application layer.
Reason #4: Big Techin frenzy: Is it time to go bigger and more expensive?
On one hand, you have the likes of Microsoft CEO, Satya Nadella. She framed it as an opportunity, citing Jevons’ Paradox—the idea that as something becomes cheaper and more efficient, its demand skyrockets. In other words, AI isn't going anywhere—it’s about to become more embedded in our daily lives than ever before.
On the other hand, you have the likes of Scale AI’s CEO, Alexander Wang. He suggests that DeepSeek is using undisclosed NVIDIA H100 clusters that would be illegal to export to China. But the company itself claims it trained the model on just 2,000 NVIDIA H800s, a version allowed for export.
A leaked post from Meta’s AI team described a “ panic mode,” where engineers are urgently analyzing DeepSeek’s methods to replicate them. With AI costs plummeting, major U.S. tech firms that have spent billions on training may be wondering if their investments were justified.
But here’s the interesting part—U.S. startups aren’t hesitating to use DeepSeek’s models. Companies like Perplexity AI have integrated them, and cloud providers like Microsoft and AWS are offering DeepSeek models on their platforms. Meanwhile, tech leaders like Yann LeCun argue that inference—running AI models at scale—matters more than the initial training cost.
Despite competition from DeepSeek, OpenAI isn’t slowing down either. Reports indicate the company is in talks to raise up to $40 billion, pushing its valuation as high as $300 billion. This funding will support OpenAI’s growing AI infrastructure and Project Stargate, a major initiative to enhance AI capabilities.
At the same time, OpenAI is seeing success with its $200-per-month Pro subscription tier, which has surpassed $300 million in annualized revenue. The company is also expanding its offerings for government agencies, including partnerships with U.S. national laboratories to enhance cybersecurity and nuclear security. Whether this continues after DeepSeek's reasoning model, who knows, I suspect it will. OpenAI is still more trusted in terms of what happens with their data, is U.S. based, has big distribution and attention.
Market reaction, winners, and losers—My take
Finally, let’s discuss whether the stock market's reaction is justified or an overreaction.
First off, the success of DeepSeek is more than a technical achievement. It has implications for the global economy. Big tech companies have poured billions into AI infrastructure and they have built massive training clusters. DeepSeek suggests that a different approach is possible and training may become far cheaper. The focus will shift to delivering cost-effective inference.
This shift may force companies to rethink their investments: companies like Microsoft, Amazon, and Google will adjust their strategies. The cost savings from cheaper training could lead to broader adoption of AI services. Lower inference costs mean that more businesses can integrate AI into their products andthis democratization can drive economic growth.
There are concerns about the current investment paradigm. Critics argue that too much money has been spent on expensive, inefficient training setups. DeepSeek’s innovations show that resource constraints can drive creativity.
Ultimately, you can argue that these innovations came as a result of constraints. For example, the Chip Ban in China made it impossible for DeepSeek to use NVIDIA's best chips, so they figured out a way of not using them.
Investors are watching these developments closely. Market reactions show a mix of fear and excitement: some are worried about the long-term impact on traditional investments, while others see it as an opportunity to pivot to new business models. The message is clear: innovation in cost efficiency is here to stay.
As more companies adopt similar methods, the entire AI ecosystem will benefit. Lower costs drive demand—this, in turn, accelerates development. Cheaper AI means that more services will become available to consumers. The result could be a wave of innovation across multiple sectors. DeepSeek is a harbinger of that change.
Analysis of winners and potential losers
- I think there are many winners here:
- Consumers and Developers: Access to powerful AI at dramatically lower costs.
- Open-Source AI Advocates: Proof that open collaboration can match (or even surpass) proprietary AI labs.
- Tech Giants like Microsoft, Amazon, and Apple: Lower inference costs mean they can integrate AI into their products more affordably.
- Meta: They have been doubling down on open-source AI and will directly benefit from these innovations. Also, every single part of their business model gets better with Gen AI in a real clear but simple way
Others that I believe are still essential are energy providers. Energy providers’ stocks also dropped a lot, but I don't buy that energy is not required. I believe in the Jevons’ Paradox—the idea that as something becomes cheaper and more efficient, its demand skyrockets. In other words, energy use will continue to be essential and we still require huge data centers to provide this technology to the world, meaning we need alternative energy supplies.
Why did DeepSeek affect NVIDIA?
Companies like NVIDIA are particularly fascinating. The bull case for NVIDIA centers on its commanding position in both AI hardware and software, which allows it to capture a massive share of capital expenditures—money that companies spend on building and upgrading their AI training and inference systems.
Think of NVIDIA as the “engine maker” for AI: its GPUs, like the H100, are the high-performance engines powering deep learning models. Leading tech giants such as Microsoft, Apple, Amazon, Meta, and Google depend on these GPUs to stay ahead in the competitive race, much like top Formula 1 teams rely on the best engines to win races. This reliance creates a near-monopoly in a crucial part of AI infrastructure, driving enormous demand for NVIDIA’s data center products that often yield impressive gross margins of over 90%.
Moreover, NVIDIA’s proprietary CUDA ecosystem is like a specialized toolkit for writing programs that run on its GPUs. Developers write code in languages such as C, C++, or Python with CUDA extensions to tell the GPU how to do important tasks that make the LLMs successful. Just like building a car with a specific engine type, many developers and companies have built their workflows around CUDA. They have also done a number of other things that makes being a part of that CUDA ecosystem more beneficial, for example, allowing GPUs to communicate with each other, making the LLMs even more powerful. And if you use this CUDA ecosystem, it's not easy to use any other companies.
The bull case also argues that as advanced methods like chain-of-thought(where AI “thinks through” problems step-by-step) and increasingly complex models drive up the need for raw computing power, which means NVIDIA is going to be essential for.
On the flip side, the bear case points out that competitors are quickly developing alternatives that could challenge NVIDIA's dominance. For example, Cerebras has introduced new chips that promise speeds up to 57 times faster than NVIDIA's H100. This means that a single Cerebras chip can do more things but with less complexity and cost.
At the same time, major players like Google and Amazon are developing their own custom silicon—which is akin to car manufacturers designing their own engines instead of buying off-the-shelf parts. This move reduces their reliance on NVIDIA hardware, potentially eroding NVIDIA’s market share. On the software front, new developer frameworks enable developers to write code that isn’t tied to a specific type of hardware. Think of these frameworks as universal remote controls that can operate on a variety of devices, which again could start to diminish the unique advantage offered by CUDA.
And finally, startups like DeepSeek are making efficiency breakthroughs that add further pressure on NVIDIA’s strong margins. DeepSeeks claims that they can improve efficiency by up to 45× while offering API calls at up to 95% lower cost than competitors like OpenAI and Anthropic could challenge NVIDIA’s historically high gross margins and premium valuation.
In conclusion, NVIDIA stands at a bit of a crossroads. Its dominant position in AI has driven enormous growth. Yet, the next wave of innovations threatens to disrupt its business—evidenced by the recent 17% drop in share price after the release of DeepSeek. Hardware alternatives, new software frameworks, and efficiency breakthroughs could lower computing costs. This may force NVIDIA’s customers to seek other solutions. In time, the company’s high margins and rapid growth could be under pressure.
NVIDIA is a symbol of today’s AI revolution. Its stock reflects the optimism of many investors. But caution is warranted. The market is in the early days of a major transformation. Many forces will shape the future of AI computers. The ultimate winners may not be the current leaders.
Investors should watch these developments closely. The potential rewards are huge, but so are the risks. NVIDIA’s journey is far from over. The next few years will be critical. Companies will continue to fight for every advantage. The best ideas will win.