Introduction to Deepseek
Hey friends, I hope you’re well. In case you missed it, DeepSeek R1 has been making waves in the AI and tech scene. It’s an open-source AI model that was apparently developed for less than $6 million, a fraction of the billions of dollars spent by OpenAI and Google, for example, to create their AI models.
The good news for all of us is that DeepSeek’s free to use, and it’s shot up to the most downloaded app on the App Store, surpassing ChatGPT within days.
It’s now one of the most advanced free and open-source AI models we can use. I’ve been playing around with DeepSeek R1 for a couple of days, and I have to say it’s game-changing, but it’s not without its flaws, so let’s run through what DeepSeek R1 is capable of and see what the fuss is all about, so let’s get up to speed. .
What is DeepSeek R1?
Deepseek is the name given to open-source large language models (LLM) developed by Chinese artificial intelligence company Hangzhou DeepSeek Artificial Intelligence Co.,Ltd.
On 20 November 2024, it becameDeepSeek-R1-Lite-Preview
accessible via DeepSeek’s API and DeepSeek. It was trained for logical inference, mathematical reasoning, and real-time problem-solving. DeepSeek claimed that it exceeded the performance of OpenAI o1 on benchmarks such as the American Invitational Mathematics Examination (AIME) and MATH.
One of the main reasons why DeepSeek R1 is so hyped is because it doesn’t rely on expensive human label data sets or supervised fine-tuning, which is how most AI models are trained, and it costs millions, if not billions.
Instead, DeepSeek R1 uses a self-reinforced learning method without the need for human supervision and effort.
You can think of supervisor fine-tuning like teaching a child to cook by writing up a long and precise recipe and then showing them step by step, while reinforcement learning is allowing the child to sort of like experiment in the kitchen and gently guiding them when dishes don’t turn out well so they’re learning through trial and error, and that’s exactly how DeepSeek was trained, and the benchmark results are incredible:

The AIME 2024 mathematics benchmark deepseek achieves 71% accuracy, while GPT o1-mini achieves 63.6% accuracy, and on the MATH-500 benchmark, it beats both o1-mini and o1 0912, but it performs worse on coding tasks in Codeforces and Live Code benchmarks, but of course, say much more to benchmarks.
And now I’ll show you what I found while playing with DeepSeek over the past couple of days so jumping onto DeepSeek.com
DeepSeek is a Chinese company focused on AGI research, with a range of AI models designed for various applications. Below is a summary of their key models as of the latest public updates (July 2024)
Also read- Machine Learning for Kids Fun Projects and Easy Concepts best guide 2025
Different types of DeepSeek models
1. DeepSeek-R1
- Type: Autonomous Agent Framework
- Key Features:
- Designed for long-term goal-driven tasks.
- Capable of self-improvement through iterative learning.
- Integrates memory, planning, and tool-use capabilities.
- Use Cases: Complex problem-solving, automation of multi-step workflows.
- Availability: Proprietary (via API or enterprise partnerships).
2. DeepSeek LLM (Large Language Models)
a. DeepSeek-7B
DeepSeek-7B is an open-source large language model (LLM) developed by the Chinese AI company DeepSeek. Released in 2023, it is part of the company’s efforts to advance AI accessibility and research. Here are key details about the model:
- Parameters: 7 billion
- Key Features:
- General-purpose language model optimized for reasoning and coding.
- Trained on 2 trillion tokens.
- Supports 4K context length (extendable via techniques like RoPE).
- Use Cases: Code generation, mathematical reasoning, creative writing.
- Availability: Open-source (check license terms).
b. DeepSeek-67B
- Parameters: 67 billion
- Key Features:
- Enhanced performance for complex tasks (reasoning, STEM, coding).
- Trained on 8.1 trillion tokens.
- Supports 16K context length.
- Use Cases: Advanced code generation, technical research, enterprise solutions.
- Availability: Open weights (free for research/commercial use).
c. DeepSeek-MoE-16B
- Architecture: Mixture-of-Experts (MoE)
- Parameters: 16B total (2B active per inference)
- Key Features:
- Cost-efficient inference with near-7B model performance.
- Trained on 2 trillion tokens.
- Ideal for resource-constrained environments.
- Use Cases: Efficient deployment in chatbots, real-time applications.
- Availability: Open-source.
3. DeepSeek-R1-Lite-Preview
- Type: Lightweight Autonomous Agent
- Key Features:
- Simplified version of DeepSeek-R1 for public access.
- Focuses on basic task automation and user interaction.
- Use Cases: Personal assistants, simple workflow automation.
- Availability: Limited public preview via API.
4. Domain-Specific Models
- DeepSeek-Math: Specialized in solving mathematical problems (supports Olympiad-level tasks).
- DeepSeek-Coder: Code-focused model with repository-level understanding (supports Python, Java, etc.).
- DeepSeek-Finance: Tailored for financial analysis, market prediction, and risk assessment.
Key Technical Highlights
- Training Data: Diverse multilingual corpus (Chinese/English focus) with heavy emphasis on technical content (code, math, science).
- Efficiency: Models use techniques like FlashAttention and grouped-query attention (GQA) for faster inference.
- Openness: Many models are released under permissive licenses (Apache 2.0 or similar) for community use.
Also read – Best 5 AI Training Jobs that will make $100/day
Setup & Getting Started Deepseek

Here’s where you can create an account, or you can go ahead and download the app on your phone, but currently their servers are super slow because of the crazy demand, so I recommend avoiding signing up with an email.
You’ll probably be waiting forever for an email verification code, so I suggest logging directly through a Google account. So once you’re in here, toggle on the Deepseek R1 model. It’s an advanced reasoning model similar to GPT’s o1 model but without GPT’s 50 message per week restriction, and also R1 is able to work alongside internet search toggles right here simultaneously, something I believe o1 still can’t do yet.
DeepSeek R1’s Chain of Thought Prompting

Okay, so the R1 model uses the Chain of Thought prompting approach, which basically encourages the A1 model to break down the reasoning into simple-to-understand steps. This isn’t new, but DeepSeek R1 does this really well, so let’s use this simple math problem as an example.

The first part here is the problem to solve, and the second is the prompt that I’ve added to show its chain of thought. So that’s specifically “let’s solve this step by step, & for each step, explain your thinking and show your calculations.” So hitting enter.

You can see Deepseek thinking and reasoning with itself, and this is what makes R1 different; it transparently reasons through each step individually and figures it out in the same response in real time, whereas GPT can often be sort of clinical and political.
I found DeepSeek R1 to be direct but also great at showing you the reasoning, and you can also extract the reasoning and send it to other AI models too, something that’s unique to DeepSeek R1.
Also read – Dark Realities of AI Porn: What You Should Be Aware Of 2025
DeepSeek R1 Solving Hallucinations
The other cool thing is how Deepseek R1 solves hallucinations. So hallucinations is a term to describe when AI gives you an incorrect answer, and it’s a big challenge with current AI models. But I’ve noticed that R1 is particularly good at understanding why it hallucinates almost as if it’s truly self-aware and then it also corrects itself.

So I started recording this specific cliphere when I noticed that it gave me an incorrect answer to the vague question of “what happened to Hershey’s in 1998? “It says Hershey’s launched Arman kisses in 1988 “.
When in reality they were actually launched in 1990. So I pointed out the mistake and asked why it made the mistake because of its Chain of Thought approach It’s fascinating to see it run a search on this mistake, confirming why it made a mistake, and then it corrects itself and compares to other AI models.
DeepSeek R1 thinks way more naturally, almost human-like, and elaborates on its mistake clearly, so I highly recommend challenging R1 when it hallucinates and giving this a go yourself. It does seem to be slower, though, than ChatGPT 4.o.
DeepSeek R1 Creating Tetris Game
Especially when it comes to coding tasks, I’ve been playing around with creating games on DeepSeek, uh, if we ask it to create a Tetris game and and take the python code and run it in HTML, it takes longer than it would in 4.0
Before you can preview the game right from the chat, if you have coding tasks 01 and particularly Claude 3.5, it still does a better job overall and will help remove the need to debug as a coder, but if you’re looking for a free option or an open source option, R1 here is definitely the way to go currently and worth checking out, so if you have coding tasks o.1 and particularly Claude 3.5 on, it still does a better job.
Overall and will help remove the need to debug as a coder, but if you’re looking for a free option or an open source option, R1 here is definitely the way to go currently and worth checking out, so based on my short time with R1, I feel like deepseek was probably trained on GPT 4.o generated data.
The responses on both models are extremely similar, and if you’re concerned about privacy but still want to leverage DeepSeek R1,.
Ollama Deepseek
To use DeepSeek models with Ollama, follow these steps since they aren’t natively supported in Ollama’s default library:
1. Obtain the Model Weights
- Download the DeepSeek model (e.g., DeepSeek-R1) from its official source:
- Hugging Face Model Hub: DeepSeek-R1
- Ensure you have the model files (e.g.,
.bin
,.safetensors
, or GGUF format).
2. Convert to GGUF Format (If Needed)
Ollama uses the GGUF format. If your model isn’t already in GGUF:
- Use
llama.cpp
to convert it:
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
make
python3 convert.py --input-dir /path/to/deepseek-model --output-type gguf
3. Create a Modelfile
Create a Modelfile
to define the model configuration:
FROM /path/to/deepseek-model.gguf
# Set parameters (adjust based on your hardware)
PARAMETER num_ctx 4096
PARAMETER temperature 0.7
# Add a system prompt template if needed (e.g., for chat)
SYSTEM """
{{ if .System }}{{ .System }}{{ else }}You are DeepSeek-R1, a helpful AI assistant.{{ end }}
"""
4. Build and Run the Model
ollama create deepseek -f Modelfile
ollama run deepseek
Notes
- Hardware Requirements: Ensure your machine has sufficient RAM/VRAM (e.g., 16GB+ RAM for 7B models).
- Prompt Templates: If DeepSeek uses a custom chat format (e.g., special tokens), update the
SYSTEM
template in the Modelfile. - Community Resources: Check if pre-converted GGUF files exist on platforms like TheBloke’s Hugging Face.
Example Chat Usage
Once running, interact like this:
>>> Hi, how can you help me?
I'm DeepSeek-R1, here to assist with questions, coding, or general knowledge!
To Run Ollama Deepseek Locally
You can actually run it locally because it’s open source. You can download and use the Ollama app to run this R1 model on a local server so all your questions and interactions remain completely private rather than on the cloud, but it is a very large model, so you’ll need a beast of a setup to run its full R1 model locally.
It’s roughly like 1,300 GB of the RAM that you’ll need to run it fully, but there are distilled LLM versions of R1 that run on a single GPU version. 1.5B, in particular, works fine on my Mac Studio M2 Ultra, for example.
Final Thoughts on Deepseek R1
So that’s my first look into DeepSeek R1. Clearly, some really incredible things are happening in the AI space. When I began using DeepSeek, I was skeptical but very quickly realized it really is something special considering its low cost to build and that it’s free for users to use.
It’s a very exciting time in the AI space, and I’m keen to see how others like OpenAI respond to DeepSeek. If you made it to the end of this post, and I’ll give it a like for making it to the end of this post. Make sure you subscribe for the latest in tech and AI content, and as always, thanks for reading till the end, and I’ll see you in the next post.
People Also Ask
1. Who are the DeepSeek founders ?
Liang Wenfeng is the founder of Deepseek AI. The company Is owned and funded by Chinese hedge fund High-Flyer, Liang Wenfeng, established the company in 2023 and serves as its CEO.
2. How to get Deepseek API key ?
Deepseek API key there are a couple of ways to use the Deepseek API key , either you directly go to platform. deepseek.com which is not working right now .
So you can go to open router and you can go to deepseek R1 and click on API and just click on create API key and this will help you with using the API of deepseek through open router.
Now what can you do with open router basically you now have access to deep seeks API and you can build anything you want with it maybe you want to build an application simply you go to cursor Ai and you can download cursor and you can just ask it to build a website for you.
3. DeepSeek vs OpenAI
DeepSeek’s Vertical Domain Specialization
- Offers tailored solutions for finance, legal, and healthcare industries.
- Focuses on fine-tuning algorithms for niche tasks.
- Prioritizes cost efficiency and lightweight deployment.
- Emphasizes data privacy with localized deployment options.
4. What industries does DeepSeek R1 serve?
DeepSeek targets industries requiring high-stakes data analysis and regulatory adherence, including:
- Finance: Risk assessment, fraud detection, and automated reporting.
- Healthcare: Medical record analysis, drug discovery support, and patient interaction automation.
- Legal: Contract review, legal research, and compliance monitoring.
- E-commerce: Personalized recommendations, sentiment analysis, and chatbots.
Its flexibility allows customization for sectors needing rapid, accurate NLP-driven insights.
5. How does DeepSeek handle data privacy and security?
DeepSeek employs strict data anonymization, encryption, and access controls to protect user data. For enterprise clients, it offers on-premises or private cloud deployment, ensuring data never leaves the client’s infrastructure. Compliance with regulations like GDPR and HIPAA is prioritized, and the platform avoids storing sensitive information post-processing. Users retain full ownership of their data, with transparency in how it’s used for model training.
6. What are the technical requirements to integrate DeepSeek?
Integration typically requires API access via RESTful endpoints or SDKs for common programming languages (Python, Java). DeepSeek supports cloud-based and on-premises deployment, with modular architecture for scalability. While basic use cases demand minimal infrastructure, advanced applications may require GPU resources for real-time processing. Documentation and developer tools are provided to streamline integration with existing systems like CRM, ERP, or databases.
7. What use cases are most suitable for DeepSeek?
Key use cases include:
- Document Analysis: Automating extraction of insights from reports, emails, or contracts.
- Customer Service: Deploying AI chatbots for 24/7 support and query resolution.
- Predictive Analytics: Forecasting trends using historical data.
- Compliance Monitoring: Flagging regulatory violations in real time.
- R&D Acceleration: Summarizing research papers or generating hypotheses.
8. How does DeepSeek ensure accuracy and reduce bias?
DeepSeek combines rigorous pre-training on diverse datasets with continuous feedback loops. Human-in-the-loop (HITL) validation ensures outputs meet quality standards, especially in critical domains. Bias mitigation techniques include adversarial training, fairness-aware algorithms, and auditing tools. Users can also flag inaccuracies, which feed into model updates.
9. What is DeepSeek’s pricing model?
DeepSeek uses a tiered subscription model based on usage volume (e.g., API calls, processing time). Entry-level plans cater to startups, while enterprise tiers offer custom pricing with SLAs, dedicated support, and enhanced security. A pay-as-you-go option is available for small-scale projects, and free trials allow testing before commitment.
10. Does DeepSeek support non-English languages?
Yes, DeepSeek supports multiple languages, including Chinese, Spanish, French, and German. Performance varies by language complexity and available training data, but its architecture is designed for multilingual NLP tasks like translation, sentiment analysis, and content generation.
Leave a Reply