- AI x Customer Research
- Posts
- AI x Customer Research - January '25
AI x Customer Research - January '25
Generate your own data for unlimited experiments, a study on sentiment analysis, DeepSeek data privacy issues, and more...

Read time: 16 minutes
Happy New Year (already a month in)!
A bunch of new people joined the newsletter since the year started—if that’s you, welcome! If you’re an “older” subscriber, thanks a lot for sticking with me into 2025. 🤗
Data privacy is a big AI topic, and one of the questions I continue hearing is this:
“How do I test AI for customer research when I can’t share my company’s PI with AI right now?”
The August edition had a guide to multiple low-risk data sources. But if you want to run tests with realistic “lookalike” data similar to your company’s, synthetic data might become your new best friend.
I created a guide to AI-generated user interviews and support tickets that don’t pose any data risks.
Secondly, I’ve been looking for studies that tell us what affects AI’s performance in sentiment analysis. I found one that gives us some helpful tips for getting better sentiment results.
Finally, the name DeepSeek has become unavoidable. There’s a bit on what you should know before testing DeepSeek, and also what MIT scientists have found about a better approach to training AI.
Here comes the January issue -
In this edition:
📝 Generate Synthetic User Interviews & Support Tickets with AI. A quick intro video plus step by step guides to creating your own data for testing.
🧑🔬 How Well Can AI Actually Handle Sentiment Analysis?
An older study still holds valuable insight into what you can do to improve how AI identifies sentiment.
📰 AI News:
DeepSeek: What you need to know about its data policies before trying it yourself.
A Better Way to Train Models? What scientists have found about training in chaos vs. predictable environments.
WORKFLOW UPGRADES
📝 Generate Synthetic User Interviews & Support Tickets with AI
Creating realistic user interviews and support tickets can be tedious, but AI makes it much easier—when prompted correctly. I tested ChatGPT, Claude and Gemini to generate synthetic interview transcripts. Claude consistently produced more complete and natural responses.
I wanted to give you a couple of guides for creating your own synthetic data - so you hopefully get to test more AI stuff, faster, without risk.
A good place to start is the walkthrough video below on generating realistic interview transcripts.
Creating Realistic Interview Transcripts for AI Analysis Course 👩💻 - Watch Video
One big issue with all three models tested: they often stop generating mid-way, requiring you to repeatedly click “continue.”
I have a few tips in the Interview guide for getting them to generate complete responses in one go, but there seems to be no 100%-certain way to get a full set of transcripts (without paying hefty fees for specific synthetic data platforms - something that’s on my long list of future experiments).
Access the guides on generating two types of synthetic data below -
✔️ Generate full interview transcripts
✔️ Create realistic support tickets in table format
AI STUDIES
💔 How Well Can AI Actually Handle Sentiment Analysis?
The TL;DR -
Some AI can classify sentiment with high accuracy, but the way you prompt it—and the type of content you’re analyzing—makes a huge difference.
A study from a year ago tested GPT-3.5, GPT-4, and Llama 2 against specialized sentiment analysis models across 3,900+ text samples from 20 datasets.
While newer models obviously already exist today, this study can still teach us some evergreen lessons about how AI processes sentiment—and how you can get better results.
Key Takeaways from the Study
AI performs well on structured, detailed text
✅ Sentiment classification is most reliable in structured content like product reviews (Amazon, Yelp).
✅ Longer, more detailed text improves accuracy—single-sentence inputs often lead to errors.
—
Social media sentiment is still a challenge
❌ AI struggles with sarcasm, slang, and ambiguous phrases, leading to lower accuracy.
❌ Short-form, casual writing (like tweets) produces less reliable sentiment predictions.
—
Providing examples dramatically improves AI accuracy
Few-shot prompting significantly improves performance—especially for older or smaller models.
For three-class classification (positive/neutral/negative): GPT-3.5 improved by 4.7 percentage points when using few-shot prompting.
What this means for you:
If you’re limited to using an outdated model, few-shot prompting can give you an accuracy boost.
Instead of just asking, "Is this positive or negative?", show AI a few labeled examples first to get better results.
〰️
🧠 Quick Explainer: Zero-shot vs. Few-shot Prompting
Zero-shot (giving no examples) → Imagine I hand you a random piece of food and say, “Is this salty, sweet, or sour?” You have to guess.
Few-shot (giving examples first) → Instead, I first show you:
🍟 Fries: “This is salty.”
🍬 Candy: “This is sweet.”
🍋 Lemon: “This is sour.”
Then, I give you a new food and ask, “What type of taste is this?”
AI works the same way. So you can imagine that when we show it examples of what we believe are salty, sweet or sour foods, it will do a better job guessing which taste applies to the next food shown.
Few-shot prompting ends up sometimes making AI more accurate, as the study found—seemingly when working with models that aren’t the latest, state-of-the-art ones.
What This Means for You
✅ If your content is short, vague, or sarcastic, AI may misclassify sentiment.
This is especially true for social media, customer service chats, and ambiguous statements.
✅ If you're using an outdated model, few-shot prompting may improve accuracy.
Provide clear, labeled examples before asking AI to classify new text.
✅ For best results, choose the right tool for the job.
If you only need sentiment analysis, specialized tools will likely outperform general LLMs.
AI NEWS
📰 DeepSeek’s R1 model - is testing it worth the risk?
DeepSeek’s been making waves in the news, thanks to its technical report showing the R1 model outperforming heavyweights OpenAI—and it’s open source.
But it's crucial to consider the issues it poses to data privacy and security. I haven’t tested it yet myself, because I want to ensure I'm not risking anyone’s data—including my own—in the process.
What You Need to Know About DeepSeek's Data Policies:
Data Storage: User data, including personal information and chat histories, is stored on servers located in China. This is significant because Chinese laws may require companies to provide data access to the government upon request.
Data Collection: The platform collects extensive user data, such as device information, IP addresses, and interaction logs. Notably, there was a recent security lapse that exposed over a million lines of sensitive internal data.
Content Moderation: Reports indicate that the AI avoids discussions on topics sensitive to the Chinese government, such as the Tiananmen Square incident and Taiwan's political status. This built-in moderation could affect the objectivity of the information provided.
Implications for Customer Research:
If you're considering testing DeepSeek for any customer research tasks, it's important to be aware of potential challenges:
Confidentiality: Sensitive customer data could be subject to foreign government access, leading to potential breaches of confidentiality agreements and legal implications.
Compliance: Using a platform that stores data in China may conflict with data protection regulations like the General Data Protection Regulation (GDPR) in the European Union, which mandates strict controls over data transfer to non-EU countries.
Bias: The AI's avoidance of certain topics will likely result in incomplete or skewed data analysis, leading to inaccurate conclusions in customer research.
What If You Still Really Want to Test It? Steps to Take:
Access DeepSeek via Web Interface:
Use DeepSeek's official web platform for your interactions.
By using the app, their privacy policy indicates that it would have way more access to your device and its contents.
Use Options for Anonymity:
VPN Usage: Employ a reputable Virtual Private Network (VPN) to encrypt your internet connection, masking your IP address and location.
Incognito Mode: Engage with DeepSeek using your browser's incognito or private browsing mode to prevent the storage of cookies and browsing history.
Fake Your Data:
Synthetic Data: Conduct your tests using synthetic data—you can use the walkthroughs in the first section of this month’s newsletter ☝️
Avoid Personal Information: Refrain from inputting any real personal or sensitive information during your interactions with DeepSeek.
Anonymous Accounts: If account creation is necessary, consider using a dedicated email address that isn't linked to your personal or professional identity or data.
Post-Session Protection:
Clear Data: After completing your testing sessions, clear your browser's cookies and cache to remove any residual data that could be used to track your activity.
〰️
📰 A New AI Training Approach That Could Change How We Analyze Customer Data
MIT researchers recently found that AI performs better in unpredictable real-world situations when it is first trained in controlled, high-certainty environments.
That’s not what most people assume.
Usually, we throw AI into messy, high-variability data and hope it gets smarter. But this research suggests AI learns foundational skills better when it starts with cleaner inputs—then adapts more effectively when faced with real-world chaos.
Why it matters -
A lot of AI tools for customer insights get trained on messy, inconsistent datasets full of noise. Some of our teams are training their own local model instances the same way.
But if MIT is right, that might be slowing them down and leading to worse results.
Instead, this research suggests a smarter way to train AI for customer research:
✔ First, train AI on clean, structured data so it learns the fundamentals without distractions.
✔ Then, introduce the messy data piles—but only after it has a solid foundation.
For us, this could mean better AI for analyzing customer feedback, spotting trends, and making accurate predictions, without getting thrown off by bad data.
It’s still early days, but definitely something to watch.
WHAT’S COMING NEXT?
Here’s what’s coming in the next few editions -
Testing “video mode” in LLMs to potentially enhance in-person testing
A snippet from my AI for Analysis, Insights + Reporting course
AI tools for prototyping - does one generate the best wireframes and views for user testing?
and more 🤓
See you in February!
-Caitlin