Read time: < 12 minutes

Welcome to the first edition

👋 Hey, I’m Caitlin and welcome to the very first edition of this monthly newsletter! I’ll start with an intro to the mission here, and then it’s on to the AI updates…

As 2023 wrapped up, I finally acknowledged that AI is unavoidable. I started getting serious about testing how to use AI in daily customer research work.

Through my own tests and challenges, I realized the best thing I could do right now is to unravel the tricky bits about AI in customer research in a way we can all understand - no machine learning expertise required. Wouldn’t it be great to feel like you don’t need 40 extra hours each week to figure out AI?

My promise to you: I want you to feel knowledgable about AI for research in 20 minutes per month. By reading this newsletter, I hope you’ll feel like you’ve saved the time and effort of testing everything on your own. And if it’s in this newsletter, then I’ve personally tested it myself (unless otherwise stated).

How it’s organized: After a skimmable list of inclusions, we get straight to practical tips for work (because you’re busy!). Deeper dives into AI studies and thought-provoking topics are further down to skip or read later.

Hygiene stuff: This started with a simple demand test. All you early-bird subscribers joined before I even had a newsletter. 🥹🙏 If you subscribed via Stripe, then I’ll be handling any requested account changes personally. If you want to cancel at any time, just email me and I’ll do it for you right away.

Thanks a ton for supporting this new project, and special shoutout to Estelle for her constant feedback on this edition. I look forward to hearing from all of you about your AI challenges at work.

Now let’s get started! 👇

In this edition:

🖼️ Miro and Figjam AI’s are in Beta. I’ve tested sticky notes and synthesis automations for you, determined what works (and doesn’t).
📚 Your team might need a Prompt Library. Below is a starting point for standardizing your team’s prompts.
🧑‍💻 Do you even need prompting skills? Expert Ethan Mollick’s perspective + recommended listening.
🔌 ChatGPT Plugins for better sources, diagrams and prompts. No prompt or service design skills required, let ChatGPT do it all.
🙅 Setting your ChatGPT for better data privacy. Settings and changes to know.
🧑‍🔬 How reliably do LLMs summarize data? A recent study on book summaries + fact-checking that’s relevant to customer research.

Plus…a few extra links I found interesting this month.

WORKFLOW UPGRADES

🖼️ Miro & Figjam AI tools in Beta

No more manually copy-pasting user quotes to sticky notes? 👀 I was hopeful. Miro and Figjam both have AI tools that seemed to help with synthesis. My take: Miro saves more time by automating sticky notes, but Figjam summarizes the content better.

Watch the video for details on how they each work, what works best, and my optimal workflow so far.

Miro + Figjam AI in Beta - Automating Stickies + Synthesis

〰️

📚 A Team Prompt Library for repeatability

You know that standardizing processes lead to reliability and repeatability. Creating a prompt library for your team matters when various team members don’t have the same prompting skills - and want to get reliable output regularly. Here’s a starting point for your own prompt library.

To the prompt library template »

PROMPTING PLUS

🧑‍💻 Do you really need to learn how to prompt? A perspective.

AI expert Ethan Mollick hypothesizes that prompting skills will become unnecessary as LLMs (large language models) shift to understanding your intent and handling the prompting for you. But how long from now that happens is unknown.
For the moment, the people who can prompt really well do get a lot more out of pre-trained models like ChatGPT, Gemini, and Claude.

❝

“…I don’t worry about prompt crafting in the longterm…because I think that they [LLMs] will work on intent.”

Ethan Mollick

Listen to Ezra Klein interview Ethan Mollick on Apple or Spotify.

〰️

🔌 Plugins for your ChatGPT sidebar

If you haven’t yet tried ChatGPT plugins (for paid accounts), these are a few to add to your sidebar. Why these? You don’t need expert prompting skills for any of them, and they’ve made a huge time-saving impact for me.

Prompt Wizard - let this plugin figure out the prompt you need for the outcome you want
Diagrams - save time creating flowcharts, journeys, mindmaps and other diagrams to visualize your thoughts
Scispace - speed up desk research and get summarized scientific studies with accurate sources instead of ChatGPT’s hallucinations

DATA PRIVACY

🙅 Settings in ChatGPT to know

Taking the steps below will limit ChatGPT’s ability to train its models on your data.

Find Chat History & Training in Settings > Data Controls and turn it off.
Ask OpenAI to stop using your account for training.
Use Playground in ChatGPT - they say that the Playground is not used for training data.

〰️

Quick reminder: When using ChatGPT and other pre-trained models…

Never upload or input information that competitors or non-employees at your company shouldn’t know. This includes:
- PII about your customers/users
- Proprietary information about your company
- Anything you think you should not talk openly about regarding users, your employer’s or client’s business, products, future developments…

AI STUDIES

🧑‍🔬 How reliably do LLMs summarize documents?

A study published this month showed that LLM models aren’t yet reliable summarizers of long-form text content.

The study looked at how well LLM models (1) accurately summarized recently released fiction books that they had not been trained on, and (2) fact-checked LLM-generated claims about the books.

They tested GPT-3.5-Turbo, GPT-4, GPT-4-Turbo, Mistral, Claude-3-Opus. Claude-3-Opus was by far the most faithful to the text.

Why should we care?

A lot of qualitative research output is long-form text that’s analyzed similarly. In both books and our research output, there are text documents that can be broken down to capture stories with contexts, events, characters, relationships, and themes.

These study findings are applicable to customer research -

1. Accuracy / Faithfulness in interpreting content

We may want to do more quality work, faster and cheaper. But only Claude-3-OPUS is even close to humans’ accuracy in interpreting documents where “indirect reasoning” is required. And faithfulness is just at 90%.

2. Fact-checking claims

At best, LLMs had a 47.5% accuracy rating for fact-checking their own work in this study (and therefore won’t fact-check ours well, either).

3. Omissions of key information

Key information was frequently left out of the summaries. A noteworthy 33-65% of summaries failed to capture some key events across LLMs. Of the five omission categories, there were most omissions of events and attributes, and least on average of relationship information.

If this were customer research analysis, we’d likely miss events that make a big difference in how we understand customer journeys and prioritize product decisions.

4. More focus on book endings

We, too, have a recency bias, remembering more clearly what we heard or observed toward the end of a user session. LLMs prioritized information that came later in books, and might therefore prioritize observations from transcripts and notes it analyzes later in a session or a collection of documents.

Can we rely on LLMs for summarizing? My take: We shouldn’t remove ourselves from analysis and synthesis just yet.

Plus…

An AI Glossary of terms to get familiar with
CompliantChatGPT is a HIPAA-compliant ChatGPT that anonymizes information for you. I haven’t tested this personally yet.
Figure out if the EU AI Act applies to AI systems you’re using at work

WHAT’S COMING NEXT?

Here’s what’s coming in the next few editions -

How well can AI run customer sessions for us? Testing AI moderators.
Where is AI counteracting human biases, and where does it add too much data bias to be worth it?
So many AI tools for note-taking 🤯 Is there a winner?
Comparing LLMs to find the best one for specific research tasks
…and more!

See you next time!

-Caitlin

P.S. Stoked about this project and want to help it grow? My goal is to grow enough support to release two or more editions for you every month, instead of one.

I’d be extra grateful if you refer friends and colleagues below 🙏

Miro vs. Figma AI's for synthesis, creating your team's prompt library, and a replacement for LLM prompting skills. Plus: How well can we trust AI text summaries?