Choosing The Right AI Partner

A checklist to help you approach conversations with vendors.

Saptarshi Nath

Nov 6, 2025

I was speaking to an AI founder in Australia this weekend. We exchanged notes on what we were hearing from large enterprises.

“We tried this AI thing and it didn’t work out. AI isn’t all it’s made out to be.”
“So many AI companies are getting funded these days. We can’t tell who’s good.”
“An AI SaaS company promised to make us more discoverable on ChatGPT. Then asked us to ‘make better content’. Last I saw, they pivoted to being a content agency.”

Yes, there is a lot of hype around AI. A lot of VCs are investing a lot of money into AI companies. This is making enterprises suffer from FOMO of a different kind.

What if we don’t adopt AI fast enough?

As a founder who has built 10 products over the last 12 years, I’ve seen this play out before.

A new technology suddenly takes off. Early adopters swarm to it, and it hits insane levels of traction quickly. VCs swarm to it and invest money with little/no understanding of the underlying dynamics. 100s of companies get funded; fundraising announcements take over TechCrunch.

But only a fraction succeed. We’ve seen this with crypto, we seen this with Web3. We’re seeing this with AI.

Of all of these technologies, I personally think AI is the only technology that need not be a bubble. I spent two years working with 300+ deep tech founders before LLMs became cool. I’ve worked with founders who designed AI algorithms to predict cancer risk, protect satellites against space debris, and forecast potential sales before you launch a CPG product.

The tech is not a bubble. The market is.

In my last article, I talked about why AI projects fail. A lot of the reasons had nothing to do with the tech.

Sign up for our insights

Today, let’s look at one specific aspect of AI failure: the AI partners you work with.

AI partners are software/product or services companies that help you launch AI solutions within your enterprise. But with the proliferation of AI companies, it’s tough to say who’s good and who’s faking it.

Here is a basic framework along with the questions you need to ask new partners, to help you separate the wheat from the chaff:

1. Data and training

Ask how the model learns from your data and what it keeps. Strong partners explain training data sources, fine‑tuning options, and how they isolate your data from others.

What data did you train on, and how do you keep my data separate?
- Good sign — named sources and data isolation;
- Red flag — “we mix all customer data to provide better insights”
Who owns the tuned model and outputs, and can I export them?
- Good sign — clear customer ownership and exports;
- Red flag — “proprietary black box”

2. Model behavior and quality

Ask how they test models before and after go‑live. Most models will take time to learn, so test if they understand how long key inferences might take. Strong partners show simple pass‑fail tests, A/B plans, and rollback steps. They monitor drift, errors, and response quality, not just accuracy in a demo.

Assess every AI partner to understand if their strengths lie in data (Machine Learning-led AI) or text (Large Language Model-led AI).

How do you test and re‑test model performance in production?
- Good sign — defined TEVV plan (Test, Evaluation, Validation, Verification), A/B tests, rollback;
- Red flag — “we test once.”
What do you monitor weekly, and how often do you refresh the model?
- Good sign — drift, latency, safety, scheduled retrains;
- Red flag — “on demand only.”
Show one incident and the fix.
- Good sign — postmortem and shipped changes;
- Red flag — “no incidents.”

3. Human oversight and control

Ask how people stay in control of AI decisions. Strong partners design human‑in‑the‑loop steps and easy overrides. They log decisions so you can audit and improve.

Where can my team review, edit, or block outputs?
- Good sign — approvals and thresholds, internal documentation of decisions;
- Red flag — auto decisions without review.
What audit logs do we get for each decision?
- Good sign — time‑stamped, searchable logs;
- Red flag — “not available.”

4. Grounding and hallucinations

Ask how the system avoids making things up (especially for text/creative AI use cases). Strong partners ground answers in your data and show hallucination rates. They red‑team prompts (design adversarial prompts themselves for testing) and add guardrails against unsafe outputs.

How do you ground outputs in our catalog, inventory, and policies?
- Good sign — documented retrieval and citations;
- Red flag — generic answers.
What is your hallucination test and target, and how do you measure it?
- Good sign — simple metric and threshold;
- Red flag — “not measured.”
How do you defend against prompt injection and misuse?
- Good sign — red‑teaming and filters;
- Red flag — “user training only.”

5. Retail workflow fit

Ask how AI shows up in the tools your teams already use. Strong partners embed into POS, CRM, or supply systems so staff do not learn a new app.

Where will staff act on AI daily?
- Good sign — inside POS or CRM;
- Red flag — separate platform entirely
What is the minimum change to our routine?
- Good sign — “no extra clicks” plan;
- Red flag — “new portal and process.”
How will you prove weekly usage and impact?
- Good sign — usage and acceptance reports;
- Red flag — monthly summaries in CSV and PDF

6. AI‑specific proof only

Ask for production proof of AI, not just software. Strong partners show live retail deployments and measured uplifts. They moved pilots to production and sustained value over time.

Share named retail references with similar systems and live KPIs.
- Good sign — direct calls and dashboards;
- Red flag — anonymous logos.
Run a time‑boxed pilot with pass‑fail gates tied to one KPI.
- Good sign — clear thresholds and testing;
- Red flag — open‑ended POCs.
What slipped in recent go‑lives, and what did you change?
- Good sign — candid lessons;
- Red flag — “none.”

7. Proof of speed

This is, by far, the metric that separates the doers from the rest. If you are considering a legacy provider, ask them to show you what they have already done using the data they have collected over the past few years.

If a POS or Loyalty vendor who has had access to your customer data for the last five years is unable to show you a quick proof of concept within weeks, they are not the right AI partner for you.

8. Proof of team

Every software company has developers. AI is still a new space, so there are very few university degrees in AI. Legacy software providers have invested highly in traditional software developers who understand infrastructure, UX, and APIs—but can they convert that experience into data-driven insights that power AI platforms? Does the team have retail experts to help them understand which AI recommendations are worth pursuing?

To be fair, it is unlikely that you will identify AI-ready talent by looking at their LinkedIn profiles; it is still a new field that is changing every week. As a regional retailer, your answer is probably somewhere between a 23 yr old AI founder and a traditional retail SaaS platform/service.

—

Every regional retailer needs a solution slightly different from others. The good news is that it is far easier to customize solutions with AI than through old school rules-based systems.

If you have a budget set aside for AI experiments, you should run multiple POCs across different areas of your business to see which AI partners manage to deliver. If you haven’t seen output within weeks, they are not the right fit for you (of course, results from the output may take months to show).

If you are evaluating AI partners for your business, find a time with the Goodlight AI team here. We will be happy to help identify the best fit partners for your business.

‹ Navigating SNAP Cuts - What Independent Grocers Need to Know Now

Who Owns Your Data? ›

Platform

Research

Case Studies

Newsletter