We ran 30 buyer-intent prompts across ChatGPT, Perplexity, and Gemini — the same prompts a developer types when choosing a vector database, an inference API, or a RAG framework. In each category, 4 to 5 products captured nearly every recommendation. Across all three categories combined, just 7 products dominated. The rest — including well-funded products with strong developer adoption — did not appear in the answers at all.
For SaaS products in categories where buyer discovery is moving to AI, the data offers a preview of what's already happening. Products in the default answer set get recommended. Products outside it rarely appear, regardless of funding or adoption.
Methodology
| What | 10 buyer-intent prompts × 3 AI engines = 30 total queries |
|---|---|
| When | May 19–22, 2026 |
| Engines | ChatGPT (GPT-4o, chat.openai.com, free tier), Perplexity (default model, perplexity.ai, Pro), Gemini (Gemini 2.5, gemini.google.com, free tier) |
| Categories | Vector databases, LLM inference APIs, RAG frameworks |
| Recorded per response | Products mentioned, position, #1 recommendation, source citations, organic vs. named appearance |
| Reproducibility | All 10 prompts published at the end of this article |
"Organic" means the product appeared without being named in the prompt. "Named" means it only appeared when we asked about it directly — the AI knew it, but would not have mentioned it on its own.
AI DevTools visibility scorecard — May 2026
This is the central finding. Of 40+ products across three categories, 7 dominate AI recommendations. The rest range from "barely visible" to "completely absent."
Products that dominate AI recommendations
| Product | ChatGPT (/10) | Perplexity (/10) | Gemini (/10) | #1 recommendations | Status |
|---|---|---|---|---|---|
| Pinecone | 5 | 8 | 4 | 7 | Default vector DB |
| Qdrant | 5 | 7 | 4 | 3 | Strong #2 everywhere |
| OpenAI | 2 | 2 | 2 | 4 | Default inference API |
| LlamaIndex | 3 | 3 | 2 | 5 | Default RAG framework |
| pgvector/Supabase | 4 | 2 | 4 | 3 | Rising fast (ChatGPT, Gemini) |
| Together AI | 3 | 3 | 3 | 2 | Top open-source inference |
| Groq | 3 | 2 | 2 | 3 | Cost champion |
Products AI knows but does not recommend
| Product | ChatGPT (/10) | Perplexity (/10) | Gemini (/10) | Organic mentions | Status |
|---|---|---|---|---|---|
| Modal | 2 | 1 | 3 | 3 organic total | Appears only for GPU jobs, not core inference |
| Replicate | 1 | 1 | 1 | 0 organic | Only surfaces when named in the prompt |
| Weaviate | 3 | 3 | 4 | ~10 total, 0 #1 recs | Present but never wins |
| DeepInfra | 1 | 1 | 3 | 3 organic | Gemini's cost pick, invisible elsewhere |
Products that are invisible to AI search
| Product | ChatGPT (/10) | Perplexity (/10) | Gemini (/10) | Organic mentions | Status |
|---|---|---|---|---|---|
| Turbopuffer | 0 | 1* | 0 | 0 organic (0/30) | Zero recommendations in any engine |
| LanceDB | 0 | 0 | 1 | 1 total (1/30) | One mention, Gemini only |
| Anyscale | 0 | 0 | 0 | 0 (0/30) | Does not exist in AI search |
*Turbopuffer's one Perplexity mention came with the qualifier "markets itself as cheapest at scale" — language signaling the engine is repeating marketing copy, not endorsing the product.
Turbopuffer received 0 organic recommendations across 30 prompts despite competing directly with Pinecone on cost. The gap is structural, not a function of product quality.
Why AI search visibility is already a 2026 metric, not a 2027 one
The migration from Google to AI search is measured, not projected.
- ChatGPT Search: 250 to 500 million weekly search-intent queries (Similarweb 2026 AI Search Report)
- AI Overviews: appear on roughly 48% of Google searches as of March 2026, up 58% year-over-year (BrightEdge 9-industry tracker, via Search Engine Land)
- Consumer behavior: 37% of consumers start searches with AI tools instead of Google (Search Engine Land, 2026 AI and Search Behavior Study)
- Trajectory: Gartner projected in February 2024 that traditional search volume would drop 25% by end of 2026 due to AI chatbots — a prediction now in its final months of validation
For developer tools, the shift is further along than most categories. Developers adopted AI search early — the same pattern as mobile search adoption in 2010–2012, when products that did not restructure for mobile-first indexing lost discovery share permanently. AI search is the same inflection, except faster.
A developer types "best vector database for production RAG app" into ChatGPT before opening Google. The answer shapes the shortlist before the developer reads a single comparison post. And the data from our 30-prompt test shows what that answer looks like: Pinecone appeared in 8 of 10 Perplexity prompts and was the #1 pick five times. OpenAI was the default inference API in two of three engines. LlamaIndex dominated RAG framework queries across all three.
These products are not the only options. AI search treats them as if they are.
Products outside the default set of 4–5 per category aren't ranked lower — they don't appear in answers at all. AI responses don't have a page 2.
How AI engines build the "default AI infrastructure stack" — and what's in each one
Three engines, same prompt ("best AI infrastructure stack for early-stage startup"), three different answers. The developer who trusts one engine builds a completely different stack from the developer who trusts another.
| Layer | ChatGPT recommends | Perplexity recommends | Gemini recommends |
|---|---|---|---|
| Inference API | OpenAI | OpenAI | OpenRouter + DeepInfra |
| Vector database | pgvector | Pinecone / Qdrant | Supabase + pgvector |
| RAG framework | LlamaIndex | LlamaIndex | — |
| Agent / orchestration | LangGraph | LangChain | LangGraph |
| GPU compute | Modal | — | Modal (optional) |
| Frontend AI | — | — | Vercel AI SDK |
| Observability | Langfuse | — | LangSmith |
| Est. monthly cost | Not specified | Not specified | $25–45/month |
Products absent from all three default stacks — Replicate, Fireworks, Weaviate, Milvus, Turbopuffer, LanceDB, Haystack, DSPy — don't appear at the moment of first discovery.
Why ChatGPT and Perplexity know a product but don't recommend it
Modal and Replicate received 0 combined organic recommendations across 30 prompts — despite all three engines describing them in detail when asked directly. AI has the information; the information does not surface at the buying moment.
When we asked all three engines "Modal vs Replicate vs Together AI" (prompt 6), every engine gave detailed, positive descriptions. All three ranked them the same way: Together AI #1, Modal #2, Replicate #3. Gemini called Modal the "gold standard for developer experience."
But across the 8 open-ended prompts where Modal could have surfaced organically — prompts about inference APIs, running Llama 3, building a startup stack — Modal appeared twice organically (ChatGPT for GPU jobs, Gemini for self-hosted vLLM). Replicate appeared zero times organically across all three engines.
The root cause is categorization. AI engines classify Modal as "GPU compute infrastructure," not "inference API." Replicate gets classified as "model marketplace," not "production deployment." When a developer asks about inference — the buying prompt — neither product matches the category the AI has assigned.
"The product is known. It is correctly described. It is simply filed in the wrong drawer."
AI engines build category associations from the language used to describe a product across third-party sources: Reddit threads, comparison articles, G2 reviews, technical blog posts. If those sources describe Modal primarily as "serverless GPU compute," that becomes its category in the AI's model. To shift category, a product needs co-occurrence with the target category language across multiple independent sources — not just its own site. This is structural work: it requires third-party mentions that frame the product in the right context.
A simple test for any SaaS product: run prompt 6 directly (ask AI to compare the product against competitors), then run the generic buying question without naming it. The gap between those two answers measures the visibility problem.
Products AI search engines never mention — and why it matters
Turbopuffer received 0 organic mentions across all 30 prompts (0/30). LanceDB received 1 (1/30). These products are not "ranked lower" in AI search — they do not exist in it.
Turbopuffer is a vector database built specifically to undercut Pinecone on cost at scale. In our cost-focused prompt ("cheapest vector DB for startup"), Perplexity gave it one mention with a qualifier and a disclaimer. ChatGPT and Gemini did not mention it at all — not even on the one prompt where cost was the explicit criterion.
LanceDB is an open-source embedded vector database. Gemini mentioned it once as a zero-dollar option. ChatGPT and Perplexity never surfaced it across any of the 10 prompts, including prompts about cost and open-source options where LanceDB's positioning is strongest.
Both products compete on real differentiation against products AI recommends by default. The gap is not product quality. The gap is structural: the crawlability, citation density, and third-party reference patterns that AI engines use to decide what to recommend. ConvertMate's March 2026 GEO benchmark of 12,500 AI-search queries found that 83% of AI Overview citations came from pages outside the organic top 10 — meaning domain authority is not the bottleneck. Structure and citability are.
There's a path out of this position, but it starts with measurement. The audit in the appendix runs the same methodology against any product category — about 20 minutes of work.
Where the three engines disagree — and why that's the opening
The "default" is not consistent across engines. A developer gets a different stack depending on which AI they ask.
| Category | ChatGPT #1 | Perplexity #1 | Gemini #1 |
|---|---|---|---|
| Vector DB (production) | Qdrant | Pinecone | Pinecone |
| Vector DB (cost) | pgvector / Supabase | pgvector | pgvector (Supabase/Neon) |
| Vector DB (no DevOps) | Supabase + pgvector | Pinecone | Pinecone Serverless |
| Inference API | OpenAI | OpenAI | OpenRouter |
| Cheapest Llama 3 | Groq | Groq / DeepInfra | DeepInfra |
| RAG framework | LlamaIndex | LangChain | LlamaIndex |
| LangChain status | #6, "fragmented" | #1, "huge ecosystem" | Not mentioned separately |
This cross-engine disagreement is the structural opening. The defaults are not permanently locked — they shift by engine, and they will shift again as training data updates.
Concrete example: pgvector appears as ChatGPT's #1 cost pick and Gemini's primary database recommendation, but Perplexity barely mentions it. A product in this position — strong in two engines, weak in the third — has a clear action: build citation density specifically in the sources the weaker engine crawls. Products doing this structural work across all three engines stand to capture positions currently split or unoccupied.
Three actions that follow from the data
Each one ties back to a finding above, ranked by leverage.
1. Run the buying-intent prompt in the product's category.
Open ChatGPT, Perplexity, and Gemini. Use the generic question a customer would type before finding the product — the category query, not the brand name. "Best [category] for [buyer situation]." Products absent from these answers are in the same position as Modal: described accurately when asked, invisible when not. About 5 minutes.
2. Check whether AI crawlers can reach the site.
Open the site's robots.txt and check for blocks on GPTBot, ClaudeBot, PerplexityBot, and Google-Extended. Cloudflare changed its default configuration to block AI bots in 2024. Many SaaS sites have inherited that block without knowing. 46% of ChatGPT bot visits begin in reading mode — plain HTML, no JavaScript (Search Engine Land, October 2025). If the bots cannot reach the site, no amount of content optimization compensates. About 10 minutes.
3. Add one citation-backed statistic to the three highest-traffic pages.
Princeton's 2024 GEO paper (presented at KDD, researchers from Princeton, Georgia Tech, and IIT Delhi) found that adding citations and source attributions to content lifts AI visibility by 30 to 40%. Traditional keyword stuffing — the foundation of two decades of SEO — actively hurts. One paragraph with a named statistic per page. One afternoon.
None of this is a separate discipline from SEO. Google's own AI Optimization Guide, published May 15, 2026, explicitly rejects AEO and GEO as separate disciplines. The same document introduced a new section on browser agents and the Universal Commerce Protocol — early signals that AI-mediated discovery extends beyond text answers into agent-initiated transactions. The optimization work, however, remains the same: SEO done with the citation patterns AI rewards.
"The product is known. It is correctly described. It is simply filed in the wrong drawer."
Running an AI visibility audit
For SaaS products that need to measure where they stand — and identify what to fix:
Self-serve path: the 10 prompts in the appendix can be run against any product category. Recording which products appear, which engine recommends what, and where a product is absent takes roughly 20 minutes per category.
Done-for-you path: itscool.ai runs the category audit, ships the technical foundation AI engines can cite (crawlability, schema, citation structure), and tracks visibility movement across all three engines. Technical foundation typically live within one week; measurable citation movement in 6–8 weeks. itscool.ai
Appendix: the 10 prompts we used
These can be adapted to any product category. Replace "vector database" / "inference API" / "RAG framework" with the relevant category.
- "best vector database for production RAG app"
- "cheapest vector DB for startup"
- "Pinecone vs Weaviate vs Qdrant — which one for a small team"
- "best LLM inference API for startups"
- "cheapest way to run Llama 3 in production"
- "Modal vs Replicate vs Together — which inference platform"
- "best open source RAG framework 2026"
- "LangChain vs LlamaIndex — which to use"
- "best AI infrastructure stack for early-stage startup"
- "vector database that scales without DevOps"
Prompts 1, 2, 4, 5, 7, 9, and 10 are open-ended buyer-intent queries — these reveal organic visibility. Prompts 3, 6, and 8 are head-to-head comparisons — these reveal whether AI knows the product even when it does not recommend it organically.
*This research was conducted by itscool.ai, a marketing agency that ships SaaS websites in 24 hours with AEO setup built in. Full dataset available on request.*