The AEO Black Box: A Step-by-Step Guide to Auditing Your AI Search Visibility for Free

admin

6 days ago

The AEO Black Box: A Step-by-Step Guide to Auditing Your AI Search Visibility for Free

The gold rush for “Answer Engine Optimization” (AEO) is here. With the integration of AI into our search habits, a new industry has exploded overnight. Self-proclaimed “AI SEO” consultants and a dizzying array of “AI visibility platforms” are all vying for a slice of your marketing budget, promising to get you ranked in AI search.

But in this chaotic new landscape, how do you know what’s real and what’s digital snake oil?

Before you spend a dime on an automated tracker, you need to understand the fundamentals. This is the “wax on, wax off” of AI search. You must first learn how these large language models (LLMs) think, work, and source information.

This guide will empower you to cut through the noise and conduct a comprehensive AI search audit for yourself. While it’s a manual and sometimes tedious exercise, it is the single most valuable way to understand what makes these platforms valuable lead-generation channels.

By following this framework, you will learn exactly how to conduct a comprehensive audit that tracks your brand’s visibility and how to spot the recurring, high-authority sources that dominate AI answers in your niche—and how to get yourself included among them.

The New Battlefield: Understanding Citations vs. Mentions

Before we dive into the audit, let’s clarify what “winning” in AI search actually means. Unlike traditional SEO, where a “win” is a blue link in the top 10, AI search visibility is more nuanced. There are two primary ways to win.

First, you can get cited as a source. This is when your website’s URL appears in the footnotes, references, or “learn more” cards of an AI-generated answer. This is a crucial signal. It means the AI, in its quest to find the best information, found your content valuable, authoritative, and relevant enough to reference as evidence for its answer. This is the new, high-value backlink.

But even better is getting your brand mentioned by name. This is when the AI, in the body of its own generative text, names your brand or product. For example, in response to “What’s the best email tool?” the AI might say, “Many users prefer [Your Brand] for its ease of use.” This is a direct endorsement. It positions your brand not just as a source, but as a solution.

The ultimate goal, of course, is to achieve both: a direct, positive brand mention in the text, with a citation linking back to your site in the footnotes. This combination is perfection—it establishes your brand as an authority and drives a direct, high-intent click. Your audit process should be designed to track both.

Why AI Audits Are a “Black Box” (And How to Pick the Lock)

Conducting an AI visibility audit is far more complex than a traditional SEO audit. In the world of Google SEO, we are spoiled. We have a wealth of data from Google Search Console, reliable third-party keyword research tools like Ahrefs or Semrush, and a relatively predictable set of ranking factors.

AI search, on the other hand, is volatile, opaque, and evolving in real-time. It’s a “black box” for three main reasons:

AI is Generative: No two answers are ever likely to be 100% identical. Unlike a static list of 10 blue links, the LLM under the hood is generating the answer on the fly every time a user asks. This means visibility can fluctuate from one minute to the next.
Personalization is Deep: Factors like a user’s location, their previous search history, and their personal preferences will heavily affect the answers they are surfaced. A logged-in user who has previously visited your competitor’s site may receive a very different answer than an incognito user.
There is No Query Database: This is the biggest hurdle. Private companies like OpenAI, Perplexity, and Anthropic do not provide a “query database” or a tool like Google Search Console. We have no way of knowing what people are searching for or how often.

So, if it’s a black box, how do we audit it? Thankfully, we can tease out powerful insights. We can’t see the query database, but we can study the AI’s “thinking process,” analyze the sources it collects, and identify the patterns that lead to a citation or a mention. This audit process is the key to picking that lock.

Step 1: Assembling Your AI Audit Toolkit

For this audit, we will focus on the four main AI platforms that currently cover the vast majority of all conversational AI searches.

ChatGPT: The undisputed market leader and the platform that defines the modern conversational AI experience.
Google AI Overviews: As this is integrated directly into the Google search results page, it is arguably the most critical platform for marketers to understand and influence.
Claude: A powerful competitor from Anthropic, known for its large context window, nuanced responses, and strong focus on safety and accuracy.
Perplexity: A research-first “answer engine” that is fantastic for our purposes because it is transparent about its sources, showing them in real-time as it formulates an answer.

You are free to start with just one, but to get a truly holistic view, you should tackle all four.

Since we are trying to get a neutral, repeatable baseline, it is essential to standardize your testing environment. For every query, you must open each platform in an isolated state to remove the variable of personalization.

For ChatGPT, Perplexity, and Claude, use their “temporary chat” or equivalent guest modes.
For Google AI Overviews, always use an Incognito or private browsing window.

There will still be variability (free vs. paid plans, which specific model you’re using), but our goal is to isolate as many variables as possible to create a standardized experiment we can repeat later.

Finally, your main tool will be a simple spreadsheet. Create a new file with a “Queries” column, and then a column for each AI assistant you are testing (e.g., “ChatGPT-4o,” “Google AI Overview,” “Claude 3 Sonnet,” “Perplexity”).

Step 2: Crafting the Right “Key” (Query Brainstorming)

This is the most important part of the entire audit. Your insights will only be as good as the prompts you test.

In traditional SEO, marketers often focus on the “top of the funnel” (TOFU) with high-volume, low-competition informational posts to bring in traffic. For this exercise, that approach is wrong.

A TOFU query like “Why do I need an email marketing platform?” might get your blog post cited as a source, but it will almost never get your brand mentioned.

Our goal is to get both. This only happens with middle-of-funnel (MOFU) and bottom-of-funnel (BOFU) queries. A user searching for “What are the best email marketing platforms for small businesses?” has clear intent. They know their problem and are now in the evaluation phase. This is where the AI will make recommendations and mention brands by name.

You must brainstorm two distinct types of queries to test.

Type 1: Discovery Queries (Unbranded)

This is to test your organic appearance and visibility. You want to find out if the AI will recommend your brand when a user doesn’t ask for it by name.

Good Discovery Query Examples:

What are the best email marketing platforms for small businesses?
Which project management tools integrate with Slack and Google Workspace?
What CRM software works best for real estate agents under $50 a month?
Compare the top [your product category] for [your target audience].

Type 2: Brand-Specific Queries (Branded & Competitive)

This is to test the AI’s sentiment and accuracy when it talks about you and your competitors. This is where you audit your brand perception.

Good Brand-Specific Query Examples:

How does [Your Brand] compare to [Competitor 1] for [key feature]?
What are the pros and cons of [Your Brand]?
Is [Your Brand] worth the price compared to alternatives like [Competitor 2]?
What are the biggest complaints about [Your Brand]?

How do you find these queries? Do not guess. Talk to your customer support and sales teams. These folks are on the front lines, speaking to your customers every day. Ask them:

What pain points do customers always mention?
What competitors come up in sales calls?
What features do people compare us on?

If you need more inspiration, AI tools can be very helpful. You can prompt an assistant to act as your target customer and brainstorm the questions they might ask when evaluating a solution.

It’s up to you how many prompts you test, but a good start is at least five to ten queries in each category to get a meaningful baseline.

Step 3: The “Wax On, Wax Off” (Manual Data Collection)

Now comes the manual, but critical, part. Once you have your spreadsheet and your list of queries, it’s time to collect the responses.

For every single query on your list, follow this exact process:

Open ChatGPT, Google (Incognito), Claude, and Perplexity in their isolated modes, each in a new tab.
Copy your first query and paste it into all four platforms.
Let each one generate its full response.
Carefully copy and paste the entire text output from each platform into the corresponding cell in your spreadsheet.
Close the tabs, and repeat the process for the next query.

But don’t just blindly copy and paste. This is where your intuition-building begins. As you gather the data, take a few extra seconds to analyze.

On ChatGPT, click the “thinking” button (it may have a different name) to study the logic of how it retrieved its answers. What search queries did it run on its end?
On Perplexity, hover your mouse over each source referenced in the answer.
Inevitably, you will start seeing patterns. You’ll notice the same one or two websites being referenced multiple times for “best of” queries. You’ll see a specific competitor mentioned for a key feature. Make a “notes” column in your spreadsheet and write these observations down. This manual exercise is what separates a data collector from an analyst.

Step 4: Scaling Your Insights with AI Analysis

When you have completed your entire spreadsheet, this is where the magic happens. You now possess a valuable, proprietary CSV file of raw data. You can analyze this manually, which is a useful exercise. But to get powerful insights at scale, you can feed this CSV back into a large language model.

Platforms like Claude are excellent for this, as they have large context windows that can handle entire file uploads.

Here are the four key metrics you want to surface. Attach your CSV to your prompt and ask the AI to analyze it for the following:

Metric 1: Mention Volume

This metric tracks how frequently your brand and your competitors appear across all AI responses. This is your baseline AI visibility. The key here is to have the AI isolate your unbranded, discovery queries for this analysis. (Obviously, your branded queries will mention your brand). This metric, when applied only to discovery queries, becomes your “Competitive Share of Voice.”

Metric 2: Sentiment Analysis

This evaluates whether the AI’s mentions of your brand are positive, negative, or neutral. This helps you understand not just if you’re visible, but what your brand perception is. You’ll want to include all queries (branded and unbranded) for this. Ask the AI to not only categorize the sentiment but also to reveal the context behind it (e.g., “Positive for features, Negative for pricing, Neutral on support”).

Metric 3: Recurring Source Analysis

This is arguably the most actionable insight you will get. Ask the AI to identify all the source URLs cited across all responses and rank them by how many times they were cited. This experiment will reveal the specific websites, review sites, and publications that the AI assistants trust over and over. These represent high-authority sources that you must either target for coverage (e.g., get your product reviewed on TechRadar or G2 if they are cited often) or compete against by creating superior content.

Metric 4: Competitive Share of Voice

This metric provides the full context for your performance by identifying all brands mentioned (yours and competitors’) and calculating each brand’s total mentions as a percentage of all brand mentions. This will highlight your biggest competitive threats and show you exactly where you can gain share, either by platform or by query type.

When using AI for data analysis, always remember the “Trust, but Verify” rule. Be sure to query and verify the data yourself before presenting it to stakeholders. LLMs can miscount or misunderstand. Use the AI’s output as a powerful, time-saving analysis, not as infallible gospel.

Step 5: From Data to a 90-Day Action Plan

Once you have these raw insights, you can use AI to format them as a step-by-step action plan. In the same chat where you uploaded your CSV and got your analysis, you can now prompt the AI to create a prioritized 90-day action plan with specific tactics to improve your AI search performance.

You’ll definitely want to edit this before showing it to anyone, but it provides a fantastic starting point. For example:

Insight: The AI analysis reveals that TechRadar.com and G2.com are cited in 75% of “best [your product category]” queries.
Action Plan Item: “Week 1: Identify and contact the editors at TechRadar and G2 responsible for our category. Begin campaign to get [Your Brand] included in their next review update.”
Insight: Sentiment analysis shows [Competitor Brand] is consistently mentioned for its “Slack integration,” while [Your Brand] is not, even though your integration is superior.
Action Plan Item: “Week 3: Publish a new, in-depth blog post and technical guide titled ‘The Ultimate Guide to [Your Brand]’s Slack Integration’ and create a comparison page for ‘[Your Brand] vs. [Competitor] for Slack.'”

Once you’ve taken action on your plan, wait about a month or two and rerun this entire process. Use the same prompts and the same assistants, and track the same metrics to see if your tasks are actually making an impact.

Conclusion: Mastering the Foundation of AI Search

This manual audit process might seem tedious in an age of automation, but it is the true foundation for understanding how AI search really works. It’s the “wax on, wax off” that builds the muscle memory and intuition you need.

Once you’ve mastered this systematic approach, you’ll have the knowledge to properly evaluate any AI tracking tool that comes to market. More importantly, you’ll understand exactly what levers to pull to improve your visibility.

Whether you’re a solopreneur trying to get mentioned alongside industry giants or an established company looking to defend your market position in the AEO era, this audit framework gives you the data you need to stop guessing and start making informed, strategic decisions. The “black box” isn’t so scary when you’re the one who has learned how to pick the lock.