How AI Search Is Recommending Clean Makeup Brands

This analysis is based on the source benchmark: Clean Makeup Brands: 2026 AI Market Discovery Index

By Mark HuntleyFounder and CEO

9 minutes read

REQUEST AN AI VISIBILITY AUDIT JUMP TO CONTENT

On this report

Key Takeaways

AI search is turning clean makeup discovery into a shortlist process, not just a brand-awareness exercise.
Recommendation strength varies by buyer intent, with brands surfacing differently for sensitive skin, acne-prone skin, mature skin, and skincare-first makeup.
e.l.f. had the broadest structured visibility, while Tower 28 captured the highest modeled recommendation value in the dataset.
ILIA showed a narrower specialist profile, and the article argues that citation-backed product authority matters more than clean positioning alone.

Clean makeup discovery is becoming an AI-generated shortlist market. Buyers are not only asking which brands are “clean.” They are asking which foundation works for acne-prone skin, which concealer is safest for mature skin, which makeup brands are best for sensitive skin, which products combine skincare benefits with coverage, and which brands are actually worth buying.

The LLM Authority Index benchmark shows recommendation power concentrating around Rare Beauty, Kosas, ILIA, Tower 28, and e.l.f. Cosmetics. The category is no longer won by legacy “clean beauty” positioning alone. AI systems appear to reward brands that combine product-specific authority, editorial reinforcement, skincare framing, sensitive-skin trust, and broad cross-platform familiarity.

The structured ILIA Beauty dataset adds a more product-level view. Across the tracked 1,173 observations, e.l.f. Cosmetics had the broadest raw visibility and valid recommendation coverage, Tower 28 captured the highest modeled monthly recommendation value, and ILIA showed a narrower but meaningful specialist profile around clean complexion, sensitive-skin, natural-finish, and skincare-first makeup prompts.

Methodology

Market studied: Clean makeup brands, skincare-infused cosmetics, sensitive-skin beauty, acne-compatible makeup, complexion products, concealers, brow products, mascara, blush, bronzer, lip products, pricing, and brand comparison prompts.
Brands/entities included: ILIA Beauty, Beautycounter, e.l.f. Cosmetics, Glossier, Kosas, Milk Makeup, Rare Beauty, Susan Posnick Cosmetics, Tarte Cosmetics, Thrive Causemetics, and Tower 28. The structured metrics include ten core tracked brands; Susan Posnick appears in the competitor setup but not in the aggregate metric table surfaced in the dataset.
Data collection date/window: May 2026 reporting window. The ILIA Beauty structured extraction was loaded on May 20, 2026.
AI platforms tested: ChatGPT, Gemini, Perplexity, Copilot, Google AI Mode, and Google AI Overviews.
Number of prompts tested: The structured ILIA dataset contains 1,173 platform-prompt observations. The public LLM Authority Index benchmark describes a broader directional model of 20,000 modeled prompt interactions across the clean makeup category.
Prompt categories: The structured dataset uses three clusters: Best Clean Beauty Discovery, Clean Beauty Comparisons, and Clean Beauty Pricing. The public benchmark also discusses 20+ high-intent buying environments, including sensitive skin, acne compatibility, makeup + skincare hybrids, best overall beauty prompts, mature skin, texture-safe makeup, and affordable alternatives.
Definition of a mention: A company counted as mentioned when it appeared in an AI answer, regardless of whether the answer framed it positively, neutrally, comparatively, or as a recommendation.
Definition of a valid recommendation: A valid recommendation required positive, shortlist-quality recommendation framing. Neutral mentions, product references, comparison-anchor appearances, factual mentions, and extraction-failed rows were not treated as recommendation credit unless the dataset marked them as valid recommendations.
Ranking/scoring metrics used: Raw mention presence, valid recommendation coverage, recommended top-three rate, recommended rank-one rate, average recommended rank, positive/neutral/negative visibility, net sentiment score by mentions, citation/source patterns, and modeled monthly captured recommendation value. Modeled value is a benchmark estimate, not realized revenue.
Limitations: This is a point-in-time benchmark. AI outputs vary across prompts, platforms, product categories, skin concerns, retailer availability, source retrieval, and time. The structured dataset includes 225 extraction-failed fallback observations, about 19.2% of the file. It also includes many product-level prompts that are broader than clean makeup alone, so this analysis gives priority to brand-level patterns, tracked competitors, and the public benchmark’s category interpretation. No Ahrefs export was supplied, so this draft does not make organic traffic, keyword ranking, DR, UR, or backlink claims.

Key findings

e.l.f. Cosmetics had the broadest structured visibility and recommendation coverage. Across 1,173 observations, e.l.f. Cosmetics appeared in 277 observations, a 23.61% raw mention presence rate, and received 169 valid recommendations, or 14.41% valid recommendation coverage. It also had the highest top-three count at 126 and the highest rank-one count at 92 in the structured dataset.

Tower 28 captured the highest modeled recommendation value. Tower 28 had 164 raw mentions, 90 valid recommendations, a 5.80% top-three rate, and $108,125.39 in modeled monthly captured recommendation value. Its value-weighted strength was especially concentrated in comparison prompts, where it captured more modeled value than any other tracked brand.

Kosas was the strongest “makeup + skincare hybrid” contender by value and recommendation depth. Kosas had 184 raw mentions, 109 valid recommendations, a 6.65% top-three rate, and $99,258.83 in modeled monthly captured recommendation value. The public benchmark also identifies Kosas as especially strong in mature-skin, hydration, concealer, corrector, brow, and skincare-infused makeup prompts.

Rare Beauty had broad cross-category visibility. Rare Beauty had 264 raw mentions, 153 valid recommendations, and a 13.04% valid recommendation coverage rate. The public benchmark describes Rare Beauty as unusually resilient across blush, highlighter, lip oils, under-eye brighteners, brow products, and complexion prompts.

ILIA showed a narrower but strategically important specialist profile. ILIA Beauty had 94 raw mentions, 58 valid recommendations, a 3.84% top-three rate, a 2.98% rank-one rate, and $55,308.58 in modeled monthly captured recommendation value. Its strongest role was not broad mass-market dominance; it was contextual relevance in clean complexion, sensitive-skin, natural-finish, and skincare-first makeup environments.

What changed in the market

Clean makeup used to be discovered through Sephora, Ulta, influencers, TikTok, YouTube, beauty editors, retailer merchandising, and search rankings. Those channels still matter. But AI systems now sit directly inside product consideration.

A buyer asking “best foundation for acne-prone skin” is not simply researching ingredients. A buyer asking “best concealer for fine lines” is already narrowing the product set. A buyer asking “which clean beauty brands are worth it?” is asking AI to filter trust, performance, price, and product fit into a shortlist.

That changes the category.

In traditional beauty discovery, a brand could win through awareness, retail placement, creator buzz, or strong product pages. In AI-led discovery, the brand also needs to be retrievable, comparable, well-cited, and easy for AI systems to justify recommending.

The public benchmark describes this as a shift from visibility to shortlist advancement. Being present in an AI answer is no longer enough. The commercial question is whether the brand is advanced into the recommendation set when the buyer is ready to choose.

What the benchmark found

The benchmark found a category where recommendation power is concentrating around brands with clear product roles.

Rare Beauty appears to own broad consumer familiarity and natural-looking makeup authority. It repeatedly surfaces across product categories where AI systems reward lightweight, easy-to-wear, emotionally familiar, and broadly accessible beauty framing.

Kosas appears to own skincare-infused makeup relevance. Its strongest AI advantage is semantic clarity: “makeup + skincare hybrid.” That makes the brand easier for AI systems to retrieve in prompts about mature skin, hydration, concealer, under-eye products, brow products, and texture-safe makeup.

ILIA appears to be a specialist leader in clean complexion and sensitive-skin contexts. The public benchmark identifies ILIA’s strongest signals around clean, non-comedogenic, sensitive skin, natural finish, and serum foundation prompts. The structured dataset supports that view: ILIA does not dominate broad raw visibility, but it remains recommendation-relevant in trust-oriented clean beauty moments.

Tower 28 appears to be a safety-oriented specialist with strong value-weighted performance. It repeatedly appears in acne-safe, sensitive-skin, serum concealer, lightweight complexion, and cream bronzer contexts. In the structured metrics, Tower 28’s modeled recommendation value was the highest among the tracked brands.

e.l.f. Cosmetics appears to be the value disruptor. e.l.f. is not only present in budget beauty prompts. It appears alongside prestige and clean-adjacent brands in primers, lip products, brow products, eye brighteners, complexion recommendations, and pricing contexts. Its high visibility and recommendation coverage suggest that affordability, availability, and strong cross-source validation can create meaningful AI recommendation power.

Legacy clean beauty positioning was not enough. Beautycounter had only 4 raw mentions and 2 valid recommendations in the structured dataset. That does not mean the brand lacks consumer awareness, but it does show that legacy clean-beauty association alone may not translate into AI shortlist strength in this prompt set.

Why visibility is not enough

Clean makeup is especially vulnerable to the gap between presence and recommendation power.

A brand can appear in an AI answer because it is known, sold at major retailers, included in editorial roundups, or mentioned in product comparisons. But that does not mean AI systems are selecting it as a preferred recommendation.

The structured dataset shows different types of AI strength. e.l.f. had the broadest raw visibility and recommendation coverage. Tower 28 captured the highest modeled value. Kosas had deep hybrid/skincare relevance. Rare Beauty had broad cross-category familiarity. ILIA had narrower but meaningful specialist authority.

Those are not the same kind of win.

For clean makeup brands, the operating question is no longer “Do AI systems know us?” It is “Which buyer moments do AI systems trust us to answer?”

A brand may win “best affordable primer” but lose “best clean foundation for acne-prone skin.” A brand may win “best concealer for mature skin” but lose “best overall clean makeup brand.” A brand may be visible in blush or lip products but absent from sensitive-skin complexion prompts.

AI discovery is product-specific, prompt-specific, and source-dependent.

The citation layer

The citation layer is shaping which clean makeup brands AI systems trust enough to recommend.

The public benchmark identifies repeated source environments such as Vogue, Allure, Marie Claire, Ulta, Sephora, Good Housekeeping, Forbes, and beauty review ecosystems. These sources reinforce product-specific authority, comparison framing, skin-sensitivity narratives, and broad beauty consensus.

The structured dataset supports that pattern. Frequently cited domains included Sephora, Allure, Vogue, Reddit, Ulta, Byrdie, Who What Wear, InStyle, YouTube, Cosmopolitan, Rank & Style, Forbes, Healthline, Marie Claire, Good Housekeeping, NewBeauty, PureWow, Tower 28, Rare Beauty, Target, Amazon, and TikTok.

This matters because AI systems do not appear to rely only on brand websites. They synthesize editorial lists, retailer pages, product reviews, community sentiment, social/video content, dermatologist-adjacent framing, and comparison-style beauty content.

Citation frequency is not endorsement. But citation patterns show the public evidence layer that AI systems can retrieve and summarize when constructing clean makeup recommendations.

For beauty brands, the implication is direct: product pages alone are not enough. The brand needs credible, repeated, third-party evidence around the exact product use cases it wants to own.

What brands need to fix

Clean makeup brands need stronger recommendation-stage architecture.

First, brands need clearer product-use-case ownership. “Clean makeup” is too broad. AI systems are segmenting by acne-prone skin, sensitive skin, mature skin, serum foundation, concealer, brow products, tubing mascara, blush, lip oil, primer, SPF complexion, and affordable alternatives.

Second, brands need stronger third-party reinforcement. Editorial beauty publications, retailer pages, dermatologist-adjacent content, Reddit/community discussions, YouTube reviews, product awards, and comparison roundups all appear to shape AI recommendation confidence.

Third, brands need to connect product authority back to brand authority. AI answers often recommend specific products rather than parent brands. A clean makeup brand can lose brand-level credit if hero products are not consistently tied to the brand entity across public sources.

Fourth, brands need to watch affordability and value framing. e.l.f.’s structured performance shows that value brands can win AI recommendation moments when they are well-supported by public evidence. Premium clean brands need sharper justification for why the product is worth the price.

Finally, brands need to separate clean positioning from safe recommendation proof. “Clean” is not always enough. AI systems appear to reward more specific evidence: non-comedogenic claims, sensitive-skin suitability, fragrance concerns, dermatologist compatibility, hydration, texture safety, and performance under makeup-wear conditions.

How CiteWorks Studio helps

Map AI recommendation visibility. Track prompts, platforms, company presence, valid recommendations, top-three and rank-one performance, framing, and citation sources.
Identify the sources shaping AI answers. Find the editorial, review, forum, government, directory, owned, and search-visible sources that influence brand framing.
Build the citation architecture plan. Strengthen the public evidence layer so AI systems have more accurate, consistent, and persuasive source material to synthesize.

Commercial takeaway

Clean makeup discovery is moving from brand awareness into AI-mediated product selection.

Rare Beauty, Kosas, ILIA, Tower 28, and e.l.f. Cosmetics appear to be the key directional leaders in the public benchmark. The structured ILIA dataset adds a more granular view: e.l.f. had the broadest measured visibility and recommendation coverage, Tower 28 captured the highest modeled recommendation value, Kosas showed strong skincare-hybrid depth, Rare Beauty maintained broad category relevance, and ILIA held a specialist role in clean, sensitive-skin, natural-finish, and skincare-first prompts.

The next competitive advantage in clean makeup will not come from “clean” claims alone. It will come from source-backed recommendation confidence across the product moments buyers actually ask AI systems to resolve.

For beauty brands, the work is now citation architecture: strengthening the public evidence layer so AI systems can understand when the brand should be recommended, why it is safe to recommend, and which product use cases it deserves to own.

See Where Your Brand Stands

Want to know how AI systems are recommending your clean makeup brand?

CiteWorks Studio can map where your brand appears, where competitors are recommended instead, which prompts carry the most commercial risk, and which sources are shaping AI-generated beauty shortlists.

Request an AI Visibility Audit or Citation Architecture Review to see how your brand performs across recommendation-stage visibility, clean complexion prompts, sensitive-skin prompts, makeup + skincare hybrid prompts, product comparisons, pricing prompts, and the public evidence layer AI systems use to form clean makeup recommendations.

/ Take the next step

Want to Understand Your AI Citation Footprint?

We start every engagement with a full audit of how AI systems reference your brand today.

Measurable, Repeatable Programme

Build a durable foundation of credible citations that compounds over time and continues to influence AI answers as new queries emerge

Citation Architecture Review

Identify which high-authority community sources are and aren't working in your favour across AI platforms.

AI Visibility Audit

Understand exactly how LLMs are referencing your brand today and which sources are shaping those answers.

REQUEST AN AI VISIBILITY AUDIT ALL AI INDUSTRY MARKET DISCOVERY REPORTS

/ Learn More

Understanding AI search visibility.

AI search experiences create answers by pulling information from many places online and summarizing it into a single response.

About The Author

Mark Huntley

Founder and CEO

Mark Huntley, J.D. is founder of CiteWorks Studio, a strategic advisory focused on visibility, authority, and recommendation presence in AI-shaped search environments. His work centers on embedding-level GEO, vector optimization, and cosine gap engineering — helping brands align their digital presence with the retrieval systems that increasingly shape discovery, interpretation, and choice.