TechnologySoftware Engineering

How Notion Powers Precise AI Search for Millions with Cohere Rerank

Notion, the connected workspace platform used by millions worldwide, integrated Cohere Rerank into its search pipeline to power Notion AI’s search accuracy across multilingual enterprise workspaces. Every search and Notion AI interaction now routes through Cohere Rerank, delivering dramatically improved relevance while cutting the cost and complexity of embedding-based retrieval for smaller workspaces.

Impact

Millions

Notion AI users reached

From 100,000 to 200 documents

Retrieval precision improvement

Eliminated embedding and vector storage for most workspaces

Cost reduction

Challenge

Notion needed a search reranker that could improve answer precision across large, multilingual enterprise workspaces without the cost and infrastructure burden of embedding-based retrieval, while supporting diverse global user bases in EMEA and APAC.

Solution

Cohere Rerank was integrated into Notion’s search pipeline via Amazon SageMaker, placed before the generative model to boost result relevance — enabling Notion to skip costly embeddings for smaller workspaces while combining cross-source search results from Slack, GitHub, and other connected tools.

Tools & Technologies

What Leaders Say

Cohere is a key part of what makes Notion AI work. Cohere Rerank gives us both the speed and quality we need, and it’s consistently improving. It’s been essential for getting our AI Connectors out the door quickly.

Simon Last, Cofounder and CTO, Notion

Cohere Rerank is one of the only high quality and fast multilingual rerankers on the market and that’s why we use it. Every single search and Notion AI interaction goes through Cohere Rerank.

Abhishek Modi, Software Engineer, Notion

One big part of the search pipeline is precision. With Cohere Rerank, we no longer have to worry about it and can focus on other parts like recall.

Abhishek Modi, Software Engineer, Notion
Get the full context.

Sign up to read complete case studies, access detailed metrics, and unlock all use cases.

Full Story

Notion operates at a scale that makes search infrastructure a core product concern rather than a technical detail. The platform hosts collaborative wikis, project management systems, and connected knowledge bases for companies ranging from small startups to global enterprises — and as those organizations grow, so does the volume of documents, databases, and integrations they manage in Notion. When Notion AI launched, delivering accurate, fast answers to user queries across this heterogeneous body of content became a critical product requirement. The technical challenge was finding a reranking solution that could match the precision users expected from AI-native features.

Before integrating Cohere Rerank, Notion’s search pipeline faced a structural tension: embedding models were expensive to run at scale and required vector storage infrastructure, but simpler retrieval methods lacked the precision needed to surface the most relevant documents from large workspaces. The challenge was compounded by Notion’s global user base — more than half of whom work with multilingual content in EMEA and APAC markets — requiring a solution that could operate accurately across languages without separate models or additional engineering overhead.

Notion integrated Cohere Rerank directly into the search pipeline, placing it before the generative model processes any user query. The implementation used Amazon SageMaker for auto-scaling, allowing Notion to handle variable traffic loads without over-provisioning. A particularly important design choice: for workspaces with fewer than 1,000 documents — the majority of Notion’s customer base — Cohere Rerank allows Notion to bypass traditional embedding and vector search entirely, reducing both cost and system complexity while maintaining high answer quality. The same Rerank model also handles cross-source result combination, merging outputs from Slack, GitHub, and other connected tools into coherent, ranked answers.

The precision improvement over embedding models has been substantial. As software engineer Abhishek Modi explained, Cohere Rerank is dramatically more accurate than embedding approaches at scoping large document sets down to the most relevant subset — moving from a retrieval pool of 100,000 documents to 200 with much higher fidelity. Co-founder and CTO Simon Last described Rerank as essential to getting Notion AI Connectors out the door quickly, citing both speed and ongoing improvement as differentiating factors. The result: millions of Notion users have engaged with Notion AI features, with rapid month-on-month growth in usage and a meaningful contribution to company revenue.

Notion’s implementation illustrates how a consumer-grade AI experience depends on enterprise-grade retrieval infrastructure underneath. The decision to partner closely with Cohere — rather than relying on open-source alternatives that would require self-maintenance — reflects a strategic bet that rapid model improvement and collaborative support would be a durable competitive advantage as Notion AI scales.

Similar Cases

B
BambooHR
tens of thousands
employee questions answered

BambooHR built an AI-powered HR assistant using Cohere's Embed and Rerank models to answer employee questions accurately, saving HR teams thousands of hours while handling sensitive data securely.

TechnologyCRCohere RerankCECohere Embed
D
Draftwise
30%
improvement in search result quality

Draftwise, an AI-powered contract drafting and negotiation platform, built its Smart Draft product on Cohere’s Command, Embed, and Rerank models to enable semantic search and AI-generated contract language grounded in clients’ own document libraries. The system delivered a 30% improvement in search result quality and tripled API usage within a single quarter.

TechnologyCCCohere CommandCRCohere Rerank
P
Pfizer
93%
database reduction

Pfizer achieved a 93% database reduction and 20% cost avoidance by migrating their global SAP environment to S/4HANA on IBM Power10 infrastructure.

PharmaceuticalsTechnologyICIBM ConsultingIPIBM Power Virtual Server
C
Confluent
15,000+
hours saved monthly

Confluent, a data streaming platform company with 2,000+ employees and 4,000+ customers, deployed Glean to solve the knowledge fragmentation that came with rapid growth from 250 to 2,000+ employees across 20+ systems. Glean indexed the company's full tool stack — Slack, Salesforce, Confluence, and more — enabling instant knowledge retrieval across all teams. The result: 15,000+ hours saved monthly, a 13% increase in support team satisfaction, and over 70% employee adoption.

TechnologyGGlean
C
Classmethod
up to 90%
reduction in development time

Classmethod, a leading Japanese cloud integrator, deployed Claude Code across its engineering teams to address chronic developer shortages. The tool automated code generation, review, and testing workflows, reducing development time by up to 90% on specific tasks and cutting code review time by 80%.

TechnologyCCClaude Code
L
Lusha
300%
increase in outbound leads

Lusha is a B2B sales intelligence platform with 1.5 million users and a database of over 200 million business contacts. By deploying Elasticsearch as both a full-text search engine and a vector database for AI-powered lead recommendations, Lusha helps customers generate 300% more leads, achieve conversion rates up to 10x higher, and realize return on investment of up to 1,000%.

TechnologyEElasticsearch
N
Nextdoor
2–3x
engineering productivity improvement

Nextdoor, the neighborhood social network, deployed Glean as a unified Work AI layer embedded directly into the tools employees already use. Rather than mandating adoption, the team built a self-reinforcing learning loop of Slack channels, live office hours, and quick-win storytelling that turned early experimentation into company-wide AI habits — with engineering productivity gains of 2–3x and RevOps workflows shrinking from hours to minutes.

TechnologyGGlean
A
Apollo
35%
increase in meeting bookings

Apollo integrated Claude 3.5 Haiku into its sales engagement platform to power intelligent, personalized messaging and prospect research at scale. The AI-driven features help sales teams write highly effective outreach without requiring technical expertise, generating over 5 million Claude-powered messaging actions per month. Customers using Claude-powered messaging saw a 35% boost in meeting bookings and a 15% increase in retention rates.

TechnologyC3Claude 3.5 Haiku