Financial ServicesCustomer Service

How Vanguard Uses Pinecone to Boost Customer Support with 12% More Accurate Responses

Vanguard partnered with Pinecone to build Agent Assist, an internal RAG-powered AI chat tool that helps customer support representatives find answers faster and more accurately. By replacing keyword-based search with hybrid vector retrieval, Vanguard achieved 12% more accurate search results and meaningfully reduced call times — even during high-demand periods like tax season.

Impact

12%

Search result accuracy improvement

Reduced

Customer call times

Reduced

Operational overhead during peak seasons

Challenge

Vanguard's customer support teams relied on keyword-based search that returned links to lengthy documents, forcing agents to manually hunt for answers — driving up call times, reducing satisfaction, and requiring costly seasonal hiring surges. The team needed a scalable, real-time retrieval solution capable of handling a highly dynamic financial document dataset.

Solution

Vanguard's CAI team built Agent Assist, an internal RAG-powered chat assistant using Pinecone Serverless as the vector database, combining BM25 sparse embeddings with dense embeddings for hybrid retrieval, and leveraging metadata filtering to ensure agents always access the most current documents.

Tools & Technologies

What Leaders Say

One of the reasons we chose Pinecone beyond functionality is because Pinecone was willing to work with Vanguard, specifically to meet our security control and performance requirements by creating a dedicated AWS account and cluster for us.

Hung Pham, ML Engineer at Vanguard
Get the full context.

Sign up to read complete case studies, access detailed metrics, and unlock all use cases.

Full Story

Vanguard, one of the world's largest investment management firms, has long prioritized delivering exceptional client experiences — including responsive, knowledgeable customer support. With millions of clients relying on Vanguard for retirement planning, investments, and financial advice, the quality and speed of support interactions carry real financial consequences. The company's Center for Analytics and Insights (CAI) team, operating within the Chief Data Analytics office, was tasked with modernizing how customer service representatives access information during live calls.

The core challenge was a retrieval problem. Vanguard's support teams were using keyword-based search to locate relevant financial documents, but this approach only surfaced links to lengthy source files — leaving agents to manually sift through dense content to find specific answers. This inefficiency drove up call times and eroded customer satisfaction. During peak periods like tax season, Vanguard's traditional workaround was to hire additional representatives to absorb the volume, adding significant operational cost without addressing the root cause.

To move beyond keyword search, the CAI team first experimented with JSON storage and cosine similarity-based retrieval. These early solutions proved too slow, struggled to scale with growing datasets, and frequently returned results that lacked contextual relevance. The team then evaluated a range of vector database options — including pgvector, Faiss, and Redis — before selecting Pinecone. Key decision factors included Pinecone's support for hybrid search (combining BM25 sparse embeddings with dense embeddings), real-time indexing capabilities, advanced metadata filtering for compliance, and enterprise-grade security features such as AWS PrivateLink. Pinecone also worked directly with Vanguard to provision a dedicated AWS account and cluster tailored to their security and performance requirements.

The resulting system, called Agent Assist, is an internal RAG-powered chat assistant built on top of Pinecone Serverless. Financial documents stored as HTML pages are scraped, preprocessed with a custom chunking strategy, and encoded into dual dense and sparse embeddings — with sparse embeddings trained in-house using BM25. Hybrid retrieval is configured with an Alpha value of 0.5 to balance precision across domain-specific financial terminology. To ensure agents always access current information, documents are tagged daily as "live" or "stale" using metadata filtering, with outdated documents archived to DynamoDB for regulatory compliance.

Since deploying Agent Assist, Vanguard has seen measurable gains across accuracy, efficiency, and compliance. Hybrid retrieval improved search result accuracy by over 12% compared to dense-only retrieval. Call times dropped as agents could surface precise answers in real time, and the team no longer needs to scale headcount during peak seasons to manage volume. Metadata tagging also introduced stronger audit traceability, supporting Vanguard's compliance obligations. Looking ahead, Vanguard plans to expand its use of RAG and Contextual-Aware Generation (CAG) systems, with Pinecone serving as a foundational layer in its broader AI knowledge ecosystem.

Similar Cases

SB
State Bank of India
64 million
app downloads

State Bank of India partnered with IBM to build YONO, a comprehensive mobile platform combining banking, financial services, and marketplace that achieved 64 million downloads and a USD 40-50 billion valuation.

Financial ServicesIAIBM API ConnectICIBM Consulting
NB
Norges Bank Investment Management
20%
weekly time savings per employee

Norges Bank Investment Management deployed Claude Enterprise to 600+ employees across all departments, achieving 20% weekly time savings on analytical and operational tasks. The $1.7 trillion sovereign wealth fund uses Claude for investment research synthesis, ESG compliance across 9,000 portfolio companies, and multilingual information processing. Business users can now prototype AI solutions independently without IT bottlenecks.

Financial ServicesCEClaude Enterprise
C
Campfire
3 days
reduction in monthly close time

Campfire embedded Claude into its accounting platform to automate monthly closes, bank reconciliation, and financial reporting. Customers now close their books 3 days faster, reconcile bank statements 90% faster, and generate reports 50% faster. Claude powers Ember, Campfire's AI chat interface for natural language financial queries.

Financial ServicesCAClaude API
N
nCino
3.5x
faster document filing

nCino, a cloud-based banking platform serving 2,800+ financial institutions, built domain-specific AI tools on Databricks and AWS leveraging 13 years of proprietary banking data. Their Banking Advisor delivers role-based AI insights natively within the platform, while Continuous Credit Monitoring automates risk alerts across the loan lifecycle. The result is 3.5x faster document processing and a shift from reactive to proactive portfolio management.

Financial ServicesAAWSSSalesforce
N
N26
70%
task automation in targeted processes

N26 deployed Claude via AWS Bedrock across 15+ internal use cases in its first year, automating up to 70% of tasks in targeted customer service processes and cutting manual processing by 50% across 24 European markets. New AI implementations now go from ideation to evaluation in 1–2 weeks.

Financial ServicesABAmazon BedrockCEClaude Enterprise
F
Fiserv
$10M
sla penalties avoided

Fiserv built safe, scalable AI automation on UiPath Platform with built-in governance, avoiding $10M in SLA penalties and onboarding 20,000+ QSR locations on schedule.

Financial ServicesUPUiPath Platform
RB
Raiffeisen Bank International
3–4x
average sql query performance improvement

Raiffeisen Bank International (RBI), one of Central and Eastern Europe's largest banking groups, migrated its fragmented analytics estate across multiple countries to Databricks SQL. The result was a 3–4x improvement in average SQL query performance, 30–40% faster time to insight for analysts, and a 5x reduction in analytics total cost of ownership compared to its previous cloud solution.

Financial ServicesDSDatabricks SQL
CC
Chipper Cash
95%+
selfie verification accuracy

Chipper Cash, a fintech serving over five million customers across Africa, deployed a Pinecone-powered facial similarity search system to detect and block fraudulent duplicate sign-ups in real time. The solution slashed identity verification latency from up to 20 minutes down to under 2 seconds, and reduced fraudulent sign-ups by 10x across all markets.

Financial ServicesGCGoogle CloudSSnowflake