How Assembled Cuts Support Response Time 95% with Pinecone RAG
Assembled is a workforce management and customer support optimization platform serving enterprises like Stripe, Etsy, and DoorDash. To power Assembled Assist, the company built a hybrid RAG pipeline combining Pinecone vector search with Algolia keyword retrieval and LLMs from OpenAI and Anthropic. Support tasks that previously took 40 minutes now complete in 2 minutes—a 95% reduction in handling time.
Impact
~95%
Ticket handling time reduction
2 minutes
Post-AI task completion time
Challenge
Support agents lacked fast access to accurate answers, requiring up to 40 minutes per ticket to search knowledge bases and draft responses manually—a process that couldn’t scale as client support volumes grew.
Solution
Assembled built Assembled Assist, a RAG pipeline powered by Pinecone for semantic vector retrieval and Algolia for keyword search, fused via Reciprocal Rank Fusion and completed by OpenAI and Anthropic LLMs to generate ticket responses in seconds.
Tools & Technologies
What Leaders Say
“Pinecone was a no-brainer for us. We needed to move quickly, and Pinecone was the leader in the vector database space. Its cost-effectiveness and ease of integration have been significant advantages, allowing us to focus on delivering value rather than managing infrastructure. We can test and adjust on the fly, which is crucial for maintaining high search quality and continuously enhancing our support solutions.”
Sign up to read complete case studies, access detailed metrics, and unlock all use cases.
Full Story
Assembled helps enterprise support teams at companies like Stripe, Etsy, and DoorDash run more efficiently, providing tools for workforce management, performance tracking, and ticket resolution. As AI began reshaping customer service expectations, Assembled saw an opportunity to close a persistent gap: support agents were spending too much time searching for accurate answers rather than delivering them.
Before Assembled Assist, agents navigated knowledge bases, past tickets, and product documentation manually before drafting responses. This slow, inconsistent process stretched routine tasks to 40 minutes per ticket. As support volumes grew, so did the inefficiency—agents were drowning in lookup work rather than focusing on the judgment calls that actually required human expertise.
Assembled built Assembled Assist as an AI automation engine that analyzes incoming tickets and generates high-quality contextual responses. The retrieval layer combines Pinecone’s vector database for semantic search with Algolia for keyword matching, fused via Reciprocal Rank Fusion to surface the most relevant results regardless of how a question is phrased. OpenAI and Anthropic LLMs then generate the response. The team chose a RAG architecture over fine-tuning, prioritizing prompt flexibility and rapid iteration as foundation models continue to improve.
The impact was stark: tasks that took 40 minutes now complete in 2 minutes, a roughly 95% reduction in handling time. Agents can process more tickets with greater consistency, and Assembled’s engineering team focuses on prompt quality and data curation rather than model infrastructure.
Assembled’s architecture reflects where enterprise AI is heading—composing specialized components rather than training from scratch. As foundation models improve, their bet on flexible, prompt-driven RAG over fine-tuned models positions Assembled to absorb those gains automatically, compounding the value delivered to every support team on the platform.