Legal ServicesSoftware Engineering

How Melange Uses Pinecone to Power 600M-Vector Patent Search

Melange is a patent analytics company that automates prior art discovery for litigation teams by embedding and retrieving hundreds of millions of patent and academic documents at scale. The company replaced a self-hosted Milvus cluster with Pinecone’s serverless vector database after their original deployment crashed under memory pressure beyond 40 million records. With Pinecone, Melange now runs a production system spanning more than 600 million vectors, saving an estimated $75,000 per year while reducing model-to-market cycle time.

Impact

>600M

Vectors stored in production

>$75,000

Annual cost savings

~10%

Model-to-market cycle time reduction

Challenge

Melange’s self-hosted Milvus vector database crashed repeatedly under memory pressure once the patent corpus grew beyond roughly 40 million records, making it impossible to serve the full global patent dataset that litigation clients required. Operating an always-on cluster without dedicated infrastructure staff was unsustainable at the scale the business needed to reach.

Solution

Melange replaced their self-hosted Milvus deployment with Pinecone’s serverless vector database, whose slab architecture decouples storage from compute to support hundreds of millions of vectors at low cost. Parquet-based bulk ingestion pipelines allow the team to test new embedding models and expand namespaces without infrastructure work, maintaining high recall across a corpus that now exceeds 600 million vectors.

Tools & Technologies

What Leaders Say

Our KPI for Pinecone is straightforward: how little time we spend thinking about it. Since we started, we haven’t had a single reliability issue, and that’s exactly what we need as a small team focused on building the product—not maintaining infrastructure.

Joshua Beck, CEO, Melange

Our entire search operation relies on Pinecone as the first step in the process. When we evaluated the developer experience and time saved, Pinecone was far and away the clear winner. Being able to generate files in the cloud and import them seamlessly with guaranteed enterprise support was exactly what we wanted.

Joshua Beck, CEO, Melange
Get the full context.

Sign up to read complete case studies, access detailed metrics, and unlock all use cases.

Full Story

Patent litigation is one of the highest-stakes environments in legal services. A single case can cost millions of dollars and turn on whether attorneys can locate a handful of obscure historical documents — buried among hundreds of millions of global patents and billions of academic papers — that establish prior art. Melange was built to solve this problem through large-scale semantic search, automating the most labor-intensive phases of prior art discovery so that litigators receive the precise set of documents their case depends on.

For Melange to deliver on that promise, the company needed an infrastructure foundation capable of indexing and querying at a scale that few applications demand. Their initial approach — a self-hosted Milvus cluster — worked at smaller dataset sizes but became unstable as the corpus grew. Memory bottlenecks caused repeated crashes, reliable uptime on a serverless cost model was unachievable, and maintaining a production-grade vector database without dedicated infrastructure staff was consuming engineering time that should have been spent on search quality and model development.

When clients began demanding that Melange expand its coverage to the full global patent corpus — roughly 450 million records at the time — the self-hosted solution could not run at all. The team evaluated commercial and open-source alternatives, requiring any solution to offer a serverless option, high recall at scale, and hands-on support. Pinecone’s serverless slab architecture met every criterion. Its decoupled storage and compute model meant Melange could store hundreds of millions of embeddings at low cost while only paying for compute during active queries. The slab design also removed the memory bottlenecks and cluster management overhead that had been blocking their growth.

Melange migrated to Pinecone and built their ingestion pipeline around Parquet-based bulk imports, enabling them to introduce new embedding models and stand up new namespaces for side-by-side testing without rebuilding indexes or managing cluster configurations. The result was a production system that crossed 600 million vectors across multiple model generations while maintaining the high recall that prior art search requires. For a small team whose value lies in domain expertise rather than infrastructure, the system’s reliability became a competitive differentiator — the search pipeline runs without interruption, and engineers focus on building better models rather than keeping the database running.

Melange estimates that Pinecone saves the company approximately $75,000 per year across avoided infrastructure maintenance, reduced operational burden, and less engineering time spent troubleshooting. Model-to-market cycle time has shortened by weeks, and the team expects to continue scaling beyond current vector counts, adding new data modalities including patent drawings and expanding into ensemble embedding approaches that combine representations from multiple models.

Similar Cases

TV
Tinexta Visura
1 hour to 2 full days
legal research time saved per task

Tinexta Visura is an Italian digital trust and technology company that built Lextel AI, a legal research platform for Italian law firms and corporate legal teams. Powered by Elasticsearch, Google Gemini, and retrieval-augmented generation across a repository of 4.8 million legal documents, the platform enables attorneys to locate relevant case law and automatically generate traceable legal opinions. The system reduces attorney research and drafting time by one hour to two full working days per task, depending on complexity.

Legal ServicesGCGoogle CloudEElasticsearch
C
Cuatrecasas
80%+
daily user adoption rate

Cuatrecasas deployed Harvey across all 26 offices as part of its CELIA Project, achieving over 80% daily usage among its 1,900+ lawyers for due diligence, drafting, and legal research.

Legal ServicesHHarvey
L
Law&Company
6,000 in 180 days
users acquired

Law&Company built SuperLawyer, an AI legal assistant powered by Claude, that achieved 6,000 users—approximately 20% of South Korean practicing lawyers—within 180 days of launch. The platform delivers a 1.7x efficiency gain for lawyers, saving 25 minutes per work hour and collectively saving 2.3 million work hours across its user base.

Legal ServicesCClaude
CL
CMS Law
95%
lawyer adoption rate in initial rollout

CMS Law deployed Harvey across 7,200 lawyers in 50+ countries — one of the largest GenAI rollouts in the legal sector — achieving 95% adoption in the initial phase through Harvey's Assistant, Vault, Knowledge, and Workflow Agents.

Legal ServicesHHarvey
A
Aquant
98%+
retrieval accuracy

Aquant is an agentic AI platform purpose-built for professionals servicing complex industrial and medical equipment at large manufacturing companies. When the company’s homegrown vector search infrastructure—built on PostgreSQL extensions—began to slow under real-time production demands, Aquant migrated to Pinecone as the retrieval backbone for its AI platform. The switch delivered sub-100ms semantic search, pushed retrieval accuracy above 98%, and helped Aquant’s customers cut average service resolution time by 49%.

TechnologyPPinecone
TX
Terminal X
0.68 to 0.91
f1 retrieval accuracy improvement

Terminal X is a vertical AI platform for institutional investors that acts as a 24/7 research agent, processing millions of financial documents for hedge funds, asset managers, and private equity firms. By rebuilding its retrieval architecture on Pinecone’s vector database, Terminal X improved F1 retrieval accuracy from 0.68 to 0.91, cut average latency by over 35%, and doubled deployment velocity. Users now save approximately three hours per day, and investment memo preparation dropped from two days to half a day.

Financial ServicesTechnologyPPinecone
CC
Chipper Cash
95%+
selfie verification accuracy

Chipper Cash, a fintech serving over five million customers across Africa, deployed a Pinecone-powered facial similarity search system to detect and block fraudulent duplicate sign-ups in real time. The solution slashed identity verification latency from up to 20 minutes down to under 2 seconds, and reduced fraudulent sign-ups by 10x across all markets.

Financial ServicesGCGoogle CloudSSnowflake
C
CustomGPT.ai
>400M
vectors stored

CustomGPT.ai built a RAG-as-a-Service platform on Pinecone storing over 400M vectors, achieving sub-20ms query latency and the #1 ranking in an independent RAG accuracy benchmark.

TechnologyPPinecone