How Melange Uses Pinecone to Power 600M-Vector Patent Search
Melange is a patent analytics company that automates prior art discovery for litigation teams by embedding and retrieving hundreds of millions of patent and academic documents at scale. The company replaced a self-hosted Milvus cluster with Pinecone’s serverless vector database after their original deployment crashed under memory pressure beyond 40 million records. With Pinecone, Melange now runs a production system spanning more than 600 million vectors, saving an estimated $75,000 per year while reducing model-to-market cycle time.
Impact
>600M
Vectors stored in production
>$75,000
Annual cost savings
~10%
Model-to-market cycle time reduction
Challenge
Melange’s self-hosted Milvus vector database crashed repeatedly under memory pressure once the patent corpus grew beyond roughly 40 million records, making it impossible to serve the full global patent dataset that litigation clients required. Operating an always-on cluster without dedicated infrastructure staff was unsustainable at the scale the business needed to reach.
Solution
Melange replaced their self-hosted Milvus deployment with Pinecone’s serverless vector database, whose slab architecture decouples storage from compute to support hundreds of millions of vectors at low cost. Parquet-based bulk ingestion pipelines allow the team to test new embedding models and expand namespaces without infrastructure work, maintaining high recall across a corpus that now exceeds 600 million vectors.
Tools & Technologies
What Leaders Say
“Our KPI for Pinecone is straightforward: how little time we spend thinking about it. Since we started, we haven’t had a single reliability issue, and that’s exactly what we need as a small team focused on building the product—not maintaining infrastructure.”
“Our entire search operation relies on Pinecone as the first step in the process. When we evaluated the developer experience and time saved, Pinecone was far and away the clear winner. Being able to generate files in the cloud and import them seamlessly with guaranteed enterprise support was exactly what we wanted.”
Sign up to read complete case studies, access detailed metrics, and unlock all use cases.
Full Story
Patent litigation is one of the highest-stakes environments in legal services. A single case can cost millions of dollars and turn on whether attorneys can locate a handful of obscure historical documents — buried among hundreds of millions of global patents and billions of academic papers — that establish prior art. Melange was built to solve this problem through large-scale semantic search, automating the most labor-intensive phases of prior art discovery so that litigators receive the precise set of documents their case depends on.
For Melange to deliver on that promise, the company needed an infrastructure foundation capable of indexing and querying at a scale that few applications demand. Their initial approach — a self-hosted Milvus cluster — worked at smaller dataset sizes but became unstable as the corpus grew. Memory bottlenecks caused repeated crashes, reliable uptime on a serverless cost model was unachievable, and maintaining a production-grade vector database without dedicated infrastructure staff was consuming engineering time that should have been spent on search quality and model development.
When clients began demanding that Melange expand its coverage to the full global patent corpus — roughly 450 million records at the time — the self-hosted solution could not run at all. The team evaluated commercial and open-source alternatives, requiring any solution to offer a serverless option, high recall at scale, and hands-on support. Pinecone’s serverless slab architecture met every criterion. Its decoupled storage and compute model meant Melange could store hundreds of millions of embeddings at low cost while only paying for compute during active queries. The slab design also removed the memory bottlenecks and cluster management overhead that had been blocking their growth.
Melange migrated to Pinecone and built their ingestion pipeline around Parquet-based bulk imports, enabling them to introduce new embedding models and stand up new namespaces for side-by-side testing without rebuilding indexes or managing cluster configurations. The result was a production system that crossed 600 million vectors across multiple model generations while maintaining the high recall that prior art search requires. For a small team whose value lies in domain expertise rather than infrastructure, the system’s reliability became a competitive differentiator — the search pipeline runs without interruption, and engineers focus on building better models rather than keeping the database running.
Melange estimates that Pinecone saves the company approximately $75,000 per year across avoided infrastructure maintenance, reduced operational burden, and less engineering time spent troubleshooting. Model-to-market cycle time has shortened by weeks, and the team expects to continue scaling beyond current vector counts, adding new data modalities including patent drawings and expanding into ensemble embedding approaches that combine representations from multiple models.