Media & EntertainmentProduct Development

How Scribd Cut GenAI Costs 90% and Boosted Sign-Ups with Databricks

Scribd, Inc. operates three content brands — Scribd, SlideShare, and Everand — and manages a global library of more than 250 million documents, audiobooks, and eBooks. After fragmenting its AI development across multiple disconnected tools, the company consolidated onto the Databricks Data Intelligence Platform to run the full lifecycle of data and AI in one environment. The result: a 90% reduction in generative AI costs, a 7% lift in new user sign-ups, and the ability to move from prototype to production in weeks rather than months.

Outcomes

90%Reduction in GenAI operating costs
7%Increase in new user sign-ups
7%Reduction in user churn
250M+ documentsContent library processed

Tools & Technologies

1D
Databricks
Unified data analytics and AI platform built on Apache Spark for lakehouse architecture, ML, and generative AI workloads.
2DA
Databricks Agent Bricks
Framework for building, evaluating, and deploying domain-specific AI agents on a lakehouse platform.
3DU
Databricks Unity Catalog
Unified governance layer for managing access, lineage, and quality of data and AI assets across a lakehouse.
4
DL
Delta Lake
Open-source storage layer that brings ACID transactions and scalable metadata handling to data lakes.
5
LJ
Lakeflow Jobs
Databricks

AI Categories

Challenge

Scribd’s data infrastructure was fragmented across separate tools for ETL, LLM experimentation, and model serving, creating slow feedback loops between data scientists and production teams and making it nearly impossible to operationalize AI efficiently across the business.

Solution

Scribd consolidated onto the Databricks Data Intelligence Platform, using Databricks Notebooks, Mosaic AI model serving, Delta Lake, Unity Catalog, and Lakeflow Jobs to run the full data and AI lifecycle in one environment — enabling the team to go from prototype to production in weeks while cutting generative AI operating costs by 90%.

Full Story

Scribd, Inc. connects millions of users to knowledge across three distinct platforms: Scribd, a user-powered library; SlideShare, a repository of presentations; and Everand, a subscription service for audiobooks and eBooks. The company’s content library has grown to more than 250 million pieces — multilingual, media-rich, and highly varied in format and quality. At that scale, ensuring content is discoverable, properly tagged, and free of low-quality material is not a manual task.

Access 390+ AI use cases, 392+ tools, and adoption signal rankings.

Source

DATABRICKS
June 2025
Original case study

Similar Cases

1TW
How The Washington Post Uses AI Agents to Eliminate Tax Overpayments
The Washington Post
100%Invoice tax accuracy coverage
2F
How FedEx Uses Salesforce Data 360 to Capture 2,000%+ ROI
FedEx
+13 percentage points (25% to 40%)Customer activation rate improvement
3P
How PetSmart Uses AI Decisioning to Boost Salon Bookings 22%
PetSmart
22%Incremental lift in salon bookings
4B
How Block Gives 4,000 Employees AI-Powered Data Access via Claude and Databricks
Block
75% saving 8-10+ hoursEngineers saving time weekly
5V
How VideoAmp Saves 90% in Costs and Achieves 10x Performance on Snowflake
VideoAmp
90%Cost Reduction
6E
How Experian Automates 35% of Customer Emails with Databricks Mosaic AI
Experian
35%Customer emails automated
7L
How Luminate Uses Snowflake to Deliver Billboard Chart Data 334% Faster
Luminate
334%Daily processing speed increase
8TA
How The AA Cuts Routine Query Time 70% with Databricks AI/BI Genie in Microsoft Teams
The AA
70%Routine query resolution time reduction
9T
How TRY Uses Claude Enterprise to Cut Routine Task Time by 30%
TRY
30%Reduction in time spent on routine tasks
10TM
How The Met Uses OpenAI to Bring Historical Fashion Exhibits to Life
The Metropolitan Museum of Art
See all use cases →