Announced May 2025: Dataproc Serverless is now Google Cloud Serverless for Apache Spark
On-demand Spark: Quick startup, zero ops, improve query performance, and Gemini productivity. Up to 60% lower TCO for Spark workloads.
Features
Eliminate the complexities of cluster management and avoid paying for idle, underutilized resources. Google Cloud Serverless for Apache Spark offers quick VM startup and dynamic autoscaling for your interactive, batch, and AI workloads. Spend your time building features, not managing infrastructure. There are no charges during VM startup and shutdown.
Experience industry-leading price-performance. Google Cloud Serverless for Apache Spark is powered by our next-generation native query engine, Lightning Engine, in Preview. It delivers significantly faster Spark query and data processing performance, over 3.6x faster** than open source Apache Spark, through its advanced vectorized execution, in-built intelligent caching, and optimized storage I/O, helping you get insights faster and reduce costs.
** The queries are derived from the TPC-DS standard and TPC-H standard and as such are not comparable to published TPC-DS standard and TPC-H standard results, as these runs do not comply with all requirements of the TPC-DS standard and TPC-H standard specification.
Run your production Spark workloads with confidence. Google Cloud Serverless for Apache Spark optimizes resources, provides job isolation, and supports Google Cloud’s enterprise security capabilities (including VPC-SC, CMEK, personal authentication, and custom organization policies). It ensures a secure execution environment with capabilities like secure subnets, encryption by default for data at rest and in transit, and no direct VM or root access, minimizing your operational security burden. While built for automation, expert users retain full access to Spark configurations for fine-grained control.
Infuse generative AI into your Spark development life cycle. Leverage Gemini for context-aware PySpark code generation in notebooks with intelligent context of your data to supercharge productivity. Get AI-assisted troubleshooting recommendations with Gemini Cloud Assist Investigate to quickly resolve issues, deeper operational insights, and optimize performance.
Seamlessly run distributed training or batch inference workloads. Google Cloud Serverless for Apache Spark offers built-in support for GPU acceleration and comes with pre-packaged popular ML libraries like XGBoost, PyTorch, and Transformers. This leads to significantly faster startup times for AI/ML environments and improves reliability since the images are Google-certified.
Maintain full flexibility. Google Cloud Serverless for Apache Spark is fully OSS-compatible, so you can bring your existing Spark code and libraries without modification. Develop in your language of choice (Python, Java, Scala, R) using your preferred IDE (BigQuery Studio, Vertex AI Workbench, Jupyter, VSCode) and orchestrate with tools like Apache Airflow/Cloud Composer or BigQuery pipelines. Process all data formats, such as Google-native and open source like Apache Iceberg.
Experience the power of Apache Spark directly within BigQuery. Write and run PySpark code alongside SQL in unified Colab Enterprise notebooks, leveraging common metadata through BigLake Metastore, shared security, consistent governance through Dataplex Universal Catalog.
Common Uses
Lightning-fast Serverless ETL/ELT
Rapidly ingest, transform, and load massive datasets from diverse sources into BigQuery or Google Cloud Storage. With the unmatched performance of the Lightning Engine and zero operational burden, streamline your data pipelines and ensure fresh data for analytics.
Lightning-fast Serverless ETL/ELT
Rapidly ingest, transform, and load massive datasets from diverse sources into BigQuery or Google Cloud Storage. With the unmatched performance of the Lightning Engine and zero operational burden, streamline your data pipelines and ensure fresh data for analytics.
Interactive analytics and rapid prototyping
Empower your data scientists and analysts with a flexible, high-performance serverless Spark environment. Whether you're performing ad-hoc data exploration, rapid prototyping, or building sophisticated machine learning models, Google Cloud Serverless for Apache Spark provides the speed and tools you need. Develop PySpark and SQL code in BigQuery Studio for a unified experience, or connect from your preferred tools like Jupyter notebooks and VS Code with Google Cloud extensions. Leverage Gemini for code assistance and troubleshooting, the Lightning Engine for rapid query results, and Vertex AI integration for MLOps. From quick data discovery to training complex models with GPUs and pre-packaged libraries, accelerate your entire data science life cycle.
Interactive analytics and rapid prototyping
Empower your data scientists and analysts with a flexible, high-performance serverless Spark environment. Whether you're performing ad-hoc data exploration, rapid prototyping, or building sophisticated machine learning models, Google Cloud Serverless for Apache Spark provides the speed and tools you need. Develop PySpark and SQL code in BigQuery Studio for a unified experience, or connect from your preferred tools like Jupyter notebooks and VS Code with Google Cloud extensions. Leverage Gemini for code assistance and troubleshooting, the Lightning Engine for rapid query results, and Vertex AI integration for MLOps. From quick data discovery to training complex models with GPUs and pre-packaged libraries, accelerate your entire data science life cycle.
Pricing
Transparent, value-driven pricing | Google Cloud Serverless Spark pricing is based on per-second usage of compute (DCUs), GPUs, and shuffle storage. | |
---|---|---|
Services and usage | Subscription type | Price (USD) |
Data Compute Unit (DCU) | Standard | Starting at $0.06 per hour |
Premium | Starting at $0.089 per hour | |
Shuffle storage | Standard | Starting at $0.04 per GB/month |
Premium | Starting at $0.1 per GB/month | |
Accelerator pricing | a100 40 GB | Starting at $3.52069 per hour |
a100 80 GB | Starting at $4.713696 per hour | |
L4 | Starting at $0.672048 per hour |
View pricing details for Google Cloud Serverless for Apache Spark.
Transparent, value-driven pricing
Google Cloud Serverless Spark pricing is based on per-second usage of compute (DCUs), GPUs, and shuffle storage.
Data Compute Unit (DCU)
Standard
Starting at
$0.06
per hour
Premium
Starting at
$0.089
per hour
Shuffle storage
Standard
Starting at
$0.04
per GB/month
Premium
Starting at
$0.1
per GB/month
Accelerator pricing
a100 40 GB
Starting at
$3.52069
per hour
a100 80 GB
Starting at
$4.713696
per hour
L4
Starting at
$0.672048
per hour
View pricing details for Google Cloud Serverless for Apache Spark.