Spark runtime version 2.2 components
Notes:
1. The 2.2
runtime uses the UTF-8
default character encoding.
Spark runtime 2.2 libraries
learning libraries, such TensorFlow, PyTorch, and XGBoost, and offer a ready-to-use environment for machine learning and data science applications.
The following sections list the library versions that are available in
Dataproc Serverless for Spark runtime version 2.2
.
GPU-specific libraries
For Dataproc Serverless batch workloads that use GPU VMs, the following NVIDIA driver and libraries are available in the Dataproc Serverless container. You can use them to accomplish the following tasks:
- Accelerate Spark batch workloads with the NVIDIA Spark Rapids library
- Train machine learning workloads
- Run distributed batch inference using Spark
Package Name | Version |
---|---|
Spark Rapids | 24.04.0 |
NVIDIA Driver | 550.127.05 |
CUDA | 12.6.2 |
cublas | 12.8.4.1 |
cusolver | 11.7.3.90 |
cupti | 12.8.90 |
cusparse | 12.5.8.93 |
cuDNN | 9.2 |
NCCL | 2.22 |
XGBoost libraries
The following Maven package versions
are available in Dataproc Serverless for Spark runtime version 2.2
to use
XGBoost with Spark in Java or Scala.
Group ID | Package Name | Version |
---|---|---|
ml.dmlc | xgboost4j-gpu_2.13 | 2.1.1 |
ml.dmlc | xgboost4j-spark-gpu_2.13 | 2.1.1 |
Python libraries
The following Python library versions are included in
Dataproc Serverless for Spark runtime version 2.2
.
Package Name | Version |
---|---|
accelerate | 0.33 |
bigframes | 1.7 |
cookiecutter | 2.6 |
cython | 3.0 |
dask | 2024.5 |
deepspeed | 0.14 |
evaluate | 0.4 |
fastavro | 1.9 |
fastparquet | 2024.2 |
gcsfs | 2024.5 |
git | 2.45 |
google-auth-oauthlib | 1.2 |
google-cloud-aiplatform | 1.60 |
google-cloud-bigquery | 3.23 |
google-cloud-bigquery-storage | 2.25 |
google-cloud-bigtable | 2.23 |
google-cloud-container | 2.45 |
google-cloud-datacatalog | 3.19 |
google-cloud-dataproc | 5.9 |
google-cloud-datastore | 2.19 |
google-cloud-dlp | 3.22 |
google-cloud-language | 2.13 |
google-cloud-logging | 3.10 |
google-cloud-monitoring | 2.21 |
google-cloud-pubsub | 2.21 |
google-cloud-redis | 2.15 |
google-cloud-secret-manager | 2.20 |
google-cloud-spanner | 3.46 |
google-cloud-speech | 2.26 |
google-cloud-storage | 2.16 |
google-cloud-texttospeech | 2.16 |
google-cloud-translate | 3.15 |
google-cloud-vision | 3.7 |
httplib2 | 0.22 |
huggingface_hub | 0.27 |
ipyparallel | 8.8 |
ipython-sql | 0.3 |
ipywidgets | 8.1 |
jupyter_http_over_ws | 0.0 |
jupyterlab | 4.1 |
jupyterlab-git | 0.50 |
keyrings.google-artifactregistry-auth | 1.1 |
langchain | 0.2 |
lightgbm | 4.5 |
markdown | 3.6 |
matplotlib | 3.8 |
nbclassic | 1.0 |
nbconvert | 7.16 |
nbdime | 4.0 |
nltk | 3.8 |
nodejs | 20.12 |
numba | 0.59 |
numpy | 1.26 |
oauth2client | 4.1 |
onnx | 1.16 |
openblas | 0.3 |
opencv | 4.9 |
orc | 2.0 |
pandas | 2.2 |
papermill | 2.6 |
pyarrow | 15.0 |
pydot | 2.0 |
pyhive | 0.7 |
pymongo | 4.7 |
pynvml | 11.5 |
pytables | 3.9 |
pytorch-cpu | 2.3 |
regex | 2024.5 |
requests | 2.31 |
rtree | 1.2 |
scikit-image | 0.22 |
scikit-learn | 1.5 |
scipy | 1.11 |
seaborn | 0.12 |
sentence-transformers | 3.0 |
shap | 0.45 |
sqlalchemy | 2.0 |
sympy | 1.12 |
tokenizers | 0.19 |
torcheval | 0.0.7 |
torchvision | 0.18 |
tornado | 6.4 |
transformers | 4.43 |
uritemplate | 4.1 |
virtualenv | 20.26 |
wordcloud | 1.9 |
xgboost | 2.0 |
ydata-profiling | 4.8 |
R libraries
The following R library versions are included in
Dataproc Serverless for Spark runtime version 2.2
.
Package Name | Version |
---|---|
askpass | 1.2 |
assertthat | 0.2 |
backports | 1.5 |
bit | 4.0 |
bit64 | 4.0 |
blob | 1.2 |
boot | 1.3_30 |
brew | 1.0_10 |
broom | 1.0 |
callr | 3.7 |
caret | 6.0_94 |
cellranger | 1.1 |
chron | 2.3_61 |
class | 7.3_22 |
cli | 3.6 |
clipr | 0.8 |
cluster | 2.1 |
codetools | 0.2_20 |
colorspace | 2.1_0 |
commonmark | 1.9 |
cpp11 | 0.4 |
crayon | 1.5 |
curl | 5.1 |
data.table | 1.15 |
dbi | 1.2 |
dbplyr | 2.5 |
desc | 1.4 |
devtools | 2.4 |
digest | 0.6 |
dplyr | 1.1 |
ellipsis | 0.3 |
evaluate | 0.23 |
fansi | 1.0 |
fastmap | 1.2 |
forcats | 1.0 |
foreach | 1.5 |
foreign | 0.8_86 |
fs | 1.6 |
future | 1.33 |
generics | 0.1 |
ggplot2 | 3.5 |
gh | 1.4 |
glmnet | 4.1_8 |
globals | 0.16 |
glue | 1.7 |
gower | 1.0 |
gtable | 0.3 |
haven | 2.5 |
highr | 0.10 |
hms | 1.1 |
htmltools | 0.5.8 |
htmlwidgets | 1.6 |
httpuv | 1.6 |
httr | 1.4 |
hwriter | 1.3.2 |
ini | 0.3 |
ipred | 0.9_14 |
isoband | 0.2 |
iterators | 1.0 |
jsonlite | 1.8 |
kernsmooth | 2.23_24 |
knitr | 1.46 |
labeling | 0.4 |
later | 1.3 |
lattice | 0.22_6 |
lava | 1.7 |
lifecycle | 1.0 |
listenv | 0.9 |
lubridate | 1.9 |
magrittr | 2.0 |
markdown | 1.12 |
mass | 7.3_60 |
matrix | 1.6_5 |
memoise | 2.0 |
mgcv | 1.9_1 |
mime | 0.12 |
modelmetrics | 1.2.2 |
modelr | 0.1 |
munsell | 0.5 |
nlme | 3.1_164 |
nnet | 7.3_19 |
numderiv | 2016.8_1 |
openssl | 2.2 |
pillar | 1.9 |
pkgbuild | 1.4 |
pkgconfig | 2.0 |
pkgload | 1.3 |
plogr | 0.2 |
plyr | 1.8 |
praise | 1.0 |
prettyunits | 1.2 |
processx | 3.8 |
prodlim | 2023.08 |
progress | 1.2 |
promises | 1.3 |
proto | 1.0 |
ps | 1.7 |
purrr | 1.0 |
r6 | 2.5 |
randomforest | 4.7_1 |
rappdirs | 0.3 |
rcmdcheck | 1.4 |
rcolorbrewer | 1.1_3 |
rcpp | 1.0 |
rcurl | 1.98_1 |
readr | 2.1 |
readxl | 1.4 |
recipes | 1.0 |
rematch | 2.0 |
remotes | 2.5 |
reprex | 2.1 |
reshape2 | 1.4 |
rlang | 1.1 |
rmarkdown | 2.27 |
rodbc | 1.3_23 |
roxygen2 | 7.3 |
rpart | 4.1 |
rprojroot | 2.0 |
rserve | 1.8_7 |
rsqlite | 2.3 |
rstudioapi | 0.16 |
rvest | 1.0 |
scales | 1.3 |
selectr | 0.4_2 |
sessioninfo | 1.2 |
shape | 1.4.6 |
shiny | 1.8.1 |
sourcetools | 0.1 |
spatial | 7.3_17 |
squarem | 2021.1 |
stringi | 1.8 |
stringr | 1.5 |
survival | 3.6_4 |
sys | 3.4 |
teachingdemos | 2.12 |
testthat | 3.2.1 |
tibble | 3.2 |
tidyr | 1.3 |
tidyselect | 1.2 |
tidyverse | 2.0 |
timedate | 4032.109 |
tinytex | 0.51 |
usethis | 2.2 |
utf8 | 1.2 |
uuid | 1.2_0 |
vctrs | 0.6 |
whisker | 0.4 |
withr | 3.0 |
xfun | 0.44 |
xml2 | 1.3 |
xopen | 1.0 |
xtable | 1.8_4 |
yaml | 2.3 |
zip | 2.3 |