Skip to content

chore: bump spark to 3.5.5 #217

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 19, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/modules/demos/pages/jupyterhub-keycloak.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ This setup is ideal for interactive data processing.

=== Spark Configuration

* **Executor Image**: Uses a custom image `oci.stackable.tech/sandbox/spark:3.5.2-python311` (built on the standard Spark image) for the executors, matching the Python version of the notebook.
* **Executor Image**: Uses a custom image `oci.stackable.tech/sandbox/spark:3.5.5-python311` (built on the standard Spark image) for the executors, matching the Python version of the notebook.
* **Resource Allocation**: Configures Spark executor instances, memory, and cores through settings defined in the notebook.
* **Hadoop and AWS Libraries**: Includes necessary Hadoop and AWS libraries for S3 operations, matching the notebook image version.

Expand Down
2 changes: 1 addition & 1 deletion stacks/airflow/airflow.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -274,7 +274,7 @@ data:
spec:
version: "1.0"
sparkImage:
productVersion: 3.5.2
productVersion: 3.5.5
mode: cluster
mainApplicationFile: local:///stackable/spark/examples/src/main/python/pi.py
job:
Expand Down
8 changes: 4 additions & 4 deletions stacks/jupyterhub-keycloak/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
# docker build -t oci.stackable.tech/sandbox/spark:3.5.2-python311 -f Dockerfile .
# kind load docker-image oci.stackable.tech/sandbox/spark:3.5.2-python311 -n stackable-data-platform
# docker build -t oci.stackable.tech/sandbox/spark:3.5.5-python311 -f Dockerfile .
# kind load docker-image oci.stackable.tech/sandbox/spark:3.5.5-python311 -n stackable-data-platform
# or:
# docker push oci.stackable.tech/sandbox/spark:3.5.2-python311
# docker push oci.stackable.tech/sandbox/spark:3.5.5-python311

FROM spark:3.5.2-scala2.12-java17-ubuntu
FROM spark:3.5.5-scala2.12-java17-ubuntu

USER root

Expand Down
2 changes: 1 addition & 1 deletion stacks/jupyterhub-keycloak/jupyterhub.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -209,7 +209,7 @@ options:
image:
display_name: Image
choices:
{% for image in ["quay.io/jupyter/pyspark-notebook:python-3.11.9", "quay.io/jupyter/pyspark-notebook:spark-3.5.2"] %}
{% for image in ["quay.io/jupyter/pyspark-notebook:python-3.11.9", "quay.io/jupyter/pyspark-notebook:spark-3.5.5"] %}
"{{image}}":
display_name: "{{image}}"
kubespawner_override:
Expand Down
6 changes: 3 additions & 3 deletions stacks/jupyterhub-keycloak/process-s3.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -49,8 +49,8 @@
"acts as the driver. It is important that the versions of spark and python match across the driver (running in the juypyterhub image)\n",
"and the executor(s) (running in a separate image, specified below with the `spark.kubernetes.container.image` setting).\n",
"\n",
"The jupyterhub image `quay.io/jupyter/pyspark-notebook:spark-3.5.2` uses a base ubuntu image (like the spark images).\n",
"The versions of java match exactly. Python versions can differ at patch level, and the image used below `oci.stackable.tech/sandbox/spark:3.5.2-python311` is built from a `spark:3.5.2-scala2.12-java17-ubuntu` base image with python 3.11 (the same major/minor version as the notebook) installed.\n",
"The jupyterhub image `quay.io/jupyter/pyspark-notebook:spark-3.5.5` uses a base ubuntu image (like the spark images).\n",
"The versions of java match exactly. Python versions can differ at patch level, and the image used below `oci.stackable.tech/sandbox/spark:3.5.5-python311` is built from a `spark:3.5.2-scala2.12-java17-ubuntu` base image with python 3.11 (the same major/minor version as the notebook) installed.\n",
"\n",
"## S3\n",
"As we will be reading data from an S3 bucket, we need to add the necessary `hadoop` and `aws` libraries in the same hadoop version as the\n",
Expand All @@ -69,7 +69,7 @@
"NAMESPACE = os.environ.get(\"NAMESPACE\", \"default\")\n",
"POD_NAME = os.environ.get(\"HOSTNAME\", f\"jupyter-{os.environ.get('USER', 'default')}-{NAMESPACE}\")\n",
"\n",
"EXECUTOR_IMAGE = \"oci.stackable.tech/sandbox/spark:3.5.2-python311\" \n",
"EXECUTOR_IMAGE = \"oci.stackable.tech/sandbox/spark:3.5.5-python311\" \n",
"\n",
"spark = (\n",
" SparkSession.builder\n",
Expand Down