Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dynamically create ingress for running sparkapplication to access spark-ui outside of the k8s cluster #454

Open
maxgruber19 opened this issue Aug 16, 2024 · 2 comments

Comments

@maxgruber19
Copy link

when submitting a sparkapplication resource the driver will expose port 4040 where we can access the spark-ui. unfortunately the ui is available only within the cluster, not outside of it by using a webbrowser on a users pc.

of course it's possible to create an ingress by myself for every spark application I'm submitting but since sparkapplications are ephemeral after a couple of weeks there will be lots of dead ingresses in the cluster because the spark application they belong to has terminated already.

i think the operator should create an ingress / route whenever a sparkapplication is submitted. possible configuration option would be a value set in the operator itself. i'd prefer that rather than setting the ingress configs with every application

@sbernauer feel free to add

@maxgruber19
Copy link
Author

we now have two workarounds that make it possible to see the running spark applications without the operator provisioning an ingress itself

1. deploying a spark history server and enabling rolling spark logs by setting spark.eventLog.rolling.enabled to true, disadvantage clearly is the delay of the log roll, it's not as live as a real spark would be but it's working pretty well and without the need to deploy an ingress for every spark application running in the cluster. further disadvantage is the additional spark history configs in the spark application where the whole s3 connection has to be set instead of just an url or something similar (#415 tracks that already, could have been me creating it)
2. deploying an ingress with every spark app manually. this is not as trivial as one might think because of the random alphanumeric string in the name of the driver service exposing 4040. we created a several pseudo service that routes to the labels of the driver and works as the target of our ingress. this leads to problems / chaos as soon as another driver comes up from the same application, traffic from ui to driver will be routed randomly towards both of the drivers because of equal labels

I really recommend adding a possibility too deploy an ingress automatically with every spark application the operator submits, maybe with an ingress template that's defined one time for the operator via helm values or smth like that

@razvan
Copy link
Member

razvan commented Sep 4, 2024

Thank you for your report, and sorry for the late response.

We discussed this briefly and it will hopefully be prioritized for the next release but I cannot guarantee.

Until then, a somewhat better workaround is to use the listener operator together with the podOverrides property to automatically create a service for the application's driver pod.

I tested this snippet and it worked:

---
apiVersion: spark.stackable.tech/v1alpha1
kind: SparkApplication
metadata:
  name: spark-app-with-ui-service
spec:
   driver:
    podOverrides:
      spec:
        containers:
          - name: spark
            volumeMounts:
              - name: listener
                mountPath: /listener 
        volumes:
          - name: listener
            ephemeral: 
              volumeClaimTemplate:
                metadata:
                  annotations:
                    listeners.stackable.tech/listener-class: external-unstable 
                spec:
                  storageClassName: listeners.stackable.tech
                  accessModes:
                    - ReadWriteMany
                  resources:
                    requests:
                      storage: "1"

Hope this helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants