PrimeHub Apps
Allows third-party application integrated into PrimeHub platform.
Features
- Shared domain: The installed application can be accessed in the sub-path of PrimeHub's domain. We don't need an additional domain for this application.
- Authorization: Allows to restrict the applications only accessible to group members, PrimeHub logged-in users, or public users.
- Data Persistence: Allows applications to persist data in group volume and access data in other persistent storage, like datasets and PHFS.
- Resource constraint: Enforce the resource CPU, memory, GPU quota limitation in a group.
Concepts
Application
We introduce a new concept: Application in PrimeHub. An application is an instance of an integrated third-party application (e.g. MLflow). We install it as a group resource and can install multiple instances for a kind of application within a group.
Application Template
Application template describes how an application is installed. It contains this information
- podTemplate: used to create the deployment of the application
- Service Ports: used to create the service
- HTTP Port: The HTTP port if the application has a web interface
- defaultEnvs: The default env variables are used when creating the application. When an application is created, the values would be put in enviornment variables of the target application.
- ENV Name
- Description
- Default value
- Optional
 
Create an Application
Users can create an application from an application template.
- Select an application template
- Fill the default envs provided by the template
- Select the instance type
- Choose the scope (Public / PrimeHub users only/ Group members only). The scope only affects the web interface of the application.
- Create
Preset Environment Variables
The preset environment variables can be used in the value field of the environment variables. Here is the list of preset environment variables
- PRIMEHUB_APP_ID: The PhApplication k8s resource name- <app-id>
- PRIMEHUB_APP_ROOT: The root of persistence storage for the application.- <group-volume>/phapplications/<app-id>(if group volume available)
- /phapplications/<app-id>(if group volume not available)
 
- PRIMEHUB_APP_BASE_URL: The url prefix for the application- /console/apps/<app-id>
- PRIMEHUB_URL: The external url of PrimeHub
- PRIMEHUB_GROUP: The group name
Connect to Application
There are two ways to connect to the application
- Connect to a web interface from sub-path of PrimeHub: Most of the applications are web-based applications. To access this kind of application, users can access it from https://<primehub>/console/apps/<app-id>from the browser.
- Connect to a TCP endpoint from the host name and port: For some applications, they provide non-HTTP service. We can access it by the service endpoint <my-app-svc>:<my-app-port>. The endpoint can be only accessed in the PrimeHub cluster internally. (like notebooks and jobs)
Application Management
- Users can start/stop an application
- Users can get the basic information of the application
- Users can get the application log
- Users can delete an application
Implementation
Principles
- Introduce a new CRD PhApplicationthat represents the app to install. It will derive a deployment and service for this app.
- We use PhAppTemplateto createPhApplication. However,PhApplicationcan work standalone without PhAppTemplate. The controller ofPhApplicationshould not knowPhAppTemplate.
- GraphQL uses PhAppTemplateto createPhApplication. And the template and template data are stored in thePhApplication's annotation for update use thereafter.
CRDs
PhApplication
apiVersion: primehub.io/v1alpha1
kind: PhApplication
metadata:
  name: mlflow-xyzab
  namespace: hub
  annotations:
      phapplication.primehub.io/template: '<json of app template>'
      phapplication.primehub.io/template-data: '<json of app template>'
spec:
  stop: false
  displayName: "My MLflow"
  groupName: "phusers"
  instanceType: ""
  scope: "group"
  podTemplate:
    spec:
      containers:
      - name: mlflow
        image: larribas/mlflow:1.9.1
        command:
        - mlflow
        args:
        - server
        - --host
        - 0.0.0.0
        - --default-artifact-root
        - $(DEFAULT_ARTIFACT_ROOT)
        - --backend-store-uri
        - $(BACKEND_STORE_URI)
        env:
        - name: FOO
          value: bar
        - name: BACKEND_STORE_URI
          value: "sqlite:///$(PRIMEHUB_APP_ROOT)/mlflow.db"
        - name: DEFAULT_ARTIFACT_ROOT
          value: "$(PRIMEHUB_APP_ROOT)/mlruns"
        ports:
        - containerPort: 5000
          name: http
          protocol: TCP
  svcTemplate:
    spec:
      ports:
          - name: http
            port: 5000
            protocol: TCP
            targetPort: 5000
  httpPort: 5000
status:
  phase: "Ready"
  message: "Error message"
  serviceName: "app-mlflow-xyzab"
- annotations:
- phapplication.primehub.io/template: template content used to create this- PhApplication
- phapplication.primehub.io/template-data: template data used to create this- PhApplication. It is the POST data of the graphql- PhApplicationcreate.
 
- spec:
- podTemplate: the template of pods
- svcTemplate: the template of service
- scope: "group", "primehub", "public"
- httpPort: the backend service port that the proxy should forward to
 
- status:
- phase: "Starting", "Ready", "Updating", "Stopping", "Stopped", "Error"
- message: human readable message
- serviceName: name of service  (used for graphql serviceNamefield)
 
PhAppTemplate
apiVersion: primehub.io/v1alpha1
kind: PhAppTemplate
metadata:
  name: mlflow
  namespace: hub
spec:
  name: MLFlow
  description:
  version:
  docLink:
  icon: <url to icon or data-uri>
    defaultEnvs:
  - name: BACKEND_STORE_URI
    description: ""
    defaultValue: "sqlite://$(PRIMEHUB_APP_ROOT)/mlflow.db"
    optional: false
  - name: DEFAULT_ARTIFACT_ROOT
    description: ""
    defaultValue: "$(PRIMEHUB_APP_ROOT)/mlruns"
    optional: false
  template:
    # The template of phApplication
    spec:
      podTemplate:
        spec:
          containers:
          - name: mlflow
            image: larribas/mlflow:1.9.1
            command:
            - mlflow
            args:
            - server
            - --host
            - 0.0.0.0
            - --default-artifact-root
            - $(DEFAULT_ARTIFACT_ROOT)
            - --backend-store-uri
            - $(BACKEND_STORE_URI)
            env:
            - name: FOO
              value: bar
            ports:
            - containerPort: 5000
              name: http
              protocol: TCP
      svcTemplate:
        spec:
          ports:
          - name: http
            port: 5000
            protocol: TCP
            targetPort: 5000
      httpPort: 5000
- spec:
- version: the version string of the template
- description: free form description of this applicatin
- docLink: the document url
- defaultEnvs: used for create the additional envs
- name: the name of the environment variable
- descsription: description of the variable
- defaultValue: the default value of the variable
- optional: if the environment is optional
 
- template (the content of phApplication). See the phApplication
 
NOTE:
- Why we copy the content of PhAppTemplate to PhApplication instead of use name ref is we want to decouple the created app from the template.
- phapplication.primehub.io/templateis used for GraphQL to use. We keep the template so we can use it thereafter while updating the app.
Control Plane
- GraphQL create - PhApplicationresource from- PhAppTemplate
- The controller of - PhApplicationreconciles the- PhApplication. The hierarchy is- PhApplication ├── Deployment ├── Service └── NetworkPolicy
Create
- Console get the template list from GraphQL
- Console select one template and list the defaultEnvs to the UI's variables
- Console call GraphQL to create phapplication
- Get the phapptemplate content
- Append env variables to the end of the container's env
- Set the scope
 
Update
- Console gets the PhApplication from GraphQL.
- GraphQL returns PhApplication user data phapplication.primehub.io/template-dataand the PhApplication default envs fromphapplication.primehub.io/template
- Console can reset the variables
- Console can add/remove/update the env vars
- Console call update to the GraphQL
- GraphQL gets the current PhApplication, and modify container env, scope, instance type.
- GraphQL cannot change the appId and template
Controller
- Deployment - name app-<app-id>
- Use spec.podTemplate.spec
- Volumes
- Add group, dataset volumes
- Add empty dir if no group volume available
 
- Init Container
- run as root
- mkdir -p $(PRIMEHUB_APP_ROOT)
 
- Container
- Keep only the first container
- Set resources from instanceType
- Prepend (not append) the primehub required envs.  PRIMEHUB_APP_ID, ,PRIMEHUB_APP_ROOTandPRIMEHUB_APP_BASE_URL
- Mount group, dataset volumes
- Mount empty dir if no group volume available (/phapplications/<app-id>)
 
- The created pod should have label
- app=primehub-app
- primehub.io/phapplication=<appid>,
- primehub.io/group: <escaped group>
 
 
- name 
- Service - name app-<app-id>
- Use spec.svcTemplates.spec
 
- name 
- NetworkPolicy - Allows the ingress traffic from
- pod label with primehub.io/group=<escape-group>
- primehub-console (for proxy)
 
- pod label with 
 
- Allows the ingress traffic from
- status.Phase- The phase of PhApplication - Starting: App is starting, no ready pod and service still not available
deployment.status.readyReplicas==0
- Ready: App is ready to use
deployment.status.readyReplicas==1anddeployment.status.replicas==1
- Updating: App is updating, old pod is still ready to use, but new version of app is starting.
deployment.status.readyReplicas==1anddeployment.status.replicas>1
- Stopping: App is stopping, the pod is terminating but resource has not been freed.
spec.stop=trueanddeployment.status.replicas>0
- Stopped: App is stopped, the pod is delated. No resource is used.
spec.stop=trueanddeployment.status.replicas==0
 - We can check by startingandupdatingby deployment status
 
- Starting: App is starting, no ready pod and service still not available
Data Plane
Http Proxy
App path is under https://<primehub>/console/apps/<app-id>. We validate if the user can access app by server session. The first solution we come out is to validate the traffic by access token. The flow is as follows:

- Get the app information of <app-id>in the path
- If app is with scope group only, it will check if the owner of this access token has permission of the group
- If yes, accept the request and proxy to upstream service
Performance issue:
- Access Token is expired about 5 mins. Too short to cache.
- To refresh the access token, we need to request Keycloak token endpoint to ask for a new access token.
- If the refresh token is expired, we need to go through the OIDC process.
- The cache miss rate would be high because we keep changing the access token.
The solution is to implement the "session" concept

- If a new connection to the app, it will use the access token to authorize the request by the access token. If the traffic is accepted, create a session.
- When the session is created, the console sets a cookie with key phapplication-session-idunder path/console/apps/<app>, expired in 30mins, and maintains the session cache on the server side.
- If the request contains the session cookie and it is found in the session cache. Allows the request to the backend. And it will extend the expiration time to 30mins.
- If the session id is not found in the server, authorize the request as step 1
Performance issue:
- Because the session can be easily extended, there would be much fewer Keycloak token endpoint requests.
- The cache miss rate would be low because the session id is only expired if it is not used for 30 mins.
Log Traffic
Log API
- Add a new endpoint (generic log endpoints authorized by group label)
/api/logs/pods/<pod>
- GraphQL get the pod and find if there are primehub.io/grouplabel and unescaped the group. If not found, rejected.
- Check the user of the token is the group member of the app
- Get the pod log from k8s API
Console & GraphQL
- Get the pod list from GraphQL API (reference - PhDeployment)- phApplication(...) { pods { name // app-mlflow-xyzab-xxxx-yyyyy logEndpoint // https://hub.a.demo.primehub.io/api/logs/pods/app-mlflow-xyzab-xxxx-yyyyy } }
- GraphQL get the pods from the label - primehub.io/phapplication=<appid>
