Job Artifact
Allows users to store the job output and can be downloaded from the job UI.
Prerequisites
PrimeHub store and PHFS should be enabled
User Journey
Create artfiacts from job
Create a job with the command
mkdir -p artifacts/sub echo "hello" > artifacts/test.txt echo "hello" > artifacts/sub/test.txt
Go to the newly created job detail page.
Wait for the job completed
Go to the Artifacts tab, and we will see all the two artifacts we just created
Artifact retention
- After the job completed for 7 days (the default artifact retention)
- Go to the job detail page.
- Go to the Artifacts tab
- We will see no artifacts anymore. The artifacts are automatically deleted.
Design
Artifact Copy
- Use a new
run-job.sh
script to run the job pod - In this script, it will run the command and then copy the files from
/home/jovyan/artifacts
to/phfs/jobArtifacts/<jobname>/
- At the same time, check if the artifacts folder execced the maximum size and maximum file counts
- The
phjob
resource in GraphQL API provides a newartifact
field. The resolver would iterate all the files under/groups/<group>/jobArtifacts/<jobname>
Artifact Retention
In the
run-job.sh
script, store the expiration time to/phfs/jobArtifacts/<jobname>/.primehub/metadata/expiredAt
In the GraphQL, run the cleanup script periodically (per day)
for group in store groups folder for job in `groups/${group}/jobArtifacts` expiredAt = get(`${job}/.metadata/expiredAt`) if (now > expiredAt) removeAll(job)
In the GraphQL, Provide an additional GraphQL mutation to run the cleanup
Components
Controller
PhJob Reconciliation
- Use the configmap to mount the
run-job.sh
script - Add new env vars to the job pod for artifact settings
- Run the job by the new
run-job.sh
script
Artifact copy script
- User copy the artifacts to
/home/jovyan/artifacts
(or create symbolic link) - Once the job is completed (or failed), the script will copy files from
/home/jovyan/artifacts
(can be symbolic link or a real folder) to/phfs/jobArtifacts/<jobname>/
- The script also checks if the file size exceeds the limit.
- The script also stores expiration time (UNIX time in second) at
/phfs/jobArtifacts/<jobname>/.primehub/metadata/expiredAt
GraphQL
In the original phjob type, add
artifact
fieldquery { phJob(where: {id: "job-202009250153-34s0ti"}) { id artifact { prefix items { name size lastModified } } } }
Delete outdated artifacts folder manually
Provide GraphQL API to delete outdated artifacts folder manually
mutation { cleanupPhJobArtifact }
Client
- Add additional artifact tab in the job detail page.
- The tab is only enabled when the job is completed.
- When the tab is clicked, query the GraphQL artifact resource.
- Show the artifact list. It should include the path and file size.
- Click the path to download the artifact.