Jobs

Contents

Index

Job

DFControl.Jobs.JobType
Job(name::String, structure::Structure;
      calculations      ::Vector{Calculation} = Calculation[],
      dir               ::String = pwd(),
      header            ::Vector{String} = getdefault_jobheader(),
      metadata          ::Dict = Dict(),
      version           ::Int = last_job_version(dir),
      copy_temp_folders ::Bool = false, 
      server            ::String = getdefault_server(),
      environment::String ="")

A Job embodies a set of Calculations to be ran in directory dir, with the Structure as the subject.

Keywords/further attributes

  • calculations: calculations to calculations that will be run sequentially.
  • dir: the directory where the calculations will be run.
  • header: lines that will be pasted at the head of the job script, e.g. exports export OMP_NUM_THREADS=1, slurm settings#SBATCH, etc.
  • metadata: various additional information, will be saved in .metadata.jld2 in the dir.
  • version: the current version of the job.
  • copy_temp_folders: whether or not the temporary directory associated with intermediate calculation results should be copied when storing a job version. CAUTION These can be quite large.
  • server: Server where to run the Job.
  • environment: Environment to be used for running the Job.
Job(job_name::String, structure::Structure, calculations::Vector{<:Calculation}, common_flags::Pair{Symbol, <:Any}...; kwargs...)

Creates a new job. The common flags will be attempted to be set in each of the calculations. The kwargs... are passed to the Job constructor.

Job(job_dir::String, job_script="job.sh"; version=nothing, kwargs...)

Loads the job in the dir. If job_dir is not a valid job path, the previously saved jobs will be scanned for a job with a dir that partly includes job_dir. If version is specified the corresponding job version will be returned if it exists. The kwargs... will be passed to the Job constructor.

source
DFControl.Database.loadMethod
load(server::Server, j::Job)

Tries to load the Job from server at directory j.dir. If no exact matching directory is found, a list of job directories that comprise j.dir will be returned.

source

Interacting with calculations

Base.getindexMethod
getindex(job::Job, name::String)

Returns the Calculation with the specified name.

getindex(job::Job, i::Integer)

Returns the i'th Calculation in the job.

source

Example:

julia> job["scf"]
Calculation{QE}:
name  = scf
exec = pw.x
run   = false
data  = [:k_points]
flags:
  &control
    calculation => scf

  &system
    ecutwfc => 20.0

  &electrons
    conv_thr => 1.0e-6

julia> job[2]
Calculation{QE}:
name  = bands
exec = pw.x
run   = true
data  = [:k_points]
flags:
  &control
    verbosity   => high
    calculation => bands

  &system
    ecutwfc => 20.0

  &electrons
    conv_thr => 1.0e-6
Base.push!Method
push!(job::Job, calculation::Calculation) = push!(job.calculations, calculation)
source
Base.append!Method
append!(job::Job, args...) = append!(job.calculations, args...)
source
Base.insert!Method
insert!(job::Job, i::Int, calculation::Calculation) = insert!(job.calculations, i, calculation)
source

Scheduling, submission and monitoring

DFControl.Jobs.set_flow!Function
set_flow!(job::Job, should_runs::Pair{String, Bool}...)

Sets whether or not calculations should be scheduled to run. The name of each calculation in the job will be checked against the string in each pair of should_runs, and the calculation.run will be set accordingly.

Example:

set_flow!(job, "" => false, "scf" => true)

would un-schedule all calculations in the job, and schedule the "scf" and "nscf" calculations to run.

source
DFControl.Database.saveMethod
save(job::Job)

Saves the job's calculations and job.sh submission script in job.dir. Some sanity checks will be performed on the validity of flags, execs, pseudopotentials, etc. The job will also be registered for easy retrieval at a later stage.

If a previous job is present in the job directory (indicated by a valid job script), it will be copied to the .versions sub directory as the previous version of job, and the version of job will be incremented.

source
DFControl.Client.isrunningFunction
isrunning(job::Job)
isrunning(s::Server, jobdir::String)

Returns whether a job is running or not. If the job was submitted using slurm, a QUEUED status also counts as running.

source
DFControl.Client.abortFunction
abort(job::Job)
abort(server::Server, dir::String)

Will try to remove the job from the scheduler's queue. If the last running calculation happened to be a Calculation{QE}, the correct abort file will be written. For other codes the process is not smooth, and restarting is not guaranteed.

source

Directories

Base.Filesystem.joinpathMethod
joinpath(job::Job, args...)

If the job is local this is joinpath(job.dir, args...), otherwise it will resolve the path using the Server rootdir.

source
Base.Filesystem.abspathMethod
abspath(job::Job, args...)

If the job is local this is abspath(job.dir), otherwise it will resolve the abspath using the Server rootdir.

source

Registry

All Jobs are stored in an internal registry the first time save(job) is called. This means that finding all previously worked on Jobs is as straightforward as calling load(server, Job(fuzzy)) where fuzzy is a part of the previously saved Job dir. This will then return a list of Jobs with similar directories.

Versioning

As previously mentioned, a rudimentary implementation of a Job versioning system is implemented. Upon calling save on a Job, if there is already a valid job script present in job.dir, it is assumed that this was a previous version of the job and the script together with all other files in job.local_dir will be copied to a subdirectory of the .versions directory bearing the name of the respective previous job version. After this, job.version will be incremented by 1 signalling the new version of the current Job.

The virtue of this system is that it is possible to roll back to a previous version after possibly making breaking changes, or to pull out previous results after further experimentation was performed.

Note

If job.copy_temp_folders=true all possible intermediate files inside the temporary calculation directory (i.e. "job_dir/outputs") will be copied every time the job is saved. These can be quite large and can quickly create very large job directories. Handle with care!

Archiving

After a Job is completed, or an interesting result is achieved, it makes sense to store it for future reference. This can be achieved through the archive function. This will take the current job, and copy it to a subdirectory (specified by the second argument to archive) of the jobs/archived directory inside the DFControl config directory. The third argument is a description of this job's result.

Note

In order to not cause huge file transfers, all the temporary directories will first be removed before archiving.

Example:

archive(job, "test_archived_job", "This is a test archived job")

To query previously archived jobs one can use load(Server("localhost"), Job("archived")).

DFControl.Client.archiveFunction
archive(job::Job, archive_directory::AbstractString, description::String=""; present = nothing, version=job.version)

Archives job by copying it's contents to archive_directory alongside a results.jld2 file with all the parseable results as a Dict. description will be saved in a description.txt file in the archive_directory.

source

Output

DFControl.Client.outputdataFunction
outputdata(job::Job; server = job.server, calcs::Vector{String}=String[])

Finds the output files for each of the calculations of a Job, and groups all the parsed data into a dictionary.

source
DFControl.Client.readbandsFunction
readbands(job::Job, outdat=outputdata(job))

Tries to read the bands from a bands calculation that is present in job.

source
DFControl.bandgapFunction
bandgap(bands::AbstractVector{Band}, fermi=0.0)

Calculates the bandgap (possibly indirect) around the fermi level.

source
bandgap(job::Job, fermi=nothing)

Calculates the bandgap (possibly indirect) around the fermi level. Uses the first found bands calculation, if there is none it uses the first found nscf calculation.

source

Environments

Environments specify the skeleton of the job script, i.e. which environment variables need to be set, which scheduler flags, etc.

DFControl.Jobs.EnvironmentType
Environment(MPI_command::String, scheduler_flags::Vector{String}, exports::Vector{String})

Environment to run a Job in. When running on a server with a scheduler scheduler_flags holds what these should be. e.g. #SBATCH -N 2. MPI_command and MPI_processes will be used to prefix executables that should be ran in parallel, e.g. if MPI_command = "mpirun -np 4 --bind-to core" a parallel executable will be translated into a script line as mpirun -np 4 --bind-to core exec.

source