Skip to content
Snippets Groups Projects
Commit 90bda02b authored by Michael Langguth's avatar Michael Langguth
Browse files

Merge branch 'michael_issue#028_finetune_usage_template_runscripts' into develop

parents 9d1a404f 25db567e
No related branches found
No related tags found
No related merge requests found
Pipeline #47423 failed
...@@ -64,19 +64,26 @@ whose order is the following: ...@@ -64,19 +64,26 @@ whose order is the following:
``` ```
### Create additional runscripts ### ### Create additional runscripts ###
In case that you want to perform experiments with varying configuration (e.g. another region of interest), In case that you want to perform experiments with varying configuration (e.g. another set of hyperparameters, but still the same input dataset for training), it is convenient to create individual runscripts from the templates.
it is convenient to create individual runscripts from the templates.
This can be done with the help of `generate_workflow_runscripts.sh`. This can be done with the help of `generate_workflow_runscripts.sh`.
The first argument `<runscript_name>` defines the (relative) path to the template runscript The first argument `<runscript_name>` defines the (relative) path to the template runscript
which should be converted to an executable one. Note that only the suffix of the which should be converted to an executable one. Note that only the suffix of the
template's name must be passed, e.g. `../HPC_scripts/train_era5` in order to create template's name must be passed, e.g. `../HPC_scripts/train_era5` in order to create
a runscript for the training substep. a runscript for the training substep.
The second argument `<exp_id>` denotes again the experiment identifier. If this argument is omitted, The second argument `<venv_name>` denotes the name of the virtual environment which has to be set up in advanceand which should be used by the runscript.
the default value `exp1` is used which might conflict the step where the virtual environment itself
is set up. Additional optional arguments can be passed to control the experimental identifier and to set manually the realtive path to the
directory where the preprocessed data is stored (used for the training and postprocessing substep). These optional arguments have to follow a naming convention in order to be identified by `generate_workflow_runscripts.sh`.
The experimental identifer can be passed by adding `-exp_id=<id>`while the path to the preprocessed data requires passing of `-exp_dir=<relative_path_to_dir>`. Note, that the default value `exp1` is used as experimental identifier if the `-exp_id=<id>` is omitted.
``` bash
./generate_workflow_runscripts.sh <runscript_name> <venv_name> [-exp_id=<id>] [-exp_dir=<relative_dir_to_path>]
```
*Specific example:*
``` bash ``` bash
./generate_workflow_runscripts.sh <runscript_name> [<exp_id>] ./generate_workflow_runscripts.sh train_era5 venv_juwels -exp_id=exp_test -exp_dir=era5-Y2010toY2222M01to12-160x128-2970N1500W-T2_MSL_gph500
``` ```
### Output folder structure and naming convention ### Output folder structure and naming convention
......
...@@ -35,9 +35,9 @@ if [ -z ${VIRTUAL_ENV} ]; then ...@@ -35,9 +35,9 @@ if [ -z ${VIRTUAL_ENV} ]; then
fi fi
# declare directory-variables which will be modified appropriately during Preprocessing (invoked by mpi_split_data_multi_years.py) # declare directory-variables which will be modified appropriately during Preprocessing (invoked by mpi_split_data_multi_years.py)
source_dir=/p/scratch/deepacf/video_prediction_shared_folder/preprocessedData/ source_dir=/p/project/deepacf/deeprain/video_prediction_shared_folder/preprocessedData/
checkpoint_dir=/p/scratch/deepacf/video_prediction_shared_folder/models/ checkpoint_dir=/p/project/deepacf/deeprain/video_prediction_shared_folder/models/
results_dir=/p/scratch/deepacf/video_prediction_shared_folder/results/ results_dir=/p/project/deepacf/deeprain/video_prediction_shared_folder/results/
# name of model # name of model
model=convLSTM model=convLSTM
......
...@@ -34,18 +34,16 @@ if [ -z ${VIRTUAL_ENV} ]; then ...@@ -34,18 +34,16 @@ if [ -z ${VIRTUAL_ENV} ]; then
fi fi
fi fi
# declare directory-variables which will be modified appropriately during Preprocessing (invoked by mpi_split_data_multi_years.py) # declare directory-variables which will be modified appropriately during Preprocessing (invoked by mpi_split_data_multi_years.py)
source_dir=/p/project/deepacf/deeprain/video_prediction_shared_folder/preprocessedData/ source_dir=/p/project/deepacf/deeprain/video_prediction_shared_folder/preprocessedData/
destination_dir=/p/project/deepacf/deeprain/video_prediction_shared_folder/models/ destination_dir=/p/project/deepacf/deeprain/video_prediction_shared_folder/models/
# for choosing the model for choosing the model, convLSTM,savp, mcnet,vae # valid identifiers for model-argument are: convLSTM, savp, mcnet and vae
model=convLSTM model=convLSTM
model_hparams=../hparams/era5/${model}/model_hparams.json model_hparams=../hparams/era5/${model}/model_hparams.json
destination_dir=${destination_dir}/${model}/"$(date +"%Y%m%dT%H%M")_"$USER"/"
# rund training # run training
srun python ../scripts/train_dummy.py --input_dir ${source_dir}/tfrecords/ --dataset era5 --model ${model} --model_hparams_dict ${model_hparams} --output_dir ${destination_dir}/${model}/ srun python ../scripts/train_dummy.py --input_dir ${source_dir}/tfrecords/ --dataset era5 --model ${model} --model_hparams_dict ${model_hparams} --output_dir ${destination_dir}
...@@ -9,10 +9,11 @@ exit 99 ...@@ -9,10 +9,11 @@ exit 99
source_dir=/home/${USER}/preprocessedData/ source_dir=/home/${USER}/preprocessedData/
destination_dir=/home/${USER}/models/ destination_dir=/home/${USER}/models/
# for choosing the model # valid identifiers for model-argument are: convLSTM, savp, mcnet and vae
model=mcnet model=mcnet
model_hparams=../hparams/era5/model_hparams.json model_hparams=../hparams/era5/model_hparams.json
destination_dir=${destination_dir}/${model}/"$(date +"%Y%m%dT%H%M")_"$USER"/"
# run training
python ../scripts/train_dummy.py --input_dir ${source_dir}/tfrecords/ --dataset era5 --model ${model} --model_hparams_dict ${model_hparams} --output_dir ${destination_dir}
# execute respective Python-script
python ../scripts/train_dummy.py --input_dir ${source_dir}/tfrecords/ --dataset era5 --model ${model} --model_hparams_dict ${model_hparams} --output_dir ${destination_dir}/${model}/
#srun python scripts/train.py --input_dir data/era5 --dataset era5 --model savp --model_hparams_dict hparams/kth/ours_savp/model_hparams.json --output_dir logs/era5/ours_savp
...@@ -6,8 +6,8 @@ ...@@ -6,8 +6,8 @@
# **************** Description **************** # **************** Description ****************
# This script can be used for setting up the virtual environment needed for ambs-project # This script can be used for setting up the virtual environment needed for ambs-project
# or to simply activate it. # or to simply activate it.
# In the former case, it also converts the (Batch) script templates to executable runscripts. # It also converts the (Batch) runscript templates to executable runscripts.
# Note, that you may pass an experiment identifier as second argument to this runscript # Note, that you may pass an experiment identifier as second argument (default 'exp1') to this runscript
# which will also be used as suffix in the executable runscripts. # which will also be used as suffix in the executable runscripts.
# **************** Description **************** # **************** Description ****************
# #
...@@ -32,7 +32,7 @@ fi ...@@ -32,7 +32,7 @@ fi
# list of (Batch) scripts used for the steps in the workflow # list of (Batch) scripts used for the steps in the workflow
# !!! Expects that a template named [script_name]_template.sh exists!!! # !!! Expects that a template named [script_name]_template.sh exists!!!
workflow_scripts=(DataExtraction DataPreprocess DataPreprocess2tf train_era5 generate_era5 DatePreprocess2tf_movingmnist train_movingmnist generate_movingmnist) workflow_scripts=(DataExtraction DataPreprocess DataPreprocess2tf train_era5 generate_era5 DataPreprocess2tf_movingmnist train_movingmnist generate_movingmnist)
HOST_NAME=`hostname` HOST_NAME=`hostname`
ENV_NAME=$1 ENV_NAME=$1
...@@ -46,7 +46,7 @@ ENV_DIR=${WORKING_DIR}/${ENV_NAME} ...@@ -46,7 +46,7 @@ ENV_DIR=${WORKING_DIR}/${ENV_NAME}
# * check if virtual env has already been set up # * check if virtual env has already been set up
if [[ "${EXE_DIR}" != "env_setup" ]]; then if [[ "${EXE_DIR}" != "env_setup" ]]; then
echo "ERROR: The setup-script for the virtual environment from the env_setup-directory!" echo "ERROR: Execute 'create_env.sh' from the env_setup-subdirectory only!"
return return
fi fi
...@@ -119,6 +119,12 @@ if [[ "$ENV_EXIST" == 0 ]]; then ...@@ -119,6 +119,12 @@ if [[ "$ENV_EXIST" == 0 ]]; then
if [[ "${HOST_NAME}" == hdfml* || "${HOST_NAME}" == juwels* ]]; then if [[ "${HOST_NAME}" == hdfml* || "${HOST_NAME}" == juwels* ]]; then
echo "export PYTHONPATH=${ENV_DIR}/lib/python3.6/site-packages:\$PYTHONPATH" >> ${activate_virt_env} echo "export PYTHONPATH=${ENV_DIR}/lib/python3.6/site-packages:\$PYTHONPATH" >> ${activate_virt_env}
fi fi
done
elif [[ "$ENV_EXIST" == 1 ]]; then
# activating virtual env is suifficient
source ${ENV_DIR}/bin/activate
fi
# Finish by creating runscripts
# After checking and setting up the virt env, create user-specific runscripts for all steps of the workflow # After checking and setting up the virt env, create user-specific runscripts for all steps of the workflow
if [[ "${HOST_NAME}" == hdfml* || "${HOST_NAME}" == juwels* ]]; then if [[ "${HOST_NAME}" == hdfml* || "${HOST_NAME}" == juwels* ]]; then
echo "***** Creating Batch-scripts for running workflow... *****" echo "***** Creating Batch-scripts for running workflow... *****"
...@@ -131,14 +137,9 @@ if [[ "$ENV_EXIST" == 0 ]]; then ...@@ -131,14 +137,9 @@ if [[ "$ENV_EXIST" == 0 ]]; then
for wf_script in "${workflow_scripts[@]}"; do for wf_script in "${workflow_scripts[@]}"; do
curr_script=${script_dir}/${wf_script} curr_script=${script_dir}/${wf_script}
if [[ -z "${exp_id}" ]]; then if [[ -z "${exp_id}" ]]; then
./generate_workflow_runscripts.sh ${curr_script} ./generate_workflow_runscripts.sh ${curr_script} ${ENV_NAME}
else else
./generate_workflow_runscripts.sh ${curr_script} ${exp_id} ./generate_workflow_runscripts.sh ${curr_script} ${ENV_NAME} -exp_id=${exp_id}
fi
done
# *** finished ***
elif [[ "$ENV_EXIST" == 1 ]]; then
# activating virtual env is suifficient
source ${ENV_DIR}/bin/activate
fi fi
#!/usr/bin/env bash #!/usr/bin/env bash
#
# __authors__ = Michael Langguth
# __date__ = '2020_09_29'
#
# **************** Description **************** # **************** Description ****************
# Converts given template workflow script (path has to be passed as first argument) to # Converts a given template workflow script (path/name has to be passed as first argument) to
# an executable workflow (Batch) script. # an executable workflow (Batch) script.
# Note, that this first argument has to be passed with "_template.sh" omitted! # Note, that the first argument has to be passed with "_template.sh" omitted!
# A second argument can be passed to set an experiment identifier whose default is exp1. # The second argument denotes the name of the virtual environment to be used.
# Note, that the second argument can be omitted only if there are no existing (Batch) scritps # Additionally, -exp_id=[some_id] and -exp_dir=[some_dir] can be optionally passed as NON-POSITIONAL arguments.
# carrying this identifier which is added as a suffix. # -exp_id allows to set an experimental identifier explicitly (default is -exp_id=exp1) while
# Example: # -exp_dir allows setting manually the experimental directory.
# ./generate_workflow_scripts.sh ../HPC_scripts/generate exp5 # Note, that the latter is done during the preprocessing step in an end-to-end workflow.
# ... will convert generate_template.sh to generate_exp5.sh # However, if the preprocessing step can be skipped (i.e. preprocessed data already exists),
# one may wish to set the experimental directory explicitly
#
# Examples:
# ./generate_workflow_scripts.sh ../HPC_scripts/generate_era5 venv_hdfml -exp_id=exp5
# ... will convert generate_era5_template.sh to generate_era5_exp5.sh where
# venv_hdfml is the virtual environment for operation.
#
# ./generate_workflow_scripts.sh ../HPC_scripts/generate_era5 venv_hdfml -exp_id=exp5 -exp_dir=testdata
# ... does the same as the previous example, but additionally extends source_dir=[...]/preprocessedData/,
# checkpoint_dir=[...]/models/ and results_dir=[...]/results/ by testdata/
# **************** Description **************** # **************** Description ****************
# #
# **************** Auxilary functions ****************
check_argin() {
# Handle input arguments and check if one of them holds -exp_id=
# or -exp_dir= to emulate them as non-positional arguments
for argin in "$@"; do
if [[ $argin == *"-exp_id="* ]]; then
exp_id=${argin#"-exp_id="}
elif [[ $argin == *"-exp_dir="* ]]; then
exp_dir=${argin#"-exp_dir="}
fi
done
}
add_exp_dir() {
# Add exp_dir to paths in <target_script> which end with /<prefix>/
prefix=$1
if [[ `grep "/${prefix}/$" ${target_script}` ]]; then
echo "Add experimental directory after '${prefix}/' in runscript '${target_script}'"
sed -i "s|/${prefix}/$|/${prefix}/${exp_dir}/|g" ${target_script}
status=1
fi
}
# **************** Auxilary functions ****************
HOST_NAME=`hostname` HOST_NAME=`hostname`
BASE_DIR=`pwd`
WORKING_DIR="$(dirname "$BASE_DIR")"
EXE_DIR="$(basename "$BASE_DIR")"
### some sanity checks ### ### Some sanity checks ###
# ensure that the script is executed from the env_setup-subdirectory
if [[ "${EXE_DIR}" != "env_setup" ]]; then
echo "ERROR: Execute 'generate_workflow_scripts.sh' from the env_setup-subdirectory only!"
exit 1
fi
# check input arguments # check input arguments
if [[ "$#" -lt 1 ]]; then if [[ "$#" -lt 2 ]]; then
echo "ERROR: Pass path to workflow runscript (without '_template.sh') to be generated..." echo "ERROR: Pass path to workflow runscript (without '_template.sh') and pass name of virtual environment..."
echo "Example: ./generate_workflow_scripts.sh ../HPC_scripts/DataExtraction venv_hdfml"
exit 1 exit 1
else else
curr_script=$1 curr_script=$1
if [[ "$#" -gt 1 ]]; then curr_script_loc="$(basename "$curr_script")"
exp_id=$2 curr_venv=$2
else # check if any known non-positional argument is present...
if [[ "$#" -gt 2 ]]; then
check_argin ${@:3}
fi
#...and ensure that exp_id is always set
if [[ -z "${exp_id}" ]]; then
exp_id="exp1" exp_id="exp1"
fi fi
fi fi
...@@ -34,19 +85,27 @@ if ! [[ -f ${curr_script}_template.sh ]]; then ...@@ -34,19 +85,27 @@ if ! [[ -f ${curr_script}_template.sh ]]; then
echo "Thus, no corresponding executable script is created!" echo "Thus, no corresponding executable script is created!"
if [[ ${curr_script} == *"template"* || ${curr_script} == *".sh"* ]]; then if [[ ${curr_script} == *"template"* || ${curr_script} == *".sh"* ]]; then
echo "ERROR: Omit '_template' and/or '.sh' from Bash script argument." echo "ERROR: Omit '_template' and/or '.sh' from Bash script argument."
exit 1 exit 2
else else
exit 0 # still ok, i.e. only a WARNING is raised exit 0 # still ok, i.e. only a WARNING is raised
fi fi
fi fi
# check if target script is unique
# Check existence of virtual environment (2nd argument)
if [[ ! -d ${WORKING_DIR}/${curr_venv} ]]; then
echo "ERROR: Could not find directory of virtual environment under ${WORKING_DIR}/${curr_venv} "
exit 3
fi
# Check if target script is unique
target_script=${curr_script}_${exp_id}.sh target_script=${curr_script}_${exp_id}.sh
if [[ -f ${target_script} ]]; then if [[ -f ${target_script} ]]; then
echo "ERROR: ${target_script} already exist." echo "ERROR: ${target_script} already exist."
echo "Set explicitly a different experiment identifier." echo "Set explicitly a different experiment identifier."
exit 1 exit 4
fi fi
### do the work ###
### Do the work ###
# create copy of template which is modified subsequently # create copy of template which is modified subsequently
cp ${curr_script}_template.sh ${target_script} cp ${curr_script}_template.sh ${target_script}
# remove template identifiers # remove template identifiers
...@@ -56,10 +115,24 @@ line_e=`echo ${num_lines} | cut -d' ' -f 2` ...@@ -56,10 +115,24 @@ line_e=`echo ${num_lines} | cut -d' ' -f 2`
if [[ ${line_s} == "" || ${line_e} == "" ]]; then if [[ ${line_s} == "" || ${line_e} == "" ]]; then
echo "ERROR: ${curr_script}_template.sh exists, but does not seem to be a valid template script." echo "ERROR: ${curr_script}_template.sh exists, but does not seem to be a valid template script."
rm ${target_script} # remove copy again rm ${target_script} # remove copy again
exit 1 exit 5
else else
sed -i "${line_s},${line_e}d" ${target_script} sed -i "${line_s},${line_e}d" ${target_script}
fi fi
# set virtual environment to be used in Batch scripts
if [[ `grep "VIRT_ENV_NAME=" ${target_script}` ]]; then
sed -i 's/VIRT_ENV_NAME=.*/VIRT_ENV_NAME="'${curr_venv}'"/g' ${target_script}
fi
# also adapt name output- and error-files of submitted job with exp_id (if we are on Juwels or HDF-ML)
if [[ `grep "#SBATCH --output=" ${target_script}` ]]; then
sed -i "s|#SBATCH --output=.*|#SBATCH --output=${curr_script_loc}_${exp_id}-out\.%j|g" ${target_script}
fi
if [[ `grep "#SBATCH --error=" ${target_script}` ]]; then
sed -i "s|#SBATCH --error=.*|#SBATCH --error=${curr_script_loc}_${exp_id}-err\.%j|g" ${target_script}
fi
# set exp_id in (Batch) script if present # set exp_id in (Batch) script if present
if [[ `grep "exp_id=" ${target_script}` ]]; then if [[ `grep "exp_id=" ${target_script}` ]]; then
sed -i "s/exp_id=.*/exp_id=$exp_id/g" ${target_script} sed -i "s/exp_id=.*/exp_id=$exp_id/g" ${target_script}
...@@ -75,4 +148,18 @@ if [[ "${HOST_NAME}" == hdfml* || "${HOST_NAME}" == juwels* ]]; then ...@@ -75,4 +148,18 @@ if [[ "${HOST_NAME}" == hdfml* || "${HOST_NAME}" == juwels* ]]; then
sed -i "s/--mail-user=.*/--mail-user=$USER_EMAIL/g" ${target_script} sed -i "s/--mail-user=.*/--mail-user=$USER_EMAIL/g" ${target_script}
fi fi
# finally set experimental directory if exp_dir is present
if [[ ! -z "${exp_dir}" ]]; then
status=0 # status to check if exp_dir is added to the runscript at hand
# -> will be set to one by add_exp_dir if modifictaion takes place
add_exp_dir preprocessedData
add_exp_dir models
add_exp_dir results
if [[ ${status} == 0 ]]; then
echo "WARNING: -exp_dir has been passed, but no addition to any path in runscript at hand done..."
fi
fi
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment