Implementation of gradient accumulation for SAVP

In order to allow for a larger batch size during SAVP training (it is hoped that this stabilizes the training), gradient accumulation (with Horovod) will be implemented in this working branch.