Hi,
Thanks for your interest in our work.
In our case, we developed the accelerator in RTL coding, and the drivers porting the communication (through PCIe) between TensorFlow program on host CPU and FPGA accelerator had to be manually developed, and wrapped in Cython to be called from the python(TF) program. On Xilinx U200 side I used PCIe integrated IP block in Vivado.
However, I am aware that Xilinx Vitis or Pynq platform provides some high-level support to automate this process, it would definitely be interesting to try those out.