DLTK: State of the Art Reference Implementations for Deep Learning on Medical Images. We present DLTK, a toolkit providing baseline implementations for efficient experimentation with deep learning methods on biomedical images. It builds on top of TensorFlow and its high modularity and easy-to-use examples allow for a low-threshold access to state-of-the-art implementations for typical medical imaging problems. A comparison of DLTK’s reference implementations of popular network architectures for image segmentation demonstrates new top performance on the publicly available challenge data ”Multi-Atlas Labeling Beyond the Cranial Vault”. The average test Dice similarity coefficient of 81.5 exceeds the previously best performing CNN (75.7) and the accuracy of the challenge winning method (79.0).