Imitation Learning (IL) and Reinforcement Learning (RL) each come with distinct strengths and weaknesses. IL is typically sample-efficient and straightforward to implement, but it requires large amounts of expert data and often suffers from distributional shift. In contrast, RL does not rely on expert demonstrations and can learn robust policies through interaction, but it faces challenges such as unstable training dynamics and the difficulty of designing appropriate reward functions. IL has since long been applied to the problem of autonomus driving. Since recently, also RL is getting more traction, among others due to the availability of extremely fast simulators and large-scale compute. Goal of this thesis is to investigate the emergent topic of combining both approaches.