Reinforcement Learning In Partially Observable Markov Decision Processes: Learning To Remember The Past By Learning To Predict The Future