Solving the taxi problem using Q learning