Reinforcement Learning in Grid system to catch Robber