Autopong

Untrained model moving randomly.

DQN Model after 10,000,000 training frames.

Autopong, was my final project for Machine Learning FA2024, where I implemented and trained two reinforcement learning (RL) agents using OpenAI Gym's Atari Pong environment. The first agent used a prebuilt Deep Q-Network (DQN) from stable-baselines3, while the second was a custom DQN built from scratch using TensorFlow and OpenCV. The goal was to compare the performance and training dynamics of off-the-shelf vs. homemade models, and to better understand the challenges of developing RL agents. Ultimately, both agents learned to successfully defeat a hardcoded AI opponent, although the stable-baselines3 model outperformed my custom model.

Technical Components:

Architecture: Autopong was built using OpenAI Gymnasium's ALE/Pong-v5 environment. Both agents were developed and trained within a single Jupyter Notebook environment, using a CUDA-enabled GPU to accelerate training. The notebook served as an all-in-one interface for model development, training, evaluation, and eventually as the code for the technical essay. All training runs were conducted locally, with models training over a period of 28-36 hours each.

Prebuilt Agent: The prebuilt agent was implemented using stable-baselines3's DQN class with the CnnPolicy, allowing the model to learn directly from raw game frames. The Pong environment was wrapped with Monitor and DummyVecEnv for compatibility and logging.

Custom Agent: The homemade DQN was built from scratch in TensorFlow and processed stacked 84x84 grayscale frames using OpenCV.
  • Architecture: Three convolutional layers followed by dense layers
  • Learning Strategy: Experience replay, target network syncing, and epsilon-greedy exploration
  • Loss & Optimizer: Huber loss with Adam
My custom agent used OpenCV to crop, grayscale, resize, and stack frames for temporal context. Though trained for fewer iterations, it steadily improved and approached the prebuilt agent's performance.

Personal Contributions:

This was an individual project! The building, evaluation, and technical report were all completed by yours truly.