
Create Custom OpenAI Gym Environments: Build Chopper Game with Coding
Introduction
Creating custom environments in OpenAI Gym is a powerful way to build interactive simulations for machine learning. In this tutorial, we’ll guide you through coding a simple game where a chopper must avoid birds and collect fuel tanks to survive. We’ll cover the essential steps, from defining the observation and action spaces to implementing key functions like reset and step functions for dynamic gameplay. Along the way, we’ll show you how to render the environment for visualization, making it easy to monitor the chopper’s performance and improve its learning. Let’s dive in and build your custom OpenAI Gym environment!
What is Custom Environment in OpenAI Gym?
This solution helps users create a custom learning environment in OpenAI Gym for reinforcement learning. It allows developers to design unique tasks or games, like controlling a chopper while avoiding birds and collecting fuel. The environment is built using Python, and users can define the behavior of their AI agents through actions and observations within the game-like setup.
Prerequisites
Alright, before we dive into the fun stuff, there are a couple of things we need to set up. First, let’s talk about Python. To follow along with this tutorial, you’ll need a machine that has Python installed. Don’t worry, it’s easy enough to do! If you already have it, great. If not, a quick search will point you in the right direction.
Now, you don’t need to be a Python expert, but having a basic understanding of things like variables, loops, and functions will make everything a whole lot easier as we go along. If you’re new to Python, no stress! Just make sure you’re comfortable with the basics so you can follow along smoothly. These concepts will come in handy when we start with the environment setup and coding the actions for our Chopper.
Next up, we’ll need OpenAI Gym installed. This is a crucial tool, and it’s where the magic happens. OpenAI Gym is a toolkit that lets us build and test reinforcement learning environments. Essentially, it provides a space where we can teach our Chopper (and other agents) how to interact with the environment, make decisions, and get better over time.
To install it, you’ll use Python’s package manager, pip. Just type
$ pip install gym
One important thing to note is that OpenAI Gym must be installed on the machine or cloud server you’re using. If you’re running everything locally, make sure your Python version is compatible with the Gym package.
If you need help with the installation, the official documentation has a detailed guide to walk you through it. Once OpenAI Gym is up and running, you’re all set to start building your custom environment, which is exactly what we’ll be doing in this tutorial.
Dependencies/Imports
Alright, before we get to building our custom environment, there are a few things we need to install. Think of these as the tools we need in our toolbox—without them, we can’t really get the job done. These dependencies will help us handle images, work with arrays, and interface smoothly with OpenAI Gym. They’re absolutely essential for tasks like rendering images, managing graphical data, and handling the OpenAI Gym environment itself.
Let’s start by installing the libraries that we’ll use for image handling and more. First up, we’ll need to install these two libraries:
!pip install opencv-python
!pip install pillow
These libraries are key to working with images. OpenCV ( opencv-python ) is an open-source computer vision library. It gives us powerful tools to manipulate images and videos. We’ll use it to render the elements in our custom environment, like the Chopper, birds, and fuel tanks. Next, we have Pillow ( pillow ), a fork of the Python Imaging Library (PIL). This one’s all about image processing—making sure we can load and work with image files and formats easily.
Once that’s set up, we move on to the next step: importing the libraries in our Python script. Here’s a list of what we’ll need:
import numpy as np # NumPy: Used for handling arrays and performing math operations
import cv2 # OpenCV: Essential for computer vision tasks like image manipulation
import matplotlib.pyplot as plt # Matplotlib: Helps us visualize images and data plots
import PIL.Image as Image # Pillow (PIL): Used for handling image files and formats
import gym # OpenAI Gym: The framework for creating and interacting with custom environments
import random # Random: Generates random numbers and decisions for the environment
from gym import Env, spaces # OpenAI Gym’s Env and spaces: Tools for creating custom environments
import time # Time: Used for controlling frame delays when rendering
And don’t forget—there’s also a specific font from OpenCV that we’ll need to display text on our images. This will help us show important details, like fuel levels or scores, right on the environment’s canvas:
font = cv2.FONT_HERSHEY_COMPLEX_SMALL
These libraries are the foundation of our environment. They let us manipulate images, define the behavior of our Chopper, handle the elements in the environment, and more. Be sure everything is installed correctly before moving on. Once we’ve got this all set up, we’ll be ready to start crafting the magic that is our custom environment.
Be sure everything is installed correctly before moving on.
OpenCV Python Applications: Recipes for Beginners
Description of the Environment
Imagine you’re playing a game where your job is to keep a chopper flying for as long as you can. That’s the basic idea behind the environment we’re building in this tutorial. It’s inspired by the classic “Dino Run” game that pops up in Google Chrome whenever your internet decides to take a nap. You know the one—the little dinosaur that just keeps running forward, and you need to help it jump over cacti and dodge birds. The longer the dino lasts and the farther it runs, the higher the score. In reinforcement learning terms, that’s basically how the reward system works.
Now, here’s where it gets interesting: in our version of the game, the character isn’t a dinosaur. Nope, we’re switching it up with a chopper pilot. The goal? Get the chopper as far as possible without crashing into birds or running out of fuel. If the chopper hits a bird, the game ends—just like in the original Dino Run. And if the chopper runs out of fuel, that’s game over too. We’re definitely raising the stakes!
But don’t worry, we’re not just leaving the chopper stranded in the sky. To keep it flying, there are floating fuel tanks scattered around the environment. When the chopper collects these, it gets refueled—though we’re not going for total realism here. The fuel tanks will refill the chopper to a full capacity of 1000 liters, just enough to keep the game exciting.
Now, here’s the deal—this environment is a proof of concept. It’s not going to be the most visually stunning game you’ve ever seen, but it gives you a solid starting point to work with. You can take this basic concept and make it your own, adding new challenges or making the game more complex. The sky’s the limit!
The first big decision we need to make when designing this environment is what kind of observation and action space the agent will use. Think of the observation space as how the chopper “sees” the environment. It can either be continuous or discrete, and this choice affects how the agent interacts with the world around it.
In a discrete action space, the environment is divided into fixed areas or cells. Picture a grid world where each cell represents a specific position the agent could be. The agent can only be in one of these cells at any time, and each cell has a set of rules or actions associated with it. So, for example, in a grid-based game, the agent might be able to move left, right, or jump—but it can’t do anything more complicated, like jump higher or jump lower .
Now, contrast that with a continuous action space, which gives much more freedom. Here, the agent’s position is described using real numbers. This means the agent can move freely—kind of like in a game like Angry Birds, where you don’t just pull back a slingshot and let it go. Instead, you control how far back you stretch the slingshot, adjusting the force and direction of the shot. This gives you a lot more control over what the agent does.
So why does this matter? Well, whether you go with a continuous or discrete action space will change how your agent behaves and interacts with the environment. It’s a pretty important decision that shapes the whole feel of your game. Whether you want simple, predefined actions or a more flexible, dynamic setup, this choice sets the tone for everything!
Reinforcement Learning Environment Design
Something went wrong while generating the response. If this issue persists please contact us through our help center at help.openai.com.Edit
Elements of the Environment
Alright, now that we’ve got the action space and the observation space all figured out, let’s move on to defining the elements that will fill our custom environment. Imagine this step as setting up the characters and props for a game—a Chopper, some Birds, and Fuel Tanks. These are the main players that will interact with our main character, the Chopper, throughout the game. To keep things organized, we’ll create a separate class for each element, and they’ll all inherit from a common base class called Point .
Point Base Class
So, what’s this Point class all about? Think of it as the blueprint for every object in the game world. The Point class defines any arbitrary point on our observation image (that’s the game screen, of course). Every element, whether it’s the Chopper, a Bird, or a Fuel Tank, will be treated as a point that we can move around within the game.
Let’s break down the parts that make this class work:
- Attributes:
- (x, y) : These are the coordinates of the point on the screen, telling us exactly where it is.
- (x_min, x_max, y_min, y_max) : These values define the boundaries within which the point can move. We wouldn’t want our elements to fly off the screen, right? If they go out of bounds, the values get “clamped” back to the limits we set.
- name : This is just the name of the point—something like “Chopper,” “Bird,” or “Fuel.”
- Methods:
- get_position() : This function returns the current coordinates of the point.
- set_position(x, y) : This one sets the point’s position to the (x, y) coordinates we give it, making sure it stays within the screen’s boundaries.
- move(del_x, del_y) : If we want to move the point by a certain amount, this method does the trick.
- clamp(n, minn, maxn) : A handy helper method that ensures a value stays within the minimum and maximum limits. It’s like a safety net for our points.
Here’s how we implement the Point class in code:
class Point(object):
def __init__(self, name, x_max, x_min, y_max, y_min):
self.x = 0
self.y = 0
self.x_min = x_min
self.x_max = x_max
self.y_min = y_min
self.y_max = y_max
self.name = name</p>
<p> def set_position(self, x, y):
self.x = self.clamp(x, self.x_min, self.x_max – self.icon_w)
self.y = self.clamp(y, self.y_min, self.y_max – self.icon_h)</p>
<p> def get_position(self):
return (self.x, self.y)</p>
<p> def move(self, del_x, del_y):
self.x += del_x
self.y += del_y
self.x = self.clamp(self.x, self.x_min, self.x_max – self.icon_w)
self.y = self.clamp(self.y, self.y_min, self.y_max – self.icon_h)</p>
<p> def clamp(self, n, minn, maxn):
return max(min(maxn, n), minn)
Defining the Chopper, Bird, and Fuel Classes
With the Point base class in place, it’s time to define our game elements: the Chopper, Birds, and Fuel Tanks. These elements will each inherit from the Point class and bring their own unique characteristics to the game.
Chopper Class
The Chopper class is the star of the show—it’s the character that the player controls. We start by giving it an image (think of it as choosing a costume for our character) and then resizing that image so it fits perfectly in the game world. This class uses OpenCV to read and process the Chopper’s image.
Here’s the code for the Chopper class:
class Chopper(Point):
def __init__(self, name, x_max, x_min, y_max, y_min):
super(Chopper, self).__init__(name, x_max, x_min, y_max, y_min)
self.icon = cv2.imread(“chopper.png”) / 255.0 # Read and normalize the image
self.icon_w = 64 # Width of the icon
self.icon_h = 64 # Height of the icon
self.icon = cv2.resize(self.icon, (self.icon_h, self.icon_w)) # Resize the icon
Bird Class
Next up, the Birds. These guys are the Chopper’s enemies—they swoop in and try to take the Chopper down. Just like the Chopper, they have their own image, which we read and resize. The Bird class is very similar to the Chopper class but with a different image.
Here’s the Bird class:
class Bird(Point):
def __init__(self, name, x_max, x_min, y_max, y_min):
super(Bird, self).__init__(name, x_max, x_min, y_max, y_min)
self.icon = cv2.imread(“bird.png”) / 255.0 # Read and normalize the image
self.icon_w = 32 # Width of the bird icon
self.icon_h = 32 # Height of the bird icon
self.icon = cv2.resize(self.icon, (self.icon_h, self.icon_w)) # Resize the bird icon
Fuel Class
Finally, we have the Fuel Tanks. These floating tanks provide a way for the Chopper to refuel and keep flying. Like the Bird and Chopper, the Fuel Tank has an image and dimensions. The only difference is that these icons will be floating up from the bottom of the screen for the Chopper to collect.
Here’s the Fuel class:
class Fuel(Point):
def __init__(self, name, x_max, x_min, y_max, y_min):
super(Fuel, self).__init__(name, x_max, x_min, y_max, y_min)
self.icon = cv2.imread(“fuel.png”) / 255.0 # Read and normalize the image
self.icon_w = 32 # Width of the fuel icon
self.icon_h = 32 # Height of the fuel icon
self.icon = cv2.resize(self.icon, (self.icon_h, self.icon_w)) # Resize the fuel icon
Summary
We’ve now laid the foundation for our game environment. The Chopper, Birds, and Fuel are all set up, each represented by its own class that inherits from the Point class. The Point class handles the essentials—positioning, moving, and rendering—while the specific elements like the Chopper, Birds, and Fuel tanks each bring their own flavor to the game.
This structure is key for managing how these elements interact with each other in the environment. With these components, we’re one step closer to building a fully interactive game where the Chopper dodges Birds and collects Fuel to survive. The real fun begins when we start defining how these elements move and interact in the game world!
Custom Environment Creation (Gymnasium 2025)
Point Base Class
Let’s start by talking about the Point class. This class is the backbone of our environment—it defines the basic properties for any object that will appear on the screen, such as the Chopper, Birds, and Fuel Tanks. Imagine it like the coordinates on a map—it’s the foundation that ensures everything behaves the way it’s supposed to in our custom environment.
Attributes of the Point Class:
- (x, y): These are the coordinates that tell us where the point (or element) is located on the screen. The x value controls how far left or right the element is, and the y value handles its position up or down.
- (x_min, x_max, y_min, y_max): These define the boundaries for where the element can be placed. It’s like saying, “You can only move between these walls!” If we try to set a point outside these boundaries, the position will be adjusted, or “clamped,” to fit within the screen.
- name: This is simply a label for each point. For instance, we’ll name our points things like “Chopper,” “Bird,” or “Fuel.” It’s like giving a nickname to each element in the game so we can keep track of them.
Member Functions of the Point Class:
- get_position() : This method gives us the current coordinates of the point (basically the location on the map). It’s like asking, “Where am I right now?”
- set_position(x, y) : This method sets the position to specific coordinates on the screen. But don’t worry, it makes sure the position stays within the allowed area, thanks to the clamp function.
- move(del_x, del_y) : Here’s where the action happens! We can move the point by a specified number of steps ( del_x for horizontal and del_y for vertical). After moving, it checks if the new position is still within the boundaries—just like a friendly reminder not to step outside the lines.
- clamp(n, minn, maxn) : This is the magic that keeps things in check. If a position is too far out of bounds, this function brings it back, so it never goes past the boundaries we set.
Code Implementation of the Point Class:
class Point(object):
def __init__(self, name, x_max, x_min, y_max, y_min):
self.x = 0 # Initial x-coordinate of the point
self.y = 0 # Initial y-coordinate of the point
self.x_min = x_min # Minimum x-coordinate (boundary)
self.x_max = x_max # Maximum x-coordinate (boundary)
self.y_min = y_min # Minimum y-coordinate (boundary)
self.y_max = y_max # Maximum y-coordinate (boundary)
self.name = name # Name of the point (e.g., “Chopper”, “Bird”) </p>
<p> def set_position(self, x, y):
# Set the position, ensuring it stays within the boundaries
self.x = self.clamp(x, self.x_min, self.x_max – self.icon_w)
self.y = self.clamp(y, self.y_min, self.y_max – self.icon_h) </p>
<p> def get_position(self):
# Return the current position of the point as a tuple (x, y)
return (self.x, self.y) </p>
<p> def move(self, del_x, del_y):
# Move the point by a certain amount (del_x, del_y)
self.x += del_x
self.y += del_y
# Ensure the new position stays within the boundaries
self.x = self.clamp(self.x, self.x_min, self.x_max – self.icon_w)
self.y = self.clamp(self.y, self.y_min, self.y_max – self.icon_h) </p>
<p> def clamp(self, n, minn, maxn):
# Ensure the value n is within the range of minn and maxn
return max(min(maxn, n), minn)
Explanation of the Code:
The __init__ function sets up the basic attributes. It starts the position at (0, 0) and defines the boundaries for where the point can move, using x_min , x_max , y_min , and y_max . The name attribute helps us identify the point.
When we use the set_position method, it takes the new position ( x , y ) and adjusts it with the help of the clamp method to make sure it stays within the boundaries. The get_position method just returns the current location of the point—basically a “Where am I?” check.
The move method updates the position by adding del_x and del_y to the current coordinates. After the move, it checks if the new position is still within the defined boundaries, ensuring the point doesn’t wander off-screen.
Lastly, the clamp function makes sure that if a point is out of bounds, it gets adjusted back to a valid spot. No one wants to see their elements disappear off the edge of the screen, right?
Defining the Chopper, Bird, and Fuel Classes
Now that we’ve got the Point class down, we can start defining the specific classes for the Chopper, Birds, and Fuel Tanks. Each of these classes will inherit from Point , which means they’ll have all the positioning and movement functionality we just set up.
Chopper Class
The Chopper class is the star of the show. It’s the character that the player controls in the game. We’ll assign it an image (think of it as the Chopper’s avatar) and then resize it to fit in the game world. We use OpenCV to load and resize the Chopper’s image.
Here’s the code for the Chopper class:
class Chopper(Point):
def __init__(self, name, x_max, x_min, y_max, y_min):
super(Chopper, self).__init__(name, x_max, x_min, y_max, y_min)
self.icon = cv2.imread(“chopper.png”) / 255.0 # Read and normalize the image
self.icon_w = 64 # Width of the icon
self.icon_h = 64 # Height of the icon
self.icon = cv2.resize(self.icon, (self.icon_h, self.icon_w)) # Resize the icon
Bird Class
Next up, the Birds. These are the enemies in our game, the ones that the Chopper needs to avoid. Like the Chopper, the Bird class also has an image and size, but its icon is different to reflect its nature.
Here’s the Bird class:
class Bird(Point):
def __init__(self, name, x_max, x_min, y_max, y_min):
super(Bird, self).__init__(name, x_max, x_min, y_max, y_min)
self.icon = cv2.imread(“bird.png”) / 255.0 # Read and normalize the image
self.icon_w = 32 # Width of the bird icon
self.icon_h = 32 # Height of the bird icon
self.icon = cv2.resize(self.icon, (self.icon_h, self.icon_w)) # Resize the bird icon
Fuel Class
Finally, we have the Fuel Tanks. These are scattered around the game, waiting to be collected by the Chopper to refuel. Just like the Bird and Chopper, the Fuel class is initialized with an image that represents it on screen.
Here’s the Fuel class:
class Fuel(Point):
def __init__(self, name, x_max, x_min, y_max, y_min):
super(Fuel, self).__init__(name, x_max, x_min, y_max, y_min)
self.icon = cv2.imread(“fuel.png”) / 255.0 # Read and normalize the image
self.icon_w = 32 # Width of the fuel icon
self.icon_h = 32 # Height of the fuel icon
self.icon = cv2.resize(self.icon, (self.icon_h, self.icon_w)) # Resize the fuel icon
Summary
Now, we’ve got the building blocks for our game! The Chopper, Birds, and Fuel Tanks are all set up as individual classes, each inheriting from the Point class. This allows us to control their positions, movements, and interactions within the environment.
With this structure in place, we’re ready to move on to the next step: creating a dynamic world where these elements will interact, and the Chopper can dodge Birds and collect Fuel Tanks to stay airborne. It’s time to bring this environment to life!
Python Object-Oriented Programming Guide
Chopper Class
Picture this: you’re in control of a Chopper flying through a challenging landscape, dodging birds and collecting floating fuel tanks to stay alive. Sounds fun, right? Well, that’s exactly what our Chopper class does! It’s the main character in our game, the agent that drives all the action. But how does it interact with the environment? How does it move and respond to the world around it? That’s where the magic of coding comes in.
Inheriting from the Point Class
The Chopper class inherits from the Point class, which is like giving it a solid foundation to stand on. It inherits all the tools it needs to control its position and movement within the game world, just like any other object in the environment. But the Chopper isn’t just about where it is—it’s about what it looks like, how big it is, and how it moves. Let’s dive into what makes the Chopper class tick.
Key Components of the Chopper Class:
-
icon
: The Chopper needs to have an image, of course! This icon attribute represents the Chopper’s visual appearance in the game. Using OpenCV (the computer vision magic tool), we load the Chopper image with
cv2.imread(“chopper.png”)
- icon_w and icon_h : These are the width and height of the Chopper’s image. Right now, we’ve got it set to 64×64 pixels, but you could easily change this if you wanted the Chopper to be a bit bigger or smaller. It’s all about customization, right?
- cv2.resize(self.icon, (self.icon_h, self.icon_w)) : This line of code ensures that the Chopper’s image fits perfectly within the game. Images come in all sorts of shapes and sizes, but we want our Chopper to be the right size to play smoothly with the environment. So, we resize it to the exact dimensions we set earlier (64×64 pixels).
Code Implementation of the Chopper Class:
class Chopper(Point):
def __init__(self, name, x_max, x_min, y_max, y_min):
super(Chopper, self).__init__(name, x_max, x_min, y_max, y_min) # Load and normalize the Chopper image
self.icon = cv2.imread(“chopper.png”) / 255.0 # Define the dimensions of the Chopper’s icon
self.icon_w = 64
self.icon_h = 64 # Resize the Chopper’s icon to the specified width and height
self.icon = cv2.resize(self.icon, (self.icon_h, self.icon_w))
Explanation of the Code:
- Inheritance: Here’s the cool part: the Chopper class inherits all the positioning and movement functionalities from the Point class. That means it already knows how to track its location and move around within the boundaries. Pretty neat, right?
- Loading and Normalizing the Image: To make the Chopper look good on the screen, we load the image using
cv2.imread(“chopper.png”)
- Resizing the Image: Since images come in all sizes, we need to resize the Chopper’s image so it fits just right. We use
cv2.resize(self.icon, (self.icon_h, self.icon_w))
The Big Picture: In the game, Chopper is the agent that you control. It needs to move, avoid birds, and collect fuel tanks to keep going. This class defines what the Chopper looks like, how it moves, and ensures that it stays within the visible bounds of the game screen. It inherits from the Point class, so it automatically knows how to keep track of its position and how to move around. But, it goes a step further by giving the Chopper a unique appearance, making it a true character in the game.
And there you have it—the Chopper class is ready to go, looking good, moving smoothly, and ready for action. Now, it’s all about bringing this flying hero to life in the custom environment we’ve been building!
Understanding Object-Oriented Programming in Python
Bird Class
Imagine you’re flying the Chopper, soaring through the game world. You feel the wind in your virtual hair as you dodge fuel tanks, but suddenly, out of nowhere—wham!—a bird swoops in front of you. If you’re not quick enough to avoid it, your journey comes to an abrupt end. The Bird class is the villain in our game, adding that extra challenge. It’s the pesky obstacle that the Chopper has to avoid to keep flying and earning rewards.
Inheriting from the Point Class
Just like the Chopper, the Bird class inherits from the Point class. So, it inherits all the cool abilities of the Point class to track its position and move within the environment. But the Bird isn’t just about where it’s placed. It has its own unique traits, like its image and how it behaves during the game.
Key Components of the Bird Class:
- icon : The icon represents the bird’s image on the screen. Using OpenCV, we load the bird image with cv2.imread("bird.png") and then normalize it. This normalization (dividing by 255.0 ) ensures that the image can be processed correctly, no matter its original format.
- icon_w and icon_h : These two attributes determine how big the bird is on screen. We’ve set the bird’s size to 32×32 pixels, which you can change if you want a larger or smaller bird. You’re in control!
- cv2.resize(self.icon, (self.icon_h, self.icon_w)) : This is the magic that resizes the bird’s image to the perfect 32×32 pixel size. It ensures the bird fits perfectly within the visual scale of the game, and doesn’t look out of place next to the Chopper or the fuel tanks.
Code Implementation of the Bird Class:
class Bird(Point):
def __init__(self, name, x_max, x_min, y_max, y_min):
super(Bird, self).__init__(name, x_max, x_min, y_max, y_min) # Load and normalize the Bird image
self.icon = cv2.imread(“bird.png”) / 255.0 # Define the dimensions of the Bird’s icon
self.icon_w = 32
self.icon_h = 32 # Resize the Bird’s icon to the specified width and height
self.icon = cv2.resize(self.icon, (self.icon_h, self.icon_w))
Explanation of the Code:
- Inheritance: The Bird class inherits from the Point class. This means that, just like the Chopper, the bird knows where it is and how to move around within the environment. It benefits from the same functionality, ensuring that all game elements behave in the same consistent way. This makes things a lot easier when you’re managing multiple objects in the game.
- Loading and Normalizing the Image: The bird’s image is loaded from the file using cv2.imread("bird.png") . But here’s the neat part: we normalize the image’s pixel values by dividing them by 255.0 . This is a standard step in image processing, making sure that all the values fit within the proper range and are processed without errors.
- Resizing the Image: The bird’s image is resized to match the dimensions we’ve set—32×32 pixels. We use cv2.resize() to adjust the size. This ensures that the bird fits seamlessly into the game world, and doesn’t overpower the other elements, like the Chopper.
The Big Picture:
The Bird class adds a real challenge to our game world. Its behavior is simple: it moves, it can collide with the Chopper, and if it does, the game ends. But how it moves, how it looks, and how it interacts with the Chopper are all defined by this class. By inheriting from the Point class, it shares the same basic positioning functionality as the other game elements, but it has its own unique properties—its image, its size, and how it contributes to the gameplay.
Now, whenever the Chopper encounters a bird, the game gets a bit more intense, a little more exciting, and a lot more fun. So, make sure you steer clear of those birds while you’re flying your Chopper through the world!
Make sure to adjust the size and behavior of the bird to match the overall feel of your game.
Fuel Class
Imagine you’re zooming through the game world, dodging birds and trying to stay in the air. But wait—your fuel’s running low! You need to find a fuel tank before you run out of energy and fall from the sky. That’s where the Fuel class comes into play. It’s the hero that keeps the Chopper flying, offering that vital boost to keep your journey going.
The Role of the Fuel Class
The Fuel class is a key part of our custom environment. It’s what lets the Chopper refuel, giving it the ability to keep moving, dodging, and surviving. Like other game elements, the Fuel class is derived from the Point class, so it inherits the same abilities to manage its position and movement within the game. However, it also has unique properties that make it distinct—like how it looks and how it interacts with the Chopper.
Key Components of the Fuel Class:
- icon : The icon is what represents the fuel on the screen. It’s the image that you’ll see floating around in the environment. We load the fuel image using OpenCV‘s cv2.imread() function. Then, we normalize it by dividing each pixel value by 255.0 —this step is necessary to make sure the image is processed properly and fits into the game environment.
- icon_w and icon_h : These two attributes determine the size of the fuel icon on the screen. For our game, we’ve set the fuel icon to 32×32 pixels. Of course, you can adjust the size if you prefer a bigger or smaller fuel icon. It’s all about finding the right balance for your game.
- cv2.resize(self.icon, (self.icon_h, self.icon_w)) : This line of code resizes the fuel icon to ensure that it fits the specified dimensions (32×32 pixels). Resizing ensures that all the elements in the game, including the fuel, align properly and look consistent in terms of visual scale.
Code Implementation of the Fuel Class:
class Fuel(Point):
def __init__(self, name, x_max, x_min, y_max, y_min):
super(Fuel, self).__init__(name, x_max, x_min, y_max, y_min)
# Load and normalize the Fuel image
self.icon = cv2.imread(“fuel.png”) / 255.0
# Define the dimensions of the Fuel icon
self.icon_w = 32
self.icon_h = 32
# Resize the Fuel icon to the specified width and height
self.icon = cv2.resize(self.icon, (self.icon_h, self.icon_w))
Explanation of the Code:
Inheritance: The Fuel class inherits from the Point class, which means it gets all the cool positioning and movement features right off the bat. This inheritance ensures that the fuel behaves like the other game elements, like the Chopper and Birds, while also adding its own special qualities like its image and size.
Loading and Normalizing the Image: The fuel image is loaded using cv2.imread("fuel.png") , and we normalize the pixel values by dividing by 255.0 . This normalization is necessary for the image to display correctly in the game environment and ensures that it’s processed properly by the system.
Resizing the Image: We then resize the fuel icon to the desired size (32×32 pixels) using the cv2.resize() function. This makes sure that the fuel looks just the right size in the game, fitting in with the other elements like the Chopper and the Birds.
The Big Picture:
The Fuel class is more than just a simple game element—it’s what keeps the Chopper from running out of energy and falling from the sky. Its unique attributes, like the icon and its size, are what make the fuel stand out in the game environment. By inheriting from the Point class, the Fuel class ensures it behaves consistently with other game elements, making the game world feel cohesive and dynamic.
And let’s not forget the strategy it introduces: The Chopper has to collect the fuel in order to stay in the game. So, every time a fuel tank appears, it becomes a race against time. Can you get to it before the fuel runs out? That’s where the excitement lies!
Remember to adjust the size of the fuel icon to fit your game’s aesthetic.
Back to the ChopperScape Class
Alright, let’s dive back into the heart of the action with the ChopperScape class. In this part, we’re going to implement two of the most crucial functions for our environment: reset and step. These functions are the backbone of how the game state is controlled and how the Chopper interacts with the environment. Plus, we’ll introduce some helper functions that make rendering the environment and updating its elements a breeze.
Reset Function: Starting Fresh
Think of the reset function as the game’s way of hitting the “refresh” button. Every time you reset, the game goes back to square one, and everything gets set to its initial state. Fuel levels? Reset. Rewards? Back to zero. The Chopper? It’s back in its starting position, ready to face the challenge once again.
The reset function takes care of all the variables that track the state of the game. It handles things like fuel consumption, the cumulative reward, and the number of elements like birds and fuel tanks on the screen. When the environment resets, only the Chopper starts the game fresh—everything else (like the birds and fuel tanks) will be spawned dynamically as the game progresses.
So, here’s the deal: We initialize the Chopper at a random position within the top-left section of the screen. We do this by picking a random point within the top 5-10% of the screen’s width and 15-20% of its height. This ensures the Chopper starts in a valid location, without overlapping with any other elements.
We also define a helper function called draw_elements_on_canvas, which is responsible for rendering all game elements, like the Chopper, birds, and fuel, onto the canvas. If an element goes beyond the screen’s boundaries, the helper function makes sure it’s clamped back within the limits. And, of course, it also displays essential information like the remaining fuel and current rewards, so you can always see how you’re doing in the game.
Finally, the reset function returns the updated canvas—this is what you’ll see when you start a new episode.
Here’s the code for reset and its helper method draw_elements_on_canvas:
def draw_elements_on_canvas(self): # Initialize the canvas with a white background self.canvas = np.ones(self.observation_shape) * 1</p>
<p> # Draw the Chopper and other elements on the canvas for elem in self.elements: elem_shape = elem.icon.shape x, y = elem.x, elem.y self.canvas[y : y + elem_shape[1], x:x + elem_shape[0]] = elem.icon</p>
<p> # Display the remaining fuel and rewards on the canvas text = ‘Fuel Left: {} | Rewards: {}’.format(self.fuel_left, self.ep_return) self.canvas = cv2.putText(self.canvas, text, (10, 20), font, 0.8, (0, 0, 0), 1, cv2.LINE_AA)</p>
<p>def reset(self): # Reset the fuel consumed to its maximum value self.fuel_left = self.max_fuel</p>
<p> # Reset the total reward to 0 self.ep_return = 0</p>
<p> # Initialize counters for the birds and fuel stations self.bird_count = 0 self.fuel_count = 0</p>
<p> # Determine a random starting position for the Chopper x = random.randrange(int(self.observation_shape[0] * 0.05), int(self.observation_shape[0] * 0.10)) y = random.randrange(int(self.observation_shape[1] * 0.15), int(self.observation_shape[1] * 0.20))</p>
<p> # Initialize the Chopper object self.chopper = Chopper(“chopper”, self.x_max, self.x_min, self.y_max, self.y_min) self.chopper.set_position(x, y)</p>
<p> # Add the Chopper to the elements list self.elements = [self.chopper]
<p> # Reset the canvas and draw the elements on it self.canvas = np.ones(self.observation_shape) * 1 self.draw_elements_on_canvas()</p>
<p> # Return the updated canvas as the observation return self.canvas
Rendering the Game
Now that we’ve reset the environment, it’s time to render it, so we can see the game in action. The render function lets us visualize the environment in two modes:
- Human Mode: This mode displays the game in a pop-up window, just like how it would look during gameplay.
- RGB Array Mode: This mode returns the environment as a pixel array. It’s especially useful if we want to process the environment in other applications or for testing purposes.
Here’s the code for the render function:
def render(self, mode=”human”): # Ensure that the mode is either “human” or “rgb_array” assert mode in [“human”, “rgb_array”], “Invalid mode, must be either ‘human’ or ‘rgb_array'”</p>
<p> if mode == “human”: # Display the environment in a pop-up window cv2.imshow(“Game”, self.canvas) cv2.waitKey(10) # Update the display with a short delay</p>
<p> elif mode == “rgb_array”: # Return the canvas as an array of pixel values return self.canvas
Closing the Game
When the game is done or the environment is no longer needed, we need to close any open windows and clean up. The close function takes care of that by using OpenCV’s cv2.destroyAllWindows() to close any active game windows.
Here’s the code for close:
def close(self): # Close all OpenCV windows cv2.destroyAllWindows()
Testing the Environment
Now that we’ve got the reset and render functions set up, we can test how the environment looks when it’s first reset. To do that, we create a new instance of the ChopperScape class and visualize the initial observation:
$ env = ChopperScape() # Create a new instance of the environment$ obs = env.reset() # Reset the environment and get the initial observation$ screen = env.render(mode=”rgb_array”) # Render the environment as an RGB array$ plt.imshow(screen) # Display the environment as an image
With this, you’ll see the initial state of the environment: the Chopper, the fuel, and maybe even some birds, all positioned and ready to start the game!
These functions are key to making our game interactive and fun. The reset function ensures that we can start fresh each time, while the render function lets us see the action unfold. Together, they provide a flexible and dynamic way to test different scenarios, visualize the environment, and guide the Chopper through the challenges that await.
These functions are essential for a smooth game experience.
Reset Function
The reset function is the cornerstone of our reinforcement learning environment, where it does more than just reset the game—it breathes life into the entire experience, preparing the environment for a fresh start. Every time you hit the reset button, the environment goes back to square one, and everything gets set to its initial state. Fuel levels? Reset. Rewards? Back to zero. The Chopper? It’s back in its starting position, ready to face the challenge once again.
The reset function takes care of all the variables that track the state of the game. It handles things like fuel consumption, the total rewards (also known as the episodic return), and the number of elements like birds and fuel tanks on the screen. The idea is to ensure that when we begin a new episode, the environment is in a pristine state, letting the agent—our trusty Chopper—start with a clean slate and tackle each challenge anew.
Resetting the Chopper and the Environment
So, where does the Chopper come into play when we reset? Well, it all begins with positioning the Chopper at a random spot on the screen. We don’t want the agent to always start in the same spot—where’s the fun in that? Instead, we place it in a random area in the top-left corner of the screen. Specifically, we position it within an area that takes up about 5-10% of the screen’s width and 15-20% of its height. This gives the Chopper a chance to face slightly different challenges each time the game resets.
Rendering the Environment
Now, it’s time to visualize the environment. We can’t just let the Chopper float in space; we need to know where it is and what else is around it. That’s where the helper function draw_elements_on_canvas comes in. This function does the magic of placing all the game elements—like the Chopper, Birds, and Fuel Tanks—on the canvas. It carefully arranges them at the correct positions, so the agent knows where everything is.
But here’s the kicker: If any element dares to venture outside the screen’s boundaries, this function will clamp it back to a valid position, keeping everything in sight and in order. No flying off-screen! And while we’re at it, it also displays key game information, like the fuel left and the current rewards on the canvas, so you can always keep track of your progress.
Once the canvas is updated with all the elements and essential info, the reset function returns the canvas as the observation, which is like the game’s first frame after you hit reset. It’s the moment the game prepares to start the adventure again.
Here’s the code that makes this all happen:
def draw_elements_on_canvas(self):
# Initialize the canvas with a white background
self.canvas = np.ones(self.observation_shape) * 1
# Draw all elements (Chopper, Birds, Fuel) on the canvas
for elem in self.elements:
elem_shape = elem.icon.shape
x, y = elem.x, elem.y
self.canvas[y : y + elem_shape[1], x:x + elem_shape[0]] = elem.icon
# Display the remaining fuel and rewards on the canvas
text = ‘Fuel Left: {} | Rewards: {}’.format(self.fuel_left, self.ep_return)
self.canvas = cv2.putText(self.canvas, text, (10, 20), font, 0.8, (0, 0, 0), 1, cv2.LINE_AA)def reset(self):
# Reset the fuel consumed to its maximum value
self.fuel_left = self.max_fuel
# Reset the total reward (episodic return) to 0
self.ep_return = 0
# Initialize counters for the number of birds and fuel stations
self.bird_count = 0
self.fuel_count = 0
# Determine a random starting position for the Chopper within the top-left corner
x = random.randrange(int(self.observation_shape[0] * 0.05), int(self.observation_shape[0] * 0.10))
y = random.randrange(int(self.observation_shape[1] * 0.15), int(self.observation_shape[1] * 0.20))
# Initialize the Chopper object at the random position
self.chopper = Chopper(“chopper”, self.x_max, self.x_min, self.y_max, self.y_min)
self.chopper.set_position(x, y)
# Add the Chopper to the list of elements in the environment
self.elements = [self.chopper]
# Reset the canvas to a blank image and redraw the elements
self.canvas = np.ones(self.observation_shape) * 1
self.draw_elements_on_canvas()
# Return the updated canvas as the observation for the environment
return self.canvas
Viewing the Initial Observation
Once the environment has been reset, it’s time to see how things look. The initial observation is essentially the game’s “first frame” after a reset. To view it, we use matplotlib.pyplot.imshow(), which displays the canvas with all the elements and information laid out.
Here’s how you can visualize the initial state of the environment after resetting:
env = ChopperScape() # Create a new instance of the ChopperScape environment
obs = env.reset() # Reset the environment to get the initial observation
plt.imshow(obs) # Display the environment as an image
Render Function
Okay, now let’s talk about how to render the game—this is how we see everything in action during gameplay. The render function comes with two modes:
- Human Mode: This mode pops up a window where you can watch the game unfold. It’s like watching a live stream of the action.
- RGB Array Mode: This mode returns the game as a pixel array, which is super useful if you want to process the environment for machine learning or testing.
Here’s the code for render:
def render(self, mode=”human”):
# Validate the mode input to ensure it is either “human” or “rgb_array”
assert mode in [“human”, “rgb_array”], “Invalid mode, must be either ‘human’ or ‘rgb_array'”
if mode == “human”:
# Display the environment in a pop-up window for human visualization
cv2.imshow(“Game”, self.canvas)
cv2.waitKey(10) # Update the display with a short delay
elif mode == “rgb_array”:
# Return the environment as an array of pixel values
return self.canvas
Closing the Window
When you’re done with the game and no longer need the environment, you can clean up any open windows with the close function. It’s like turning off the lights when you’re done with the game session:
def close(self):
# Close all OpenCV windows after the game is finished
cv2.destroyAllWindows()
With these functions, you can easily reset the environment, visualize it in different modes, and cleanly shut everything down when you’re done. These steps are essential for reinforcing the agent’s learning process, allowing it to interact, adapt, and improve with each reset and step. Whether you’re testing or debugging, these functions give you the flexibility to manage the game and see how the agent’s performance evolves.
These functions give you the flexibility to manage the game and see how the agent’s performance evolves.
OpenAI Gymnasium Environment API (2025)
Render Function
In the world of reinforcement learning, there are two functions that hold the game together: reset and step. These aren’t just any functions; they’re the heart and soul of how the environment evolves and how the agent learns. If you’re familiar with the OpenAI Gym, you already know that every environment must be able to reset, restoring it to its starting state, and then proceed to step forward, allowing the agent to take action and learn from its results. So, what do these functions really do? Well, the reset function gets things going by setting the environment up from scratch, and the step function is where the agent makes its moves, updating the environment and collecting rewards along the way.
Let’s dive into these critical functions and see how they come to life in our ChopperScape environment, focusing especially on how to reset the environment and how we visualize everything through rendering.
Reset Function
Imagine you’re about to play a new round of your favorite game. You hit reset, and suddenly, everything is set back to its starting point—everything’s wiped clean, and the game restarts, ready for you to tackle it again. That’s essentially what the reset function does in reinforcement learning. It resets all the variables that track the state of the environment—fuel consumption, rewards, and the elements in the game—giving you a fresh starting point for the next episode.
In our case, when the reset function is called, the only thing on the screen is the Chopper in its initial state. It’s like a fresh game where the agent gets to start over. We place the Chopper at a random position in the top-left area of the screen, specifically between 5-10% of the image’s width and 15-20% of its height. This randomness adds variety and helps train the agent to adapt to different starting points every time it begins.
Rendering the Environment
Now that everything is reset and ready, it’s time to render the environment and actually see what’s happening. To do that, we have a helper function called draw_elements_on_canvas. This function takes care of positioning all the game elements—the Chopper, Birds, and Fuel Tanks—on the canvas. If any of the elements go off-screen, this function clamps them back within the valid screen area, so nothing ever vanishes into thin air. It also takes care of displaying crucial game information like how much fuel the Chopper has left and how many rewards it’s earned.
After updating everything on the canvas, the reset function returns this updated canvas as the current observation of the environment. This is the starting point from which the agent will begin learning and taking actions.
Here’s the code for the reset function and the draw_elements_on_canvas helper function:
def draw_elements_on_canvas(self):
# Initialize the canvas with a white background
self.canvas = np.ones(self.observation_shape) * 1</p>
<p> # Draw all elements (Chopper, Birds, Fuel) on the canvas
for elem in self.elements:
elem_shape = elem.icon.shape
x, y = elem.x, elem.y
self.canvas[y : y + elem_shape[1], x:x + elem_shape[0]] = elem.icon</p>
<p> # Display the remaining fuel and rewards on the canvas
text = ‘Fuel Left: {} | Rewards: {}’.format(self.fuel_left, self.ep_return)
self.canvas = cv2.putText(self.canvas, text, (10, 20), font, 0.8, (0, 0, 0), 1, cv2.LINE_AA)</p>
<p>def reset(self):
# Reset the fuel consumed to its maximum value
self.fuel_left = self.max_fuel</p>
<p> # Reset the total reward (episodic return) to 0
self.ep_return = 0</p>
<p> # Initialize counters for the number of birds and fuel stations
self.bird_count = 0
self.fuel_count = 0</p>
<p> # Determine a random starting position for the Chopper within the top-left corner
x = random.randrange(int(self.observation_shape[0] * 0.05), int(self.observation_shape[0] * 0.10))
y = random.randrange(int(self.observation_shape[1] * 0.15), int(self.observation_shape[1] * 0.20))</p>
<p> # Initialize the Chopper object at the random position
self.chopper = Chopper(“chopper”, self.x_max, self.x_min, self.y_max, self.y_min)
self.chopper.set_position(x, y)</p>
<p> # Add the Chopper to the list of elements in the environment
self.elements = [self.chopper]
<p> # Reset the canvas to a blank image and redraw the elements
self.canvas = np.ones(self.observation_shape) * 1
self.draw_elements_on_canvas()</p>
<p> # Return the updated canvas as the observation for the environment
return self.canvas
Viewing the Initial Observation
Once we’ve reset everything, it’s time to see the results. The initial observation is like taking a snapshot of the environment right after the reset. To view it, we use matplotlib.pyplot.imshow(), which shows us exactly how things look before the agent takes any action.
Here’s how you can visualize it:
env = ChopperScape() # Create a new instance of the ChopperScape environment
obs = env.reset() # Reset the environment to get the initial observation
plt.imshow(obs) # Display the environment as an image
Render Function
Now, let’s talk about rendering the environment during gameplay. The render function is what lets us see the environment unfold as the Chopper interacts with it. It comes with two modes:
- Human Mode: This displays the game in a pop-up window, letting you watch it just like you would while playing it yourself.
- RGB Array Mode: This returns the environment as a pixel array, which can be useful for processing the environment during machine learning training or testing.
Here’s the code for the render function:
def render(self, mode=”human”):
# Ensure the mode is either “human” or “rgb_array”
assert mode in [“human”, “rgb_array”], “Invalid mode, must be either ‘human’ or ‘rgb_array'”</p>
<p> if mode == “human”:
# Display the environment in a pop-up window for human interaction
cv2.imshow(“Game”, self.canvas)
cv2.waitKey(10) # Update the display with a short delay</p>
<p> elif mode == “rgb_array”:
# Return the environment as an array of pixel values
return self.canvas
Closing the Window
When you’re done, there’s the close function to clean up the environment. It ensures that any windows we opened to render the game get closed properly, just like turning off the lights when you’re done with a game session:
def close(self):
# Close all OpenCV windows after the game is finished
cv2.destroyAllWindows()
Now you’ve got everything in place! With the reset, render, and close functions, you’ve got full control over the game. You can reset the environment, see the results as the game progresses, and cleanly close things down when you’re done. This makes testing and refining the agent’s learning process a lot easier, and as the agent learns, you’ll see it navigate the environment with increasing skill.
For more details on Deep Q-Learning, check out the Deep Q-Learning (2015) paper.
Step Function
Now that we’ve set the stage with the reset function, it’s time to dive into one of the most important parts of the game—the step function. This function is where the magic happens, where the environment responds to the agent’s actions, and the game moves forward from one state to the next. It’s like the heartbeat of the game, pushing everything forward. Every time the agent takes an action, the step function updates the environment, keeps track of the rewards, and checks the conditions to see if the episode is over.
Breaking Down the Transition Process
The step function can be broken down into two major parts:
- Applying actions to the agent – This is where we define what the Chopper (our agent) can do and how its movements affect its position on the screen.
- Managing the environment’s non-RL actors – These include Birds and Fuel Tanks, which aren’t directly controlled by the agent but still interact with it in various ways. They spawn, move around, and can potentially collide with the Chopper.
Actions for the Agent (Chopper)
In our game, the Chopper has a set of five actions it can choose from. Each action changes the Chopper’s position on the screen, and here’s how they work:
- Move right: The Chopper moves right on the screen.
- Move left: The Chopper moves left.
- Move down: The Chopper moves down.
- Move up: The Chopper moves up.
- Do nothing: The Chopper stays in its current position.
Each of these actions is represented by an integer:
- 0: Move right
- 1: Move left
- 2: Move down
- 3: Move up
- 4: Do nothing
To make it easier to understand, we define a helper function, get_action_meanings() , which translates these integer values into human-readable actions. This is especially useful when debugging or tracking the agent’s progress.
Here’s the code for the get_action_meanings() function:
def get_action_meanings(self):
return {0: “Right”, 1: “Left”, 2: “Down”, 3: “Up”, 4: “Do Nothing”}
Before applying any action, we validate whether it’s a valid one. If not, we raise an error.
# Assert that the action is valid
assert self.action_space.contains(action), “Invalid Action”
Once the action is validated, we apply it to the Chopper. Each action moves the Chopper by 5 units in the direction specified. For instance, if the action is “Move right”, the Chopper moves 5 units right. If the action is “Do nothing”, well, the Chopper just stays put.
Here’s how the action is applied:
# Apply the action to the chopper
if action == 0:
self.chopper.move(0, 5) # Move right
elif action == 1:
self.chopper.move(0, -5) # Move left
elif action == 2:
self.chopper.move(5, 0) # Move down
elif action == 3:
self.chopper.move(-5, 0) # Move up
elif action == 4:
self.chopper.move(0, 0) # Do nothing
Managing the Environment’s Non-RL Actors
After we apply the Chopper’s action, we turn our attention to the non-RL actors in the environment: the Birds and Fuel Tanks. These elements aren’t directly controlled by the agent, but they still play an essential role in the game by interacting with the Chopper.
Birds: These pesky creatures spawn randomly from the right edge of the screen. They have a 1% chance of appearing every frame, and once they spawn, they move left by 5 units every frame. If a Bird collides with the Chopper, the game ends. Otherwise, they disappear once they hit the left edge of the screen.
Fuel Tanks: These little life-savers spawn from the bottom edge of the screen, also with a 1% chance per frame. They move up by 5 units every frame. If the Chopper collides with a Fuel Tank, it gets refueled to full capacity. But if the Fuel Tank hits the top edge of the screen without interacting with the Chopper, it disappears.
Detecting Collisions
To check if two objects have collided—say, the Chopper and a Bird or Fuel Tank—we need to compare their positions. We define a helper function, has_collided() , to check if their positions overlap. If the distance between the two objects is less than half the sum of their widths and heights, a collision occurs. If not, they’re still in the clear.
Here’s the has_collided() function:
def has_collided(self, elem1, elem2):
x_col = False
y_col = False
# Get the current positions of the two elements
elem1_x, elem1_y = elem1.get_position()
elem2_x, elem2_y = elem2.get_position() # Check for horizontal collision
if 2 * abs(elem1_x – elem2_x) <= (elem1.icon_w + elem2.icon_w):
x_col = True # Check for vertical collision
if 2 * abs(elem1_y - elem2_y) <= (elem1.icon_h + elem2.icon_h):
y_col = True # Return True if both x and y collide
if x_col and y_col:
return True
return False
Implementing the Step Function
Now that we’ve defined how to apply actions to the Chopper and how to handle the Birds and Fuel Tanks, it’s time to bring everything together in the step function. This function takes an action, updates the environment, and returns the new state, along with a reward and information about whether the episode is done.
Here’s the full step function implementation:
def step(self, action):
# Flag that marks the termination of an episode
done = False # Assert that it is a valid action
assert self.action_space.contains(action), “Invalid Action” # Decrease the fuel counter by one for every step
self.fuel_left -= 1 # Set the reward for executing a step
reward = 1 # Apply the action to the chopper
if action == 0:
self.chopper.move(0, 5) # Move right
elif action == 1:
self.chopper.move(0, -5) # Move left
elif action == 2:
self.chopper.move(5, 0) # Move down
elif action == 3:
self.chopper.move(-5, 0) # Move up
elif action == 4:
self.chopper.move(0, 0) # Do nothing # Spawn a bird at the right edge with a 1% probability
if random.random() < 0.01:
spawned_bird = Bird("bird_{}".format(self.bird_count), self.x_max, self.x_min, self.y_max, self.y_min)
self.bird_count += 1
bird_x = self.x_max
bird_y = random.randrange(self.y_min, self.y_max)
spawned_bird.set_position(self.x_max, bird_y)
self.elements.append(spawned_bird) # Spawn a fuel tank at the bottom edge with a 1% probability
if random.random() < 0.01:
spawned_fuel = Fuel("fuel_{}".format(self.bird_count), self.x_max, self.x_min, self.y_max, self.y_min)
self.fuel_count += 1
fuel_x = random.randrange(self.x_min, self.x_max)
fuel_y = self.y_max
spawned_fuel.set_position(fuel_x, fuel_y)
self.elements.append(spawned_fuel) # Update the positions of the elements and handle collisions
for elem in self.elements:
if isinstance(elem, Bird):
if elem.get_position()[0] <= self.x_min:
self.elements.remove(elem)
else:
elem.move(-5, 0) if self.has_collided(self.chopper, elem):
done = True
reward = -10
self.elements.remove(self.chopper) if isinstance(elem, Fuel):
if elem.get_position()[1] <= self.y_min:
self.elements.remove(elem)
else:
elem.move(0, -5) if self.has_collided(self.chopper, elem):
self.elements.remove(elem)
self.fuel_left = self.max_fuel # Increment the episodic return (reward)
self.ep_return += 1 # Redraw elements on the canvas
self.draw_elements_on_canvas() # End the episode if the Chopper runs out of fuel
if self.fuel_left == 0:
done = True return self.canvas, reward, done, []
Understanding Machine Learning and Artificial Intelligence for Gamified Environments
In this code, the step function does everything: it applies the agent’s action, updates the environment, checks for collisions, and computes the reward. The Chopper moves, the Birds and Fuel Tanks spawn and move, and if any collisions occur, the episode ends.
This function is the dynamic core that makes the ChopperScape environment come to life, creating an engaging learning process for the agent. Each step is an opportunity for the Chopper to navigate through the environment, earn rewards, avoid obstacles, and hopefully, stay alive long enough to accumulate high rewards!
Seeing It in Action
Now that we’ve set up the mechanics of our environment, it’s time to see it in action. Picture this: we have a Chopper pilot in the game, but this time, there’s no strategy involved. Instead, our agent will be taking random actions, and we get to watch how the environment reacts. Every action the Chopper takes will change the game state, and we can see it happen step by step. Think of it like watching a chaotic flight through a landscape filled with birds, fuel tanks, and a ticking fuel meter!
Initial Setup: Let’s Get the Show on the Road
We start by importing the necessary display tools to render the game. These come from the IPython library, which is super useful when you’re working with Jupyter Notebooks or similar environments. It allows us to easily see the game’s output right in the notebook itself.
from IPython import display
Once that’s ready, we initialize the ChopperScape environment by creating an instance of the class and then resetting it. This makes sure everything starts fresh, with the Chopper in its starting position and all game variables like fuel and score reset to their starting values.
env = ChopperScape()
obs = env.reset()
The Agent Takes Control
Now, here’s where things get interesting: we start a loop where the Chopper will take random actions. Instead of making decisions like a human player would, our agent just picks a random action from the environment’s action space. Every time it does, the step function takes over. The step function processes the action, updates the environment, and gives us the new game state, reward, and more.
Here’s how the loop looks:
while True:
# Take a random action
action = env.action_space.sample()</p>
<p> # Apply the action and get the new state, reward, done flag, and info
obs, reward, done, info = env.step(action)
So, at every step, the Chopper picks a direction, whether it’s moving left, right, up, down, or even doing nothing. The step function processes that action, updates the environment, and returns the new observation, reward, and whether the game is over ( done ).
Rendering the Game
Once the agent has taken its action, we render the environment to show what’s going on. This visually updates the game’s state, including the Chopper’s position, the birds flying around, and any fuel tanks in the area.
# Render the game
env.render()
The rendering happens after each action, so we can see the changes in real-time. It’s like hitting ‘refresh’ after every decision the agent makes, letting us track how well the Chopper is doing as it moves through the environment.
Ending the Episode
The game isn’t endless—there’s always a point where the episode ends. If the Chopper crashes into a bird or runs out of fuel, the episode will end. When that happens, the done flag will turn True , and the loop will break, signaling the end of the game.
if done == True:
break
Finally, once the game is over, we close the environment to clean up any resources that were used during the gameplay.
env.close()
Watching the Agent in Action
And there you have it! By running this process, you can watch the agent, through its random actions, navigate the environment. It’s like a visual experiment where we can see how each decision impacts the Chopper’s survival—whether it’s dodging birds, collecting fuel tanks, or simply running out of fuel. Watching this in real-time helps us monitor how well the agent is performing and gives us a glimpse into how its learning will evolve over time.
In short, you get a fully interactive visualization of how the environment behaves with each random action the agent takes, giving you valuable insights into the agent’s decision-making process and the environment’s dynamics. It’s like watching a game unfold with a pilot who’s just flying blind, trying to survive the chaos!
Understanding agent-environment interactions (2025)
Conclusion
In conclusion, creating custom environments in OpenAI Gym offers an exciting way to design and experiment with interactive simulations, like our Chopper game. By setting up the observation and action spaces, coding the reset and step functions, and adding key elements like birds and fuel tanks, you can build a dynamic learning environment for your agent. This tutorial also explored how to render the game for real-time visualization, helping you track progress and improve the agent’s performance. As you advance, consider expanding your environment with new challenges or features, like a life system, to push the boundaries of what your agent can learn. OpenAI Gym is a powerful tool, and with continuous experimentation, the possibilities are endless.Start building your own environments today and let your agent’s learning journey unfold!