8 questions
- Bountied 0
- Unanswered
- Frequent
- Score
- Trending
- Week
- Month
- Unanswered (my tags)
0
votes
0
answers
52
views
Unable to reproduce training results in a dummy vector using stablebaseline3
I created a custom Gymnasium environment and trained an agent using Stable-Baselines3 with DummyVecEnv and VecNormalize. The agent performs well during training and consistently reaches the goal. ...
0
votes
0
answers
44
views
How to implement model.learn() correctly in self-play (Stable-baseline3 DQN)
I use DQN from sb3 to train a model. I want to train 2 agents that play against each other alternately. The problem is, as soon as I call model.learn(total_timesteps=N), which is the central method to ...
0
votes
1
answer
214
views
EvalCallback hangs in stable-baselines3
I'm trying to train an A2C model in stable-baselines3 and the EvalCallback appears to freeze when it is called. I cannot figure out why. Below you will find a script that recreates this problem. ...
0
votes
1
answer
107
views
Stable baselines 3 not generating tensorfiles for ppo, sac and td3
I am comparing a2c, dqn and ppo models. I need to have tensorboard graphs to show my teacher. The tensorboard only collect data for the a2cmodel, when using it for ppo, sac or td3 it creates the event ...
1
vote
0
answers
59
views
what input should I use to predict rl model? will it be scaled or inv scaled?
I am using sb3 DQN to train stock data where my obs is last 120 candle with 7 feature i.e open high low close hour min rsi etc... . so obs shape would be (120,7) output would be discrete with 3 int 0, ...
2
votes
1
answer
716
views
Training a Custom Feature Extractor in Stable Baselines3 Starting from Pre-trained Weights?
I am using the following custom feature extractor for my StableBaselines3 model:
import torch.nn as nn
from stable_baselines3 import PPO
class Encoder(nn.Module):
def __init__(self, input_dim, ...
0
votes
1
answer
116
views
requested array would exceed the maximum number of dimension of 1 issue in gym
let us suppose we have folloing code :
import gym
from stable_baselines3 import PPO
env = gym.make("CartPole-v1", render_mode="human")
model = PPO("MlpPolicy", env, ...
1
vote
1
answer
134
views
Baseline3 TD3, reset() method too many values to unpack error
The env is python 3.10, stable-baseline3 2.3.0 and I'm trying TD3 Algorithm.
I'm keep getting same error for whatever I do.
As far as I know, the reset method has return as same as observation space ...