Vagnus's picture
Upload folder using huggingface_hub
7e0132a verified
[2024-11-19 22:16:33,891][02170] Saving configuration to /content/train_dir/default_experiment/config.json...
[2024-11-19 22:16:33,894][02170] Rollout worker 0 uses device cpu
[2024-11-19 22:16:33,895][02170] Rollout worker 1 uses device cpu
[2024-11-19 22:16:33,897][02170] Rollout worker 2 uses device cpu
[2024-11-19 22:16:33,898][02170] Rollout worker 3 uses device cpu
[2024-11-19 22:16:33,899][02170] Rollout worker 4 uses device cpu
[2024-11-19 22:16:33,901][02170] Rollout worker 5 uses device cpu
[2024-11-19 22:16:33,902][02170] Rollout worker 6 uses device cpu
[2024-11-19 22:16:33,903][02170] Rollout worker 7 uses device cpu
[2024-11-19 22:16:34,060][02170] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2024-11-19 22:16:34,062][02170] InferenceWorker_p0-w0: min num requests: 2
[2024-11-19 22:16:34,095][02170] Starting all processes...
[2024-11-19 22:16:34,096][02170] Starting process learner_proc0
[2024-11-19 22:16:34,141][02170] Starting all processes...
[2024-11-19 22:16:34,147][02170] Starting process inference_proc0-0
[2024-11-19 22:16:34,147][02170] Starting process rollout_proc0
[2024-11-19 22:16:34,149][02170] Starting process rollout_proc1
[2024-11-19 22:16:34,149][02170] Starting process rollout_proc2
[2024-11-19 22:16:34,149][02170] Starting process rollout_proc3
[2024-11-19 22:16:34,149][02170] Starting process rollout_proc4
[2024-11-19 22:16:34,149][02170] Starting process rollout_proc5
[2024-11-19 22:16:34,149][02170] Starting process rollout_proc6
[2024-11-19 22:16:34,149][02170] Starting process rollout_proc7
[2024-11-19 22:16:50,990][03246] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2024-11-19 22:16:50,994][03246] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
[2024-11-19 22:16:51,057][03246] Num visible devices: 1
[2024-11-19 22:16:51,107][03246] Starting seed is not provided
[2024-11-19 22:16:51,108][03246] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2024-11-19 22:16:51,108][03246] Initializing actor-critic model on device cuda:0
[2024-11-19 22:16:51,109][03246] RunningMeanStd input shape: (3, 72, 128)
[2024-11-19 22:16:51,112][03246] RunningMeanStd input shape: (1,)
[2024-11-19 22:16:51,131][03264] Worker 4 uses CPU cores [0]
[2024-11-19 22:16:51,217][03246] ConvEncoder: input_channels=3
[2024-11-19 22:16:51,310][03262] Worker 2 uses CPU cores [0]
[2024-11-19 22:16:51,375][03267] Worker 6 uses CPU cores [0]
[2024-11-19 22:16:51,394][03260] Worker 0 uses CPU cores [0]
[2024-11-19 22:16:51,496][03259] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2024-11-19 22:16:51,506][03259] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
[2024-11-19 22:16:51,555][03263] Worker 3 uses CPU cores [1]
[2024-11-19 22:16:51,570][03259] Num visible devices: 1
[2024-11-19 22:16:51,629][03265] Worker 5 uses CPU cores [1]
[2024-11-19 22:16:51,640][03266] Worker 7 uses CPU cores [1]
[2024-11-19 22:16:51,652][03261] Worker 1 uses CPU cores [1]
[2024-11-19 22:16:51,713][03246] Conv encoder output size: 512
[2024-11-19 22:16:51,713][03246] Policy head output size: 512
[2024-11-19 22:16:51,767][03246] Created Actor Critic model with architecture:
[2024-11-19 22:16:51,767][03246] ActorCriticSharedWeights(
(obs_normalizer): ObservationNormalizer(
(running_mean_std): RunningMeanStdDictInPlace(
(running_mean_std): ModuleDict(
(obs): RunningMeanStdInPlace()
)
)
)
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
(encoder): VizdoomEncoder(
(basic_encoder): ConvEncoder(
(enc): RecursiveScriptModule(
original_name=ConvEncoderImpl
(conv_head): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=ELU)
(2): RecursiveScriptModule(original_name=Conv2d)
(3): RecursiveScriptModule(original_name=ELU)
(4): RecursiveScriptModule(original_name=Conv2d)
(5): RecursiveScriptModule(original_name=ELU)
)
(mlp_layers): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Linear)
(1): RecursiveScriptModule(original_name=ELU)
)
)
)
)
(core): ModelCoreRNN(
(core): GRU(512, 512)
)
(decoder): MlpDecoder(
(mlp): Identity()
)
(critic_linear): Linear(in_features=512, out_features=1, bias=True)
(action_parameterization): ActionParameterizationDefault(
(distribution_linear): Linear(in_features=512, out_features=5, bias=True)
)
)
[2024-11-19 22:16:52,051][03246] Using optimizer <class 'torch.optim.adam.Adam'>
[2024-11-19 22:16:54,054][02170] Heartbeat connected on Batcher_0
[2024-11-19 22:16:54,061][02170] Heartbeat connected on InferenceWorker_p0-w0
[2024-11-19 22:16:54,071][02170] Heartbeat connected on RolloutWorker_w0
[2024-11-19 22:16:54,074][02170] Heartbeat connected on RolloutWorker_w1
[2024-11-19 22:16:54,078][02170] Heartbeat connected on RolloutWorker_w2
[2024-11-19 22:16:54,083][02170] Heartbeat connected on RolloutWorker_w3
[2024-11-19 22:16:54,084][02170] Heartbeat connected on RolloutWorker_w4
[2024-11-19 22:16:54,095][02170] Heartbeat connected on RolloutWorker_w6
[2024-11-19 22:16:54,096][02170] Heartbeat connected on RolloutWorker_w5
[2024-11-19 22:16:54,097][02170] Heartbeat connected on RolloutWorker_w7
[2024-11-19 22:16:56,358][03246] No checkpoints found
[2024-11-19 22:16:56,359][03246] Did not load from checkpoint, starting from scratch!
[2024-11-19 22:16:56,359][03246] Initialized policy 0 weights for model version 0
[2024-11-19 22:16:56,369][03246] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2024-11-19 22:16:56,378][03246] LearnerWorker_p0 finished initialization!
[2024-11-19 22:16:56,379][02170] Heartbeat connected on LearnerWorker_p0
[2024-11-19 22:16:56,538][03259] RunningMeanStd input shape: (3, 72, 128)
[2024-11-19 22:16:56,539][03259] RunningMeanStd input shape: (1,)
[2024-11-19 22:16:56,561][03259] ConvEncoder: input_channels=3
[2024-11-19 22:16:56,733][03259] Conv encoder output size: 512
[2024-11-19 22:16:56,734][03259] Policy head output size: 512
[2024-11-19 22:16:56,814][02170] Inference worker 0-0 is ready!
[2024-11-19 22:16:56,817][02170] All inference workers are ready! Signal rollout workers to start!
[2024-11-19 22:16:57,060][03260] Doom resolution: 160x120, resize resolution: (128, 72)
[2024-11-19 22:16:57,059][03267] Doom resolution: 160x120, resize resolution: (128, 72)
[2024-11-19 22:16:57,061][03262] Doom resolution: 160x120, resize resolution: (128, 72)
[2024-11-19 22:16:57,065][03264] Doom resolution: 160x120, resize resolution: (128, 72)
[2024-11-19 22:16:57,194][03265] Doom resolution: 160x120, resize resolution: (128, 72)
[2024-11-19 22:16:57,198][03266] Doom resolution: 160x120, resize resolution: (128, 72)
[2024-11-19 22:16:57,199][03263] Doom resolution: 160x120, resize resolution: (128, 72)
[2024-11-19 22:16:57,200][03261] Doom resolution: 160x120, resize resolution: (128, 72)
[2024-11-19 22:16:58,033][02170] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2024-11-19 22:16:58,335][03267] Decorrelating experience for 0 frames...
[2024-11-19 22:16:58,334][03260] Decorrelating experience for 0 frames...
[2024-11-19 22:16:58,735][03260] Decorrelating experience for 32 frames...
[2024-11-19 22:16:58,981][03265] Decorrelating experience for 0 frames...
[2024-11-19 22:16:58,988][03263] Decorrelating experience for 0 frames...
[2024-11-19 22:16:58,996][03266] Decorrelating experience for 0 frames...
[2024-11-19 22:16:59,011][03261] Decorrelating experience for 0 frames...
[2024-11-19 22:16:59,374][03265] Decorrelating experience for 32 frames...
[2024-11-19 22:16:59,659][03267] Decorrelating experience for 32 frames...
[2024-11-19 22:17:00,043][03264] Decorrelating experience for 0 frames...
[2024-11-19 22:17:00,055][03260] Decorrelating experience for 64 frames...
[2024-11-19 22:17:00,094][03266] Decorrelating experience for 32 frames...
[2024-11-19 22:17:00,887][03263] Decorrelating experience for 32 frames...
[2024-11-19 22:17:01,140][03266] Decorrelating experience for 64 frames...
[2024-11-19 22:17:01,233][03267] Decorrelating experience for 64 frames...
[2024-11-19 22:17:01,335][03262] Decorrelating experience for 0 frames...
[2024-11-19 22:17:01,350][03264] Decorrelating experience for 32 frames...
[2024-11-19 22:17:01,921][03260] Decorrelating experience for 96 frames...
[2024-11-19 22:17:02,218][03267] Decorrelating experience for 96 frames...
[2024-11-19 22:17:02,225][03262] Decorrelating experience for 32 frames...
[2024-11-19 22:17:02,737][03263] Decorrelating experience for 64 frames...
[2024-11-19 22:17:03,033][02170] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2024-11-19 22:17:03,203][03264] Decorrelating experience for 64 frames...
[2024-11-19 22:17:03,841][03265] Decorrelating experience for 64 frames...
[2024-11-19 22:17:03,864][03262] Decorrelating experience for 64 frames...
[2024-11-19 22:17:03,953][03263] Decorrelating experience for 96 frames...
[2024-11-19 22:17:04,746][03261] Decorrelating experience for 32 frames...
[2024-11-19 22:17:05,073][03264] Decorrelating experience for 96 frames...
[2024-11-19 22:17:06,848][03265] Decorrelating experience for 96 frames...
[2024-11-19 22:17:07,058][03262] Decorrelating experience for 96 frames...
[2024-11-19 22:17:08,033][02170] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 89.8. Samples: 898. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2024-11-19 22:17:08,037][02170] Avg episode reward: [(0, '2.829')]
[2024-11-19 22:17:09,312][03266] Decorrelating experience for 96 frames...
[2024-11-19 22:17:10,773][03261] Decorrelating experience for 64 frames...
[2024-11-19 22:17:11,191][03246] Signal inference workers to stop experience collection...
[2024-11-19 22:17:11,217][03259] InferenceWorker_p0-w0: stopping experience collection
[2024-11-19 22:17:11,783][03261] Decorrelating experience for 96 frames...
[2024-11-19 22:17:13,033][02170] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 174.0. Samples: 2610. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2024-11-19 22:17:13,034][02170] Avg episode reward: [(0, '2.792')]
[2024-11-19 22:17:13,568][03246] Signal inference workers to resume experience collection...
[2024-11-19 22:17:13,569][03259] InferenceWorker_p0-w0: resuming experience collection
[2024-11-19 22:17:18,033][02170] Fps is (10 sec: 2457.7, 60 sec: 1228.8, 300 sec: 1228.8). Total num frames: 24576. Throughput: 0: 337.3. Samples: 6746. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0)
[2024-11-19 22:17:18,035][02170] Avg episode reward: [(0, '3.542')]
[2024-11-19 22:17:21,981][03259] Updated weights for policy 0, policy_version 10 (0.0023)
[2024-11-19 22:17:23,033][02170] Fps is (10 sec: 4505.6, 60 sec: 1802.2, 300 sec: 1802.2). Total num frames: 45056. Throughput: 0: 391.4. Samples: 9784. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:17:23,035][02170] Avg episode reward: [(0, '4.112')]
[2024-11-19 22:17:28,033][02170] Fps is (10 sec: 3276.8, 60 sec: 1911.5, 300 sec: 1911.5). Total num frames: 57344. Throughput: 0: 481.0. Samples: 14430. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-11-19 22:17:28,035][02170] Avg episode reward: [(0, '4.268')]
[2024-11-19 22:17:33,033][02170] Fps is (10 sec: 2457.5, 60 sec: 1989.5, 300 sec: 1989.5). Total num frames: 69632. Throughput: 0: 505.3. Samples: 17684. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-11-19 22:17:33,038][02170] Avg episode reward: [(0, '4.256')]
[2024-11-19 22:17:38,033][02170] Fps is (10 sec: 2048.0, 60 sec: 1945.6, 300 sec: 1945.6). Total num frames: 77824. Throughput: 0: 486.1. Samples: 19444. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2024-11-19 22:17:38,038][02170] Avg episode reward: [(0, '4.401')]
[2024-11-19 22:17:38,188][03259] Updated weights for policy 0, policy_version 20 (0.0061)
[2024-11-19 22:17:43,033][02170] Fps is (10 sec: 2867.3, 60 sec: 2184.5, 300 sec: 2184.5). Total num frames: 98304. Throughput: 0: 547.5. Samples: 24636. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-11-19 22:17:43,035][02170] Avg episode reward: [(0, '4.433')]
[2024-11-19 22:17:48,033][02170] Fps is (10 sec: 3276.8, 60 sec: 2211.8, 300 sec: 2211.8). Total num frames: 110592. Throughput: 0: 637.7. Samples: 28696. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-11-19 22:17:48,035][02170] Avg episode reward: [(0, '4.478')]
[2024-11-19 22:17:48,052][03246] Saving new best policy, reward=4.478!
[2024-11-19 22:17:51,094][03259] Updated weights for policy 0, policy_version 30 (0.0029)
[2024-11-19 22:17:53,033][02170] Fps is (10 sec: 3276.8, 60 sec: 2383.1, 300 sec: 2383.1). Total num frames: 131072. Throughput: 0: 669.2. Samples: 31014. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-11-19 22:17:53,035][02170] Avg episode reward: [(0, '4.449')]
[2024-11-19 22:17:58,035][02170] Fps is (10 sec: 3685.6, 60 sec: 2457.5, 300 sec: 2457.5). Total num frames: 147456. Throughput: 0: 767.2. Samples: 37136. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-11-19 22:17:58,041][02170] Avg episode reward: [(0, '4.256')]
[2024-11-19 22:18:02,080][03259] Updated weights for policy 0, policy_version 40 (0.0029)
[2024-11-19 22:18:03,033][02170] Fps is (10 sec: 3276.6, 60 sec: 2730.7, 300 sec: 2520.6). Total num frames: 163840. Throughput: 0: 778.8. Samples: 41792. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:18:03,041][02170] Avg episode reward: [(0, '4.319')]
[2024-11-19 22:18:08,033][02170] Fps is (10 sec: 3277.4, 60 sec: 3003.7, 300 sec: 2574.6). Total num frames: 180224. Throughput: 0: 752.0. Samples: 43626. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2024-11-19 22:18:08,040][02170] Avg episode reward: [(0, '4.452')]
[2024-11-19 22:18:13,033][02170] Fps is (10 sec: 3686.6, 60 sec: 3345.1, 300 sec: 2676.1). Total num frames: 200704. Throughput: 0: 778.4. Samples: 49456. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2024-11-19 22:18:13,036][02170] Avg episode reward: [(0, '4.635')]
[2024-11-19 22:18:13,042][03246] Saving new best policy, reward=4.635!
[2024-11-19 22:18:14,050][03259] Updated weights for policy 0, policy_version 50 (0.0024)
[2024-11-19 22:18:18,033][02170] Fps is (10 sec: 3686.5, 60 sec: 3208.5, 300 sec: 2713.6). Total num frames: 217088. Throughput: 0: 829.5. Samples: 55010. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-11-19 22:18:18,035][02170] Avg episode reward: [(0, '4.557')]
[2024-11-19 22:18:23,033][02170] Fps is (10 sec: 2867.2, 60 sec: 3072.0, 300 sec: 2698.5). Total num frames: 229376. Throughput: 0: 829.8. Samples: 56784. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-11-19 22:18:23,035][02170] Avg episode reward: [(0, '4.394')]
[2024-11-19 22:18:26,851][03259] Updated weights for policy 0, policy_version 60 (0.0033)
[2024-11-19 22:18:28,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 2776.2). Total num frames: 249856. Throughput: 0: 826.6. Samples: 61834. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-11-19 22:18:28,037][02170] Avg episode reward: [(0, '4.287')]
[2024-11-19 22:18:28,047][03246] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000061_249856.pth...
[2024-11-19 22:18:33,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 2802.5). Total num frames: 266240. Throughput: 0: 869.2. Samples: 67810. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:18:33,035][02170] Avg episode reward: [(0, '4.384')]
[2024-11-19 22:18:38,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 2826.2). Total num frames: 282624. Throughput: 0: 866.1. Samples: 69988. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-11-19 22:18:38,036][02170] Avg episode reward: [(0, '4.498')]
[2024-11-19 22:18:39,709][03259] Updated weights for policy 0, policy_version 70 (0.0060)
[2024-11-19 22:18:43,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 2847.7). Total num frames: 299008. Throughput: 0: 826.1. Samples: 74310. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-11-19 22:18:43,041][02170] Avg episode reward: [(0, '4.502')]
[2024-11-19 22:18:48,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 2904.4). Total num frames: 319488. Throughput: 0: 863.2. Samples: 80636. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:18:48,035][02170] Avg episode reward: [(0, '4.527')]
[2024-11-19 22:18:49,669][03259] Updated weights for policy 0, policy_version 80 (0.0018)
[2024-11-19 22:18:53,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 2920.6). Total num frames: 335872. Throughput: 0: 886.5. Samples: 83518. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:18:53,036][02170] Avg episode reward: [(0, '4.365')]
[2024-11-19 22:18:58,033][02170] Fps is (10 sec: 2867.2, 60 sec: 3345.2, 300 sec: 2901.3). Total num frames: 348160. Throughput: 0: 842.3. Samples: 87358. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:18:58,037][02170] Avg episode reward: [(0, '4.276')]
[2024-11-19 22:19:02,410][03259] Updated weights for policy 0, policy_version 90 (0.0017)
[2024-11-19 22:19:03,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3413.4, 300 sec: 2949.1). Total num frames: 368640. Throughput: 0: 851.1. Samples: 93308. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-11-19 22:19:03,035][02170] Avg episode reward: [(0, '4.358')]
[2024-11-19 22:19:08,033][02170] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 2993.2). Total num frames: 389120. Throughput: 0: 882.5. Samples: 96496. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-11-19 22:19:08,035][02170] Avg episode reward: [(0, '4.689')]
[2024-11-19 22:19:08,043][03246] Saving new best policy, reward=4.689!
[2024-11-19 22:19:13,033][02170] Fps is (10 sec: 3276.7, 60 sec: 3345.1, 300 sec: 2973.4). Total num frames: 401408. Throughput: 0: 866.0. Samples: 100806. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-11-19 22:19:13,035][02170] Avg episode reward: [(0, '4.718')]
[2024-11-19 22:19:13,039][03246] Saving new best policy, reward=4.718!
[2024-11-19 22:19:14,969][03259] Updated weights for policy 0, policy_version 100 (0.0032)
[2024-11-19 22:19:18,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3013.5). Total num frames: 421888. Throughput: 0: 851.4. Samples: 106124. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:19:18,035][02170] Avg episode reward: [(0, '4.784')]
[2024-11-19 22:19:18,042][03246] Saving new best policy, reward=4.784!
[2024-11-19 22:19:23,035][02170] Fps is (10 sec: 4095.3, 60 sec: 3549.7, 300 sec: 3050.8). Total num frames: 442368. Throughput: 0: 872.4. Samples: 109248. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-11-19 22:19:23,043][02170] Avg episode reward: [(0, '4.613')]
[2024-11-19 22:19:24,830][03259] Updated weights for policy 0, policy_version 110 (0.0016)
[2024-11-19 22:19:28,038][02170] Fps is (10 sec: 3275.0, 60 sec: 3413.0, 300 sec: 3030.9). Total num frames: 454656. Throughput: 0: 888.6. Samples: 114304. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-11-19 22:19:28,042][02170] Avg episode reward: [(0, '4.409')]
[2024-11-19 22:19:33,033][02170] Fps is (10 sec: 2867.8, 60 sec: 3413.3, 300 sec: 3039.0). Total num frames: 471040. Throughput: 0: 848.4. Samples: 118816. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-11-19 22:19:33,036][02170] Avg episode reward: [(0, '4.389')]
[2024-11-19 22:19:37,287][03259] Updated weights for policy 0, policy_version 120 (0.0021)
[2024-11-19 22:19:38,033][02170] Fps is (10 sec: 3688.4, 60 sec: 3481.6, 300 sec: 3072.0). Total num frames: 491520. Throughput: 0: 855.6. Samples: 122022. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:19:38,036][02170] Avg episode reward: [(0, '4.432')]
[2024-11-19 22:19:43,033][02170] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3103.0). Total num frames: 512000. Throughput: 0: 905.5. Samples: 128106. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:19:43,038][02170] Avg episode reward: [(0, '4.508')]
[2024-11-19 22:19:48,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3084.0). Total num frames: 524288. Throughput: 0: 856.4. Samples: 131844. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-11-19 22:19:48,039][02170] Avg episode reward: [(0, '4.503')]
[2024-11-19 22:19:49,531][03259] Updated weights for policy 0, policy_version 130 (0.0018)
[2024-11-19 22:19:53,033][02170] Fps is (10 sec: 3276.7, 60 sec: 3481.6, 300 sec: 3113.0). Total num frames: 544768. Throughput: 0: 855.4. Samples: 134990. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:19:53,039][02170] Avg episode reward: [(0, '4.316')]
[2024-11-19 22:19:58,033][02170] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3140.3). Total num frames: 565248. Throughput: 0: 900.4. Samples: 141322. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:19:58,042][02170] Avg episode reward: [(0, '4.305')]
[2024-11-19 22:20:00,326][03259] Updated weights for policy 0, policy_version 140 (0.0026)
[2024-11-19 22:20:03,033][02170] Fps is (10 sec: 3276.9, 60 sec: 3481.6, 300 sec: 3121.8). Total num frames: 577536. Throughput: 0: 874.8. Samples: 145492. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:20:03,039][02170] Avg episode reward: [(0, '4.365')]
[2024-11-19 22:20:08,033][02170] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3125.9). Total num frames: 593920. Throughput: 0: 854.5. Samples: 147700. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:20:08,038][02170] Avg episode reward: [(0, '4.257')]
[2024-11-19 22:20:12,219][03259] Updated weights for policy 0, policy_version 150 (0.0026)
[2024-11-19 22:20:13,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3150.8). Total num frames: 614400. Throughput: 0: 878.7. Samples: 153842. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-11-19 22:20:13,038][02170] Avg episode reward: [(0, '4.274')]
[2024-11-19 22:20:18,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3153.9). Total num frames: 630784. Throughput: 0: 889.2. Samples: 158828. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-11-19 22:20:18,035][02170] Avg episode reward: [(0, '4.386')]
[2024-11-19 22:20:23,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3413.4, 300 sec: 3156.9). Total num frames: 647168. Throughput: 0: 857.5. Samples: 160608. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-11-19 22:20:23,035][02170] Avg episode reward: [(0, '4.452')]
[2024-11-19 22:20:24,876][03259] Updated weights for policy 0, policy_version 160 (0.0040)
[2024-11-19 22:20:28,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3550.2, 300 sec: 3179.3). Total num frames: 667648. Throughput: 0: 850.0. Samples: 166358. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:20:28,035][02170] Avg episode reward: [(0, '4.482')]
[2024-11-19 22:20:28,044][03246] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000163_667648.pth...
[2024-11-19 22:20:33,033][02170] Fps is (10 sec: 3686.3, 60 sec: 3549.9, 300 sec: 3181.5). Total num frames: 684032. Throughput: 0: 891.6. Samples: 171964. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-11-19 22:20:33,038][02170] Avg episode reward: [(0, '4.497')]
[2024-11-19 22:20:37,061][03259] Updated weights for policy 0, policy_version 170 (0.0018)
[2024-11-19 22:20:38,036][02170] Fps is (10 sec: 2866.3, 60 sec: 3413.2, 300 sec: 3165.0). Total num frames: 696320. Throughput: 0: 862.1. Samples: 173786. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-11-19 22:20:38,042][02170] Avg episode reward: [(0, '4.553')]
[2024-11-19 22:20:43,033][02170] Fps is (10 sec: 3276.9, 60 sec: 3413.3, 300 sec: 3185.8). Total num frames: 716800. Throughput: 0: 828.6. Samples: 178610. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:20:43,039][02170] Avg episode reward: [(0, '4.479')]
[2024-11-19 22:20:48,000][03259] Updated weights for policy 0, policy_version 180 (0.0014)
[2024-11-19 22:20:48,033][02170] Fps is (10 sec: 4097.3, 60 sec: 3549.9, 300 sec: 3205.6). Total num frames: 737280. Throughput: 0: 870.5. Samples: 184664. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-11-19 22:20:48,036][02170] Avg episode reward: [(0, '4.499')]
[2024-11-19 22:20:53,039][02170] Fps is (10 sec: 3274.8, 60 sec: 3413.0, 300 sec: 3189.6). Total num frames: 749568. Throughput: 0: 873.6. Samples: 187018. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-11-19 22:20:53,041][02170] Avg episode reward: [(0, '4.511')]
[2024-11-19 22:20:58,033][02170] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3191.5). Total num frames: 765952. Throughput: 0: 829.0. Samples: 191148. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-11-19 22:20:58,041][02170] Avg episode reward: [(0, '4.741')]
[2024-11-19 22:21:00,682][03259] Updated weights for policy 0, policy_version 190 (0.0020)
[2024-11-19 22:21:03,033][02170] Fps is (10 sec: 3688.6, 60 sec: 3481.6, 300 sec: 3209.9). Total num frames: 786432. Throughput: 0: 852.6. Samples: 197196. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-11-19 22:21:03,040][02170] Avg episode reward: [(0, '4.900')]
[2024-11-19 22:21:03,043][03246] Saving new best policy, reward=4.900!
[2024-11-19 22:21:08,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3211.3). Total num frames: 802816. Throughput: 0: 881.9. Samples: 200292. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-11-19 22:21:08,042][02170] Avg episode reward: [(0, '4.843')]
[2024-11-19 22:21:13,033][02170] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3196.5). Total num frames: 815104. Throughput: 0: 836.8. Samples: 204016. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:21:13,034][02170] Avg episode reward: [(0, '4.727')]
[2024-11-19 22:21:13,622][03259] Updated weights for policy 0, policy_version 200 (0.0041)
[2024-11-19 22:21:18,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3213.8). Total num frames: 835584. Throughput: 0: 840.8. Samples: 209802. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:21:18,036][02170] Avg episode reward: [(0, '4.738')]
[2024-11-19 22:21:23,033][02170] Fps is (10 sec: 4095.9, 60 sec: 3481.6, 300 sec: 3230.4). Total num frames: 856064. Throughput: 0: 870.7. Samples: 212964. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:21:23,037][02170] Avg episode reward: [(0, '4.859')]
[2024-11-19 22:21:24,223][03259] Updated weights for policy 0, policy_version 210 (0.0015)
[2024-11-19 22:21:28,033][02170] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3200.9). Total num frames: 864256. Throughput: 0: 849.0. Samples: 216814. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:21:28,035][02170] Avg episode reward: [(0, '5.013')]
[2024-11-19 22:21:28,053][03246] Saving new best policy, reward=5.013!
[2024-11-19 22:21:33,036][02170] Fps is (10 sec: 2047.4, 60 sec: 3208.4, 300 sec: 3187.4). Total num frames: 876544. Throughput: 0: 778.5. Samples: 219698. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:21:33,039][02170] Avg episode reward: [(0, '4.987')]
[2024-11-19 22:21:38,034][02170] Fps is (10 sec: 3276.4, 60 sec: 3345.2, 300 sec: 3203.6). Total num frames: 897024. Throughput: 0: 790.5. Samples: 222588. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-11-19 22:21:38,037][02170] Avg episode reward: [(0, '4.818')]
[2024-11-19 22:21:38,939][03259] Updated weights for policy 0, policy_version 220 (0.0042)
[2024-11-19 22:21:43,033][02170] Fps is (10 sec: 4097.4, 60 sec: 3345.1, 300 sec: 3219.3). Total num frames: 917504. Throughput: 0: 839.1. Samples: 228908. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-11-19 22:21:43,039][02170] Avg episode reward: [(0, '4.755')]
[2024-11-19 22:21:48,033][02170] Fps is (10 sec: 3277.2, 60 sec: 3208.5, 300 sec: 3206.2). Total num frames: 929792. Throughput: 0: 803.6. Samples: 233358. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:21:48,037][02170] Avg episode reward: [(0, '4.615')]
[2024-11-19 22:21:51,469][03259] Updated weights for policy 0, policy_version 230 (0.0038)
[2024-11-19 22:21:53,033][02170] Fps is (10 sec: 2867.2, 60 sec: 3277.1, 300 sec: 3207.4). Total num frames: 946176. Throughput: 0: 781.9. Samples: 235478. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-11-19 22:21:53,036][02170] Avg episode reward: [(0, '4.688')]
[2024-11-19 22:21:58,034][02170] Fps is (10 sec: 3686.1, 60 sec: 3345.0, 300 sec: 3276.8). Total num frames: 966656. Throughput: 0: 838.1. Samples: 241730. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-11-19 22:21:58,036][02170] Avg episode reward: [(0, '4.791')]
[2024-11-19 22:22:02,083][03259] Updated weights for policy 0, policy_version 240 (0.0023)
[2024-11-19 22:22:03,036][02170] Fps is (10 sec: 3685.3, 60 sec: 3276.6, 300 sec: 3332.3). Total num frames: 983040. Throughput: 0: 822.3. Samples: 246810. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-11-19 22:22:03,038][02170] Avg episode reward: [(0, '4.947')]
[2024-11-19 22:22:08,033][02170] Fps is (10 sec: 2867.5, 60 sec: 3208.5, 300 sec: 3374.0). Total num frames: 995328. Throughput: 0: 792.7. Samples: 248636. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:22:08,035][02170] Avg episode reward: [(0, '4.814')]
[2024-11-19 22:22:13,033][02170] Fps is (10 sec: 3277.8, 60 sec: 3345.1, 300 sec: 3360.1). Total num frames: 1015808. Throughput: 0: 831.4. Samples: 254226. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:22:13,036][02170] Avg episode reward: [(0, '4.883')]
[2024-11-19 22:22:14,154][03259] Updated weights for policy 0, policy_version 250 (0.0023)
[2024-11-19 22:22:18,033][02170] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3360.1). Total num frames: 1036288. Throughput: 0: 903.0. Samples: 260332. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-11-19 22:22:18,035][02170] Avg episode reward: [(0, '5.328')]
[2024-11-19 22:22:18,048][03246] Saving new best policy, reward=5.328!
[2024-11-19 22:22:23,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3208.6, 300 sec: 3360.1). Total num frames: 1048576. Throughput: 0: 877.8. Samples: 262090. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-11-19 22:22:23,035][02170] Avg episode reward: [(0, '5.276')]
[2024-11-19 22:22:26,762][03259] Updated weights for policy 0, policy_version 260 (0.0021)
[2024-11-19 22:22:28,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3387.9). Total num frames: 1069056. Throughput: 0: 846.0. Samples: 266980. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-11-19 22:22:28,036][02170] Avg episode reward: [(0, '5.631')]
[2024-11-19 22:22:28,049][03246] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000261_1069056.pth...
[2024-11-19 22:22:28,190][03246] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000061_249856.pth
[2024-11-19 22:22:28,210][03246] Saving new best policy, reward=5.631!
[2024-11-19 22:22:33,033][02170] Fps is (10 sec: 4096.0, 60 sec: 3550.1, 300 sec: 3429.5). Total num frames: 1089536. Throughput: 0: 882.1. Samples: 273054. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2024-11-19 22:22:33,041][02170] Avg episode reward: [(0, '5.901')]
[2024-11-19 22:22:33,044][03246] Saving new best policy, reward=5.901!
[2024-11-19 22:22:38,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3413.4, 300 sec: 3401.8). Total num frames: 1101824. Throughput: 0: 886.9. Samples: 275388. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:22:38,035][02170] Avg episode reward: [(0, '5.637')]
[2024-11-19 22:22:38,368][03259] Updated weights for policy 0, policy_version 270 (0.0021)
[2024-11-19 22:22:43,033][02170] Fps is (10 sec: 2867.1, 60 sec: 3345.1, 300 sec: 3415.6). Total num frames: 1118208. Throughput: 0: 838.7. Samples: 279472. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:22:43,036][02170] Avg episode reward: [(0, '5.810')]
[2024-11-19 22:22:48,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3415.6). Total num frames: 1138688. Throughput: 0: 866.4. Samples: 285796. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-11-19 22:22:48,035][02170] Avg episode reward: [(0, '5.903')]
[2024-11-19 22:22:48,044][03246] Saving new best policy, reward=5.903!
[2024-11-19 22:22:49,324][03259] Updated weights for policy 0, policy_version 280 (0.0020)
[2024-11-19 22:22:53,033][02170] Fps is (10 sec: 4096.1, 60 sec: 3549.9, 300 sec: 3429.6). Total num frames: 1159168. Throughput: 0: 894.6. Samples: 288892. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:22:53,040][02170] Avg episode reward: [(0, '5.847')]
[2024-11-19 22:22:58,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3413.4, 300 sec: 3415.7). Total num frames: 1171456. Throughput: 0: 856.3. Samples: 292760. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-11-19 22:22:58,038][02170] Avg episode reward: [(0, '5.982')]
[2024-11-19 22:22:58,049][03246] Saving new best policy, reward=5.982!
[2024-11-19 22:23:01,798][03259] Updated weights for policy 0, policy_version 290 (0.0036)
[2024-11-19 22:23:03,035][02170] Fps is (10 sec: 3276.8, 60 sec: 3481.8, 300 sec: 3429.5). Total num frames: 1191936. Throughput: 0: 848.9. Samples: 298534. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-11-19 22:23:03,040][02170] Avg episode reward: [(0, '5.940')]
[2024-11-19 22:23:08,033][02170] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3429.5). Total num frames: 1212416. Throughput: 0: 875.8. Samples: 301502. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:23:08,036][02170] Avg episode reward: [(0, '5.964')]
[2024-11-19 22:23:13,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3415.6). Total num frames: 1224704. Throughput: 0: 870.4. Samples: 306148. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:23:13,037][02170] Avg episode reward: [(0, '6.172')]
[2024-11-19 22:23:13,043][03246] Saving new best policy, reward=6.172!
[2024-11-19 22:23:14,314][03259] Updated weights for policy 0, policy_version 300 (0.0024)
[2024-11-19 22:23:18,033][02170] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3429.5). Total num frames: 1241088. Throughput: 0: 846.7. Samples: 311156. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:23:18,042][02170] Avg episode reward: [(0, '6.601')]
[2024-11-19 22:23:18,052][03246] Saving new best policy, reward=6.601!
[2024-11-19 22:23:23,033][02170] Fps is (10 sec: 3686.3, 60 sec: 3549.9, 300 sec: 3429.5). Total num frames: 1261568. Throughput: 0: 861.2. Samples: 314142. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:23:23,039][02170] Avg episode reward: [(0, '7.086')]
[2024-11-19 22:23:23,042][03246] Saving new best policy, reward=7.086!
[2024-11-19 22:23:24,467][03259] Updated weights for policy 0, policy_version 310 (0.0019)
[2024-11-19 22:23:28,033][02170] Fps is (10 sec: 3686.3, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 1277952. Throughput: 0: 892.8. Samples: 319650. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:23:28,037][02170] Avg episode reward: [(0, '7.009')]
[2024-11-19 22:23:33,033][02170] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3415.6). Total num frames: 1290240. Throughput: 0: 844.7. Samples: 323806. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-11-19 22:23:33,035][02170] Avg episode reward: [(0, '7.150')]
[2024-11-19 22:23:33,039][03246] Saving new best policy, reward=7.150!
[2024-11-19 22:23:37,302][03259] Updated weights for policy 0, policy_version 320 (0.0027)
[2024-11-19 22:23:38,033][02170] Fps is (10 sec: 3276.9, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 1310720. Throughput: 0: 841.5. Samples: 326760. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-11-19 22:23:38,042][02170] Avg episode reward: [(0, '7.295')]
[2024-11-19 22:23:38,052][03246] Saving new best policy, reward=7.295!
[2024-11-19 22:23:43,033][02170] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3429.5). Total num frames: 1331200. Throughput: 0: 892.3. Samples: 332912. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-11-19 22:23:43,035][02170] Avg episode reward: [(0, '8.127')]
[2024-11-19 22:23:43,037][03246] Saving new best policy, reward=8.127!
[2024-11-19 22:23:48,034][02170] Fps is (10 sec: 3276.4, 60 sec: 3413.3, 300 sec: 3415.6). Total num frames: 1343488. Throughput: 0: 846.1. Samples: 336610. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:23:48,040][02170] Avg episode reward: [(0, '8.181')]
[2024-11-19 22:23:48,049][03246] Saving new best policy, reward=8.181!
[2024-11-19 22:23:50,031][03259] Updated weights for policy 0, policy_version 330 (0.0019)
[2024-11-19 22:23:53,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3443.4). Total num frames: 1363968. Throughput: 0: 841.5. Samples: 339368. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:23:53,035][02170] Avg episode reward: [(0, '8.119')]
[2024-11-19 22:23:58,033][02170] Fps is (10 sec: 4096.5, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 1384448. Throughput: 0: 874.6. Samples: 345506. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-11-19 22:23:58,035][02170] Avg episode reward: [(0, '8.562')]
[2024-11-19 22:23:58,044][03246] Saving new best policy, reward=8.562!
[2024-11-19 22:24:01,163][03259] Updated weights for policy 0, policy_version 340 (0.0024)
[2024-11-19 22:24:03,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3415.6). Total num frames: 1396736. Throughput: 0: 860.2. Samples: 349866. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:24:03,037][02170] Avg episode reward: [(0, '8.438')]
[2024-11-19 22:24:08,033][02170] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3429.5). Total num frames: 1413120. Throughput: 0: 835.9. Samples: 351756. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:24:08,037][02170] Avg episode reward: [(0, '8.491')]
[2024-11-19 22:24:12,804][03259] Updated weights for policy 0, policy_version 350 (0.0018)
[2024-11-19 22:24:13,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 1433600. Throughput: 0: 849.0. Samples: 357856. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-11-19 22:24:13,035][02170] Avg episode reward: [(0, '8.203')]
[2024-11-19 22:24:18,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3415.7). Total num frames: 1449984. Throughput: 0: 871.2. Samples: 363008. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0)
[2024-11-19 22:24:18,039][02170] Avg episode reward: [(0, '8.074')]
[2024-11-19 22:24:23,033][02170] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3415.7). Total num frames: 1462272. Throughput: 0: 847.3. Samples: 364890. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-11-19 22:24:23,036][02170] Avg episode reward: [(0, '7.856')]
[2024-11-19 22:24:25,878][03259] Updated weights for policy 0, policy_version 360 (0.0025)
[2024-11-19 22:24:28,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3429.5). Total num frames: 1482752. Throughput: 0: 830.8. Samples: 370298. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:24:28,036][02170] Avg episode reward: [(0, '7.944')]
[2024-11-19 22:24:28,045][03246] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000362_1482752.pth...
[2024-11-19 22:24:28,174][03246] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000163_667648.pth
[2024-11-19 22:24:33,033][02170] Fps is (10 sec: 4096.1, 60 sec: 3549.9, 300 sec: 3429.5). Total num frames: 1503232. Throughput: 0: 883.9. Samples: 376384. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:24:33,035][02170] Avg episode reward: [(0, '8.504')]
[2024-11-19 22:24:38,016][03259] Updated weights for policy 0, policy_version 370 (0.0026)
[2024-11-19 22:24:38,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3401.8). Total num frames: 1515520. Throughput: 0: 863.0. Samples: 378202. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-11-19 22:24:38,035][02170] Avg episode reward: [(0, '9.305')]
[2024-11-19 22:24:38,041][03246] Saving new best policy, reward=9.305!
[2024-11-19 22:24:43,033][02170] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3415.6). Total num frames: 1531904. Throughput: 0: 825.9. Samples: 382672. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:24:43,034][02170] Avg episode reward: [(0, '9.985')]
[2024-11-19 22:24:43,041][03246] Saving new best policy, reward=9.985!
[2024-11-19 22:24:48,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3481.7, 300 sec: 3415.7). Total num frames: 1552384. Throughput: 0: 865.6. Samples: 388818. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:24:48,037][02170] Avg episode reward: [(0, '10.101')]
[2024-11-19 22:24:48,050][03246] Saving new best policy, reward=10.101!
[2024-11-19 22:24:48,808][03259] Updated weights for policy 0, policy_version 380 (0.0025)
[2024-11-19 22:24:53,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3401.8). Total num frames: 1568768. Throughput: 0: 883.0. Samples: 391492. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:24:53,035][02170] Avg episode reward: [(0, '10.074')]
[2024-11-19 22:24:58,033][02170] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3401.8). Total num frames: 1581056. Throughput: 0: 829.0. Samples: 395160. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:24:58,035][02170] Avg episode reward: [(0, '9.421')]
[2024-11-19 22:25:01,692][03259] Updated weights for policy 0, policy_version 390 (0.0028)
[2024-11-19 22:25:03,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3415.6). Total num frames: 1601536. Throughput: 0: 850.1. Samples: 401264. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-11-19 22:25:03,039][02170] Avg episode reward: [(0, '9.790')]
[2024-11-19 22:25:08,033][02170] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3415.6). Total num frames: 1622016. Throughput: 0: 875.4. Samples: 404282. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-11-19 22:25:08,035][02170] Avg episode reward: [(0, '8.933')]
[2024-11-19 22:25:13,035][02170] Fps is (10 sec: 3276.1, 60 sec: 3344.9, 300 sec: 3401.7). Total num frames: 1634304. Throughput: 0: 846.2. Samples: 408378. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-11-19 22:25:13,037][02170] Avg episode reward: [(0, '8.620')]
[2024-11-19 22:25:14,989][03259] Updated weights for policy 0, policy_version 400 (0.0040)
[2024-11-19 22:25:18,033][02170] Fps is (10 sec: 2048.0, 60 sec: 3208.5, 300 sec: 3374.0). Total num frames: 1642496. Throughput: 0: 785.8. Samples: 411744. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-11-19 22:25:18,039][02170] Avg episode reward: [(0, '8.110')]
[2024-11-19 22:25:23,033][02170] Fps is (10 sec: 2867.9, 60 sec: 3345.1, 300 sec: 3374.0). Total num frames: 1662976. Throughput: 0: 797.6. Samples: 414094. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-11-19 22:25:23,038][02170] Avg episode reward: [(0, '8.360')]
[2024-11-19 22:25:28,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3360.1). Total num frames: 1675264. Throughput: 0: 819.2. Samples: 419536. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:25:28,038][02170] Avg episode reward: [(0, '8.759')]
[2024-11-19 22:25:28,371][03259] Updated weights for policy 0, policy_version 410 (0.0023)
[2024-11-19 22:25:33,033][02170] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3374.0). Total num frames: 1691648. Throughput: 0: 771.7. Samples: 423546. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:25:33,035][02170] Avg episode reward: [(0, '9.244')]
[2024-11-19 22:25:38,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3374.0). Total num frames: 1712128. Throughput: 0: 779.3. Samples: 426560. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:25:38,035][02170] Avg episode reward: [(0, '11.127')]
[2024-11-19 22:25:38,048][03246] Saving new best policy, reward=11.127!
[2024-11-19 22:25:39,976][03259] Updated weights for policy 0, policy_version 420 (0.0018)
[2024-11-19 22:25:43,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3360.1). Total num frames: 1728512. Throughput: 0: 832.8. Samples: 432638. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:25:43,034][02170] Avg episode reward: [(0, '11.915')]
[2024-11-19 22:25:43,036][03246] Saving new best policy, reward=11.915!
[2024-11-19 22:25:48,034][02170] Fps is (10 sec: 2866.8, 60 sec: 3140.2, 300 sec: 3360.2). Total num frames: 1740800. Throughput: 0: 779.8. Samples: 436358. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:25:48,038][02170] Avg episode reward: [(0, '11.719')]
[2024-11-19 22:25:52,646][03259] Updated weights for policy 0, policy_version 430 (0.0035)
[2024-11-19 22:25:53,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3374.0). Total num frames: 1761280. Throughput: 0: 772.9. Samples: 439062. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:25:53,035][02170] Avg episode reward: [(0, '11.015')]
[2024-11-19 22:25:58,033][02170] Fps is (10 sec: 4096.5, 60 sec: 3345.1, 300 sec: 3374.0). Total num frames: 1781760. Throughput: 0: 822.0. Samples: 445364. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-11-19 22:25:58,035][02170] Avg episode reward: [(0, '10.338')]
[2024-11-19 22:26:03,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3374.0). Total num frames: 1798144. Throughput: 0: 849.8. Samples: 449984. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:26:03,039][02170] Avg episode reward: [(0, '10.596')]
[2024-11-19 22:26:04,576][03259] Updated weights for policy 0, policy_version 440 (0.0024)
[2024-11-19 22:26:08,035][02170] Fps is (10 sec: 3275.9, 60 sec: 3208.4, 300 sec: 3387.8). Total num frames: 1814528. Throughput: 0: 840.9. Samples: 451936. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:26:08,038][02170] Avg episode reward: [(0, '11.579')]
[2024-11-19 22:26:13,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3345.2, 300 sec: 3387.9). Total num frames: 1835008. Throughput: 0: 858.0. Samples: 458146. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:26:13,040][02170] Avg episode reward: [(0, '12.049')]
[2024-11-19 22:26:13,046][03246] Saving new best policy, reward=12.049!
[2024-11-19 22:26:14,983][03259] Updated weights for policy 0, policy_version 450 (0.0023)
[2024-11-19 22:26:18,033][02170] Fps is (10 sec: 3687.4, 60 sec: 3481.6, 300 sec: 3374.0). Total num frames: 1851392. Throughput: 0: 886.4. Samples: 463434. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-11-19 22:26:18,039][02170] Avg episode reward: [(0, '12.587')]
[2024-11-19 22:26:18,048][03246] Saving new best policy, reward=12.587!
[2024-11-19 22:26:23,033][02170] Fps is (10 sec: 2867.1, 60 sec: 3345.0, 300 sec: 3387.9). Total num frames: 1863680. Throughput: 0: 860.3. Samples: 465272. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-11-19 22:26:23,035][02170] Avg episode reward: [(0, '12.591')]
[2024-11-19 22:26:23,038][03246] Saving new best policy, reward=12.591!
[2024-11-19 22:26:27,614][03259] Updated weights for policy 0, policy_version 460 (0.0040)
[2024-11-19 22:26:28,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3415.7). Total num frames: 1884160. Throughput: 0: 844.3. Samples: 470632. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-11-19 22:26:28,039][02170] Avg episode reward: [(0, '12.666')]
[2024-11-19 22:26:28,049][03246] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000460_1884160.pth...
[2024-11-19 22:26:28,184][03246] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000261_1069056.pth
[2024-11-19 22:26:28,197][03246] Saving new best policy, reward=12.666!
[2024-11-19 22:26:33,033][02170] Fps is (10 sec: 4096.1, 60 sec: 3549.9, 300 sec: 3415.7). Total num frames: 1904640. Throughput: 0: 893.0. Samples: 476540. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:26:33,035][02170] Avg episode reward: [(0, '13.201')]
[2024-11-19 22:26:33,041][03246] Saving new best policy, reward=13.201!
[2024-11-19 22:26:38,033][02170] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3374.0). Total num frames: 1912832. Throughput: 0: 873.2. Samples: 478358. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-11-19 22:26:38,035][02170] Avg episode reward: [(0, '13.502')]
[2024-11-19 22:26:38,099][03246] Saving new best policy, reward=13.502!
[2024-11-19 22:26:40,712][03259] Updated weights for policy 0, policy_version 470 (0.0017)
[2024-11-19 22:26:43,033][02170] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3401.8). Total num frames: 1933312. Throughput: 0: 834.7. Samples: 482924. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-11-19 22:26:43,039][02170] Avg episode reward: [(0, '14.101')]
[2024-11-19 22:26:43,043][03246] Saving new best policy, reward=14.101!
[2024-11-19 22:26:48,033][02170] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3415.6). Total num frames: 1953792. Throughput: 0: 870.9. Samples: 489174. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-11-19 22:26:48,035][02170] Avg episode reward: [(0, '15.646')]
[2024-11-19 22:26:48,043][03246] Saving new best policy, reward=15.646!
[2024-11-19 22:26:51,169][03259] Updated weights for policy 0, policy_version 480 (0.0018)
[2024-11-19 22:26:53,033][02170] Fps is (10 sec: 3686.3, 60 sec: 3481.6, 300 sec: 3401.8). Total num frames: 1970176. Throughput: 0: 886.3. Samples: 491816. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:26:53,036][02170] Avg episode reward: [(0, '15.411')]
[2024-11-19 22:26:58,033][02170] Fps is (10 sec: 2867.1, 60 sec: 3345.0, 300 sec: 3387.9). Total num frames: 1982464. Throughput: 0: 837.5. Samples: 495832. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-11-19 22:26:58,037][02170] Avg episode reward: [(0, '14.814')]
[2024-11-19 22:27:02,883][03259] Updated weights for policy 0, policy_version 490 (0.0017)
[2024-11-19 22:27:03,033][02170] Fps is (10 sec: 3686.5, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 2007040. Throughput: 0: 861.6. Samples: 502208. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-11-19 22:27:03,038][02170] Avg episode reward: [(0, '15.988')]
[2024-11-19 22:27:03,040][03246] Saving new best policy, reward=15.988!
[2024-11-19 22:27:08,035][02170] Fps is (10 sec: 4095.2, 60 sec: 3481.6, 300 sec: 3415.6). Total num frames: 2023424. Throughput: 0: 889.6. Samples: 505308. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-11-19 22:27:08,037][02170] Avg episode reward: [(0, '15.891')]
[2024-11-19 22:27:13,033][02170] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3387.9). Total num frames: 2035712. Throughput: 0: 857.7. Samples: 509228. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:27:13,036][02170] Avg episode reward: [(0, '16.183')]
[2024-11-19 22:27:13,043][03246] Saving new best policy, reward=16.183!
[2024-11-19 22:27:15,615][03259] Updated weights for policy 0, policy_version 500 (0.0028)
[2024-11-19 22:27:18,033][02170] Fps is (10 sec: 3277.5, 60 sec: 3413.3, 300 sec: 3415.6). Total num frames: 2056192. Throughput: 0: 852.0. Samples: 514878. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:27:18,035][02170] Avg episode reward: [(0, '16.820')]
[2024-11-19 22:27:18,054][03246] Saving new best policy, reward=16.820!
[2024-11-19 22:27:23,033][02170] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3415.6). Total num frames: 2076672. Throughput: 0: 878.0. Samples: 517870. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-11-19 22:27:23,037][02170] Avg episode reward: [(0, '17.302')]
[2024-11-19 22:27:23,042][03246] Saving new best policy, reward=17.302!
[2024-11-19 22:27:27,204][03259] Updated weights for policy 0, policy_version 510 (0.0026)
[2024-11-19 22:27:28,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3387.9). Total num frames: 2088960. Throughput: 0: 881.5. Samples: 522590. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-11-19 22:27:28,035][02170] Avg episode reward: [(0, '16.749')]
[2024-11-19 22:27:33,033][02170] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3401.8). Total num frames: 2105344. Throughput: 0: 855.6. Samples: 527678. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-11-19 22:27:33,035][02170] Avg episode reward: [(0, '16.160')]
[2024-11-19 22:27:37,894][03259] Updated weights for policy 0, policy_version 520 (0.0022)
[2024-11-19 22:27:38,033][02170] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3429.5). Total num frames: 2129920. Throughput: 0: 866.6. Samples: 530812. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:27:38,037][02170] Avg episode reward: [(0, '16.217')]
[2024-11-19 22:27:43,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3401.8). Total num frames: 2142208. Throughput: 0: 900.1. Samples: 536336. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:27:43,037][02170] Avg episode reward: [(0, '16.557')]
[2024-11-19 22:27:48,033][02170] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3387.9). Total num frames: 2158592. Throughput: 0: 852.8. Samples: 540582. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-11-19 22:27:48,035][02170] Avg episode reward: [(0, '15.725')]
[2024-11-19 22:27:50,298][03259] Updated weights for policy 0, policy_version 530 (0.0018)
[2024-11-19 22:27:53,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3415.6). Total num frames: 2179072. Throughput: 0: 853.2. Samples: 543700. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-11-19 22:27:53,035][02170] Avg episode reward: [(0, '17.473')]
[2024-11-19 22:27:53,039][03246] Saving new best policy, reward=17.473!
[2024-11-19 22:27:58,037][02170] Fps is (10 sec: 4094.3, 60 sec: 3617.9, 300 sec: 3415.6). Total num frames: 2199552. Throughput: 0: 905.4. Samples: 549976. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-11-19 22:27:58,039][02170] Avg episode reward: [(0, '18.279')]
[2024-11-19 22:27:58,061][03246] Saving new best policy, reward=18.279!
[2024-11-19 22:28:02,255][03259] Updated weights for policy 0, policy_version 540 (0.0028)
[2024-11-19 22:28:03,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3387.9). Total num frames: 2211840. Throughput: 0: 862.5. Samples: 553692. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2024-11-19 22:28:03,035][02170] Avg episode reward: [(0, '18.545')]
[2024-11-19 22:28:03,038][03246] Saving new best policy, reward=18.545!
[2024-11-19 22:28:08,033][02170] Fps is (10 sec: 3278.2, 60 sec: 3481.7, 300 sec: 3415.6). Total num frames: 2232320. Throughput: 0: 858.5. Samples: 556504. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-11-19 22:28:08,035][02170] Avg episode reward: [(0, '18.395')]
[2024-11-19 22:28:12,652][03259] Updated weights for policy 0, policy_version 550 (0.0029)
[2024-11-19 22:28:13,033][02170] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3429.5). Total num frames: 2252800. Throughput: 0: 892.3. Samples: 562742. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-11-19 22:28:13,038][02170] Avg episode reward: [(0, '19.650')]
[2024-11-19 22:28:13,040][03246] Saving new best policy, reward=19.650!
[2024-11-19 22:28:18,037][02170] Fps is (10 sec: 3275.4, 60 sec: 3481.4, 300 sec: 3401.7). Total num frames: 2265088. Throughput: 0: 877.0. Samples: 567146. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-11-19 22:28:18,040][02170] Avg episode reward: [(0, '20.400')]
[2024-11-19 22:28:18,056][03246] Saving new best policy, reward=20.400!
[2024-11-19 22:28:23,033][02170] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3401.8). Total num frames: 2281472. Throughput: 0: 852.7. Samples: 569182. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0)
[2024-11-19 22:28:23,038][02170] Avg episode reward: [(0, '19.420')]
[2024-11-19 22:28:25,401][03259] Updated weights for policy 0, policy_version 560 (0.0034)
[2024-11-19 22:28:28,033][02170] Fps is (10 sec: 3687.9, 60 sec: 3549.9, 300 sec: 3429.5). Total num frames: 2301952. Throughput: 0: 871.2. Samples: 575542. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:28:28,040][02170] Avg episode reward: [(0, '19.761')]
[2024-11-19 22:28:28,049][03246] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000562_2301952.pth...
[2024-11-19 22:28:28,201][03246] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000362_1482752.pth
[2024-11-19 22:28:33,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3415.6). Total num frames: 2318336. Throughput: 0: 892.2. Samples: 580732. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:28:33,035][02170] Avg episode reward: [(0, '21.611')]
[2024-11-19 22:28:33,041][03246] Saving new best policy, reward=21.611!
[2024-11-19 22:28:38,008][03259] Updated weights for policy 0, policy_version 570 (0.0015)
[2024-11-19 22:28:38,033][02170] Fps is (10 sec: 3276.9, 60 sec: 3413.3, 300 sec: 3401.8). Total num frames: 2334720. Throughput: 0: 863.2. Samples: 582542. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-11-19 22:28:38,040][02170] Avg episode reward: [(0, '21.846')]
[2024-11-19 22:28:38,050][03246] Saving new best policy, reward=21.846!
[2024-11-19 22:28:43,035][02170] Fps is (10 sec: 3276.1, 60 sec: 3481.5, 300 sec: 3415.6). Total num frames: 2351104. Throughput: 0: 847.0. Samples: 588088. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:28:43,040][02170] Avg episode reward: [(0, '21.488')]
[2024-11-19 22:28:48,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3415.6). Total num frames: 2371584. Throughput: 0: 900.1. Samples: 594198. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:28:48,035][02170] Avg episode reward: [(0, '21.378')]
[2024-11-19 22:28:48,246][03259] Updated weights for policy 0, policy_version 580 (0.0014)
[2024-11-19 22:28:53,033][02170] Fps is (10 sec: 3277.5, 60 sec: 3413.3, 300 sec: 3387.9). Total num frames: 2383872. Throughput: 0: 877.7. Samples: 596000. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-11-19 22:28:53,037][02170] Avg episode reward: [(0, '22.995')]
[2024-11-19 22:28:53,128][03246] Saving new best policy, reward=22.995!
[2024-11-19 22:28:58,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3413.6, 300 sec: 3415.6). Total num frames: 2404352. Throughput: 0: 844.2. Samples: 600732. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-11-19 22:28:58,042][02170] Avg episode reward: [(0, '21.532')]
[2024-11-19 22:29:00,722][03259] Updated weights for policy 0, policy_version 590 (0.0025)
[2024-11-19 22:29:03,033][02170] Fps is (10 sec: 4095.9, 60 sec: 3549.9, 300 sec: 3429.5). Total num frames: 2424832. Throughput: 0: 880.8. Samples: 606780. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:29:03,035][02170] Avg episode reward: [(0, '19.112')]
[2024-11-19 22:29:08,034][02170] Fps is (10 sec: 2866.8, 60 sec: 3345.0, 300 sec: 3387.9). Total num frames: 2433024. Throughput: 0: 871.4. Samples: 608398. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-11-19 22:29:08,039][02170] Avg episode reward: [(0, '19.631')]
[2024-11-19 22:29:13,033][02170] Fps is (10 sec: 2048.0, 60 sec: 3208.5, 300 sec: 3374.0). Total num frames: 2445312. Throughput: 0: 798.7. Samples: 611482. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-11-19 22:29:13,041][02170] Avg episode reward: [(0, '19.753')]
[2024-11-19 22:29:16,127][03259] Updated weights for policy 0, policy_version 600 (0.0027)
[2024-11-19 22:29:18,033][02170] Fps is (10 sec: 2867.6, 60 sec: 3277.0, 300 sec: 3387.9). Total num frames: 2461696. Throughput: 0: 800.8. Samples: 616766. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-11-19 22:29:18,036][02170] Avg episode reward: [(0, '19.188')]
[2024-11-19 22:29:23,033][02170] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3401.8). Total num frames: 2486272. Throughput: 0: 830.3. Samples: 619906. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:29:23,035][02170] Avg episode reward: [(0, '19.184')]
[2024-11-19 22:29:27,191][03259] Updated weights for policy 0, policy_version 610 (0.0020)
[2024-11-19 22:29:28,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3374.0). Total num frames: 2498560. Throughput: 0: 820.2. Samples: 624994. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:29:28,035][02170] Avg episode reward: [(0, '19.031')]
[2024-11-19 22:29:33,033][02170] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3387.9). Total num frames: 2514944. Throughput: 0: 786.8. Samples: 629602. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-11-19 22:29:33,040][02170] Avg episode reward: [(0, '18.411')]
[2024-11-19 22:29:38,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3345.1, 300 sec: 3401.8). Total num frames: 2535424. Throughput: 0: 816.8. Samples: 632758. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:29:38,039][02170] Avg episode reward: [(0, '17.497')]
[2024-11-19 22:29:38,430][03259] Updated weights for policy 0, policy_version 620 (0.0032)
[2024-11-19 22:29:43,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3345.2, 300 sec: 3387.9). Total num frames: 2551808. Throughput: 0: 839.3. Samples: 638502. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:29:43,038][02170] Avg episode reward: [(0, '17.260')]
[2024-11-19 22:29:48,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3387.9). Total num frames: 2568192. Throughput: 0: 792.2. Samples: 642430. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:29:48,035][02170] Avg episode reward: [(0, '17.537')]
[2024-11-19 22:29:50,934][03259] Updated weights for policy 0, policy_version 630 (0.0025)
[2024-11-19 22:29:53,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3415.6). Total num frames: 2588672. Throughput: 0: 826.5. Samples: 645588. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:29:53,039][02170] Avg episode reward: [(0, '18.798')]
[2024-11-19 22:29:58,033][02170] Fps is (10 sec: 4096.0, 60 sec: 3413.3, 300 sec: 3415.6). Total num frames: 2609152. Throughput: 0: 898.4. Samples: 651908. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-11-19 22:29:58,039][02170] Avg episode reward: [(0, '18.846')]
[2024-11-19 22:30:02,448][03259] Updated weights for policy 0, policy_version 640 (0.0017)
[2024-11-19 22:30:03,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3387.9). Total num frames: 2621440. Throughput: 0: 874.7. Samples: 656128. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:30:03,040][02170] Avg episode reward: [(0, '19.518')]
[2024-11-19 22:30:08,033][02170] Fps is (10 sec: 2867.2, 60 sec: 3413.4, 300 sec: 3401.8). Total num frames: 2637824. Throughput: 0: 861.4. Samples: 658670. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:30:08,038][02170] Avg episode reward: [(0, '19.697')]
[2024-11-19 22:30:13,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 2658304. Throughput: 0: 882.2. Samples: 664694. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:30:13,035][02170] Avg episode reward: [(0, '19.891')]
[2024-11-19 22:30:13,254][03259] Updated weights for policy 0, policy_version 650 (0.0025)
[2024-11-19 22:30:18,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3429.5). Total num frames: 2674688. Throughput: 0: 885.8. Samples: 669462. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:30:18,036][02170] Avg episode reward: [(0, '18.630')]
[2024-11-19 22:30:23,033][02170] Fps is (10 sec: 2867.0, 60 sec: 3345.0, 300 sec: 3429.5). Total num frames: 2686976. Throughput: 0: 858.4. Samples: 671386. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:30:23,036][02170] Avg episode reward: [(0, '18.450')]
[2024-11-19 22:30:25,975][03259] Updated weights for policy 0, policy_version 660 (0.0026)
[2024-11-19 22:30:28,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3457.3). Total num frames: 2711552. Throughput: 0: 862.9. Samples: 677334. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:30:28,037][02170] Avg episode reward: [(0, '17.399')]
[2024-11-19 22:30:28,051][03246] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000662_2711552.pth...
[2024-11-19 22:30:28,203][03246] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000460_1884160.pth
[2024-11-19 22:30:33,033][02170] Fps is (10 sec: 4096.3, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 2727936. Throughput: 0: 899.2. Samples: 682894. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:30:33,040][02170] Avg episode reward: [(0, '17.082')]
[2024-11-19 22:30:38,033][02170] Fps is (10 sec: 2867.1, 60 sec: 3413.3, 300 sec: 3429.5). Total num frames: 2740224. Throughput: 0: 868.1. Samples: 684652. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:30:38,038][02170] Avg episode reward: [(0, '17.346')]
[2024-11-19 22:30:38,679][03259] Updated weights for policy 0, policy_version 670 (0.0017)
[2024-11-19 22:30:43,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 2760704. Throughput: 0: 843.2. Samples: 689852. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:30:43,035][02170] Avg episode reward: [(0, '17.041')]
[2024-11-19 22:30:48,033][02170] Fps is (10 sec: 4096.1, 60 sec: 3549.9, 300 sec: 3457.3). Total num frames: 2781184. Throughput: 0: 886.9. Samples: 696038. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:30:48,042][02170] Avg episode reward: [(0, '18.150')]
[2024-11-19 22:30:48,935][03259] Updated weights for policy 0, policy_version 680 (0.0017)
[2024-11-19 22:30:53,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3429.5). Total num frames: 2793472. Throughput: 0: 875.4. Samples: 698064. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:30:53,039][02170] Avg episode reward: [(0, '18.045')]
[2024-11-19 22:30:58,033][02170] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3429.5). Total num frames: 2809856. Throughput: 0: 838.1. Samples: 702410. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:30:58,038][02170] Avg episode reward: [(0, '19.586')]
[2024-11-19 22:31:01,224][03259] Updated weights for policy 0, policy_version 690 (0.0014)
[2024-11-19 22:31:03,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2830336. Throughput: 0: 868.2. Samples: 708530. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2024-11-19 22:31:03,041][02170] Avg episode reward: [(0, '20.032')]
[2024-11-19 22:31:08,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 2846720. Throughput: 0: 888.0. Samples: 711344. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:31:08,036][02170] Avg episode reward: [(0, '19.731')]
[2024-11-19 22:31:13,033][02170] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3415.6). Total num frames: 2859008. Throughput: 0: 839.0. Samples: 715088. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-11-19 22:31:13,035][02170] Avg episode reward: [(0, '18.859')]
[2024-11-19 22:31:14,355][03259] Updated weights for policy 0, policy_version 700 (0.0030)
[2024-11-19 22:31:18,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3443.4). Total num frames: 2879488. Throughput: 0: 843.8. Samples: 720866. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:31:18,037][02170] Avg episode reward: [(0, '18.997')]
[2024-11-19 22:31:23,033][02170] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 2899968. Throughput: 0: 872.8. Samples: 723926. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-11-19 22:31:23,035][02170] Avg episode reward: [(0, '18.771')]
[2024-11-19 22:31:25,737][03259] Updated weights for policy 0, policy_version 710 (0.0040)
[2024-11-19 22:31:28,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3345.1, 300 sec: 3415.6). Total num frames: 2912256. Throughput: 0: 854.7. Samples: 728314. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:31:28,038][02170] Avg episode reward: [(0, '17.840')]
[2024-11-19 22:31:33,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3457.3). Total num frames: 2932736. Throughput: 0: 835.2. Samples: 733620. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:31:33,035][02170] Avg episode reward: [(0, '17.981')]
[2024-11-19 22:31:36,710][03259] Updated weights for policy 0, policy_version 720 (0.0026)
[2024-11-19 22:31:38,033][02170] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3457.3). Total num frames: 2953216. Throughput: 0: 859.7. Samples: 736750. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:31:38,037][02170] Avg episode reward: [(0, '19.980')]
[2024-11-19 22:31:43,033][02170] Fps is (10 sec: 3686.3, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 2969600. Throughput: 0: 883.3. Samples: 742160. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-11-19 22:31:43,036][02170] Avg episode reward: [(0, '19.275')]
[2024-11-19 22:31:48,033][02170] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3429.5). Total num frames: 2981888. Throughput: 0: 845.8. Samples: 746590. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2024-11-19 22:31:48,034][02170] Avg episode reward: [(0, '18.424')]
[2024-11-19 22:31:49,174][03259] Updated weights for policy 0, policy_version 730 (0.0020)
[2024-11-19 22:31:53,033][02170] Fps is (10 sec: 3276.9, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 3002368. Throughput: 0: 852.3. Samples: 749696. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-11-19 22:31:53,035][02170] Avg episode reward: [(0, '18.377')]
[2024-11-19 22:31:58,035][02170] Fps is (10 sec: 4095.0, 60 sec: 3549.7, 300 sec: 3443.4). Total num frames: 3022848. Throughput: 0: 907.8. Samples: 755940. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:31:58,043][02170] Avg episode reward: [(0, '18.501')]
[2024-11-19 22:32:00,660][03259] Updated weights for policy 0, policy_version 740 (0.0021)
[2024-11-19 22:32:03,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3429.6). Total num frames: 3035136. Throughput: 0: 865.1. Samples: 759794. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-11-19 22:32:03,040][02170] Avg episode reward: [(0, '18.716')]
[2024-11-19 22:32:08,033][02170] Fps is (10 sec: 3277.6, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 3055616. Throughput: 0: 859.9. Samples: 762622. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:32:08,037][02170] Avg episode reward: [(0, '18.255')]
[2024-11-19 22:32:11,455][03259] Updated weights for policy 0, policy_version 750 (0.0019)
[2024-11-19 22:32:13,033][02170] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3457.3). Total num frames: 3076096. Throughput: 0: 905.6. Samples: 769066. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-11-19 22:32:13,036][02170] Avg episode reward: [(0, '20.581')]
[2024-11-19 22:32:18,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 3088384. Throughput: 0: 884.0. Samples: 773398. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:32:18,040][02170] Avg episode reward: [(0, '21.484')]
[2024-11-19 22:32:23,033][02170] Fps is (10 sec: 3276.7, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 3108864. Throughput: 0: 863.8. Samples: 775622. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:32:23,037][02170] Avg episode reward: [(0, '22.606')]
[2024-11-19 22:32:23,841][03259] Updated weights for policy 0, policy_version 760 (0.0018)
[2024-11-19 22:32:28,033][02170] Fps is (10 sec: 4095.9, 60 sec: 3618.1, 300 sec: 3471.2). Total num frames: 3129344. Throughput: 0: 883.6. Samples: 781924. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:32:28,040][02170] Avg episode reward: [(0, '22.810')]
[2024-11-19 22:32:28,050][03246] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000764_3129344.pth...
[2024-11-19 22:32:28,184][03246] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000562_2301952.pth
[2024-11-19 22:32:33,033][02170] Fps is (10 sec: 3276.9, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 3141632. Throughput: 0: 892.3. Samples: 786744. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-11-19 22:32:33,035][02170] Avg episode reward: [(0, '23.007')]
[2024-11-19 22:32:33,038][03246] Saving new best policy, reward=23.007!
[2024-11-19 22:32:36,731][03259] Updated weights for policy 0, policy_version 770 (0.0015)
[2024-11-19 22:32:38,033][02170] Fps is (10 sec: 2867.3, 60 sec: 3413.3, 300 sec: 3443.4). Total num frames: 3158016. Throughput: 0: 861.7. Samples: 788474. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:32:38,035][02170] Avg episode reward: [(0, '22.748')]
[2024-11-19 22:32:43,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3457.3). Total num frames: 3178496. Throughput: 0: 850.0. Samples: 794190. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:32:43,040][02170] Avg episode reward: [(0, '21.760')]
[2024-11-19 22:32:46,888][03259] Updated weights for policy 0, policy_version 780 (0.0029)
[2024-11-19 22:32:48,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 3194880. Throughput: 0: 893.4. Samples: 799996. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:32:48,040][02170] Avg episode reward: [(0, '19.915')]
[2024-11-19 22:32:53,033][02170] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3415.7). Total num frames: 3207168. Throughput: 0: 871.0. Samples: 801818. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:32:53,039][02170] Avg episode reward: [(0, '19.222')]
[2024-11-19 22:32:58,033][02170] Fps is (10 sec: 2457.6, 60 sec: 3276.9, 300 sec: 3415.6). Total num frames: 3219456. Throughput: 0: 809.0. Samples: 805472. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-11-19 22:32:58,035][02170] Avg episode reward: [(0, '19.967')]
[2024-11-19 22:33:02,110][03259] Updated weights for policy 0, policy_version 790 (0.0016)
[2024-11-19 22:33:03,033][02170] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3401.8). Total num frames: 3235840. Throughput: 0: 817.7. Samples: 810194. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-11-19 22:33:03,039][02170] Avg episode reward: [(0, '21.020')]
[2024-11-19 22:33:08,034][02170] Fps is (10 sec: 3276.5, 60 sec: 3276.7, 300 sec: 3387.9). Total num frames: 3252224. Throughput: 0: 819.6. Samples: 812506. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:33:08,035][02170] Avg episode reward: [(0, '22.812')]
[2024-11-19 22:33:13,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3208.5, 300 sec: 3401.8). Total num frames: 3268608. Throughput: 0: 773.9. Samples: 816748. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-11-19 22:33:13,035][02170] Avg episode reward: [(0, '23.159')]
[2024-11-19 22:33:13,040][03246] Saving new best policy, reward=23.159!
[2024-11-19 22:33:14,865][03259] Updated weights for policy 0, policy_version 800 (0.0021)
[2024-11-19 22:33:18,033][02170] Fps is (10 sec: 3686.8, 60 sec: 3345.1, 300 sec: 3415.6). Total num frames: 3289088. Throughput: 0: 803.6. Samples: 822904. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:33:18,040][02170] Avg episode reward: [(0, '22.743')]
[2024-11-19 22:33:23,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3276.8, 300 sec: 3401.8). Total num frames: 3305472. Throughput: 0: 832.9. Samples: 825954. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-11-19 22:33:23,038][02170] Avg episode reward: [(0, '24.615')]
[2024-11-19 22:33:23,040][03246] Saving new best policy, reward=24.615!
[2024-11-19 22:33:27,416][03259] Updated weights for policy 0, policy_version 810 (0.0020)
[2024-11-19 22:33:28,033][02170] Fps is (10 sec: 2867.2, 60 sec: 3140.3, 300 sec: 3387.9). Total num frames: 3317760. Throughput: 0: 787.4. Samples: 829622. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:33:28,035][02170] Avg episode reward: [(0, '25.212')]
[2024-11-19 22:33:28,048][03246] Saving new best policy, reward=25.212!
[2024-11-19 22:33:33,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3276.8, 300 sec: 3401.8). Total num frames: 3338240. Throughput: 0: 788.8. Samples: 835490. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:33:33,035][02170] Avg episode reward: [(0, '22.663')]
[2024-11-19 22:33:37,481][03259] Updated weights for policy 0, policy_version 820 (0.0030)
[2024-11-19 22:33:38,033][02170] Fps is (10 sec: 4096.0, 60 sec: 3345.1, 300 sec: 3415.7). Total num frames: 3358720. Throughput: 0: 816.0. Samples: 838540. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:33:38,039][02170] Avg episode reward: [(0, '21.825')]
[2024-11-19 22:33:43,034][02170] Fps is (10 sec: 3276.4, 60 sec: 3208.5, 300 sec: 3387.9). Total num frames: 3371008. Throughput: 0: 834.4. Samples: 843020. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:33:43,040][02170] Avg episode reward: [(0, '21.991')]
[2024-11-19 22:33:48,033][02170] Fps is (10 sec: 2867.2, 60 sec: 3208.5, 300 sec: 3401.8). Total num frames: 3387392. Throughput: 0: 840.1. Samples: 847998. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:33:48,035][02170] Avg episode reward: [(0, '21.444')]
[2024-11-19 22:33:50,298][03259] Updated weights for policy 0, policy_version 830 (0.0017)
[2024-11-19 22:33:53,033][02170] Fps is (10 sec: 3686.9, 60 sec: 3345.1, 300 sec: 3401.8). Total num frames: 3407872. Throughput: 0: 856.6. Samples: 851050. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:33:53,040][02170] Avg episode reward: [(0, '20.288')]
[2024-11-19 22:33:58,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3387.9). Total num frames: 3424256. Throughput: 0: 880.0. Samples: 856350. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:33:58,035][02170] Avg episode reward: [(0, '21.166')]
[2024-11-19 22:34:02,908][03259] Updated weights for policy 0, policy_version 840 (0.0024)
[2024-11-19 22:34:03,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3415.7). Total num frames: 3440640. Throughput: 0: 838.4. Samples: 860634. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:34:03,040][02170] Avg episode reward: [(0, '20.418')]
[2024-11-19 22:34:08,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3481.7, 300 sec: 3443.4). Total num frames: 3461120. Throughput: 0: 839.7. Samples: 863740. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-11-19 22:34:08,040][02170] Avg episode reward: [(0, '20.569')]
[2024-11-19 22:34:13,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 3477504. Throughput: 0: 896.1. Samples: 869946. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-11-19 22:34:13,037][02170] Avg episode reward: [(0, '20.659')]
[2024-11-19 22:34:13,316][03259] Updated weights for policy 0, policy_version 850 (0.0026)
[2024-11-19 22:34:18,033][02170] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3401.8). Total num frames: 3489792. Throughput: 0: 848.6. Samples: 873676. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2024-11-19 22:34:18,037][02170] Avg episode reward: [(0, '20.574')]
[2024-11-19 22:34:23,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3429.5). Total num frames: 3510272. Throughput: 0: 844.8. Samples: 876554. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-11-19 22:34:23,037][02170] Avg episode reward: [(0, '20.099')]
[2024-11-19 22:34:25,259][03259] Updated weights for policy 0, policy_version 860 (0.0015)
[2024-11-19 22:34:28,033][02170] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 3530752. Throughput: 0: 885.7. Samples: 882874. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:34:28,037][02170] Avg episode reward: [(0, '19.916')]
[2024-11-19 22:34:28,045][03246] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000862_3530752.pth...
[2024-11-19 22:34:28,170][03246] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000662_2711552.pth
[2024-11-19 22:34:33,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 3547136. Throughput: 0: 872.7. Samples: 887270. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:34:33,040][02170] Avg episode reward: [(0, '19.448')]
[2024-11-19 22:34:37,886][03259] Updated weights for policy 0, policy_version 870 (0.0024)
[2024-11-19 22:34:38,033][02170] Fps is (10 sec: 3276.7, 60 sec: 3413.3, 300 sec: 3429.5). Total num frames: 3563520. Throughput: 0: 849.0. Samples: 889254. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:34:38,036][02170] Avg episode reward: [(0, '18.880')]
[2024-11-19 22:34:43,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 3584000. Throughput: 0: 870.4. Samples: 895520. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:34:43,035][02170] Avg episode reward: [(0, '20.644')]
[2024-11-19 22:34:48,034][02170] Fps is (10 sec: 3685.9, 60 sec: 3549.8, 300 sec: 3429.5). Total num frames: 3600384. Throughput: 0: 893.0. Samples: 900820. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:34:48,037][02170] Avg episode reward: [(0, '20.360')]
[2024-11-19 22:34:49,159][03259] Updated weights for policy 0, policy_version 880 (0.0016)
[2024-11-19 22:34:53,033][02170] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3401.8). Total num frames: 3612672. Throughput: 0: 865.1. Samples: 902668. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:34:53,038][02170] Avg episode reward: [(0, '21.733')]
[2024-11-19 22:34:58,033][02170] Fps is (10 sec: 3277.3, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 3633152. Throughput: 0: 855.3. Samples: 908434. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:34:58,040][02170] Avg episode reward: [(0, '23.358')]
[2024-11-19 22:35:00,140][03259] Updated weights for policy 0, policy_version 890 (0.0016)
[2024-11-19 22:35:03,036][02170] Fps is (10 sec: 4094.6, 60 sec: 3549.7, 300 sec: 3443.4). Total num frames: 3653632. Throughput: 0: 910.8. Samples: 914664. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-11-19 22:35:03,040][02170] Avg episode reward: [(0, '23.675')]
[2024-11-19 22:35:08,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3415.6). Total num frames: 3665920. Throughput: 0: 888.2. Samples: 916522. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2024-11-19 22:35:08,040][02170] Avg episode reward: [(0, '24.322')]
[2024-11-19 22:35:12,600][03259] Updated weights for policy 0, policy_version 900 (0.0021)
[2024-11-19 22:35:13,033][02170] Fps is (10 sec: 3277.9, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 3686400. Throughput: 0: 858.2. Samples: 921492. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2024-11-19 22:35:13,037][02170] Avg episode reward: [(0, '23.775')]
[2024-11-19 22:35:18,033][02170] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3457.3). Total num frames: 3706880. Throughput: 0: 894.8. Samples: 927538. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2024-11-19 22:35:18,036][02170] Avg episode reward: [(0, '24.576')]
[2024-11-19 22:35:23,035][02170] Fps is (10 sec: 3276.1, 60 sec: 3481.5, 300 sec: 3415.6). Total num frames: 3719168. Throughput: 0: 900.3. Samples: 929768. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-11-19 22:35:23,038][02170] Avg episode reward: [(0, '23.592')]
[2024-11-19 22:35:25,168][03259] Updated weights for policy 0, policy_version 910 (0.0025)
[2024-11-19 22:35:28,033][02170] Fps is (10 sec: 2867.2, 60 sec: 3413.3, 300 sec: 3415.6). Total num frames: 3735552. Throughput: 0: 847.2. Samples: 933646. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-11-19 22:35:28,035][02170] Avg episode reward: [(0, '23.395')]
[2024-11-19 22:35:33,033][02170] Fps is (10 sec: 3687.2, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 3756032. Throughput: 0: 864.7. Samples: 939728. Policy #0 lag: (min: 0.0, avg: 0.8, max: 2.0)
[2024-11-19 22:35:33,037][02170] Avg episode reward: [(0, '23.761')]
[2024-11-19 22:35:35,903][03259] Updated weights for policy 0, policy_version 920 (0.0030)
[2024-11-19 22:35:38,034][02170] Fps is (10 sec: 3685.9, 60 sec: 3481.5, 300 sec: 3429.5). Total num frames: 3772416. Throughput: 0: 891.2. Samples: 942772. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2024-11-19 22:35:38,042][02170] Avg episode reward: [(0, '22.816')]
[2024-11-19 22:35:43,033][02170] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3401.8). Total num frames: 3784704. Throughput: 0: 851.6. Samples: 946756. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0)
[2024-11-19 22:35:43,035][02170] Avg episode reward: [(0, '23.232')]
[2024-11-19 22:35:48,033][02170] Fps is (10 sec: 3277.2, 60 sec: 3413.4, 300 sec: 3429.5). Total num frames: 3805184. Throughput: 0: 830.2. Samples: 952020. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:35:48,035][02170] Avg episode reward: [(0, '22.548')]
[2024-11-19 22:35:48,947][03259] Updated weights for policy 0, policy_version 930 (0.0027)
[2024-11-19 22:35:53,033][02170] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3443.4). Total num frames: 3825664. Throughput: 0: 855.4. Samples: 955014. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:35:53,036][02170] Avg episode reward: [(0, '22.044')]
[2024-11-19 22:35:58,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3415.6). Total num frames: 3837952. Throughput: 0: 850.8. Samples: 959780. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0)
[2024-11-19 22:35:58,039][02170] Avg episode reward: [(0, '22.035')]
[2024-11-19 22:36:01,799][03259] Updated weights for policy 0, policy_version 940 (0.0024)
[2024-11-19 22:36:03,033][02170] Fps is (10 sec: 2867.2, 60 sec: 3345.3, 300 sec: 3415.6). Total num frames: 3854336. Throughput: 0: 815.6. Samples: 964238. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-11-19 22:36:03,035][02170] Avg episode reward: [(0, '20.522')]
[2024-11-19 22:36:08,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 3874816. Throughput: 0: 833.9. Samples: 967290. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-11-19 22:36:08,039][02170] Avg episode reward: [(0, '20.029')]
[2024-11-19 22:36:12,623][03259] Updated weights for policy 0, policy_version 950 (0.0013)
[2024-11-19 22:36:13,033][02170] Fps is (10 sec: 3686.4, 60 sec: 3413.3, 300 sec: 3429.5). Total num frames: 3891200. Throughput: 0: 871.0. Samples: 972840. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0)
[2024-11-19 22:36:13,040][02170] Avg episode reward: [(0, '19.292')]
[2024-11-19 22:36:18,033][02170] Fps is (10 sec: 2867.2, 60 sec: 3276.8, 300 sec: 3401.8). Total num frames: 3903488. Throughput: 0: 816.0. Samples: 976450. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2024-11-19 22:36:18,037][02170] Avg episode reward: [(0, '19.572')]
[2024-11-19 22:36:23,033][02170] Fps is (10 sec: 3276.8, 60 sec: 3413.5, 300 sec: 3429.5). Total num frames: 3923968. Throughput: 0: 817.6. Samples: 979564. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0)
[2024-11-19 22:36:23,036][02170] Avg episode reward: [(0, '20.129')]
[2024-11-19 22:36:24,902][03259] Updated weights for policy 0, policy_version 960 (0.0018)
[2024-11-19 22:36:28,033][02170] Fps is (10 sec: 4096.0, 60 sec: 3481.6, 300 sec: 3429.5). Total num frames: 3944448. Throughput: 0: 862.4. Samples: 985566. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-11-19 22:36:28,037][02170] Avg episode reward: [(0, '20.413')]
[2024-11-19 22:36:28,059][03246] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000963_3944448.pth...
[2024-11-19 22:36:28,250][03246] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000764_3129344.pth
[2024-11-19 22:36:33,034][02170] Fps is (10 sec: 2866.9, 60 sec: 3276.7, 300 sec: 3387.9). Total num frames: 3952640. Throughput: 0: 834.5. Samples: 989574. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2024-11-19 22:36:33,049][02170] Avg episode reward: [(0, '21.750')]
[2024-11-19 22:36:37,891][03259] Updated weights for policy 0, policy_version 970 (0.0031)
[2024-11-19 22:36:38,033][02170] Fps is (10 sec: 2867.2, 60 sec: 3345.1, 300 sec: 3401.8). Total num frames: 3973120. Throughput: 0: 818.7. Samples: 991854. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0)
[2024-11-19 22:36:38,041][02170] Avg episode reward: [(0, '22.083')]
[2024-11-19 22:36:43,033][02170] Fps is (10 sec: 3686.8, 60 sec: 3413.3, 300 sec: 3415.6). Total num frames: 3989504. Throughput: 0: 838.9. Samples: 997530. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-11-19 22:36:43,038][02170] Avg episode reward: [(0, '22.151')]
[2024-11-19 22:36:48,034][02170] Fps is (10 sec: 2457.3, 60 sec: 3208.5, 300 sec: 3374.0). Total num frames: 3997696. Throughput: 0: 814.5. Samples: 1000890. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0)
[2024-11-19 22:36:48,041][02170] Avg episode reward: [(0, '22.389')]
[2024-11-19 22:36:50,177][03246] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2024-11-19 22:36:50,178][03246] Stopping Batcher_0...
[2024-11-19 22:36:50,192][03246] Loop batcher_evt_loop terminating...
[2024-11-19 22:36:50,187][02170] Component Batcher_0 stopped!
[2024-11-19 22:36:50,296][03259] Weights refcount: 2 0
[2024-11-19 22:36:50,305][03259] Stopping InferenceWorker_p0-w0...
[2024-11-19 22:36:50,305][03259] Loop inference_proc0-0_evt_loop terminating...
[2024-11-19 22:36:50,320][02170] Component InferenceWorker_p0-w0 stopped!
[2024-11-19 22:36:50,383][03246] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000862_3530752.pth
[2024-11-19 22:36:50,412][03246] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2024-11-19 22:36:50,704][03246] Stopping LearnerWorker_p0...
[2024-11-19 22:36:50,704][03246] Loop learner_proc0_evt_loop terminating...
[2024-11-19 22:36:50,706][02170] Component LearnerWorker_p0 stopped!
[2024-11-19 22:36:50,747][02170] Component RolloutWorker_w0 stopped!
[2024-11-19 22:36:50,750][03260] Stopping RolloutWorker_w0...
[2024-11-19 22:36:50,757][03260] Loop rollout_proc0_evt_loop terminating...
[2024-11-19 22:36:50,802][02170] Component RolloutWorker_w4 stopped!
[2024-11-19 22:36:50,809][03264] Stopping RolloutWorker_w4...
[2024-11-19 22:36:50,810][03264] Loop rollout_proc4_evt_loop terminating...
[2024-11-19 22:36:50,823][02170] Component RolloutWorker_w6 stopped!
[2024-11-19 22:36:50,832][03267] Stopping RolloutWorker_w6...
[2024-11-19 22:36:50,832][03267] Loop rollout_proc6_evt_loop terminating...
[2024-11-19 22:36:50,883][02170] Component RolloutWorker_w2 stopped!
[2024-11-19 22:36:50,890][03262] Stopping RolloutWorker_w2...
[2024-11-19 22:36:50,894][03262] Loop rollout_proc2_evt_loop terminating...
[2024-11-19 22:36:50,967][02170] Component RolloutWorker_w5 stopped!
[2024-11-19 22:36:50,967][03265] Stopping RolloutWorker_w5...
[2024-11-19 22:36:50,975][03265] Loop rollout_proc5_evt_loop terminating...
[2024-11-19 22:36:50,980][03266] Stopping RolloutWorker_w7...
[2024-11-19 22:36:50,980][02170] Component RolloutWorker_w7 stopped!
[2024-11-19 22:36:50,980][03266] Loop rollout_proc7_evt_loop terminating...
[2024-11-19 22:36:51,005][03261] Stopping RolloutWorker_w1...
[2024-11-19 22:36:51,005][02170] Component RolloutWorker_w1 stopped!
[2024-11-19 22:36:51,006][03261] Loop rollout_proc1_evt_loop terminating...
[2024-11-19 22:36:51,058][03263] Stopping RolloutWorker_w3...
[2024-11-19 22:36:51,058][03263] Loop rollout_proc3_evt_loop terminating...
[2024-11-19 22:36:51,057][02170] Component RolloutWorker_w3 stopped!
[2024-11-19 22:36:51,066][02170] Waiting for process learner_proc0 to stop...
[2024-11-19 22:36:52,922][02170] Waiting for process inference_proc0-0 to join...
[2024-11-19 22:36:52,927][02170] Waiting for process rollout_proc0 to join...
[2024-11-19 22:36:55,159][02170] Waiting for process rollout_proc1 to join...
[2024-11-19 22:36:55,163][02170] Waiting for process rollout_proc2 to join...
[2024-11-19 22:36:55,166][02170] Waiting for process rollout_proc3 to join...
[2024-11-19 22:36:55,170][02170] Waiting for process rollout_proc4 to join...
[2024-11-19 22:36:55,174][02170] Waiting for process rollout_proc5 to join...
[2024-11-19 22:36:55,177][02170] Waiting for process rollout_proc6 to join...
[2024-11-19 22:36:55,181][02170] Waiting for process rollout_proc7 to join...
[2024-11-19 22:36:55,184][02170] Batcher 0 profile tree view:
batching: 27.3470, releasing_batches: 0.0377
[2024-11-19 22:36:55,187][02170] InferenceWorker_p0-w0 profile tree view:
wait_policy: 0.0000
wait_policy_total: 453.8767
update_model: 10.3135
weight_update: 0.0040
one_step: 0.0209
handle_policy_step: 676.7872
deserialize: 17.6384, stack: 3.7512, obs_to_device_normalize: 139.5276, forward: 348.0653, send_messages: 33.6286
prepare_outputs: 99.5771
to_cpu: 57.1343
[2024-11-19 22:36:55,190][02170] Learner 0 profile tree view:
misc: 0.0059, prepare_batch: 13.9512
train: 76.6419
epoch_init: 0.0065, minibatch_init: 0.0069, losses_postprocess: 0.6339, kl_divergence: 0.7658, after_optimizer: 33.9420
calculate_losses: 28.1026
losses_init: 0.0122, forward_head: 1.5081, bptt_initial: 18.7807, tail: 1.3050, advantages_returns: 0.3126, losses: 3.7130
bptt: 2.1595
bptt_forward_core: 2.0617
update: 12.4456
clip: 0.9799
[2024-11-19 22:36:55,192][02170] RolloutWorker_w0 profile tree view:
wait_for_trajectories: 0.3050, enqueue_policy_requests: 123.8314, env_step: 923.5497, overhead: 19.2210, complete_rollouts: 8.1184
save_policy_outputs: 24.6937
split_output_tensors: 9.9086
[2024-11-19 22:36:55,195][02170] RolloutWorker_w7 profile tree view:
wait_for_trajectories: 0.3456, enqueue_policy_requests: 124.6163, env_step: 916.5866, overhead: 18.7237, complete_rollouts: 7.1781
save_policy_outputs: 25.6469
split_output_tensors: 10.3271
[2024-11-19 22:36:55,197][02170] Loop Runner_EvtLoop terminating...
[2024-11-19 22:36:55,200][02170] Runner profile tree view:
main_loop: 1221.1050
[2024-11-19 22:36:55,201][02170] Collected {0: 4005888}, FPS: 3280.5
[2024-11-19 22:37:06,824][02170] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2024-11-19 22:37:06,825][02170] Overriding arg 'num_workers' with value 1 passed from command line
[2024-11-19 22:37:06,827][02170] Adding new argument 'no_render'=True that is not in the saved config file!
[2024-11-19 22:37:06,829][02170] Adding new argument 'save_video'=True that is not in the saved config file!
[2024-11-19 22:37:06,831][02170] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2024-11-19 22:37:06,833][02170] Adding new argument 'video_name'=None that is not in the saved config file!
[2024-11-19 22:37:06,834][02170] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file!
[2024-11-19 22:37:06,836][02170] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2024-11-19 22:37:06,837][02170] Adding new argument 'push_to_hub'=False that is not in the saved config file!
[2024-11-19 22:37:06,838][02170] Adding new argument 'hf_repository'=None that is not in the saved config file!
[2024-11-19 22:37:06,839][02170] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2024-11-19 22:37:06,840][02170] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2024-11-19 22:37:06,841][02170] Adding new argument 'train_script'=None that is not in the saved config file!
[2024-11-19 22:37:06,842][02170] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2024-11-19 22:37:06,844][02170] Using frameskip 1 and render_action_repeat=4 for evaluation
[2024-11-19 22:37:06,880][02170] Doom resolution: 160x120, resize resolution: (128, 72)
[2024-11-19 22:37:06,885][02170] RunningMeanStd input shape: (3, 72, 128)
[2024-11-19 22:37:06,888][02170] RunningMeanStd input shape: (1,)
[2024-11-19 22:37:06,904][02170] ConvEncoder: input_channels=3
[2024-11-19 22:37:07,012][02170] Conv encoder output size: 512
[2024-11-19 22:37:07,013][02170] Policy head output size: 512
[2024-11-19 22:37:07,197][02170] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2024-11-19 22:37:08,039][02170] Num frames 100...
[2024-11-19 22:37:08,162][02170] Num frames 200...
[2024-11-19 22:37:08,286][02170] Num frames 300...
[2024-11-19 22:37:08,406][02170] Num frames 400...
[2024-11-19 22:37:08,537][02170] Num frames 500...
[2024-11-19 22:37:08,672][02170] Num frames 600...
[2024-11-19 22:37:08,789][02170] Num frames 700...
[2024-11-19 22:37:08,913][02170] Num frames 800...
[2024-11-19 22:37:09,038][02170] Num frames 900...
[2024-11-19 22:37:09,122][02170] Avg episode rewards: #0: 21.170, true rewards: #0: 9.170
[2024-11-19 22:37:09,124][02170] Avg episode reward: 21.170, avg true_objective: 9.170
[2024-11-19 22:37:09,229][02170] Num frames 1000...
[2024-11-19 22:37:09,355][02170] Num frames 1100...
[2024-11-19 22:37:09,481][02170] Num frames 1200...
[2024-11-19 22:37:09,611][02170] Num frames 1300...
[2024-11-19 22:37:09,746][02170] Num frames 1400...
[2024-11-19 22:37:09,868][02170] Num frames 1500...
[2024-11-19 22:37:09,995][02170] Num frames 1600...
[2024-11-19 22:37:10,155][02170] Avg episode rewards: #0: 18.925, true rewards: #0: 8.425
[2024-11-19 22:37:10,156][02170] Avg episode reward: 18.925, avg true_objective: 8.425
[2024-11-19 22:37:10,179][02170] Num frames 1700...
[2024-11-19 22:37:10,314][02170] Num frames 1800...
[2024-11-19 22:37:10,438][02170] Num frames 1900...
[2024-11-19 22:37:10,562][02170] Num frames 2000...
[2024-11-19 22:37:10,707][02170] Num frames 2100...
[2024-11-19 22:37:10,842][02170] Num frames 2200...
[2024-11-19 22:37:10,965][02170] Num frames 2300...
[2024-11-19 22:37:11,089][02170] Num frames 2400...
[2024-11-19 22:37:11,224][02170] Num frames 2500...
[2024-11-19 22:37:11,347][02170] Num frames 2600...
[2024-11-19 22:37:11,513][02170] Avg episode rewards: #0: 19.924, true rewards: #0: 8.923
[2024-11-19 22:37:11,515][02170] Avg episode reward: 19.924, avg true_objective: 8.923
[2024-11-19 22:37:11,550][02170] Num frames 2700...
[2024-11-19 22:37:11,692][02170] Num frames 2800...
[2024-11-19 22:37:11,815][02170] Num frames 2900...
[2024-11-19 22:37:11,943][02170] Num frames 3000...
[2024-11-19 22:37:12,072][02170] Num frames 3100...
[2024-11-19 22:37:12,199][02170] Num frames 3200...
[2024-11-19 22:37:12,332][02170] Num frames 3300...
[2024-11-19 22:37:12,460][02170] Num frames 3400...
[2024-11-19 22:37:12,582][02170] Num frames 3500...
[2024-11-19 22:37:12,714][02170] Num frames 3600...
[2024-11-19 22:37:12,846][02170] Num frames 3700...
[2024-11-19 22:37:13,023][02170] Avg episode rewards: #0: 20.978, true rewards: #0: 9.477
[2024-11-19 22:37:13,025][02170] Avg episode reward: 20.978, avg true_objective: 9.477
[2024-11-19 22:37:13,042][02170] Num frames 3800...
[2024-11-19 22:37:13,162][02170] Num frames 3900...
[2024-11-19 22:37:13,302][02170] Num frames 4000...
[2024-11-19 22:37:13,429][02170] Num frames 4100...
[2024-11-19 22:37:13,553][02170] Num frames 4200...
[2024-11-19 22:37:13,683][02170] Num frames 4300...
[2024-11-19 22:37:13,814][02170] Num frames 4400...
[2024-11-19 22:37:13,961][02170] Num frames 4500...
[2024-11-19 22:37:14,148][02170] Num frames 4600...
[2024-11-19 22:37:14,329][02170] Num frames 4700...
[2024-11-19 22:37:14,420][02170] Avg episode rewards: #0: 20.838, true rewards: #0: 9.438
[2024-11-19 22:37:14,423][02170] Avg episode reward: 20.838, avg true_objective: 9.438
[2024-11-19 22:37:14,569][02170] Num frames 4800...
[2024-11-19 22:37:14,748][02170] Num frames 4900...
[2024-11-19 22:37:14,913][02170] Num frames 5000...
[2024-11-19 22:37:15,083][02170] Num frames 5100...
[2024-11-19 22:37:15,258][02170] Num frames 5200...
[2024-11-19 22:37:15,435][02170] Num frames 5300...
[2024-11-19 22:37:15,614][02170] Num frames 5400...
[2024-11-19 22:37:15,791][02170] Num frames 5500...
[2024-11-19 22:37:15,989][02170] Num frames 5600...
[2024-11-19 22:37:16,171][02170] Num frames 5700...
[2024-11-19 22:37:16,336][02170] Num frames 5800...
[2024-11-19 22:37:16,463][02170] Num frames 5900...
[2024-11-19 22:37:16,594][02170] Num frames 6000...
[2024-11-19 22:37:16,720][02170] Num frames 6100...
[2024-11-19 22:37:16,845][02170] Num frames 6200...
[2024-11-19 22:37:16,982][02170] Num frames 6300...
[2024-11-19 22:37:17,106][02170] Num frames 6400...
[2024-11-19 22:37:17,238][02170] Num frames 6500...
[2024-11-19 22:37:17,366][02170] Num frames 6600...
[2024-11-19 22:37:17,487][02170] Num frames 6700...
[2024-11-19 22:37:17,616][02170] Num frames 6800...
[2024-11-19 22:37:17,696][02170] Avg episode rewards: #0: 27.698, true rewards: #0: 11.365
[2024-11-19 22:37:17,698][02170] Avg episode reward: 27.698, avg true_objective: 11.365
[2024-11-19 22:37:17,805][02170] Num frames 6900...
[2024-11-19 22:37:17,935][02170] Num frames 7000...
[2024-11-19 22:37:18,089][02170] Num frames 7100...
[2024-11-19 22:37:18,231][02170] Num frames 7200...
[2024-11-19 22:37:18,354][02170] Num frames 7300...
[2024-11-19 22:37:18,485][02170] Num frames 7400...
[2024-11-19 22:37:18,613][02170] Num frames 7500...
[2024-11-19 22:37:18,742][02170] Num frames 7600...
[2024-11-19 22:37:18,865][02170] Num frames 7700...
[2024-11-19 22:37:19,004][02170] Num frames 7800...
[2024-11-19 22:37:19,128][02170] Num frames 7900...
[2024-11-19 22:37:19,261][02170] Avg episode rewards: #0: 27.081, true rewards: #0: 11.367
[2024-11-19 22:37:19,264][02170] Avg episode reward: 27.081, avg true_objective: 11.367
[2024-11-19 22:37:19,320][02170] Num frames 8000...
[2024-11-19 22:37:19,447][02170] Num frames 8100...
[2024-11-19 22:37:19,571][02170] Num frames 8200...
[2024-11-19 22:37:19,691][02170] Num frames 8300...
[2024-11-19 22:37:19,812][02170] Num frames 8400...
[2024-11-19 22:37:19,934][02170] Num frames 8500...
[2024-11-19 22:37:20,064][02170] Num frames 8600...
[2024-11-19 22:37:20,155][02170] Avg episode rewards: #0: 25.536, true rewards: #0: 10.786
[2024-11-19 22:37:20,157][02170] Avg episode reward: 25.536, avg true_objective: 10.786
[2024-11-19 22:37:20,260][02170] Num frames 8700...
[2024-11-19 22:37:20,381][02170] Num frames 8800...
[2024-11-19 22:37:20,507][02170] Num frames 8900...
[2024-11-19 22:37:20,634][02170] Num frames 9000...
[2024-11-19 22:37:20,767][02170] Num frames 9100...
[2024-11-19 22:37:20,912][02170] Avg episode rewards: #0: 23.859, true rewards: #0: 10.192
[2024-11-19 22:37:20,913][02170] Avg episode reward: 23.859, avg true_objective: 10.192
[2024-11-19 22:37:20,954][02170] Num frames 9200...
[2024-11-19 22:37:21,096][02170] Num frames 9300...
[2024-11-19 22:37:21,237][02170] Num frames 9400...
[2024-11-19 22:37:21,364][02170] Num frames 9500...
[2024-11-19 22:37:21,496][02170] Num frames 9600...
[2024-11-19 22:37:21,624][02170] Num frames 9700...
[2024-11-19 22:37:21,748][02170] Num frames 9800...
[2024-11-19 22:37:21,883][02170] Num frames 9900...
[2024-11-19 22:37:22,025][02170] Num frames 10000...
[2024-11-19 22:37:22,088][02170] Avg episode rewards: #0: 22.905, true rewards: #0: 10.005
[2024-11-19 22:37:22,090][02170] Avg episode reward: 22.905, avg true_objective: 10.005
[2024-11-19 22:38:27,409][02170] Replay video saved to /content/train_dir/default_experiment/replay.mp4!
[2024-11-19 22:44:23,944][02170] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2024-11-19 22:44:23,951][02170] Overriding arg 'num_workers' with value 1 passed from command line
[2024-11-19 22:44:23,952][02170] Adding new argument 'no_render'=True that is not in the saved config file!
[2024-11-19 22:44:23,954][02170] Adding new argument 'save_video'=True that is not in the saved config file!
[2024-11-19 22:44:23,956][02170] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2024-11-19 22:44:23,957][02170] Adding new argument 'video_name'=None that is not in the saved config file!
[2024-11-19 22:44:23,959][02170] Adding new argument 'max_num_frames'=100000 that is not in the saved config file!
[2024-11-19 22:44:23,960][02170] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2024-11-19 22:44:23,961][02170] Adding new argument 'push_to_hub'=True that is not in the saved config file!
[2024-11-19 22:44:23,963][02170] Adding new argument 'hf_repository'='Vagnus/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file!
[2024-11-19 22:44:23,963][02170] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2024-11-19 22:44:23,964][02170] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2024-11-19 22:44:23,966][02170] Adding new argument 'train_script'=None that is not in the saved config file!
[2024-11-19 22:44:23,969][02170] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2024-11-19 22:44:23,971][02170] Using frameskip 1 and render_action_repeat=4 for evaluation
[2024-11-19 22:44:24,028][02170] RunningMeanStd input shape: (3, 72, 128)
[2024-11-19 22:44:24,030][02170] RunningMeanStd input shape: (1,)
[2024-11-19 22:44:24,051][02170] ConvEncoder: input_channels=3
[2024-11-19 22:44:24,119][02170] Conv encoder output size: 512
[2024-11-19 22:44:24,123][02170] Policy head output size: 512
[2024-11-19 22:44:24,154][02170] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2024-11-19 22:44:24,819][02170] Num frames 100...
[2024-11-19 22:44:24,940][02170] Num frames 200...
[2024-11-19 22:44:25,070][02170] Num frames 300...
[2024-11-19 22:44:25,195][02170] Num frames 400...
[2024-11-19 22:44:25,330][02170] Num frames 500...
[2024-11-19 22:44:25,466][02170] Num frames 600...
[2024-11-19 22:44:25,590][02170] Num frames 700...
[2024-11-19 22:44:25,722][02170] Num frames 800...
[2024-11-19 22:44:25,865][02170] Num frames 900...
[2024-11-19 22:44:26,044][02170] Num frames 1000...
[2024-11-19 22:44:26,221][02170] Num frames 1100...
[2024-11-19 22:44:26,422][02170] Num frames 1200...
[2024-11-19 22:44:26,594][02170] Num frames 1300...
[2024-11-19 22:44:26,765][02170] Num frames 1400...
[2024-11-19 22:44:26,935][02170] Num frames 1500...
[2024-11-19 22:44:27,105][02170] Num frames 1600...
[2024-11-19 22:44:27,287][02170] Avg episode rewards: #0: 37.640, true rewards: #0: 16.640
[2024-11-19 22:44:27,289][02170] Avg episode reward: 37.640, avg true_objective: 16.640
[2024-11-19 22:44:27,357][02170] Num frames 1700...
[2024-11-19 22:44:27,550][02170] Num frames 1800...
[2024-11-19 22:44:27,734][02170] Num frames 1900...
[2024-11-19 22:44:27,918][02170] Num frames 2000...
[2024-11-19 22:44:28,107][02170] Num frames 2100...
[2024-11-19 22:44:28,316][02170] Num frames 2200...
[2024-11-19 22:44:28,513][02170] Num frames 2300...
[2024-11-19 22:44:28,662][02170] Num frames 2400...
[2024-11-19 22:44:28,843][02170] Avg episode rewards: #0: 28.980, true rewards: #0: 12.480
[2024-11-19 22:44:28,844][02170] Avg episode reward: 28.980, avg true_objective: 12.480
[2024-11-19 22:44:28,856][02170] Num frames 2500...
[2024-11-19 22:44:28,994][02170] Num frames 2600...
[2024-11-19 22:44:29,125][02170] Num frames 2700...
[2024-11-19 22:44:29,271][02170] Num frames 2800...
[2024-11-19 22:44:29,415][02170] Num frames 2900...
[2024-11-19 22:44:29,542][02170] Num frames 3000...
[2024-11-19 22:44:29,682][02170] Num frames 3100...
[2024-11-19 22:44:29,815][02170] Num frames 3200...
[2024-11-19 22:44:29,938][02170] Num frames 3300...
[2024-11-19 22:44:30,065][02170] Num frames 3400...
[2024-11-19 22:44:30,199][02170] Num frames 3500...
[2024-11-19 22:44:30,344][02170] Num frames 3600...
[2024-11-19 22:44:30,474][02170] Num frames 3700...
[2024-11-19 22:44:30,630][02170] Avg episode rewards: #0: 29.253, true rewards: #0: 12.587
[2024-11-19 22:44:30,631][02170] Avg episode reward: 29.253, avg true_objective: 12.587
[2024-11-19 22:44:30,670][02170] Num frames 3800...
[2024-11-19 22:44:30,799][02170] Num frames 3900...
[2024-11-19 22:44:30,923][02170] Num frames 4000...
[2024-11-19 22:44:31,049][02170] Num frames 4100...
[2024-11-19 22:44:31,176][02170] Num frames 4200...
[2024-11-19 22:44:31,319][02170] Num frames 4300...
[2024-11-19 22:44:31,446][02170] Num frames 4400...
[2024-11-19 22:44:31,579][02170] Num frames 4500...
[2024-11-19 22:44:31,713][02170] Num frames 4600...
[2024-11-19 22:44:31,841][02170] Num frames 4700...
[2024-11-19 22:44:32,017][02170] Avg episode rewards: #0: 28.243, true rewards: #0: 11.992
[2024-11-19 22:44:32,019][02170] Avg episode reward: 28.243, avg true_objective: 11.992
[2024-11-19 22:44:32,027][02170] Num frames 4800...
[2024-11-19 22:44:32,157][02170] Num frames 4900...
[2024-11-19 22:44:32,303][02170] Num frames 5000...
[2024-11-19 22:44:32,430][02170] Num frames 5100...
[2024-11-19 22:44:32,558][02170] Num frames 5200...
[2024-11-19 22:44:32,693][02170] Num frames 5300...
[2024-11-19 22:44:32,820][02170] Num frames 5400...
[2024-11-19 22:44:32,887][02170] Avg episode rewards: #0: 24.810, true rewards: #0: 10.810
[2024-11-19 22:44:32,888][02170] Avg episode reward: 24.810, avg true_objective: 10.810
[2024-11-19 22:44:33,005][02170] Num frames 5500...
[2024-11-19 22:44:33,130][02170] Num frames 5600...
[2024-11-19 22:44:33,274][02170] Num frames 5700...
[2024-11-19 22:44:33,403][02170] Num frames 5800...
[2024-11-19 22:44:33,526][02170] Num frames 5900...
[2024-11-19 22:44:33,660][02170] Num frames 6000...
[2024-11-19 22:44:33,781][02170] Avg episode rewards: #0: 22.742, true rewards: #0: 10.075
[2024-11-19 22:44:33,782][02170] Avg episode reward: 22.742, avg true_objective: 10.075
[2024-11-19 22:44:33,853][02170] Num frames 6100...
[2024-11-19 22:44:33,978][02170] Num frames 6200...
[2024-11-19 22:44:34,106][02170] Num frames 6300...
[2024-11-19 22:44:34,240][02170] Num frames 6400...
[2024-11-19 22:44:34,376][02170] Num frames 6500...
[2024-11-19 22:44:34,499][02170] Num frames 6600...
[2024-11-19 22:44:34,628][02170] Num frames 6700...
[2024-11-19 22:44:34,796][02170] Avg episode rewards: #0: 21.402, true rewards: #0: 9.687
[2024-11-19 22:44:34,798][02170] Avg episode reward: 21.402, avg true_objective: 9.687
[2024-11-19 22:44:34,834][02170] Num frames 6800...
[2024-11-19 22:44:34,958][02170] Num frames 6900...
[2024-11-19 22:44:35,086][02170] Num frames 7000...
[2024-11-19 22:44:35,216][02170] Num frames 7100...
[2024-11-19 22:44:35,359][02170] Num frames 7200...
[2024-11-19 22:44:35,488][02170] Num frames 7300...
[2024-11-19 22:44:35,558][02170] Avg episode rewards: #0: 19.886, true rewards: #0: 9.136
[2024-11-19 22:44:35,560][02170] Avg episode reward: 19.886, avg true_objective: 9.136
[2024-11-19 22:44:35,676][02170] Num frames 7400...
[2024-11-19 22:44:35,809][02170] Num frames 7500...
[2024-11-19 22:44:35,929][02170] Num frames 7600...
[2024-11-19 22:44:36,052][02170] Num frames 7700...
[2024-11-19 22:44:36,181][02170] Num frames 7800...
[2024-11-19 22:44:36,324][02170] Num frames 7900...
[2024-11-19 22:44:36,449][02170] Num frames 8000...
[2024-11-19 22:44:36,576][02170] Num frames 8100...
[2024-11-19 22:44:36,704][02170] Num frames 8200...
[2024-11-19 22:44:36,853][02170] Avg episode rewards: #0: 19.957, true rewards: #0: 9.179
[2024-11-19 22:44:36,855][02170] Avg episode reward: 19.957, avg true_objective: 9.179
[2024-11-19 22:44:36,908][02170] Num frames 8300...
[2024-11-19 22:44:37,037][02170] Num frames 8400...
[2024-11-19 22:44:37,164][02170] Num frames 8500...
[2024-11-19 22:44:37,308][02170] Num frames 8600...
[2024-11-19 22:44:37,436][02170] Num frames 8700...
[2024-11-19 22:44:37,564][02170] Num frames 8800...
[2024-11-19 22:44:37,688][02170] Num frames 8900...
[2024-11-19 22:44:37,850][02170] Num frames 9000...
[2024-11-19 22:44:37,983][02170] Num frames 9100...
[2024-11-19 22:44:38,108][02170] Num frames 9200...
[2024-11-19 22:44:38,246][02170] Num frames 9300...
[2024-11-19 22:44:38,382][02170] Num frames 9400...
[2024-11-19 22:44:38,509][02170] Num frames 9500...
[2024-11-19 22:44:38,663][02170] Num frames 9600...
[2024-11-19 22:44:38,860][02170] Num frames 9700...
[2024-11-19 22:44:39,033][02170] Num frames 9800...
[2024-11-19 22:44:39,220][02170] Num frames 9900...
[2024-11-19 22:44:39,416][02170] Num frames 10000...
[2024-11-19 22:44:39,597][02170] Num frames 10100...
[2024-11-19 22:44:39,766][02170] Num frames 10200...
[2024-11-19 22:44:39,953][02170] Num frames 10300...
[2024-11-19 22:44:40,135][02170] Avg episode rewards: #0: 23.661, true rewards: #0: 10.361
[2024-11-19 22:44:40,137][02170] Avg episode reward: 23.661, avg true_objective: 10.361
[2024-11-19 22:45:45,510][02170] Replay video saved to /content/train_dir/default_experiment/replay.mp4!